http://www.sqlpass.org/24hours/2016/summitpreview/Sessions/Details.aspx?sid=53738
I was watching the session and found that I would like to add to the batch commands section. Back in 2003 i wrote a query optimization model that still works to this day that i would like to share.
Problem
You have a lot of data you need to import, pushing it one row at a time is very inefficient, but what is the "right" amount of queries to push.Solution
SQL commands are sent through a network packet to the server, these packets are default 4096 bytes / 4KB in size, this can be defined on the system but most leave this setting intact. This means that if you write an insert for one row and sends that to the server there are plenty of unused space in the packet of 4096.
So the task at hand is very simple, you must optimize your query so that you send as many queries in one packet of 4096 but without going over, since it then sends the overflowing query in another packet with trailing empty space as a result.
If you watch the session by Mladen Prajdić there is a reference to a System.Data.SqlClient.SqlCommandSet that is used internally by the .Net framework, my guess is that it works similarly and tries to optimize the network traffic to the server.
A more brutal implementation (clean and simple) is the write a batch query engine yourself and measuring the byte length of the resulting query and then only pushing it to the server when the byte length is close to 4096 but not over.
I have shared my base code for this brutal approach on https://gist.github.com/janhebnes/726811a5ba0d36c9de7c323ae177bf22
No comments: