Unbuffered data fetching from MySQL - mysql_free_result performance issues

When you run a SELECT statement through MySQL client library (or some driver implemented on top of it, such as Perl's DBD::mysql or PHP's mysql_query), by default the server buffers the complete query result and returns rows to the client only after the entire query has finished. This is undesirable if the result set is potentially huge and you want to obtain rows as soon as possible. Fortunately, there is a documented solution for this problem - you can configure your client library to use the API function mysql_use_result rather than mysql_store_result (which is equivalent to the --quick option in mysql command-line client).

With that option in place, let's say you wish to implement a sort of client-side LIMIT statement, based on some required client-side filtering of the sorted result set, and discard the rest of the rows as soon as you have found your matches. Unfortunately, your luck ends here because mysql_use_result will not easily satisfy this scenario. When you attempt to finish/release the statement handle after reading a couple rows (as you are required to do, e.g. by calling $sth->finish if you use Perl's DBI), under the hood the function mysql_free_result is called. The documentation makes it appear harmless, saying that it "frees the memory". Apparently, for unknown reasons, this function (or more precisely, cli_flush_use_result called from within it) does a lot more and takes ages to complete given a big result set. Your client may remain blocked for one minute or longer before it is given an opportunity to execute the next statement, all while causing a high CPU usage on the server. You really wanted to simply abort the query, but with MySQL you can't.

This regrettable behavior is even partly documented: you must retrieve all the rows even if you determine in mid-retrieval that you've found the information you were looking for, although the reason for it is not explained. The only workaround I found was to close the connection on which the incomplete query was issued and then reconnect immediately. Ironically, this brute-force way of "freeing memory" works instantly in comparison. (Of course, it is not acceptable if you're inside a transaction, which will be lost by dropping the connection.)

No comments:

Post a Comment