Practical Query Analysis

Sample report Documentation Download Project page Credits

If you have a busy PostgreSQL or MySQL database application, you might want to analyze the queries to see if they can be improved. Here's a little utility to help with that.


To use PQA, you'll need Ruby, PQA itself, and of course a PostgreSQL or MySQL installation.

Here's an example report generated from data collected on RubyForge's database activity.

You can install PQA via RubyGems by downloading the RubyGem file and doing a:

gem install pqa-1.6.gem

And then you can run PQA like this:

pqa -file /path/to/your/logfile -normalize -top 5

Or, if you want to use the zip file, download it and unzip it into any directory. Then open a command line window, cd into the pqa-1.6 directory, and run it like this:

[tom@hal lib]$ ./pqa.rb -file ../sample/pglog_sample.log -normalize -top 5
645 queries (65 unique), longest ran in 0.370524 seconds), parsed in 0.088793 seconds
### Queries by type
SELECTs: 593  (92%)
INSERTs: 51   (8%)
UPDATEs: 1  (0%)
### 5 most frequent queries
61 times: SELECT total FROM forum_group_list_vw WHERE group_forum_id=''
46 times: SELECT plugin_id, plugin_name FROM plugins
46 times: INSERT INTO activity_log (day,hour,group_id,browser,ver,platform,time,page,type) VALUES (0,'','','','','','','','');
40 times: select classname from supported_languages where language_code = ''
40 times: SELECT language_code FROM supported_languages WHERE language_id=''


Here are some recommendations on configuring postgresql.conf:
VariableRecommending settingNotes
log_statementtrueThis reports the actual SQL statements; turn this one on.
log_durationtrueThis reports how long it took to run each statement; turn this one on too.
log_pidtrueThis reports the process id (pid) of the client making the query. This is needed to correctly track messages from multiple clients. Turn this one on.
log_min_duration_statementtime in ms
(see note)
This setting allows only queries that are taking more than the specified number of milliseconds are reported. All SQL statements that run for the time specified or longer will be logged with their duration. Setting this to zero will print all queries and their durations. It is not necessary to enable log_statement or log_duration to use feature. log_min_duration_statement was introduced with PostgreSQL 7.4.
log_connectionfalseThis reports when connections are established. Turn this one off.
log_timestampfalseThis reports the time at which each event occurs. Turn this one off as well.
log_error_verbositydefaultThis controls how verbose the log file is; 'default' is the format PQA expects.
lc_messages'C'This ensures that the log file will contain the English messages - that's necessary because PQA looks for strings like "Duration"

There are two ways to collect data from your PostgreSQL database server:

And to collect data from MySQL:

Note that PQA assumes that the ruby binary is in /usr/local/bin/. If it's somewhere else, like /usr/bin/, you'll need to edit the first line of pqa.rb to reflect that location.

If there's anything we can add to make this documentation more helpful, please let us know about it by posting to the project forums. Thanks!

PHP programmers may be interested in a similar application written by Havard Eide - EPCQueryAnalyzer.



Contributors Article

Tom Copeland wrote an article (here's the errata) about PQA which might be helpful reading. It's a bit outdated, though, and it includes some statements about syslog that reflect my ignorance rather than syslog's actual capabilities :-)

The article states that to turn off logging, you can just comment out the syslog configuration settings and do a postgresql reload. That won't work; instead, you'll need to change the settings back to what they were originally and then postgresql reload. Thanks to Neil Conway for pointing this out to me.

The article also has the wrong name for the PostgreSQL IRC channel - it's #postgresql, not #pgsql. Thanks to Robert Treat for noticing this.