Monday, November 5, 2012

MySQL 5.6 : Performance Schema - Statement Digest

Along with lots of new features in performance schema, a new summary table, event_statement_summary_by_digest, has been added to MySQL 5.6. This tables gives a consolidated view of different statements executed/executing on MySQL server.

WHAT IS DIGEST?
Digest is a hash calculated on normalized form of statement text which removes general property of a statement but keeps specific property. For ex. for two statements X and Y,

Removing general property means
normalized (X) == normalized (Y)
Preserving specific property after normalization means:
normalized (X) != normalized (Y)

On these normalized statement, a hash is calculated which is called DIGEST. So digest is a unique representation of statements having same structure.


Example :
SELECT 1 FROM t1; 
SELECT 2 FROM t1; 
Above two statements have same general structure so they would end up having same digest.

SELECT 1 FROM t1;
SELECT 1 FROM t2.
Above two statements don't have same general structure as they have different table names (specific property), so they would be having different digests.

NEED FOR AGGREGATION BASED ON DIGEST?
As it is clear that aggregation based on digest will put together all statements having similar structure, this gives MySQL user (like DBA) a way to investigate what is happening with statement of specific types.

NEW TABLE(S) AND COLUMN(S)?
To store statements statistics based on digest, a new summary table is introduced named event_statements_summary_by_digest. And new columns DIGEST and DIGEST_TEXT are introduced in events_statements_current, events_statements_history and events_statements_history_long. DDL of tables look like following:
-->
    CREATE TABLE `events_statements_summary_by_digest` (
`DIGEST` varchar(32),
`DIGEST_TEXT` longtext,
`COUNT_STAR` bigint(20) unsigned NOT NULL,
`SUM_TIMER_WAIT` bigint(20) unsigned NOT NULL,
`MIN_TIMER_WAIT` bigint(20) unsigned NOT NULL,
`AVG_TIMER_WAIT` bigint(20) unsigned NOT NULL,
`MAX_TIMER_WAIT` bigint(20) unsigned NOT NULL,
`SUM_LOCK_TIME` bigint(20) unsigned NOT NULL,
`SUM_ERRORS` bigint(20) unsigned NOT NULL,
`SUM_WARNINGS` bigint(20) unsigned NOT NULL,
`SUM_ROWS_AFFECTED` bigint(20) unsigned NOT NULL,
`SUM_ROWS_SENT` bigint(20) unsigned NOT NULL,
`SUM_ROWS_EXAMINED` bigint(20) unsigned NOT NULL,
`SUM_CREATED_TMP_DISK_TABLES` bigint(20) unsigned NOT NULL,
`SUM_CREATED_TMP_TABLES` bigint(20) unsigned NOT NULL,
`SUM_SELECT_FULL_JOIN` bigint(20) unsigned NOT NULL,
`SUM_SELECT_FULL_RANGE_JOIN` bigint(20) unsigned NOT NULL,
`SUM_SELECT_RANGE` bigint(20) unsigned NOT NULL,
`SUM_SELECT_RANGE_CHECK` bigint(20) unsigned NOT NULL,
`SUM_SELECT_SCAN` bigint(20) unsigned NOT NULL,
`SUM_SORT_MERGE_PASSES` bigint(20) unsigned NOT NULL,
`SUM_SORT_RANGE` bigint(20) unsigned NOT NULL,
`SUM_SORT_ROWS` bigint(20) unsigned NOT NULL,
`SUM_SORT_SCAN` bigint(20) unsigned NOT NULL,
`SUM_NO_INDEX_USED` bigint(20) unsigned NOT NULL,
`SUM_NO_GOOD_INDEX_USED` bigint(20) unsigned NOT NULL,
'FIRST_SEEN TIMESTAMP(0) default 0,',
'LAST_SEEN TIMESTAMP(0) default 0,'
) ENGINE=PERFORMANCE_SCHEMA DEFAULT CHARSET=utf8

And
CREATE TABLE `events_statements_current` (
...
`SQL_TEXT` longtext,
`DIGEST` varchar(32),
`DIGEST_TEXT` longtext,
...
`CURRENT_SCHEMA` varchar(64) DEFAULT NULL,
) ENGINE=PERFORMANCE_SCHEMA DEFAULT CHARSET=utf8

and similar to events_statements_history and events_statements_history_long.
A new consumer is added in setup_consumer table.

mysql> select * from setup_consumers;
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
...
| statements_digest              | YES     |
+--------------------------------+---------+
12 rows in set (0.00 sec)

By default this consumer is enabled.
 
ANY NEW SERVER VARIABLES?

One new server variable is added which could be specified while starting mysql server:

performance-schema-digests-size
     Indicates the number of digest to be collected.
     Default value is 1000.

CHANGES ON USER'S SCREEN?
When server is started with performance schema, user sees following changes:

mysql> show variables like '%performance%';
+--------------------------------------------------------+---------+
| Variable_name                                          | Value   |
+--------------------------------------------------------+---------+
| performance_schema                                     | ON      |
...
| performance_schema_digests_size                        | 1000    |
...
+--------------------------------------------------------+---------+
30 rows in set (0.05 sec)

mysql> use performance_schema;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> show tables;
+----------------------------------------------------+
| Tables_in_performance_schema                       |
+----------------------------------------------------+
...
| events_statements_summary_by_digest                |
...
+----------------------------------------------------+
52 rows in set (0.01 sec)

mysql> select * from setup_consumers;
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
...
| statements_digest              | YES     |
...
+--------------------------------+---------+
12 rows in set (0.00 sec)

All statements executed will be captured and their statistics will be stored in evens_statements_summary_by_digest. This table can be queried to see statements captured.
mysql> select DIGEST, DIGEST_TEXT, COUNT_STAR from performance_schema.events_statements_summary_by_digest;
+----------------------------------+-------------------------------------+------------+
| DIGEST                           | DIGEST_TEXT                         | COUNT_STAR |
+----------------------------------+-------------------------------------+------------+
| 021d13e766a40ee278c0d30dfcffbc58 | SELECT @ @ version_comment LIMIT ?  |          1 |
| ed3bec8844f8b8ac0632d5409e07e104 | SELECT SCHEMA ( )                   |          1 |
| 42b93d481e96b9c9b4049b9407900194 | SHOW SCHEMAS                        |          1 |
| a32df38ba4db0890b236e490d15889e2 | SHOW TABLES                         |          2 |
| 8204f53e4eae088fe292d09d13e8049d | SELECT VARIABLES LIKE ?             |          1 |
| 1c465973a09e28ad9da5f65303c66343 | SHOW VARIABLES LIKE ?               |          2 |
| b92f1ec95780e5dc6820a951f93af825 | SELECT * FROM `setup_consumer`      |          1 |
| f92a5f6103d7a9da0da846b5c5830fa0 | SELECT * FROM `setup_consumers`     |          2 |
+----------------------------------+-------------------------------------+------------+
8 rows in set (0.00 sec)

 

If number of different statements exceeds performance-schema-digests-size, statistics of all further new statements will be captured and stored at index 0 on table and as all different statements are captured there, DIGEST/DIGEST_TEXT will be NULL for this index. Based on number of statements captured in this index, DBA can decide upon an optimal value of performance-schema-digests-size.

mysql> show variables like "%digest%";
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| performance_schema_digests_size | 2     |
+---------------------------------+-------+
1 row in set (0.00 sec)

mysql> select DIGEST, DIGEST_TEXT, COUNT_STAR from performance_schema.events_statements_summary_by_digest;
+----------------------------------+-------------------------------------+------------+
| DIGEST                           | DIGEST_TEXT                         | COUNT_STAR |
+----------------------------------+-------------------------------------+------------+
| NULL                             | NULL                                |          6 |
| 021d13e766a40ee278c0d30dfcffbc58 | SELECT @ @ version_comment LIMIT ?  |          1 |
+----------------------------------+-------------------------------------+------------+
2 rows in set (0.00 sec)