One table is partitioned, one partition per day.
There is a query on this table. It often only queries the data of a certain day in the table, but almost all data in the entire partition is scanned every time. Is there any way to optimize it?
There is a large table, the amount of data generated every day is about 1 million, so the table partitioning scheme is adopted, one partition per day.
Here is the DDL of the table:
CREATE TABLE `t1` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `date` date NOT NULL, `kid` int(11) DEFAULT '0', `uid` int(11) NOT NULL, `iid` int(11) DEFAULT '0', `icnt` int(8) DEFAULT '0', `tst` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `countp` smallint(11) DEFAULT '1', `isr` int(2) NOT NULL DEFAULT '0', `clv` int(5) NOT NULL DEFAULT '1', PRIMARY KEY (`id`,`date`), UNIQUE KEY `date` (`date`,`uid`,`iid`), KEY `date_2` (`date`,`kid`) ) ENGINE=InnoDB AUTO_INCREMENT=3180686682 DEFAULT CHARSET=utf8mb4 /*!50500 PARTITION BY RANGE COLUMNS(`date`) (PARTITION p20161201 VALUES LESS THAN ('2016-12-02') ENGINE = InnoDB, PARTITION p20161202 VALUES LESS THAN ('2016-12-03') ENGINE = InnoDB, PARTITION p20161203 VALUES LESS THAN ('2016-12-04') ENGINE = InnoDB, ...
The following slow queries often occur on this table:
SELECT ... FROM `t1` WHERE `date` = '2017-04-01' AND `icnt`> 300 AND `id` = '801301';
Want to optimize a SQL, it is to look at the general implementation plan, to observe whether or not to use the index as much as possible , at the same time pay attention to the number of scanning lines is expected , as well as whether or not a temporary table (Using temporary) or whether you need to sort (Using filesort ) , find a way to eliminate these situations.
Further optimization strategies may require adjustment of program code logic, or even technical architecture or business requirements. This action is relatively large. Generally, core problems on non-core systems will not be so violent. In most cases, you still need to rely on DBA as much as possible. Use your ingenuity to solve it.
Now, let's look at the execution plan of this SQL:
yejr@imysql.com[myDB]> EXPLAIN PARTITIONS SELECT ... FROM `t1` WHERE `date` = '2017-03-02' AND `icnt`> 100 AND `iid` = '502302'\G *************************** 1. row ******************** ******* id: 1 select_type: SIMPLE table: t1 partitions: p20170302 type: range possible_keys: date,date_2 key: date key_len: 3 ref: const rows: 9384602 Extra: Using where
This execution plan looks good, with indexes available, no temporary tables, and no filesort. However, we also noticed that the expected number of rows to be scanned is still quite large rows: 9384602, and all data in the entire partition is scanned. No wonder the efficiency is not high, it is always SLOW QUERY.
We noticed that this SQL always needs to query the data of a certain day. This table has been partitioned by day. Is it possible to ignore the time condition in the WHERE clause ?
Also, since the date condition has been removed, looking at the table DDL, it seems that there is no suitable index for the remaining conditions, right?
So, we try to create a new index:
yejr@imysql.com[myDB]> ALTER TABLE t1 ADD INDEX iid (iid, icnt);
Then, transform the SQL into the following, and then look at the execution plan:
yejr@imysql.com[myDB]> EXPLAIN PARTITIONS SELECT ... FROM `t1` partition(p2017030) WHERE `icnt`> 100 AND `iid` = '502302'\G *************************** 1. row ******************** ******* id: 1 select_type: SIMPLE table: t1 partitions: p20170302 type: ref possible_keys: date,date_2,iid key: iid key_len: 10 ref: const rows: 7800 Extra: Using where This optimizes the effect, leveraging. In fact, if the partition is not mandatory, the optimization effect can be achieved: yejr@imysql.com[myDB]> EXPLAIN PARTITIONS SELECT ... FROM `t1` WHERE `date` = '2017-03-02' AND `icnt`> 100 AND `iid` = '502302'\G *************************** 1. row ******************** ******* id: 1 select_type: SIMPLE table: t1 partitions: p20170302 type: ref possible_keys: date,date_2,iid key: iid key_len: 10 ref: NULL rows: 7800 Extra: Using where
The vast majority of SQL is completed by simple methods such as adding indexes and appropriately adjusting the SQL code (for example, adjusting the order of driving tables).
To say a few more words, when you encounter SQL optimization performance bottlenecks and want to ask the technical group for advice, please provide a few necessary information first: