Optimize MySQL

When MySQL uses indexes

  • Using >, >=, =, <, <=, IF NULL and BETWEEN on a key.
    • SELECT * FROM table_name WHERE key_part1=1 and key_part2 > 5;
    • SELECT * FROM table_name WHERE key_part1 IS NULL;
  • When you use a LIKE that doesn't start with a wildcard.
    • SELECT * FROM table_name WHERE key_part1 LIKE 'jani%'
  • Retrieving rows from other tables when performing joins.
    • SELECT * from t1,t2 where t1.col=t2.key_part
  • Find the MAX() or MIN() value for a specific index.
    • SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10
  • ORDER BY or GROUP BY on a prefix of a key.
    • SELECT * FROM foo ORDER BY key_part1,key_part2,key_part3
  • When all columns used in the query are part of one key.
    • SELECT key_part3 FROM table_name WHERE key_part1=1

When MySQL doesn't use an index

  • Indexes are NOT used if MySQL can calculate that it will probably be faster to scan the whole table.
  • For example if key_part1 is evenly distributed between 1 and 100, it's not good to use an index in the following query:
    • SELECT * FROM table_name where key_part1 > 1 and key_part1 < 90
  • If you are using HEAP tables and you don't search on all key parts with =
  • When you use ORDER BY on a HEAP table
  • If you are not using the first key part
    • SELECT * FROM table_name WHERE key_part2=1
  • If you are using LIKE that starts with a wildcard
    • SELECT * FROM table_name WHERE key_part1 LIKE '%jani%'
  • When you search on one index and do an ORDER BY on another
    • SELECT * from table_name WHERE key_part1 = # ORDER BY key2

Learn to use EXPLAIN

Use EXPLAIN on every query that you think is too slow!

mysql> explain select t3.DateOfAction, t1.TransactionID
	-> from t1 join t2 join t3
	-> where t2.ID = t1.TransactionID and t3.ID = t2.GroupID      
	-> order by t3.DateOfAction, t1.TransactionID;  
+-------+--------+---------------+---------+---------+------------------+------+---------------------------------+  
| table | type   | possible_keys | key     | key_len | ref              | rows | Extra                           |  
+-------+--------+---------------+---------+---------+------------------+------+---------------------------------+  
| t1    | ALL    | NULL          | NULL    |    NULL | NULL             |   11 | Using temporary; Using filesort |  
| t2    | ref    | ID            | ID      |       4 | t1.TransactionID |   13 |                                 |  
| t3    | eq_ref | PRIMARY       | PRIMARY |       4 | t2.GroupID       |    1 |                                 |  
+-------+--------+---------------+---------+---------+------------------+------+---------------------------------+  

Types ALL and range signal a potential problem.


Learn to use SHOW PROCESSLIST

Use SHOW processlist to find out what is going on:

+----+-------+-----------+----+---------+------+--------------+-------------------------------------+  
| Id | User  | Host      | db | Command | Time | State        | Info                                |  
+----+-------+-----------+----+---------+------+--------------+-------------------------------------+  
| 6  | monty | localhost | bp | Query   | 15   | Sending data | select * from station,station as s1 |  
| 8  | monty | localhost |    | Query   | 0    |              | show processlist                    |  
+----+-------+-----------+----+---------+------+--------------+-------------------------------------+  

Use KILL in mysql or mysqladmin to kill off runaway threads.


How to find out how MySQL solves a query

Run the following commands and try to understand the output:

  • SHOW VARIABLES;
  • SHOW COLUMNS FROM ...\G
  • EXPLAIN SELECT ...\G
  • FLUSH STATUS;
  • SELECT ...;
  • SHOW STATUS;

MySQL is extremely good

  • For logging.
  • When you do many connects; connect is very fast.
  • Where you use SELECT and INSERT at the same time.
  • When you don't combine updates with selects that take a long time.
  • When most selects/updates are using unique keys.
  • When you use many tables without long conflicting locks.
  • When you have big tables (MySQL uses a very compact table format).

Things to avoid with MySQL

  • Updates to a table or INSERT on a table with deleted rows, combined with SELECTS that take a long time.
  • HAVING on things you can have in a WHERE clause.
  • JOINS without using keys or keys which are not unique enough.
  • JOINS on columns that have different column types.
  • Using HEAP tables when not using a full key match with =
  • Forgetting a WHERE clause with UPDATE or DELETE in the MySQL monitor. If you tend to do this, use the --i-am-a-dummy option to the mysq client.

Different locks in MySQL

  • Internal table locks.
  • LOCK TABLES (Works on all table types)
  • GET_LOCK()/RELEASE_LOCK()
  • Page locks (for BDB tables)
  • ALTER TABLE also does a table lock on BDB tables.
  • LOCK TABLES gives you multiple readers on a table or one writer.
  • Normally a WRITE lock has higher priority than a READ lock to avoid starving the writers. For writers that are not important one can use the LOW_PRIORITY keyword to let the lock handler prefer readers.
      UPDATE LOW_PRIORITY SET value=10 WHERE id=10;  

Tricks to give MySQL more information to solve things better

Note that you can always comment out a MySQL feature to make the query portable:

SELECT /*! SQL_BUFFER_RESULTS */ ...  
  • SELECT SQL_BUFFER_RESULTS ...
    Will force MySQL to make a temporary result set. As soon as the temporary set is done, all locks on the tables are released. This can help when you get a problem with table locks or when it takes a long time to transfer the result to the client.
  • SELECT SQL_SMALL_RESULT ... GROUP BY ...
    To tell the optimizer that the result set will only contain a few rows.
  • SELECT SQL_BIG_RESULT ... GROUP BY ...
    To tell the optimizer that the result set will contain many rows.
  • SELECT STRAIGHT_JOIN ...
    Forces the optimizer to join the tables in the order in which they are listed in the FROM clause.
  • SELECT ... FROM table_name [USE INDEX (index_list) | IGNORE INDEX (index_list)] table_name2
    Forces MySQL to use/ignore the listed indexes.

Example of doing transactions

  • How to do a transaction with MyISAM tables:
    mysql> LOCK TABLES trans READ, customer WRITE;  mysql> select sum(value) from trans where customer_id=some_id;  mysql> update customer set total_value=sum_from_previous_statement             where customer_id=some_id;  mysql> UNLOCK TABLES;  
  • How to do a transaction with Berkeley DB tables:
    mysql> BEGIN WORK;   mysql> select sum(value) from trans where customer_id=some_id;  mysql> update customer set total_value=sum_from_previous_statement             where customer_id=some_id;  mysql> COMMIT;  
  • Note that you can often avoid transactions altogether by doing:
    UPDATE customer SET value=value+new_value WHERE customer_id=some_id;  

Example of using REPLACE

REPLACE works exactly like INSERT, except that if an old record in the table has the same value as a new record on a unique index, the old record is deleted before the new record is inserted. Instead of using

  SELECT 1 FROM t1 WHERE key=#    
	IF found-row      
		LOCK TABLES t1      
		DELETE FROM t1 WHERE key1=#      
		INSERT INTO t1 VALUES (...)      
		UNLOCK TABLES t1;    
	ENDIF  

Do

  REPLACE INTO t1 VALUES (...)      

General tips

    • Use short primary keys. Use numbers, not strings, when joining tables.
    • When using multi-part keys, the first part should be the most-used key.
    • When in doubt, use columns with more duplicates first to get better key compression.
    • If you run the client and MySQL server on the same machine, use sockets instead of TCP/IP when connecting to MySQL (this can give you up to a 7.5 % improvement). You can do this by specifying no hostname or localhost when connecting to the MySQL server.
    • Use --skip-locking (default on some OSes) if possible. This will turn off external locking and will give better performance.
    • Use application-level hashed values instead of using long keys:
SELECT * FROM table_name WHERE hash=MD5(concat(col1,col2)) AND    
col_1='constant' AND col_2='constant'  
  • Store BLOB's that you need to access as files in files. Store only the file name in the database.
  • It is faster to remove all rows than to remove a large part of the rows.
  • If SQL is not fast enough, take a look at the lower level interfaces to access the data.

Benefits of using MySQL 3.23

  • MyISAM ; Portable BIG table format
  • HEAP ; In memory tables
  • Berkeley DB ; Transactional tables from Sleepycat.
  • A lot of raised limits
  • Dynamic character sets
  • More STATUS variables.
  • CHECK and REPAIR table.
  • Faster GROUP BY and DISTINCT
  • LEFT JOIN ... IF NULL optimization.
  • CREATE TABLE ... SELECT
  • CREATE TEMPORARY table_name (...)
  • Automatic conversion of temporary HEAP to MyISAM tables
  • Replication
  • mysqlhotcopy script.

Important features that we are actively working on

  • Improving transactions
  • Fail safe replication
  • Text searching
  • Delete with many tables (Updates with many tables will be done after this.)
  • Better key cache
  • Atomic RENAME (RENAME TABLE foo as foo_old, foo_new as foo)
  • A query cache
  • MERGE TABLES
  • A better GUI client