SQL Query Performance Optimisation

Try not to use `SELECT *` and provide specific fields

Bad Example

1
SELECT * FROM user;

Good Example

1
SELECT id, username, tel FROM user;

Reason

Specify fields will save computational resources, and reduce network expense.
It’s possible using pre-built table index to reduce table returns, and improve query efficiency.

Avoid using `OR` in the where clause to connect conditions

Bad Example

1
2
SELECT id, username, tel FROM user
WHERE id=1 OR salary=500;

Good Example

Use `UNION ALL` to concatenate results

1
2
3
4
5
SELECT id, username, tel FROM user
WHERE id=1
UNION ALL
SELECT id, username, tel FROM user
WHERE salary=500;

Use two separate SQL

1
2
SELECT id, username, tel FROM user
WHERE id=1

1
2
SELECT id, username, tel FROM user
WHERE salary=500;

Reason

The use of OR may invalidate the table index, and thus will do full table scan.
That is to say, the whole process requires three steps: 1) full table scan, 2) index scan, 3) results merge. If it had started with a full table scan, it is done with one scan directly.
Although some databases (i.e. MySQL) has an optimizer that considered in terms of efficiency and cost, encountering or conditions, the index may still be invalid.

Use `numeric` values instead of character string as much as possible

Good Example

primary key (ID). primary key gives priority to the use of numeric type INT, TINYINT.
gender (sex). 0 represents female, and 1 represents male. Database does not have boolean type, MySQL is recommended to use TINYINT.

Reason

Because the query engine compares each character in the character string one by one when processing queries and joins.
While for numeric types, it is enough to compare only once.
Characters will reduce the performance of queries and joins, and will also increase the storage expense.

Use `VARCHAR` instead of char as much as possible

Bad Example

1
address CHAR(100) DEFAULT NULL COMMENT 'address'

Good Example

1
address VARCHAR(100) DEFAULT NULL COMMENT 'address'

Reason

VARCHAR is a variable-length field are stored according to the actual length of the data content, with small storage space, which can save storage expense.
CHAR is stored according to the declared size, it is insufficient to fill in the spaces.
It’s more efficient to search within a relatively small field.

Differences between `CHAR` and `VARCHAR2`

The length of CHAR is fixed, while the length of VARCHAR2 can be changed.

For example, store string 101 for CHAR(10), it means that the character you store will take up 10 characters (including 7 empty characters), and it is occupied by space in the database.

While the same string with VARCHAR2(10) only take up 3 bytes of length, 10 is just the maximum. When the character you store is less than 10, it will be stored according to the actual length.

When do we use CHAR and when do we use VARCHAR2?

CHAR and VARCHAR2 is contradictory unity, the two are complementary relationship. VARCHAR2 is more space-saving than CHAR, but a little less efficient than CHAR. If you want to get efficiency, you must sacrifice a little space. This is what we often say in database design, use the space in exchange for efficiency.

Although VARCHAR2 is more space-saving than CHAR, but if a VARCHAR2 column is frequently modified, and each time the length of the modified data is different, this will cause Row Migrating phenomenon, which will cause extra database I/O and we need to avoid in database design.

On the other hand, CHAR will automatically fill in the spaces, because you insert to a CHAR field automatically supplemented by spaces. But SELECT after the space is not deleted, so when querying the CHAR type you must remember to use TRIM() function.

It is for the above reasons that fixed-width storage space can lead to tables and associated indexes that are much larger than usual, and is accompanied by problem with bind variables.

So it is important to avoid using CHAR types no matter what circumstances.

Reference

1p3a: How to Optimise SQL Performance

Try not to use SELECT * and provide specific fields#

Avoid using OR in the where clause to connect conditions#

Use UNION ALL to concatenate results#

Use two separate SQL#

Use numeric values instead of character string as much as possible#

Use VARCHAR instead of char as much as possible#

Differences between CHAR and VARCHAR2#

Reference#