Skip to content
bobby_dreamer

MySQL Tips and Tricks

mysql3 min read

Last updated : 31/Oct/2021

Events are in reverse chronological order.



Speeding up mySQL LOAD

My scenario, i had around 1200 csv files to be loaded and each load takes between 15 seconds(70%) and some takes around 1m30s to 4m30s. My estimation was it should complete by approx 8hrs. Why its taking so long ? Is because of Primary Key. MySQL checks the PRIMARY KEY integrity with each INSERT.

I know the data i am going to load into the table is going to be unique, so i dropped all the indexes.

1drop index uix1 on <<tablename>>;

What would have run for 8 hrs completed in less than 5mins

Same is the case with mass loading as well. With unique indexes it took.

1mysql> LOAD DATA INFILE 'D:/tmp/nse_daily_temp.csv'
2 -> IGNORE INTO TABLE nse_daily_history2
3 -> COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'
4 -> (symbol, series, OPEN, HIGH, LOW, CLOSE, LAST, PREVCLOSE, TOTTRDQTY, TOTTRDVAL, ts, TOTALTRADES, ISIN)
5 -> ;
6Query OK, 1730759 rows affected (3 hours 48 min 23.27 sec)
7Records: 1730759 Deleted: 0 Skipped: 0 Warnings: 0

It could have taken less time without indexes.


The total number of locks exceeds the lock table size

To resolve,

  1. Update C:\ProgramData\MySQL\MySQL Server 8.0\my.ini
  2. innodb_buffer_pool_size from 8M to 128M. innodb_buffer_pool_size=128M
  3. Go to windows services and restart the MySQL server

Never do lengthy INSERT via SELECT as rollbacks are painfully slow

It took around 8hrs to rollback. If rows > 50k LOAD in mySQL.

Below is a snip from output SHOW ENGINE INNODB STATUS \G

1---TRANSACTION 59090816, ACTIVE 1856 sec recovered trx
2ROLLING BACK 1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 797821

Lock wait timeout exceeded
1ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Got the above error message, when executing an UPDATE statement in MySQL terminal. Upon a bit of analysis found that, a program was abnormally terminated and it had the thread still active, could see in SHOW PROCESSLIST;. So after killing the thread, UPDATE statement was reexecuted and completed successfully in few seconds.

1mysql> show processlist;
2+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+
3| Id | User | Host | db | Command | Time | State | Info |
4+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+
5| 4 | event_scheduler | localhost | NULL | Daemon | 930419 | Waiting on empty queue | NULL |
6| 2372 | sushanth | localhost:50242 | test | Query | 732 | Sending data | update qReport_JSON a, ( select a.lastupdated, a.sym, json_object('lastupdated', a.lastupdated |
7| 2373 | sushanth | localhost:50581 | test | Query | 0 | starting | show processlist |
8+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+
93 rows in set (0.00 sec)
10
11mysql> kill 2372;
12Query OK, 0 rows affected (0.00 sec)
13
14mysql> show processlist;
15+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+
16| Id | User | Host | db | Command | Time | State | Info |
17+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+
18| 4 | event_scheduler | localhost | NULL | Daemon | 930427 | Waiting on empty queue | NULL |
19| 2373 | sushanth | localhost:50581 | test | Query | 0 | starting | show processlist |
20+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+
212 rows in set (0.00 sec)

Creating test user with full access on test database
1CREATE USER 'test'@'localhost' IDENTIFIED BY 'test01';
2GRANT ALL PRIVILEGES ON TEST.* To 'test'@'localhost' ;
3GRANT INSERT, SELECT, DELETE, UPDATE ON test.* TO 'test'@'localhost' IDENTIFIED BY 'test01';
4GRANT FILE ON *.* TO 'test'@'localhost';
5GRANT EXECUTE ON PROCEDURE TEST.* TO 'test'@'localhost' identified by 'test01';
6FLUSH PRIVILEGES;

After creating the user, can be tested by opening a new command prompt,

1C:\Users\Sushanth>mysql -utest -p
2Enter password: ******
3Welcome to the MySQL monitor. Commands end with ; or \g.
4Your MySQL connection id is 344
5Server version: 8.0.12 MySQL Community Server - GPL
6
7Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
8
9Oracle is a registered trademark of Oracle Corporation and/or its
10affiliates. Other names may be trademarks of their respective
11owners.
12
13Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
14
15mysql> show databases;
16+--------------------+
17| Database |
18+--------------------+
19| information_schema |
20| sakila |
21| test |
22+--------------------+
233 rows in set (0.07 sec)

For python, you may need to use mysql_native_password, i have written a separate blog post for this.

1CREATE USER 'snake_charmer'@'localhost' IDENTIFIED BY 'pepe' PASSWORD EXPIRE NEVER;
2ALTER USER 'snake_charmer'@'localhost' IDENTIFIED WITH mysql_native_password BY 'pepe';
3GRANT ALL PRIVILEGES ON TEST.* To 'snake_charmer'@'localhost' ;

MySQL commonly used commands
1-- Both the commands display similar outputs. I prefer using indexes.
2show indexes from actor;
3show keys from actor;
4
5-- Shows all the databases in the current server
6show databases;
7
8-- Shows all tables in the current database
9show tables;
10show tables in sakila;
11
12-- Describe command shows the structure of table which include name of the column, data-type of column and the nullability
13desc actor;
14
15-- Displays threads and their details(user, host, db, command, time and state) which are currently running in MySQL
16show processlist;
17
18-- Display the DDL of the table actor
19show create table actor;
20
21-- Shows the accesspath of the MySQL query
22explain select * from actor where actor_id=5 ;
23
24-- Show Server status, can look for current long running transactions to evaluate how long its going to run
25SHOW ENGINE INNODB STATUS \G

Virtual columns

They appear to be normal columns but their values are calculated or derived from other columns and they are not materialized insense they are not stored in disk.

In my usecase, i find them extremely useful when dealing with JSON columns as JSON queries are tend to get lot bigger and complex. So, this virtal column greatly helps. Heres a sample,

1drop table if exists test.products1 ;
2CREATE TABLE test.products1(
3 `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
4 `name` VARCHAR(250) NOT NULL ,
5 `brand_id` INT UNSIGNED NOT NULL ,
6 `category_id` INT UNSIGNED NOT NULL ,
7 `attributes` JSON NOT NULL ,
8 PRIMARY KEY(`id`)
9);
10
11INSERT INTO test.`products1`(`name` ,`brand_id` ,`category_id` ,`attributes`)
12VALUES(
13 'Bravia' ,'1' ,'1' ,
14 '{"screen": "25 inch"
15 , "resolution": "1366 x 768 pixels"
16 , "ports": {"hdmi": 1, "usb": 0}
17 , "speakers": {"left": "5 watt", "right": "5 watt"}}'
18);
19
20INSERT INTO test.`products1`(`name` ,`brand_id` ,`category_id` ,`attributes`)
21VALUES(
22 'Proton' ,'1' ,'1' ,
23 '{"screen": "20 inch"
24 , "resolution": "1280 x 720 pixels"
25 , "ports": {"hdmi": 0, "usb": 0}
26 , "speakers": {"left": "5 watt", "right": "5 watt"}}'
27);

Here is a simple JSON query retreving usb > -1

1SELECT * FROM test.`products1`
2WHERE JSON_EXTRACT(`attributes` , '$.ports.usb') > -1;
3mysql> SELECT * FROM test.`products1`
4 -> WHERE JSON_EXTRACT(`attributes` , '$.ports.usb') > -1;
5+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+
6| id | name | brand_id | category_id | attributes |
7+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+
8| 1 | Bravia | 1 | 1 | {"ports": {"usb": 0, "hdmi": 1}, "screen": "25 inch", "speakers": {"left": "5 watt", "right": "5 watt"}, "resolution": "1366 x 768 pixels"} |
9| 2 | Proton | 1 | 1 | {"ports": {"usb": 0, "hdmi": 0}, "screen": "20 inch", "speakers": {"left": "5 watt", "right": "5 watt"}, "resolution": "1280 x 720 pixels"} |
10+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+
112 rows in set (0.00 sec)

Lets add virtual column, to simplify the query,

1ALTER TABLE test.products1
2 ADD COLUMN usb integer GENERATED ALWAYS AS (JSON_EXTRACT(`attributes` , '$.ports.usb'));
3mysql> desc products1;
4+-------------+------------------+------+-----+---------+-------------------+
5| Field | Type | Null | Key | Default | Extra |
6+-------------+------------------+------+-----+---------+-------------------+
7| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
8| name | varchar(250) | NO | | NULL | |
9| brand_id | int(10) unsigned | NO | | NULL | |
10| category_id | int(10) unsigned | NO | | NULL | |
11| attributes | json | NO | | NULL | |
12| usb | int(11) | YES | | NULL | VIRTUAL GENERATED |
13+-------------+------------------+------+-----+---------+-------------------+
146 rows in set (0.00 sec)

Now all we have to do is

1mysql> select id, name, brand_id, category_id, usb from products1 where usb > -1;
2+----+--------+----------+-------------+------+
3| id | name | brand_id | category_id | usb |
4+----+--------+----------+-------------+------+
5| 1 | Bravia | 1 | 1 | 0 |
6| 2 | Proton | 1 | 1 | 0 |
7+----+--------+----------+-------------+------+
82 rows in set (0.00 sec)

Viewing specfic range of output by using OFFSET

Using offset we can review a big result or table part by part in the terminal which doesn't have scrollbars

1mysql> SELECT * FROM actor LIMIT 15;
2+----------+------------+--------------+---------------------+
3| actor_id | first_name | last_name | last_update |
4+----------+------------+--------------+---------------------+
5| 2 | NICK | WAHLBERG | 2016-12-02 00:00:00 |
6| 3 | ED | CHASE | 2016-12-02 00:00:00 |
7| 4 | Jenni | Lewis | 2017-04-07 19:35:18 |
8| 5 | JOHNNY | LOLLOBRIGIDA | 2016-12-02 00:00:00 |
9| 6 | BETTE | NICHOLSON | 2016-12-02 00:00:00 |
10| 7 | GRACE | MOSTEL | 2016-12-02 00:00:00 |
11| 8 | MATTHEW | JOHANSSON | 2016-12-02 00:00:00 |
12| 10 | CHRISTIAN | GABLE | 2016-12-02 00:00:00 |
13| 11 | ZERO | CAGE | 2016-12-02 00:00:00 |
14| 12 | KARL | BERRY | 2016-12-02 00:00:00 |
15| 13 | UMA | WOOD | 2016-12-02 00:00:00 |
16| 14 | VIVIEN | BERGEN | 2016-12-02 00:00:00 |
17| 15 | CUBA | OLIVIER | 2016-12-02 00:00:00 |
18| 16 | FRED | COSTNER | 2016-12-02 00:00:00 |
19| 17 | HELEN | VOIGHT | 2016-12-02 00:00:00 |
20+----------+------------+--------------+---------------------+
2115 rows in set (0.00 sec)

What below query does is,

  • LIMIT 5 : Limits number of rows to be retreived to 5
  • OFFSET 5: Retreive rows after skipping 5 rows from the resultset
1mysql> SELECT * FROM actor LIMIT 5 offset 5;
2+----------+------------+-----------+---------------------+
3| actor_id | first_name | last_name | last_update |
4+----------+------------+-----------+---------------------+
5| 7 | GRACE | MOSTEL | 2016-12-02 00:00:00 |
6| 8 | MATTHEW | JOHANSSON | 2016-12-02 00:00:00 |
7| 10 | CHRISTIAN | GABLE | 2016-12-02 00:00:00 |
8| 11 | ZERO | CAGE | 2016-12-02 00:00:00 |
9| 12 | KARL | BERRY | 2016-12-02 00:00:00 |
10+----------+------------+-----------+---------------------+
115 rows in set (0.00 sec)

Output in vertical format

Usually when we are retreiving data from table we expect to be in table format but at certain times, when we want to report or note it down, it will be better if its in 'vertial format'.

Default format

1mysql> select * from actor limit 1;
2+----------+------------+-----------+---------------------+
3| actor_id | first_name | last_name | last_update |
4+----------+------------+-----------+---------------------+
5| 2 | NICK | WAHLBERG | 2016-12-02 00:00:00 |
6+----------+------------+-----------+---------------------+
71 row in set (0.00 sec)

Vertical format

1mysql> select * from actor limit 1\G
2*************************** 1. row ***************************
3 actor_id: 2
4 first_name: NICK
5 last_name: WAHLBERG
6last_update: 2016-12-02 00:00:00
71 row in set (0.00 sec)

on duplicate update
1drop table if exists test_abc;
2create table test_abc(
3lastUpdated timestamp not null DEFAULT CURRENT_timestamp ON UPDATE CURRENT_timestamp
4,name VARCHAR(100) not null default ''
5,place VARCHAR(100) not null default ''
6,animal VARCHAR(100) not null default ''
7,things VARCHAR(100) not null default ''
8);
9create unique index uix1 on test_abc(name);
10
11insert ignore into test_abc(name, place, animal, things) values
12 ('Sushanth', 'Chennai', 'Human', 'Sony Vaio user');
13insert ignore into test_abc(name, place, animal, things) values
14 ('Bobby', 'Chennai', 'Human', 'Windows user');
15
16select * from test_abc;
17+---------------------+----------+---------+--------+----------------+
18| lastUpdated | name | place | animal | things |
19+---------------------+----------+---------+--------+----------------+
20| 2020-04-26 12:14:13 | Bobby | Chennai | Human | Windows user |
21| 2020-04-26 12:14:13 | Sushanth | Chennai | Human | Sony Vaio user |
22+---------------------+----------+---------+--------+----------------+
232 rows in set (0.00 sec)
24
25insert into test_abc(name, place, animal, things) values('Bobby', 'Earth', 'Human', 'Windows user')
26 on duplicate key update place = values(place);
27Query OK, 2 rows affected (0.13 sec)
28
29select * from test_abc;
30+---------------------+----------+---------+--------+----------------+
31| lastUpdated | name | place | animal | things |
32+---------------------+----------+---------+--------+----------------+
33| 2020-04-26 12:14:29 | Bobby | Earth | Human | Windows user |
34| 2020-04-26 12:14:13 | Sushanth | Chennai | Human | Sony Vaio user |
35+---------------------+----------+---------+--------+----------------+
362 rows in set (0.00 sec)
37
38drop table test_abc;

VALUES(col_name) in the ON DUPLICATE KEY UPDATE clause refers to the value of col_name that would be inserted, had no duplicate-key conflict occurred

Above when it says, "2 rows affected", its not two rows, its just value 2, for example,

  • if 0, had come, it means nothing is udpated
  • 1 means row is inserted
  • 2 means row is updated.

Its confusing a bit here but it is how it is.


Row change timestamp

Represents timestamp, when the row was last changed.

1drop table if exists test_abc;
2create table test_abc(
3lastUpdated timestamp not null DEFAULT CURRENT_timestamp ON UPDATE CURRENT_timestamp
4,name VARCHAR(100) not null default ''
5,place VARCHAR(100) not null default ''
6,animal VARCHAR(100) not null default ''
7,things VARCHAR(100) not null default ''
8);
9create unique index uix1 on test_abc(name);
10
11insert ignore into test_abc(name, place, animal, things) values
12 ('Sushanth', 'Chennai', 'Human', 'Sony Vaio user');
13insert ignore into test_abc(name, place, animal, things) values
14 ('Bobby', 'Chennai', 'Human', 'Windows user');
15
16select * from test_abc;
17+---------------------+----------+---------+--------+----------------+
18| lastUpdated | name | place | animal | things |
19+---------------------+----------+---------+--------+----------------+
20| 2020-04-26 11:57:30 | Bobby | Chennai | Human | Windows user |
21| 2020-04-26 11:57:30 | Sushanth | Chennai | Human | Sony Vaio user |
22+---------------------+----------+---------+--------+----------------+
23
24update test_abc SET place = 'Earth' where name = 'Bobby';
25
26select * from test_abc;
27+---------------------+----------+---------+--------+----------------+
28| lastUpdated | name | place | animal | things |
29+---------------------+----------+---------+--------+----------------+
30| 2020-04-26 11:57:35 | Bobby | Earth | Human | Windows user |
31| 2020-04-26 11:57:30 | Sushanth | Chennai | Human | Sony Vaio user |
32+---------------------+----------+---------+--------+----------------+
332 rows in set (0.00 sec)
34
35drop table test_abc;

Running MySQL script from windows prompt

Below is the load card to truncate and load data into mySQL table

LR-N_TI.sql
1truncate table N_TI
2LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'
3IGNORE INTO TABLE N_TI
4COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'
5IGNORE 1 LINES
6(ts, unitName, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d)
7;

Below windows batch file to execute the above load control card,

dataLoads.bat
1mysql -uroot -predpill test < ../data/loadcards/LR-N_TI.sql > ../data/outputs/LR-N_TI.out

FYI : Load card - Its a mainframe term meaning a member(equivalent to file in folder) containing load statement.


Updating column values during LOAD
1LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'
2IGNORE INTO TABLE N_TI
3COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'
4IGNORE 1 LINES
5(ts, unitName, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d)
6;

Line number and description,

  1. Input file path, its Windows, so remember its a forward slash
  2. IGNORE ignores duplicates while loading into table
  3. Columns are terminated by pipe symbol(|) rather than comma(,) as unitName can contain commas.
  4. Ignore the header line
  5. Table Column names as they appear in the delimited file

Updating values while loading the table

SET clause can be used for this purpose, all MySQL functions can be used.

1LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'
2IGNORE INTO TABLE N_TI
3COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'
4IGNORE 1 LINES
5(ts, @var_unitName, @var_status, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d)
6SET
7ts = str_to_date(@ts, '%d-%b-%Y')
8, unitName = trim(@var_unitName)
9, status = if(@var_status = '',null,@var_status);

Thanks