— mysql — 3 min read
Last updated : 31/Oct/2021
Events are in reverse chronological order.
My scenario, i had around 1200 csv files to be loaded and each load takes between 15 seconds(70%) and some takes around 1m30s to 4m30s. My estimation was it should complete by approx 8hrs. Why its taking so long ? Is because of Primary Key. MySQL checks the PRIMARY KEY integrity with each INSERT.
I know the data i am going to load into the table is going to be unique, so i dropped all the indexes.
1drop index uix1 on <<tablename>>;
What would have run for 8 hrs completed in less than 5mins
Same is the case with mass loading as well. With unique indexes it took.
1mysql> LOAD DATA INFILE 'D:/tmp/nse_daily_temp.csv'2 -> IGNORE INTO TABLE nse_daily_history23 -> COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'4 -> (symbol, series, OPEN, HIGH, LOW, CLOSE, LAST, PREVCLOSE, TOTTRDQTY, TOTTRDVAL, ts, TOTALTRADES, ISIN)5 -> ;6Query OK, 1730759 rows affected (3 hours 48 min 23.27 sec)7Records: 1730759 Deleted: 0 Skipped: 0 Warnings: 0
It could have taken less time without indexes.
To resolve,
C:\ProgramData\MySQL\MySQL Server 8.0\my.ini
innodb_buffer_pool_size
from 8M to 128M. innodb_buffer_pool_size=128M
It took around 8hrs to rollback. If rows > 50k LOAD in mySQL.
Below is a snip from output SHOW ENGINE INNODB STATUS \G
1---TRANSACTION 59090816, ACTIVE 1856 sec recovered trx2ROLLING BACK 1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 797821
1ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
Got the above error message, when executing an UPDATE statement in MySQL terminal. Upon a bit of analysis found that, a program was abnormally terminated and it had the thread still active, could see in SHOW PROCESSLIST;
. So after killing the thread, UPDATE statement was reexecuted and completed successfully in few seconds.
1mysql> show processlist;2+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+3| Id | User | Host | db | Command | Time | State | Info |4+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+5| 4 | event_scheduler | localhost | NULL | Daemon | 930419 | Waiting on empty queue | NULL |6| 2372 | sushanth | localhost:50242 | test | Query | 732 | Sending data | update qReport_JSON a, ( select a.lastupdated, a.sym, json_object('lastupdated', a.lastupdated |7| 2373 | sushanth | localhost:50581 | test | Query | 0 | starting | show processlist |8+------+-----------------+-----------------+------+---------+--------+------------------------+------------------------------------------------------------------------------------------------------+93 rows in set (0.00 sec)10
11mysql> kill 2372;12Query OK, 0 rows affected (0.00 sec)13
14mysql> show processlist;15+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+16| Id | User | Host | db | Command | Time | State | Info |17+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+18| 4 | event_scheduler | localhost | NULL | Daemon | 930427 | Waiting on empty queue | NULL |19| 2373 | sushanth | localhost:50581 | test | Query | 0 | starting | show processlist |20+------+-----------------+-----------------+------+---------+--------+------------------------+------------------+212 rows in set (0.00 sec)
1CREATE USER 'test'@'localhost' IDENTIFIED BY 'test01';2GRANT ALL PRIVILEGES ON TEST.* To 'test'@'localhost' ;3GRANT INSERT, SELECT, DELETE, UPDATE ON test.* TO 'test'@'localhost' IDENTIFIED BY 'test01';4GRANT FILE ON *.* TO 'test'@'localhost';5GRANT EXECUTE ON PROCEDURE TEST.* TO 'test'@'localhost' identified by 'test01';6FLUSH PRIVILEGES;
After creating the user, can be tested by opening a new command prompt,
1C:\Users\Sushanth>mysql -utest -p2Enter password: ******3Welcome to the MySQL monitor. Commands end with ; or \g.4Your MySQL connection id is 3445Server version: 8.0.12 MySQL Community Server - GPL6
7Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.8
9Oracle is a registered trademark of Oracle Corporation and/or its10affiliates. Other names may be trademarks of their respective11owners.12
13Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.14
15mysql> show databases;16+--------------------+17| Database |18+--------------------+19| information_schema |20| sakila |21| test |22+--------------------+233 rows in set (0.07 sec)
For python, you may need to use mysql_native_password
, i have written a separate blog post for this.
1CREATE USER 'snake_charmer'@'localhost' IDENTIFIED BY 'pepe' PASSWORD EXPIRE NEVER;2ALTER USER 'snake_charmer'@'localhost' IDENTIFIED WITH mysql_native_password BY 'pepe';3GRANT ALL PRIVILEGES ON TEST.* To 'snake_charmer'@'localhost' ;
1-- Both the commands display similar outputs. I prefer using indexes. 2show indexes from actor;3show keys from actor;4
5-- Shows all the databases in the current server 6show databases;7
8-- Shows all tables in the current database9show tables; 10show tables in sakila;11
12-- Describe command shows the structure of table which include name of the column, data-type of column and the nullability13desc actor;14
15-- Displays threads and their details(user, host, db, command, time and state) which are currently running in MySQL 16show processlist;17
18-- Display the DDL of the table actor 19show create table actor;20
21-- Shows the accesspath of the MySQL query22explain select * from actor where actor_id=5 ;23
24-- Show Server status, can look for current long running transactions to evaluate how long its going to run25SHOW ENGINE INNODB STATUS \G
They appear to be normal columns but their values are calculated or derived from other columns and they are not materialized insense they are not stored in disk.
In my usecase, i find them extremely useful when dealing with JSON columns as JSON queries are tend to get lot bigger and complex. So, this virtal column greatly helps. Heres a sample,
1drop table if exists test.products1 ;2CREATE TABLE test.products1(3 `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,4 `name` VARCHAR(250) NOT NULL ,5 `brand_id` INT UNSIGNED NOT NULL ,6 `category_id` INT UNSIGNED NOT NULL ,7 `attributes` JSON NOT NULL ,8 PRIMARY KEY(`id`) 9);10
11INSERT INTO test.`products1`(`name` ,`brand_id` ,`category_id` ,`attributes`)12VALUES(13 'Bravia' ,'1' ,'1' ,14 '{"screen": "25 inch"15 , "resolution": "1366 x 768 pixels"16 , "ports": {"hdmi": 1, "usb": 0}17 , "speakers": {"left": "5 watt", "right": "5 watt"}}'18);19
20INSERT INTO test.`products1`(`name` ,`brand_id` ,`category_id` ,`attributes`)21VALUES(22 'Proton' ,'1' ,'1' ,23 '{"screen": "20 inch"24 , "resolution": "1280 x 720 pixels"25 , "ports": {"hdmi": 0, "usb": 0}26 , "speakers": {"left": "5 watt", "right": "5 watt"}}'27);
Here is a simple JSON query retreving usb > -1
1SELECT * FROM test.`products1`2WHERE JSON_EXTRACT(`attributes` , '$.ports.usb') > -1;3mysql> SELECT * FROM test.`products1`4 -> WHERE JSON_EXTRACT(`attributes` , '$.ports.usb') > -1;5+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+6| id | name | brand_id | category_id | attributes |7+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+8| 1 | Bravia | 1 | 1 | {"ports": {"usb": 0, "hdmi": 1}, "screen": "25 inch", "speakers": {"left": "5 watt", "right": "5 watt"}, "resolution": "1366 x 768 pixels"} |9| 2 | Proton | 1 | 1 | {"ports": {"usb": 0, "hdmi": 0}, "screen": "20 inch", "speakers": {"left": "5 watt", "right": "5 watt"}, "resolution": "1280 x 720 pixels"} |10+----+--------+----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+112 rows in set (0.00 sec)
Lets add virtual column, to simplify the query,
1ALTER TABLE test.products1 2 ADD COLUMN usb integer GENERATED ALWAYS AS (JSON_EXTRACT(`attributes` , '$.ports.usb'));3mysql> desc products1;4+-------------+------------------+------+-----+---------+-------------------+5| Field | Type | Null | Key | Default | Extra |6+-------------+------------------+------+-----+---------+-------------------+7| id | int(10) unsigned | NO | PRI | NULL | auto_increment |8| name | varchar(250) | NO | | NULL | |9| brand_id | int(10) unsigned | NO | | NULL | |10| category_id | int(10) unsigned | NO | | NULL | |11| attributes | json | NO | | NULL | |12| usb | int(11) | YES | | NULL | VIRTUAL GENERATED |13+-------------+------------------+------+-----+---------+-------------------+146 rows in set (0.00 sec)
Now all we have to do is
1mysql> select id, name, brand_id, category_id, usb from products1 where usb > -1;2+----+--------+----------+-------------+------+3| id | name | brand_id | category_id | usb |4+----+--------+----------+-------------+------+5| 1 | Bravia | 1 | 1 | 0 |6| 2 | Proton | 1 | 1 | 0 |7+----+--------+----------+-------------+------+82 rows in set (0.00 sec)
Using offset we can review a big result or table part by part in the terminal which doesn't have scrollbars
1mysql> SELECT * FROM actor LIMIT 15;2+----------+------------+--------------+---------------------+3| actor_id | first_name | last_name | last_update |4+----------+------------+--------------+---------------------+5| 2 | NICK | WAHLBERG | 2016-12-02 00:00:00 |6| 3 | ED | CHASE | 2016-12-02 00:00:00 |7| 4 | Jenni | Lewis | 2017-04-07 19:35:18 |8| 5 | JOHNNY | LOLLOBRIGIDA | 2016-12-02 00:00:00 |9| 6 | BETTE | NICHOLSON | 2016-12-02 00:00:00 |10| 7 | GRACE | MOSTEL | 2016-12-02 00:00:00 |11| 8 | MATTHEW | JOHANSSON | 2016-12-02 00:00:00 |12| 10 | CHRISTIAN | GABLE | 2016-12-02 00:00:00 |13| 11 | ZERO | CAGE | 2016-12-02 00:00:00 |14| 12 | KARL | BERRY | 2016-12-02 00:00:00 |15| 13 | UMA | WOOD | 2016-12-02 00:00:00 |16| 14 | VIVIEN | BERGEN | 2016-12-02 00:00:00 |17| 15 | CUBA | OLIVIER | 2016-12-02 00:00:00 |18| 16 | FRED | COSTNER | 2016-12-02 00:00:00 |19| 17 | HELEN | VOIGHT | 2016-12-02 00:00:00 |20+----------+------------+--------------+---------------------+2115 rows in set (0.00 sec)
What below query does is,
1mysql> SELECT * FROM actor LIMIT 5 offset 5;2+----------+------------+-----------+---------------------+3| actor_id | first_name | last_name | last_update |4+----------+------------+-----------+---------------------+5| 7 | GRACE | MOSTEL | 2016-12-02 00:00:00 |6| 8 | MATTHEW | JOHANSSON | 2016-12-02 00:00:00 |7| 10 | CHRISTIAN | GABLE | 2016-12-02 00:00:00 |8| 11 | ZERO | CAGE | 2016-12-02 00:00:00 |9| 12 | KARL | BERRY | 2016-12-02 00:00:00 |10+----------+------------+-----------+---------------------+115 rows in set (0.00 sec)
Usually when we are retreiving data from table we expect to be in table format but at certain times, when we want to report or note it down, it will be better if its in 'vertial format'.
Default format
1mysql> select * from actor limit 1;2+----------+------------+-----------+---------------------+3| actor_id | first_name | last_name | last_update |4+----------+------------+-----------+---------------------+5| 2 | NICK | WAHLBERG | 2016-12-02 00:00:00 |6+----------+------------+-----------+---------------------+71 row in set (0.00 sec)
Vertical format
1mysql> select * from actor limit 1\G2*************************** 1. row ***************************3 actor_id: 24 first_name: NICK5 last_name: WAHLBERG6last_update: 2016-12-02 00:00:0071 row in set (0.00 sec)
1drop table if exists test_abc;2create table test_abc(3lastUpdated timestamp not null DEFAULT CURRENT_timestamp ON UPDATE CURRENT_timestamp4,name VARCHAR(100) not null default ''5,place VARCHAR(100) not null default ''6,animal VARCHAR(100) not null default ''7,things VARCHAR(100) not null default ''8);9create unique index uix1 on test_abc(name);10
11insert ignore into test_abc(name, place, animal, things) values12 ('Sushanth', 'Chennai', 'Human', 'Sony Vaio user');13insert ignore into test_abc(name, place, animal, things) values14 ('Bobby', 'Chennai', 'Human', 'Windows user');15
16select * from test_abc;17+---------------------+----------+---------+--------+----------------+18| lastUpdated | name | place | animal | things |19+---------------------+----------+---------+--------+----------------+20| 2020-04-26 12:14:13 | Bobby | Chennai | Human | Windows user |21| 2020-04-26 12:14:13 | Sushanth | Chennai | Human | Sony Vaio user |22+---------------------+----------+---------+--------+----------------+232 rows in set (0.00 sec)24
25insert into test_abc(name, place, animal, things) values('Bobby', 'Earth', 'Human', 'Windows user')26 on duplicate key update place = values(place);27Query OK, 2 rows affected (0.13 sec)28
29select * from test_abc;30+---------------------+----------+---------+--------+----------------+31| lastUpdated | name | place | animal | things |32+---------------------+----------+---------+--------+----------------+33| 2020-04-26 12:14:29 | Bobby | Earth | Human | Windows user |34| 2020-04-26 12:14:13 | Sushanth | Chennai | Human | Sony Vaio user |35+---------------------+----------+---------+--------+----------------+362 rows in set (0.00 sec)37
38drop table test_abc;
VALUES(col_name) in the ON DUPLICATE KEY UPDATE clause refers to the value of col_name that would be inserted, had no duplicate-key conflict occurred
Above when it says, "2 rows affected", its not two rows, its just value 2, for example,
Its confusing a bit here but it is how it is.
Represents timestamp, when the row was last changed.
1drop table if exists test_abc;2create table test_abc(3lastUpdated timestamp not null DEFAULT CURRENT_timestamp ON UPDATE CURRENT_timestamp4,name VARCHAR(100) not null default ''5,place VARCHAR(100) not null default ''6,animal VARCHAR(100) not null default ''7,things VARCHAR(100) not null default ''8);9create unique index uix1 on test_abc(name);10
11insert ignore into test_abc(name, place, animal, things) values12 ('Sushanth', 'Chennai', 'Human', 'Sony Vaio user');13insert ignore into test_abc(name, place, animal, things) values14 ('Bobby', 'Chennai', 'Human', 'Windows user');15
16select * from test_abc;17+---------------------+----------+---------+--------+----------------+18| lastUpdated | name | place | animal | things |19+---------------------+----------+---------+--------+----------------+20| 2020-04-26 11:57:30 | Bobby | Chennai | Human | Windows user |21| 2020-04-26 11:57:30 | Sushanth | Chennai | Human | Sony Vaio user |22+---------------------+----------+---------+--------+----------------+23
24update test_abc SET place = 'Earth' where name = 'Bobby';25
26select * from test_abc;27+---------------------+----------+---------+--------+----------------+28| lastUpdated | name | place | animal | things |29+---------------------+----------+---------+--------+----------------+30| 2020-04-26 11:57:35 | Bobby | Earth | Human | Windows user |31| 2020-04-26 11:57:30 | Sushanth | Chennai | Human | Sony Vaio user |32+---------------------+----------+---------+--------+----------------+332 rows in set (0.00 sec)34
35drop table test_abc;
Below is the load card to truncate and load data into mySQL table
1truncate table N_TI2LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'3IGNORE INTO TABLE N_TI4COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'5IGNORE 1 LINES6(ts, unitName, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d)7;
Below windows batch file to execute the above load control card,
1mysql -uroot -predpill test < ../data/loadcards/LR-N_TI.sql > ../data/outputs/LR-N_TI.out
FYI : Load card - Its a mainframe term meaning a member(equivalent to file in folder) containing load statement.
1LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'2IGNORE INTO TABLE N_TI3COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'4IGNORE 1 LINES5(ts, unitName, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d)6;
Line number and description,
IGNORE
ignores duplicates while loading into table |
) rather than comma(,
) as unitName can contain commas. Updating values while loading the table
SET
clause can be used for this purpose, all MySQL functions can be used.
1LOAD DATA INFILE 'D:/BigData/12. Python/data/TI.csv'2IGNORE INTO TABLE N_TI3COLUMNS TERMINATED BY '|' LINES TERMINATED BY '\r\n'4IGNORE 1 LINES5(ts, @var_unitName, @var_status, TI180d, p180d, TI3d, p3d, TI5d, p5d, TI15d, p15d, TI30d, p30d) 6SET7ts = str_to_date(@ts, '%d-%b-%Y')8, unitName = trim(@var_unitName)9, status = if(@var_status = '',null,@var_status);