Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: add ddl_reorg_batch_size variable to control ddl worker batch size. #8365

Merged
merged 11 commits into from
Dec 4, 2018

Conversation

crazycs520
Copy link
Contributor

@crazycs520 crazycs520 commented Nov 20, 2018

What problem does this PR solve?

here is a test:
environment: 1 machine, 16G, disk is nvme. both tidb,tikv,pd in this machine.

test add index with different batch size

set @@tidb_ddl_reorg_worker_cnt=1;  #avoid region split impact testing, so set ddl worker to only 1.
mysql root@127.0.0.1:test> select * from t_slim limit 1;
+---+---+
| a | b |
+---+---+
| 1 | 1 |
+---+---+
mysql root@127.0.0.1:test> select count(*) from t_slim; #80W rows in this table.
+----------+
| count(*) |
+----------+
| 800000   |
+----------+

# there is no other workload in tidb, only add index test.
batchCnt = 128:
mysql> alter table t_slim add index idx1(a);
Time: 139.031s

batchCnt = 512:
mysql> alter table t_slim add index idx3(a);
Time: 67.523s

batchCnt = 1024:
mysql> alter table t_slim add index idx5(a);
Time: 35.031s

batchCnt = 4096:
mysql> alter table t_slim add index idx7(a);
Time: 19.246s

batchCnt = 8192:
mysql> alter table t_slim add index idx9(a);
Time: 15.309s

batchCnt = 10240:
mysql> alter table t_slim add index idx33(a);
Query OK, 0 rows affected
Time: 13.849s

batchCnt = 20480:
mysql root@192.168.197.180:test> alter table t_slim add index idx36(a)
Query OK, 0 rows affected
Time: 11.732s

batchCnt = 40960:
mysql> alter table t_slim add index idx39(a)
Query OK, 0 rows affected
Time: 11.501s



set @@tidb_ddl_reorg_worker_cnt=16; # add more ddl worker.
batchCnt = 8192:
mysql> alter table t_slim add index idx31(a); # only 8 regions in table t_slim, so only 8 ddl worker do task, other workers is idle.
Query OK, 0 rows affected
Time: 9.831s

test with transaction conflict

80W rows in table, 1 session send add index request, 20 sessions send update request, table schema is:

# table t_slim have 80W rows records.
mysql> select * from t_slim order by a limit 10;
+---+----+
| a | b  |
+---+----+
| 0 | 1  |
| 1 | 2  |
...
mysql> select count(*) from t_slim;
+----------+
| count(*) |
+----------+
| 800000   |
+----------+


session 0: 
mysql> alter table t_slim add index idx(a);

session 1 ~ session 20:
for{
	n = rand.Intn(800000)
	update t_slim set a=${n+1} where a=${n}; 
	sleep(10ms)
}

result

batchcnt=128, ddl_reorg_worker_cnt=1;
add index cost time(s):
250.118721157
279.992544007
303.099265579
269.927179136



batchcnt=8196, ddl_reorg_worker_cnt=1;
add index cost time(s):
50.934135623
56.634457149
50.81117151
50.383097911

# check how many data has been updated:
mysql> select count(*) from t_slim where (a+1) != b;
+----------+
| count(*) |
+----------+
| 440973   |
+----------+

Maybe next step, I wish to make tidb can auto tune the batch size according to whether there is a transaction write conflict, if no conflict, increase the batch, other wise reduce the batch size.

What is changed and how it works?

Add a session variable.

Check List

Tests

  • Unit test

Code changes

Side effects

Related changes


This change is Reviewable

@crazycs520
Copy link
Contributor Author

@winkyao @zimulala PTAL

@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520
Copy link
Contributor Author

/run-sqllogic-test

@crazycs520
Copy link
Contributor Author

crazycs520 commented Nov 21, 2018

tidb pprof svg.tar.gz

@winkyao
Copy link
Contributor

winkyao commented Nov 21, 2018

@crazycs520 Good job.

@winkyao
Copy link
Contributor

winkyao commented Nov 22, 2018

You need to append TiDBDDLReorgBatchSize to loadCommonGlobalVarsSQL, then the global variable can affect all the sessions.

@winkyao
Copy link
Contributor

winkyao commented Nov 22, 2018

I think 4096 maybe the proper default batch size. But we should find out a way to handle too large transaction error, and then reduce the batchSize automatically. Otherwise the add index will block by the error.

Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the information about the test cluster

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zimulala zimulala added the status/LGT2 Indicates that a PR has LGTM 2. label Nov 22, 2018
Copy link
Contributor

@ciscoxll ciscoxll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ciscoxll ciscoxll added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Dec 3, 2018
@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520
Copy link
Contributor Author

/run-integration-ddl-test

@crazycs520
Copy link
Contributor Author

/run-integration-ddl-test

1 similar comment
@crazycs520
Copy link
Contributor Author

/run-integration-ddl-test

@crazycs520
Copy link
Contributor Author

/run-integration-ddl-test

@crazycs520 crazycs520 merged commit e478be7 into pingcap:master Dec 4, 2018
crazycs520 added a commit to crazycs520/tidb that referenced this pull request Dec 7, 2018
crazycs520 added a commit to crazycs520/tidb that referenced this pull request Dec 7, 2018
winkyao pushed a commit that referenced this pull request Dec 11, 2018
iamzhoug37 pushed a commit to iamzhoug37/tidb that referenced this pull request Dec 13, 2018
@you06 you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants