TiDB v5.2 Planning

bb7133 · July 19, 2021, 3:42am

Hello all, we are glad to announce the release of TiDB v5.1 and the short-term planning for the next version of TiDB.

You’re encouraged to leave your suggestion/comments to our planning; furthermore, if you’re interested in participating in any of the projects, feel free to contact us and join the feature crew, let’s improve TiDB together!

Attention: Code freeze for v5.2 will be called at Aug. 6th, and after that release-5.2 branch will be cut and all feature pull requests cannot merge to the release branch.

TiDB Server

The planned features of TiDB Server are organized in SIGs, you are welcome to join the SIG for more details of those features.

SIG Execution

Feature	Tracking Issue	Owner	Description
Aggregation spill to disk	#25882	wshwsh12	At present, HashAgg cannot spill to disk, which means that large joins that use HashAgg use a huge amount of memory and can trigger OOM.
Refactor the execution framework		XuHuaiyu	In order to enhance the resource management mechanism of tidb-server, we would like to provide a better parallel query framework for the runtime execution.

SIG Planner

Feature	Tracking Issue	Owner	Description
Enhance SQL Plan Management (SPM)	#25970	eurekaka	We would like to enhance the SPM of TiDB by providing the ability to auto capturing/binding the plans.
Plan Recreator	#26325	rebelice	A one-button dump and load utility is designed to help the DBA/developers better analyzing/improving the execution plans.
Improvement of index selection	#26020	winoros	We hope that the index selection of TiDB can be improved from the enhanced Heuristic/Skyline rules, a refined index cost model and formula, and a better strategy for indexes under severe stats.
Enhancement of cardinality estimation	#26085	time-and-fate	We plan to improve the out-of-Range/DNF/Limit/TopN estimation and provide a better selectivity under the severe stat.

SIG SQL-Infra

Feature	Tracking Issue	Owner	Description
The database-level placement rule in SQL	#18030	morgo	The database-level placement rule in SQL would provide a convenient solution to manipulate the placement rule for all objects in a database through the ‘DDL’ syntax in TiDB, which we believe, would enhance the usability of TiDB in the cross-region deployment scenario.
Temporary Table	#24169	djshow832	For now the `CREATE TEMPORARY TABLE` is a no-op feature in TiDB, now we have the plan to support it to provide better compatibility with MySQL.
A framework to provide the full character set support, as well as the GB18030 encoding	#25152	zimulala	Since the very initial version, TiDB supports the charset encoding for UTF-8(and its subset) only(`utf8mb4`, `utf8`, `ascii`). We hope to support other `CHARACTER SET` in MySQL, as the requirements are proposed by some customers.
Make expression index generally available	#25150	wjhuang2016	The ‘expression index’ of TiDB has been released in TiDB v4.0 as an experimental feature, now we think it is stable enough to be generally available by bringing a wider scope of integration tests.
Improve the ORM compatibility for TiDB	#24194	bb7133	We would like to provide a better experience for the developers who build applications on TiDB, by testing the ORM frameworks against TiDB and provide a specific TiDB adaptor if necessary, you can check this doc for the TiDBDialect we’ve done for Hibernate ORM.
Remove TiDB-to-TiDB RPC from tikv client	#25808	crazycs520	Now TiDB can access TiDB by RPC to grab some runtime information. It is implemented by defining a store type TiDB in the tikv client and all RPCs are served as coprocessor calls. However, a better way would be for TiDB to start its own gRPC client to access other TiDBs.

disksing · July 1, 2021, 3:28pm

Can we consider doing this improvement?

tison · July 2, 2021, 2:51am

@bb7133 could you elaborate a bit how to contact with you or the feature crew? Reply here or send an e-mail or other approaches?

BTW, since the definition of feature crew doesn’t formally published in our development guide could you give a brief introduction of it?

bb7133 · July 2, 2021, 11:59pm

Yes for sure, I’ve added it. Thanks!

bb7133 · July 3, 2021, 12:00am

Thanks, I’ll try to add the names of ‘feature crew owner’ and give a brief introduction to this term.

zhangjinpeng1987 · July 4, 2021, 2:30am

Many exciting features

skyzh · July 4, 2021, 12:23pm

Looks interesting! Is there any detail for this plan?

bb7133 · July 4, 2021, 2:59pm

Hello! There will be a tracking issue on Github(as you may see, it is missing now).

Before that, I would suggest you contact Huaiyu Xu(https://tidbcommunity.slack.com/archives/D01G237N27L) or Jian Zhang(https://tidbcommunity.slack.com/archives/DTW4PMMH6) directly for more details.

tison · July 5, 2021, 1:17am

cc @skyzh @zz-jason @bb7133

Since Slack cannot retain all history and it is another place to discuss, I suggest that you guys summary the direction on this feature briefly in this thread. And, if @skyzh are more interested in the details, you and the feature owner can create a separated dedicated discussion thread to continue.

XuHuaiyu · July 5, 2021, 2:05am

Hi, we’re working on a rough design doc now, we may finish it this week. More details are under investigation.
Issue with the detailed plan will be created this week.
If you’re interested, we can create a separate discussion group for this work.

leiysky · July 11, 2021, 3:31am

What’s the theoretical basis of this refactor?

Is it about pipeline level parallelism?

tison · July 11, 2021, 3:33am

Any updates? Even a rough design doc is worth to be published in form like the first version of

It may not be a good idea after a few days a complex design published, which cause higher overhead to understand or change later.

tison · July 11, 2021, 3:36am

Hi @bb7133 @rebelice I’d like to know whether these features are still discussed offline.

Enhance the plan management of TiDB
Plan recreator
Improvement of index selection

Any updates?

tison · July 17, 2021, 4:05am

Thanks for your replies! Updated on the list.

tison · July 17, 2021, 4:31am

@rebelice may you comment the tracking issue of “Plan Recreator” once it is created?

guo-shaoge · July 27, 2021, 7:41am

Here is the basic design doc Execution framework refactor. Welcome to comment.

leiysky · July 28, 2021, 3:46pm

Thanks for sharing the design document @guo-shaoge! Here is my two coins.

It seems this execution framework doesn’t improve the resource management issue. And there’s no reason to get significant improvement by replacing operator-level parallelism with pipeline level parallelism.

Resource management is a big scope. It’s better to fix the issues case by case.

If you’d like to improve the execution performance of TiDB, why not just support running MPP on tidb-server?

guo-shaoge · July 29, 2021, 11:40am

Thanks for your reply!!!

I did a basic performance test here. There is no significant performance improvement by replacing parallelism diagram.

But IMO, this could be the starting point for resource management. Because by controlling the execution of exchange operator , the CPU usage can be controlled at the framework level, rather than the internal control of each operator.

In terms of memory, we can change the buffer size of exchange operator to control the speed of producer and consumer.

Finally, at present, some operators in TiDB do not implement intra operator parallelism. Exchange can help them implement parallelism easily.

tison · August 13, 2021, 12:51am

Have we created a tracking issue for this effort yet? I think we may also create a post or a PR with the design document and we can proceed the feature task.