serverlessDB for HTAP

shipx123 · June 30, 2021, 9:44am

Hello everyone, I would like to introduce the project serverlessDB for HTAP. serverlessDB is a database for severless services based on TiDB, which can dynamically scale up and down compute and storage nodes based on business load changes with zero user perception. It provides the following features.

Dynamic scaling based on business load to ensure continuous and stable business with zero user perception.
The service load model can differentiate between AP and TP services, ensuring that AP and TP services do not affect each other.
Always ensure that the load on each computing node is balanced and kept within a reasonable range.
Supports ultra-small form factor compute nodes and ensures smooth transition from ultra-small to large form factor.

In order to implement tidb serverless, we designed the proxy module and serverless module. The proxy module does permission control, computation under low load, and traffic forwarding under high load, while the serverless module mainly manages tidb-server instances and smoothly scales tidb-server.Below is an overview of the system design.

proxy module

Low-load computing

The proxy module acts as a tidb-server under low load, interacting directly with pd and tikv and returning sql execution results to the client.
High-load traffic forwarding

Under high load, the proxy module mainly implements sql processing and traffic forwarding functions. sql processing mainly includes sql parsing, sql optimization, sql execution plan generation, and estimation of sql execution cost, while traffic forwarding mainly carries out load balancing according to the cost of sql, and forwards the sql execution plan to the regular tidb-server instance for execution.
Medium-load

The proxy can be used as both a compute node and a proxy node. For example, for some point-check SQL, the proxy directly completes the calculation without forwarding to other tidb nodes. Some relatively complex SQL is forwarded to other nodes.
The proxy will establish connections to other TIDB nodes, each corresponding to a pool of connections. After the user request comes in, a series of calculations are performed, and finally, based on the cost, a suitable TIDB node is selected, and then a connection is selected from the pool of connections corresponding to this TIDB node for the specific task.

From the above description, there will be three roles for the proxy: pure compute node role, pure proxy role, and mixed compute and proxy role. These three roles will be dynamically adjusted online based on the business load.

serverless module

tidb-server instance management

Control the specification of compute and storage instances and the number of instances, support load-based and rule-based elastic scaling, support compute/storage node scale up/down, scale out/in.
Smooth expansion and contraction tidb-server

When expanding, the serverless module registers new instances with the proxy module, and when scaling down, the serverless module deletes instances with the proxy module. After each operation, the proxy module will dynamically load balance to existing instances to achieve smooth business migration.

The lastest progress of this project is:

Complete development of serverless module
tidb is being modified to make tidb proxy capable (connection pooling development completed)

Limitations:
1.It is difficult to cope with scenarios where the business load changes dramatically, e.g. TPS rises from 100 to 100W in a very short period of time, leading to business performance fluctuations and some SQL delays reaching 5s and lasting up to 30s.
2.It is difficult to establish the most correct business load model and there is a risk of unbalanced load on some of the computing nodes.
3.Because of the introduction of the proxy, there will be performance loss, ideally, performance loss is expected to be kept within 15%.

TODO:

Distinguish SQL between AP and TP.
Optimize the business load model.

tison · June 30, 2021, 9:45am

Thanks for sharing you project @shipx123 ! Welcome to the TiDB community.

This topic is not a proposal to be accepted to the TiDB codebase but your project in the TiDB ecosystem calling for participant and contribution. Thus I remove the proposal tag for you.

shipx123 · July 1, 2021, 12:47am

Thanks for your attention.

tison · August 5, 2021, 8:01am

Hi @shipx123 ,

I’m watching your project and found it is a bit weird that commits pushed without discussion or pull request. That makes the evolution of the project hard to understand.

It seems that you guys develop and discuss internally and later push the commits onto the open source repo, which is quite like so-called “open source code only”.

Would you mind to create an issue to track development task and later submit a PR close that, with communication happen on GitHub and we who are interested in the project can participate?

See also this tracking issue, its subtasks and pull request against the subtasks.

skyzh · August 5, 2021, 8:16am

Will there be some feature like “remote compaction” or “store SST on s3” to better scale the storage layer?

shipx123 · August 5, 2021, 10:49am

Sorry @tison , we were discussing the new architecture design of the system, sorting out the internal code and uploading them to Github. We just created some issues about development tasks, and we will develop through the issue in the future. Thanks for reminding!

shipx123 · August 5, 2021, 10:53am

Sorry @skyzh , there is no plan for this. Now we are only considering tikV storage capacity expansion on the storage layer.