How to understand the GCSafePoint in PD?

I am reading the RPC code in PD, and don’t understand the following functions:

func (s *Server) GetGCSafePoint(ctx context.Context, request *pdpb.GetGCSafePointRequest) (*pdpb.GetGCSafePointResponse, error)
func (s *Server) UpdateServiceGCSafePoint(ctx context.Context, request *pdpb.UpdateServiceGCSafePointRequest) (*pdpb.UpdateServiceGCSafePointResponse, error)

How to understand the “GC” and the “safepoint”?

I know the “GC” is for “Garbage Collection” now.

Who will do the GC? PD, TiKV or RocksDB? I think it should do in the storage-layer.

In TiDB, we use MVCC to do the concurrency control which uses the TSO (you can treat it as an UNIX timestamp basically) as the monotonic increasing version. However, we can’t save every version of data in TiKV forever, garbage collection work is needed periodly to clean the outdated data. To delete the outdated data, we need to know what kind of data is outdated first. This leads to your question: what is GCSafePoint?

Because the TSO is monotonic increasing in a TiDB cluster. So once TiDB requests with a TSO t1, there is no possibility of existing a TSO t2 later that is smaller than t1 anymore. If we have a cluster having 3 TiDB instances, at any moment, we can calculate a timestamp point according to the min TSO among these three instances, and this min TSO is a GCSafePoint basically, which we will never read the data before this version anymore, so these data can be cleaned up safely. In real scenarios, we also need to take into account the handling of locks, but for a basic understanding, I think this will be enough.

GC is not performed very frequently and we will handle the TSO and GCSafePoint carefully to make sure the consistency, base on this, TiDB still can provide the ability to read some stale data, e.g, the Stale Read feature.

ref:

3 Likes