HugeGraph Database X TiKV

Introduction

PingCAP Hackathon is a hackathon held by the TiDB community. In this issue, HugeGraph, a star open source product in the field of TiDB joint graph database, jointly creates a graph database based on TiKV for back-end storage. The purpose of this document is to introduce the project in detail and help the team get started quickly.

Background

HugeGraph is an easy-to-use, efficient, and universal open source graph database system, which implements the Apache TinkerPop3 framework and is fully compatible with Gremlin query language. HugeGraph supports the rapid import of hundreds of billions of vertices and edges, provides millisecond-level relational query capabilities (OLTP), and large-scale offline computing and analysis capabilities (OLAP); provides easy-to-use RESTful API and Client, with import, export, backup, recovery, visual interface and other tool components, you can easily build a variety of graph-based database applications and products. It is widely used in social relationship analysis, marketing recommendation, public opinion and social listening, information dissemination, fraud prevention and other scenarios with rich relationship data.

HugeGraph supports plug-in expansion of back-end storage, and currently supports RocksDB, Cassandra, ScyllaDB, HBase, MySQL, PostgreSQL, Palo and InMemory, etc.

The requirements of this project are: use Tikv as the back-end storage of HugeGraph to build a set of graph engines with high-level scalability, high performance, and high stability; it can also be based on the basic base of Tikv+HugeGraph and integrate TiDB in financial and other markets to create valuable application solutions.

PingCAP and HugeGraph will provide experienced mentors to help the project team formulate a reasonable follow-up plan and business direction; after the project graduates, it will be incorporated into the HugeGraph open source project as appropriate based on the completion status, for long-term and continuous maintenance, and become a real landing Community projects; At the same time, HugeGraph and TiKV will carry out vigorous publicity and promotion, attracting a large number of HugeGraph and TiKV community users to try out, and continue to improve and optimize the project.

HugeGraph and Tikv are both excellent domestic open source products with strong open source community resources. This project works together to create a graph database system based on TiKV storage to achieve joint technological innovation and industrial ecological cooperation. We look forward to creating unlimited possibilities with the hardest technology and the most explosive ideas.

Detailed Design

You can refer to the implementation of RocksDB backend[com.baidu.hugegraphh.backend.store.rocksdb](https://github.com/hugegraph/hugegraph/tree/master/hugegraph- rocksdb).
Create com.baidu.hugegraphh.backend.store.tikv package to implement the following.

  • TikvFeatures, which stores feature support
  • TikvMetrics, collect monitoring information of Tikv
  • TikvOptions, store the required configuration items and add as many parameters as possible that are supported by the java-client
  • TikvSessions, the interface for HG and tikv interaction
  • TikvStdSessions, the class really responsible for dealing with storage, re-implementing the methods in TikvSessions, [reference](https://github.com/hugegraph/hugegraph/blob/master/hugegraph-rocksdb/src /main/java/com/baidu/hugegraph/backend/store/rocksdb/RocksDBStdSessions.java), including
    • Connecting to the cluster, disconnecting
    • create namespace, initialize namespace, delete namespace, can be represented by fixed prefix
    • create table, delete table, can be represented by a fixed prefix
    • table specific operations
      • get(table, key), query by key
      • put(table, key, value), insert data
      • remove(table, key)
      • scan(table), iterate through the whole table
      • scan(table, startkey, endkey), iterate through the interval by key
      • scan(table, prefix), traverse by prefix of key (not necessary)
      • increase(table, key, increament), atomically increase the value corresponding to the key
      • batchCommit, which guarantees atomic commit of multiple statements
      • ttl support
  • TikvStore, logical structure to record the structure of the table corresponding to the store
    • open(), init(), clear(), drop(), etc. by encapsulating the functions in TikvSession
  • TikvStoreProvider, the provider of the store
  • TikvTable, the template of the table, encapsulates the operations such as adding, deleting, changing and checking of the table, relying on the basic operations in TikvStdSession to achieve
  • TikvTables, each table compared to the template TikvTable need special implementation of the part

Notes:

  • Currently, the above work has been partially achieved and needs to be supplemented
  • TikvOptions and TikvSessions are closely linked to Tikv and determine the performance of the implementation
  • Use java-client to access tikv, transform or add java-client as needed
1 Like