Some talk about bulk load feature #1628

jihuayu · 2023-08-02T13:51:53Z

jihuayu
Aug 2, 2023
Collaborator

Discussion for this issue #1301

Hello everyone, after some effort, I have now got the basic logic of the bulk load. However, there are still some aspects that I am unsure of and require discussion with the community. Below, I will briefly explain the basic process of the bulk load function and present some questions.

How can we implement bulk load

The bulk load use the ingest SST feature of RocksDB, which allows us to quickly import pre-generated SST files into the kvrocks DB. Similar open-source projects such as TiDB and Pegasus, which use RocksDB as their storage engine, also use this way to implement the bulk load. We can use an tool to generate SST files in the kvrocks database format, and then ingest these SST files into kvrocks.

Therefore, to implement the bulk load, we need to design three things:

a tool for generating SST files (we called make-sst tool),
an ingest command in kvrocks that can ingest SST files (we called ingest command)
a format for SST resources that the ingest command can accept, which is generated by the make-sst tool (we called exchange format file).

Make-sst tool

There are two ways to generate SST files

Using SstFileWriter Generate SST files directly based on the kvrocks SST file format.
Starting a kvrocks storage engine and inserting elements using its API, and finally generating a DB file.

For method 1, because SstFileWriter requires that the writes be in order, we need to sort the elements ourselves before writing them to the SST file. In addition, we also need to determine an appropriate size for the SST file.

For method 2, I am not sure if the kvrocks storage engine can be started independently and if the DB SST files can be directly provided to the ingest command. If the entire process is ok, then I think method 2 can reduce many compatibility risks for us.

Exchange format file

For the exchange format, we need to record which SST file should be merged into which column family, and we also need the checksum of each SST file to verify its integrity.

I think we can use a JSON file to store this information. By provided this JSON file to the ingest command, we can import multiple SST files. Here is an example schema:

{
  "version": "1.0",
  "files": [
    {
      "path": "/path/to/sst_file1.sst",
      "column_family": "cf1",
      "checksum": "xxxxxxxxxxxx"
    },
    {
      "path": "/path/to/sst_file2.sst",
      "column_family": "cf2",
      "checksum": "yyyyyyyyyyyy"
    }
  ]
}

In this schema, version specifies the version of the exchange format, files is an array of SST files to ingest, and each file object contains the path to the SST file, the column family to merge into, and the checksum of the file.

Ingest command

The Ingest command needs to support whether to overwrite the existing keys in the database with the imported keys, and we need to provide both sync and async versions of the Ingest command.

Summary

So, we need to talk about the following issues.

Which way should we create the SST file?
What fields should our exchange format have?
during the process of ingesting SST files, there is a step that will block the write operations of RocksDB, and we need to evaluate how long this time will be.
We need to evaluate whether this design is easy to migrate to the kvrocks clustered version.
We need to evaluate what methods are required to enable users to generate SST files. Should we provide a programming API or a (csv,json) to SST conversion tool?

Reference

RocksDB docs
TiDb docs
TiDb docs
Pegasus blog(only Chinese)
https://rockset.com/blog/optimizing-bulk-load-in-rocksdb/
https://www.cockroachlabs.com/blog/bulk-ingest-from-csv/

git-hulk · 2023-08-02T14:33:59Z

git-hulk
Aug 2, 2023
Collaborator

Thanks for your awesome design proposal first.

Starting a kvrocks storage engine and inserting elements using its API, and finally generating a DB file.

For the second solution, it's a costless and consistent way to generate SST files, but I have a bit worried about the performance. Because every command needs to go through the network stack. And for the command part, it'd be better to use the RESP format if possible, so that we won't need to take care of this special format.

To see if other guys have any thoughts on this topic. @torwig @PragmaTwice @mapleFU @enjoy-binbin @caipengbo @ShooterIT

4 replies

jihuayu Aug 3, 2023
Collaborator Author

I have a bit worried about the performance. Because every command needs to go through the network stack.

I think we can only create engine::Storage, don't create a Server.
Directly call methods like String#Set to write data.

git-hulk Aug 3, 2023
Collaborator

OK, then it sounds good to me if all logs can be also flushed as SST files before finishing the process.

jihuayu Aug 11, 2023
Collaborator Author

I have successfully generated SST file using engine::Storage, but I am unable to determine which column family the SST file belongs to in the DB folder.
I'm not sure which API to use to retrieve this information. Can you help me?

caipengbo Aug 11, 2023
Collaborator

You can use SstFileReader::GetTableProperties()

https://github.com/facebook/rocksdb/blob/66643b8106ba72f5a266d9b06b65aaafb507911c/include/rocksdb/sst_file_reader.h#L32

caipengbo · 2023-08-03T02:10:03Z

caipengbo
Aug 3, 2023
Collaborator

Which way should we create the SST file?

The amount of imported data is usually very large, and it may be an inefficient job to use kvrocks to do this. I think we should provide external tools or big data programs to directly generate the final Key-Value data.

What fields should our exchange format have?

For importing information, I think you can use the command like: submit bulkload-scheme scheme1 file file1_local_path checksum xxxxxxx (the name of the command is just an example).
This command can currently pass a local path, in the future we can pass the address, asking kvrocks to download these files asynchronously, and we can provide display scheme, delete scheme, etc.
In my opinion, this approach is more consistent with the use of redis protocol and easier to extend.

during the process of ingesting SST files, there is a step that will block the write operations of RocksDB, and we need to evaluate how long this time will be.**

The common use cases I can think of for bulk loading are cold starts, migrating to kvrocks from other databases, or periodically ingesting data. Blocking writes should be an acceptable behavior for the user. In general, ingesting data should be fast(I think tens of seconds is enough for big data), and what's slower is generating and downloading the data.

0 replies

mapleFU · 2023-08-11T07:14:30Z

mapleFU
Aug 11, 2023
Collaborator

BulkLoad is nice, which makes us able to make kvrocks load the batch result. The datasource can come from spark ETL and others.

Generally, generate SST is not hard, however, we need to considering the syntax for Bulkload:

The format of file. RocksDB has SST support, and doesn't have complex syntax. However, our kvrocks has lots of types. So I think maybe we need to wrap the Storage type, and get db in SST. It's not hard, but needs lots of dirty work. Also we need to validate the SST and make sure the result is ok.
Timestamp for inserts. Would the "ingest" data be newest or latest? Does bulk load need to stop the writing? I think we can firstly stop all user-writes, and regard it as a cold-start, then support concurrency later.
As for pull/push style. Maybe we can first support fetch the local filesystem. I guess pulling from remote machine might port an fs or data-access layer?
Partition: When cluster mode enabled, would writer or someone need to understand the partition of the cluster? Seems we're not a shared-storage implemention. So maybe we need to get the cluster before write the fs file?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some talk about bulk load feature #1628

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Some talk about bulk load feature #1628

jihuayu Aug 2, 2023 Collaborator

Discussion for this issue #1301

How can we implement bulk load

Make-sst tool

Exchange format file

Ingest command

Summary

Reference

Replies: 3 comments · 4 replies

git-hulk Aug 2, 2023 Collaborator

jihuayu Aug 3, 2023 Collaborator Author

git-hulk Aug 3, 2023 Collaborator

jihuayu Aug 11, 2023 Collaborator Author

caipengbo Aug 11, 2023 Collaborator

caipengbo Aug 3, 2023 Collaborator

mapleFU Aug 11, 2023 Collaborator

jihuayu
Aug 2, 2023
Collaborator

Replies: 3 comments 4 replies

git-hulk
Aug 2, 2023
Collaborator

jihuayu Aug 3, 2023
Collaborator Author

git-hulk Aug 3, 2023
Collaborator

jihuayu Aug 11, 2023
Collaborator Author

caipengbo Aug 11, 2023
Collaborator

caipengbo
Aug 3, 2023
Collaborator

mapleFU
Aug 11, 2023
Collaborator