Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserting data way too slow. #68

Open
pveentjer opened this issue Aug 12, 2021 · 3 comments
Open

Inserting data way too slow. #68

pveentjer opened this issue Aug 12, 2021 · 3 comments
Labels

Comments

@pveentjer
Copy link

pveentjer commented Aug 12, 2021

I'm using the following command to insert data:

go/bin/scylla-bench  -workload sequential -mode write -partition-count 10000 -nodes 172.31.24.133  2>&1 | tee -a scylla-bench-12-08-2021_15-03-19.log

I can see in Scylla Monitor and on the commandline output that 22K operations/second are done.

  39s   21738   21738      0 1.4ms  1ms    885µs  819µs  786µs  754µs  735µs  
  40s   21803   21803      0 1.5ms  1.1ms  852µs  819µs  786µs  754µs  733µs  
  41s   21779   21779      0 1.5ms  1ms    885µs  819µs  786µs  754µs  734µs  
  42s   21714   21714      0 1.5ms  1ms    885µs  819µs  786µs  754µs  736µs  
  43s   21682   21682      0 1.4ms  1ms    885µs  819µs  786µs  754µs  737µs  
  44s   21713   21713      0 1.6ms  1.3ms  918µs  819µs  786µs  754µs  736µs  
  45s   21766   21766      0 1.5ms  1.1ms  885µs  819µs  786µs  754µs  734µs  
  46s   21765   21765      0 1.4ms  1.1ms  885µs  819µs  786µs  754µs  734µs  

So this command should complete in half a second.

But in reality it runs for 46 seconds.

I also calculated the throughput of the inserts manually (so partition-count/time) and I get 210 inserts/second. When I increase to 20K or 30K items, the manually calculated throughput remains constant at 210 inserts/second.

So it seems there is roughly a factor of 100 difference between the manually calculated insertion rate and the writes/second listed by Scylla Monitor and scylla-bench.

@pveentjer pveentjer added the bug label Aug 12, 2021
@pveentjer
Copy link
Author

When I add clustering-row-count 1, the problem is resolved.

go/bin/scylla-bench -workload sequential -clustering-row-count 1 -mode write -partition-count 40000 -partition-offset 0 -nodes 172.31.24.133 2>&1 | tee -a scylla-bench-12-08-2021_15-39-25.log

Insertion of 20K is finished super quickly.

@michoecho
Copy link

So it seems there is roughly a factor of 100 difference between the manually calculated insertion rate and the writes/second listed by Scylla Monitor and scylla-bench.
When I add clustering-row-count 1, the problem is resolved.

That's because the default is -clustering-row-count 100, so 100 rows are inserted for each partition. You get 210 partition inserts per second, which equals to 21000 row inserts per second.

@pveentjer
Copy link
Author

Why not set the default to 1? This makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants