How to increase the Milvus Serving QPS? #38060
Replies: 4 comments 22 replies
-
I guess the bottleneck is at the EBS disk, 5000 IOPS, 250MB/s throughput? |
Beta Was this translation helpful? Give feedback.
-
You can try increasing the queryNode.grouping.maxNQ value in the milvus.yaml:
This configuration controls the requests merging behavior of query nodes. It merges small requests into one request to execute to improve the throughput, but each request latency also increases. |
Beta Was this translation helpful? Give feedback.
-
I got sth new:
Which is more likely the root cause? @xiaofan-luan |
Beta Was this translation helpful? Give feedback.
-
So I'm trying to use the HNSW + MMAP + local nvme SSD disk, and IVFSQ8 + MMAP + local nvme SSD disk:
I'm surprised by these numbers. I thought HNSW would be faster? @xiaofan-luan |
Beta Was this translation helpful? Give feedback.
-
I have a Milvus distributed cluster (2.4.17), the data is using partition key, and clustering key. There is ~30m 1536 dimensional vectors. The collection is using HNSW + MMAP + AWS GP3 EBS disk. (5000 IOPS, 250mb/s throughput)
I launched 16 threads to continuously send top100 searches to this collection, using partition key and clustering key as the filtering condition, I got following results:
I'm struggling to keep increasing the query rate and want to know what's the bottleneck.
The cluster has only 1 query node (15 Cores, 120 GB Mem + 1TB GP3 EBS Volume disk)
Per dashboard, the query node CPU utilization is around 20-40%, the proxy node CPU utilization is ~15%. So seems like nothing is throttled.
Tried to increase the number of query nodes and proxy nodes. But it doesn't help increase the QPS.
May I know what should I do to increase the QPS to >10k?
Beta Was this translation helpful? Give feedback.
All reactions