Performance impact of large number of collections #37594

jubingc · 2024-11-11T21:18:44Z

jubingc
Nov 11, 2024

A Milvus instance allows up to 65,536 collections. However, too many collections may result in performance issues.

I would like to better understand how the number of collections and partitions affects performance. Could you clarify what constitutes "too many" collections in this context?

As an example, the calculation below multiplies the number of collections, shards, and partitions. However, shards are primarily for data writing, while partitions and segments are used for data reading. Why are these elements multiplied together?

60 (collections) x 2 (shards) x 4 (partitions) + 40 (collections) x 1 (shard) x 12 (partitions) = 960

Additionally, per the documentation, the maximum number of partitions in a collection is 4,096 (with a default of 1,024, controlled by rootCoord.maxPartitionNum). Given a shared rootCoord.maxGeneralCapacity, which of the following configurations would likely yield better performance?

1,024 collections, 2 shards per collection, 16 partitions per collection = 32,768 general capacity
16 collections, 2 shards per collection, 1,024 partitions per collection = 32,768 general capacity

Beyond performance, I’d also appreciate insights into the pros and cons of each setup. Some drawbacks of the second setup I’m aware of include:
a. The recommended size for a partition is up to 1 billion items (reference).
b. There is currently no way to filter data within a partition quickly.

Are there additional pros or cons to consider for each of these configurations?

yhmo · 2024-11-12T02:41:54Z

yhmo
Nov 12, 2024
Collaborator

Take a look at this chart so understand how milvus manages the data in shards/partitions/collections/segments:

5 replies

yhmo Nov 12, 2024
Collaborator

Assume a collection has 2 shards, 2 partitions. To maintain the data in this collection, it maintains 2 virtual-channels for each partition. So, there will be 2 * 2 virtual channels to be managed. If there are 60 collections, and each collection has 2 shards, 4 partitions, there will be 60 * 2 * 4 virtual channel objects to be managed. Each single row of data passed via a virtual channel needs to be carefully recorded.

yhmo Nov 12, 2024
Collaborator

1,024 collections, 2 shards per collection, 16 partitions
16 collections, 2 shards per collection, 1,024 partitions

No much difference between the two cases, since there will be 1024 * 2 * 16 virtual channels need to be maintained.

Typically, it is recommended that the total number of virtual channels be controlled under 1000. In the v2.4.x, maybe v-channels number 5000 ~ 10000 also can work. Anyway, we don't recommend high-number v-channels.

jubingc Nov 12, 2024
Author

@yhmo thanks. That diagram is helpful. Is it available somewhere in the doc? Suppose we have more than 10K v-channel, which component would become the bottleneck first?

jubingc Nov 14, 2024
Author

@yhmo v-channel is for write performance, right? How about the read performance, do the following two options have the same read performance?

1,024 collections, 2 shards per collection, 16 partitions
16 collections, 2 shards per collection, 1,024 partitions

I have this confusion because of this sentence from the doc

The search performance of partition-oriented multi-tenancy is much better than collection-oriented multi-tenancy.

Why does partition-oriented multi-tenancy have better search performance than collection-oriented multi-tenancy? @xiaofan-luan

yhmo Nov 15, 2024
Collaborator

As @xiaofan-luan mentioned "partition is considered to be more light weight than collection". So, partition-oriented is better.

xiaofan-luan · 2024-11-12T19:31:05Z

xiaofan-luan
Nov 12, 2024
Maintainer

@yanliang567 is working on the effect of large number of the collections/partitions

The goal here is to support:
10000 collections with 4096 partitions
in oue cluster.

This could be part of milvus 2.5.X

4 replies

jubingc Nov 12, 2024
Author

@xiaofan-luan, regarding your mention that Milvus 2.5.X will support:

10,000 collections with 4,096 partitions in one milvus cluster

Does this imply that the total number of v-channels will be 10,000 * 4,096? From @yhmo's response above, it seems that the number of v-channels significantly impacts performance. If so, the performance calculation would depend primarily on the number of v-channels, determined by multiplying the number of partitions by the number of shards. Collections, by contrast, don’t appear to directly impact performance, except that partitions and shards are contained within individual collections and cannot cross collection boundaries.

Also, is there an estimated release timeline for Milvus 2.5.X?

xiaofan-luan Nov 12, 2024
Maintainer

collectition * shard * partition is the least segment size you get for a cluster.
For 1000 collection * 2 shard * 1000 partition, you get 2M segments (which could be huge and brings a lot of head on maintain it)
here is some trick:

for small collection(< 100m), only one shard is required
2.partition is considered to be more light weight than collection
But the partition number per collection is limited, we are gonna to test it, but right now our recommendation is less than 10K

So something we can try is to start from 16 collection * 1 shard * 1000 partition. This should work perfectly.

xiaofan-luan Nov 12, 2024
Maintainer

2.5 is gonna to released soon this week or next. We don't have those improvements yet. this is currently under evaluation and need to be improved

jubingc Nov 14, 2024
Author

@xiaofan-luan Thank you for the suggestion.

Beyond performance, we're also evaluating strategies for multi-tenancy. Our use case involves managing Milvus as a service for internal teams. Currently, we use collections for multi-tenancy, assigning each user to their own collection since usage patterns vary, and we can’t predict the collection sizes—users may ingest data unpredictably. Currently, we have around 150 collections, each with only one partition. All collections do not share the same schema, so partition-based multi-tenancy might not be an option.

If we were to use partitions for multi-tenancy, though, we’d likely encounter some limitations sooner, including:

All partitions in the same collection must share the same schema.
The maximum partition limit per collection (4,096) is significantly lower than the maximum number of collections (65,536).
Data filtering by partition would no longer be an option (a tenant could have multiple partitions, which adds management overhead).
Partition size limitations (1B records) could cause users to reach size limits earlier than if they used collections.

Any suggestions would be helpful. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance impact of large number of collections #37594

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 9 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Performance impact of large number of collections #37594

jubingc Nov 11, 2024

Replies: 2 comments · 9 replies

yhmo Nov 12, 2024 Collaborator

yhmo Nov 12, 2024 Collaborator

yhmo Nov 12, 2024 Collaborator

jubingc Nov 12, 2024 Author

jubingc Nov 14, 2024 Author

yhmo Nov 15, 2024 Collaborator

xiaofan-luan Nov 12, 2024 Maintainer

jubingc Nov 12, 2024 Author

xiaofan-luan Nov 12, 2024 Maintainer

xiaofan-luan Nov 12, 2024 Maintainer

jubingc Nov 14, 2024 Author

jubingc
Nov 11, 2024

Replies: 2 comments 9 replies

yhmo
Nov 12, 2024
Collaborator

yhmo Nov 12, 2024
Collaborator

yhmo Nov 12, 2024
Collaborator

jubingc Nov 12, 2024
Author

jubingc Nov 14, 2024
Author

yhmo Nov 15, 2024
Collaborator

xiaofan-luan
Nov 12, 2024
Maintainer

jubingc Nov 12, 2024
Author

xiaofan-luan Nov 12, 2024
Maintainer

xiaofan-luan Nov 12, 2024
Maintainer

jubingc Nov 14, 2024
Author