-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: table sharding in user guide #1050
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Reviews pausedUse the following commands to manage reviews:
WalkthroughThe update introduces a user guide for table sharding in GreptimeDB, explaining its role in distributed databases, providing guidelines for sharding tables, and detailing partition rules based on column value ranges. It also adds a new section for table-sharding in the Changes
Sequence Diagram(s)N/A Assessment against linked issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
Deploying greptime-docs with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (2)
docs/nightly/en/user-guide/operations/table-sharding.md (2)
1-5
: Enhance the introduction by mentioning the benefits.The introduction is clear, but adding a brief mention of the benefits of table sharding, such as improved performance and scalability, would provide better context for readers.
- Table sharding can achieve better performance and scalability. + Table sharding can achieve better performance, load balancing, and scalability.
7-10
: Improve sentence structure and punctuation.The sentence structure can be improved for better readability, and a comma is missing after "Thus".
- Thus when you have a table that is too large to fit into a single node, or the table is too hot to be served by a single node, you should consider sharding it. + Thus, when you have a table that is too large to fit into a single node, or the table is too hot to be served by a single node, you should consider sharding it.Tools
LanguageTool
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
docs/public/table-sharding-partition.png
is excluded by!**/*.png
Files selected for processing (3)
- docs/nightly/en/contributor-guide/frontend/table-sharding.md (1 hunks)
- docs/nightly/en/summary.yml (1 hunks)
- docs/nightly/en/user-guide/operations/table-sharding.md (1 hunks)
Files skipped from review due to trivial changes (2)
- docs/nightly/en/contributor-guide/frontend/table-sharding.md
- docs/nightly/en/summary.yml
Additional context used
LanguageTool
docs/nightly/en/user-guide/operations/table-sharding.md
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[uncategorized] ~65-~65: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
GitHub Check: Run Linters
docs/nightly/en/user-guide/operations/table-sharding.md
[warning] 26-26:
"partitons" should be "partitions".
[warning] 65-65:
"separatly" should be "separately".
Additional comments not posted (1)
docs/nightly/en/user-guide/operations/table-sharding.md (1)
11-25
: LGTM!The explanation of the partition rule set is clear and accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (3)
docs/nightly/en/user-guide/operations/table-sharding.md (3)
9-9
: Add a comma after 'Thus'.A comma may be missing after the conjunctive/linking adverb ‘Thus’.
- Thus when you have a table that is too large to fit into a single node, or the table is too hot to be served by a single node, you should consider sharding it. + Thus, when you have a table that is too large to fit into a single node, or the table is too hot to be served by a single node, you should consider sharding it.Tools
LanguageTool
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
64-64
: Possible missing preposition 'to'.A preposition may be missing before 'stress'.
- Two things need stress in this complex example: + Two things need to stress in this complex example:Tools
LanguageTool
[uncategorized] ~64-~64: Possible missing preposition found.
Context: ...harding-partition.png) Two things need stress in this complex example: - Each column ...(AI_HYDRA_LEO_MISSING_TO)
65-65
: Add a comma after 'Hence'.A comma may be missing after the conjunctive/linking adverb ‘Hence’.
- Hence you can write whatever complex rule you want without considering the "primary key order" or "physical storage order". + Hence, you can write whatever complex rule you want without considering the "primary key order" or "physical storage order".Tools
LanguageTool
[uncategorized] ~65-~65: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
docs/public/table-sharding-partition.png
is excluded by!**/*.png
Files selected for processing (3)
- docs/nightly/en/contributor-guide/frontend/table-sharding.md (1 hunks)
- docs/nightly/en/summary.yml (1 hunks)
- docs/nightly/en/user-guide/operations/table-sharding.md (1 hunks)
Files skipped from review due to trivial changes (2)
- docs/nightly/en/contributor-guide/frontend/table-sharding.md
- docs/nightly/en/summary.yml
Additional context used
LanguageTool
docs/nightly/en/user-guide/operations/table-sharding.md
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[uncategorized] ~64-~64: Possible missing preposition found.
Context: ...harding-partition.png) Two things need stress in this complex example: - Each column ...(AI_HYDRA_LEO_MISSING_TO)
[uncategorized] ~65-~65: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- docs/nightly/en/user-guide/operations/table-sharding.md (1 hunks)
Additional context used
LanguageTool
docs/nightly/en/user-guide/operations/table-sharding.md
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[uncategorized] ~26-~26: Possible missing comma found.
Context: ... name and literals. Here is a concrete example that shardmy_table
into 3 partitions...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~65-~65: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
Additional comments not posted (4)
docs/nightly/en/user-guide/operations/table-sharding.md (4)
1-4
: LGTM!The introduction section explains the importance and benefits of table sharding in GreptimeDB clearly and concisely.
41-58
: LGTM!The section provides a clear example of partitioning based on multiple primary key columns, and the illustration is helpful.
59-63
: LGTM!The section provides a clear illustration of the partition rule in a 2-dimensional space, which is helpful.
10-26
: Fix missing comma.A comma is missing after "example".
- Here is a concrete example that shard `my_table` into 3 partitions based on column `a`: + Here is a concrete example that sharded `my_table` into 3 partitions based on column `a`:Likely invalid or redundant comment.
Tools
LanguageTool
[uncategorized] ~26-~26: Possible missing comma found.
Context: ... name and literals. Here is a concrete example that shardmy_table
into 3 partitions...(AI_HYDRA_LEO_MISSING_COMMA)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
docs/public/table-sharding-load.png
is excluded by!**/*.png
Files selected for processing (1)
- docs/nightly/en/user-guide/operations/table-sharding.md (1 hunks)
Additional context used
LanguageTool
docs/nightly/en/user-guide/operations/table-sharding.md
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[typographical] ~11-~11: Consider adding a comma after ‘Ideally’ for more clarity.
Context: ...ase the number of regions in the table. Ideally the overall throughput of a table shoul...(RB_LY_COMMA)
[grammar] ~13-~13: Make sure the noun ‘requirement’ is in agreement with the verb ‘ingest’. Beware that some collective nouns (like ‘police’ or ‘team’) can be treated as both singular and plural.
Context: ...eed to consider the requirement of data ingest rate, the query performance, the data d...(DT_NN_OF_NNS_VB)
[grammar] ~13-~13: The word ‘shard’ is a noun or an adjective. A verb or adverb is missing or misspelled here, or maybe a comma is missing.
Context: ...tribution on storage system. You should shard a table only when necessary. ## Partit...(PRP_MD_NN)
[uncategorized] ~30-~30: Possible missing comma found.
Context: ... name and literals. Here is a concrete example that shardmy_table
into 3 partitions...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~49-~49: Possible missing comma found.
Context: ...we want to partition the table for some reason like one region's capacity is 30 load u...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~72-~72: Possible missing preposition found.
Context: ...harding-partition.png) Two things need stress in this complex example: - Each column ...(AI_HYDRA_LEO_MISSING_TO)
[uncategorized] ~73-~73: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[style] ~78-~78: Try using a synonym here to strengthen your writing.
Context: ...information-schema/partitions.md) which gives the detail of partitions inside one tab...(GIVE_PROVIDE)
GitHub Check: Run Linters
docs/nightly/en/user-guide/operations/table-sharding.md
[warning] 13-13:
"increse" should be "increase".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Outside diff range, codebase verification and nitpick comments (1)
docs/nightly/en/user-guide/operations/table-sharding.md (1)
11-11
: Add a comma for clarity.Consider adding a comma after ‘Ideally’ for more clarity.
- Ideally the overall throughput of a table should be proportional to the number of regions. + Ideally, the overall throughput of a table should be proportional to the number of regions.Tools
LanguageTool
[typographical] ~11-~11: Consider adding a comma after ‘Ideally’ for more clarity.
Context: ...ase the number of regions in the table. Ideally the overall throughput of a table shoul...(RB_LY_COMMA)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- docs/nightly/en/user-guide/operations/table-sharding.md (1 hunks)
Additional context used
LanguageTool
docs/nightly/en/user-guide/operations/table-sharding.md
[uncategorized] ~9-~9: This verb may not be in the correct tense. Consider changing the tense to fit the context better.
Context: ...ed on the region level. And each region is corresponding to a table partition. Thus when you hav...(AI_EN_LECTOR_REPLACEMENT_VERB_TENSE)
[uncategorized] ~9-~9: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ... is corresponding to a table partition. Thus when you have a table that is too large...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[typographical] ~11-~11: Consider adding a comma after ‘Ideally’ for more clarity.
Context: ...ase the number of regions in the table. Ideally the overall throughput of a table shoul...(RB_LY_COMMA)
[uncategorized] ~13-~13: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...sed in parallel among regions. In other word the query latency is depends on the "sl...(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~13-~13: A comma might be missing here.
Context: ...n parallel among regions. In other word the query latency is depends on the "slowes...(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
[grammar] ~13-~13: The verb form seems incorrect.
Context: ...egions. In other word the query latency is depends on the "slowest" region's latency. But...(IS_VBZ)
[grammar] ~15-~15: Make sure the noun ‘requirement’ is in agreement with the verb ‘ingest’. Beware that some collective nouns (like ‘police’ or ‘team’) can be treated as both singular and plural.
Context: ...eed to consider the requirement of data ingest rate, the query performance, the data d...(DT_NN_OF_NNS_VB)
[grammar] ~15-~15: The word ‘shard’ is a noun or an adjective. A verb or adverb is missing or misspelled here, or maybe a comma is missing.
Context: ...tribution on storage system. You should shard a table only when necessary. ## Partit...(PRP_MD_NN)
[uncategorized] ~32-~32: Possible missing comma found.
Context: ... name and literals. Here is a concrete example that shardmy_table
into 3 partitions...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~51-~51: A comma might be missing here.
Context: ...we want to partition the table for some reason like one region's capacity is 30 load u...(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
[uncategorized] ~74-~74: Possible missing preposition found.
Context: ...harding-partition.png) Two things need stress in this complex example: - Each column ...(AI_HYDRA_LEO_MISSING_TO)
[uncategorized] ~75-~75: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...le: - Each column in the partition rule are evaluated separately, this is diffe...(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)
[uncategorized] ~75-~75: A comma may be missing after the conjunctive/linking adverb ‘Hence’.
Context: ...onal storage system like MySQL or TiKV. Hence you can write whatever complex rule you...(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)
[uncategorized] ~80-~80: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...le GreptimeDB provides severals system table to check DB's state. For table sharding...(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)
[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...provides severals system table to check DB's state. For table sharding information...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[style] ~80-~80: Try using a synonym here to strengthen your writing.
Context: ...information-schema/partitions.md) which gives the detail of partitions inside one tab...(GIVE_PROVIDE)
GitHub Check: Run Linters
docs/nightly/en/user-guide/operations/table-sharding.md
[warning] 15-15:
"increse" should be "increase".
Additional comments not posted (4)
docs/nightly/en/user-guide/operations/table-sharding.md (4)
19-26
: LGTM!The SQL syntax for partitioning tables is clear and well-written.
28-31
: LGTM!The description of the syntax for partitioning tables is clear and well-written.
69-73
: LGTM!The illustration of the partition rule set is clear and well-written.
5-5
: Fix typographical error.The phrase "the when and how" should be "when and how".
- This document will describe the when and how to configure table sharding. + This document will describe when and how to configure table sharding.Likely invalid or redundant comment.
@coderabbitai pause |
Actions performedReviews paused. |
@coderabbitai resolve |
Actions performedComments resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and when @killme2008 or @fengjiachun is free, it is better to confirm all the description is correct.
@@ -76,6 +76,7 @@ | |||
- admin | |||
- configuration | |||
- capacity-plan | |||
- table-sharding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not a part of the operation but table management.
As for which specific partition column to use or how many regions to create, it depends on the data distribution and the query pattern. A general goal is to make the data distribution among regions as even as possible. And the query pattern should be considered when designing the partition rule set as one query can be processed in parallel among regions. In other word the query latency is depends on the "slowest" region's latency. | ||
|
||
But notice that the increase of regions will bring some basic consumption and increase the complexity of the system. You need to consider the requirement of data ingest rate, the query performance, the data distribution on storage system. You should shard a table only when necessary. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include a diagram illustrating how tables, regions, frontends, and datanodes collaborate.
``` | ||
|
||
The syntax mainly consists of two parts: | ||
- `PARTITION ON COLUMNS` followed by a comma-separated list of column names, which specifies which columns might be used for partitioning. The partition list specified here is only used as an "allow list", and in reality only a portion of the columns specified here will be used for partitioning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note? The partition columns should be in primary key constraint.
And we want to partition the table for some reason like one region's capacity is 30 load unit. So we'll need 6 partitions with each has similar load. One possible partition rule set is: | ||
|
||
```sql | ||
CREATE TABLE my_table ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to change the table name.
|
||
## Inspect a sharded table | ||
|
||
GreptimeDB provides severals system table to check DB's state. For table sharding information, you can query [`information_schema.partitions`](../../reference/sql/information-schema/partitions.md) which gives the detail of partitions inside one table, and [`information_schema.region_peers`](../../reference/sql/information-schema/region-peers.md) which gives the runtime distribution of regions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check regions state?
Co-authored-by: Yiran <cuiyiran3@gmail.com>
hold this for repartition feature |
may be we can write a new PR for the repartition feature |
What's Changed in this PR
Closes #1046
Describe the when and how to shard a table
Checklist
summary.yml
matches the current document structure when you changed the document structure.Summary by CodeRabbit
table-sharding
in the summary of topics.