-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitioned scan for RegionEngine #3886
Labels
C-performance
Category Performance
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What type of enhancement is this?
API improvement
What does the enhancement do?
The
RegionEngine
trait provides ahandle_query()
method to scan a region and returns a stream ofRecordBatch
.greptimedb/src/store-api/src/region_engine.rs
Lines 136 to 140 in d997463
This method is easy to use but has some limitations:
To maximize parallelism in #2806, the engine should provide a way to return multiple streams to scan different partitions of a region concurrently.
Implementation challenges
This issue proposes to add a new method to the region engine which supports partitioned scan. The method returns a trait object that can create a stream according to a partition index.
We could then use the scanner to implement a
PhysicalPlan
and let the query engine process multiple partitions. We might need to refactor theStreamScanAdapter
as it assumes there is only one partition.greptimedb/src/table/src/table/scan.rs
Lines 103 to 119 in a6a702d
The text was updated successfully, but these errors were encountered: