Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ScaleLimitsAnalyzer using bluejay to function-runner #351

Merged
merged 3 commits into from
Sep 6, 2024

Conversation

mssalemi
Copy link
Contributor

@mssalemi mssalemi commented Sep 5, 2024

#gsd:37875

Addresses:

Description

Current scaling implementation: https://github.com/Shopify/shopify/pull/518219/files#diff-bef9ccfcca30dfc391c6ad9d0a3e0739bf8a2ab27e659473894e4e62d43bad08

We currently scale our function run limits based on the directive scaleLimits with an argument rate on cartLines. We scale by the length of the field. We multiply the directive rate found on the fields definition, by the length of that fields (seem by either array length or string length) to determine how much to scale limits our default limits by.

This PR adds an analyzer that will determine the maximum scale factor from all fields with the scale limits directive on it. We chose to use this implementation because we want to ability to add directives for scale limits of different fields in the future. Currently we only scale on cart lines.

Notes on Visitor implementation:

  • The ScaleLimitsAnalyzer is used to dynamically adjust resource limits for function executions based on data provided in GraphQL queries. It does this by analyzing the query and associated input JSON, calculating a scaling factor based on the lengths of fields specified in the query that have an associated @scaleLimits directive.

Avoiding Double Counting
A key feature of the ScaleLimitsAnalyzer is its ability to avoid double counting fields when calculating the scale factor. This is crucial for ensuring that the scaling adjustments are accurate and reflect the true resource needs of the function execution. Here’s how double counting is avoided:

Path Tracking: The analyzer uses a path_stack to keep track of the current path within the query during traversal. This helps in uniquely identifying each field's location within the query structure.

Unique Path Identification: Each field's path, combined with its index in case of arrays, is used to create a unique identifier (PathWithIndex). This ensures that each instance of a field, even if the field name is the same, is treated uniquely based on its context and position in the input data.

Rate Aggregation: When calculating the scale factor, the analyzer aggregates rates using these unique identifiers. If a field appears multiple times in different parts of the query or within nested structures, each occurrence is treated independently based on its unique path and context.

Maximal Increment Strategy: For fields that appear multiple times in the same context (e.g., duplicated fields in a query), the analyzer only considers the maximum scale increment calculated for these fields. This approach ensures that the scale factor reflects the maximum resource requirement for repeated fields without cumulatively adding the increments.

Tophat

  • can be seen via the tests, it will be implemented in the next PR.

Cargo.toml Outdated
@@ -34,6 +34,9 @@ rust-embed = "8.5.0"
rmp-serde = "1.3"
is-terminal = "0.4.13"
wasmprof = "0.7.0"
bluejay-parser = { git = "https://github.com/Shopify/bluejay.git", rev = "c7e7c2bfb73c7b4869aa8569c15cd3c4eb48b8bf", features = ["format-errors"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update after new version of bluejay is released.

@mssalemi mssalemi changed the title Adds bluejay analyzer Add ScaleLimitsAnalyzer using bluejay to function-runner Sep 5, 2024
@mssalemi mssalemi marked this pull request as ready for review September 5, 2024 13:12
src/bluejay_schema_analyzer.rs Outdated Show resolved Hide resolved
src/bluejay_schema_analyzer.rs Show resolved Hide resolved
src/scale_limits_analyzer.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@adampetro adampetro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Cargo.toml Show resolved Hide resolved
Copy link
Member

@jacobsteves jacobsteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Is #349 still an issue with your new double counting strategy? I think so?

@mssalemi
Copy link
Contributor Author

mssalemi commented Sep 6, 2024

PathWithIndex

No, we can close that I believe (will verify), added a PathWithIndex in the scale limits analyzer that we can use to uniquely identify fields with same name to avoid the overcounting :)

I added a test for it in the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants