-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SI-2749-first-and-last-agg #47
Conversation
Validate the assumption. I was afraid so I suggested something else. |
Early experiments seem pretty good. |
@KevinJoiner If this does work then we can probably replace the argmaxes for latest signals. |
internal/service/ch/queries.go
Outdated
firstGroup = "first_value(" + vss.ValueNumberCol + ")" | ||
lastGroup = "last_value(" + vss.ValueNumberCol + ")" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't think we want to use these since they are not deterministic, since first_value
is an alias for any
danger
The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate. To get a determinate result, you can use the min or max function instead of any.
-- https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference/any
anyLast
Selects the last value encountered. The result is just as indeterminate as for the any function.
--https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference/anylast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the first_value
doc
As with any, without Window Functions the result will be random if the source stream is not ordered and the return type matches the input type
Maybe the bucket here is an ordered stream (inheriting some ordering from the index). It seems to behave that way.
I’m nervous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, this is what I was referencing when initially thinking that the source was ordered https://github.com/DIMO-Network/telemetry-api/blob/main/internal/service/ch/queries.go#L316
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s ordering the buckets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ordered based on the signal table's create statement, but I don't love relying on that. max and min seem safest unless we hit some performance problems
I think you need to rebase with main to get CI working. |
Return the first and last values, chronologically, that appear in the bucket
ex:
Note:
insertTestData
to make sure that the first string value and the last string value were distinct so that the test was more clear. This also required me to update the expected response forStringAggregationUnique