You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cortadocodes opened this issue
Apr 11, 2024
· 0 comments
Assignees
Labels
decision neededA decision is required (e.g. on UX or company policy)tech-debtTechnical debt (tidy up, refactoring, restructuring, caused by laziness now)
We need to decide which fields to cluster on in the BigQuery event store and whether to pull the event kind out as a column.
Current state
The event kind is stored in the event JSON field and is queryable but cannot be ordered by (I don't think we need to order by it). We're currently clustering on ["sender", "question_uuid"] in that order. Clustering is order-dependent on the filtered fields and must include the fields of higher priority (to the left) of a clustered field to take advantage of the clustering.
@thclark says: "We’d need to cluster on event_kind otherwise you’d have to process (for example) all the log rows every time you want to query for input or output values (remember it’s column based storage so the filters aren’t like conventional SQL, it’ll process all rows in order to apply a filter). Also, regardless of clustering I think (??) it may be more efficient to filter directly on a column than on a JSONField."
Proposed Solution
Discuss and choose:
Whether to pull the event kind out as a field
The fields to cluster on and in what order
The text was updated successfully, but these errors were encountered:
decision neededA decision is required (e.g. on UX or company policy)tech-debtTechnical debt (tidy up, refactoring, restructuring, caused by laziness now)
Feature request
Use Case
We need to decide which fields to cluster on in the BigQuery event store and whether to pull the event kind out as a column.
Current state
The event kind is stored in the
event
JSON field and is queryable but cannot be ordered by (I don't think we need to order by it). We're currently clustering on["sender", "question_uuid"]
in that order. Clustering is order-dependent on the filtered fields and must include the fields of higher priority (to the left) of a clustered field to take advantage of the clustering.@thclark says: "We’d need to cluster on event_kind otherwise you’d have to process (for example) all the log rows every time you want to query for input or output values (remember it’s column based storage so the filters aren’t like conventional SQL, it’ll process all rows in order to apply a filter). Also, regardless of clustering I think (??) it may be more efficient to filter directly on a column than on a JSONField."
Proposed Solution
Discuss and choose:
The text was updated successfully, but these errors were encountered: