-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(profiles): Add a dataset to store profile chunks #5895
feat(profiles): Add a dataset to store profile chunks #5895
Conversation
This PR has a migration; here is the generated SQL -- start migrations
-- forward migration profile_chunks : 0001_create_profile_chunks_table
Local op: CREATE TABLE IF NOT EXISTS profile_chunks_local (project_id UInt64, profiler_id UUID, chunk_id UUID, start_timestamp DateTime64(6) CODEC (DoubleDelta), end_timestamp DateTime64(6) CODEC (DoubleDelta), retention_days UInt16, partition UInt16, offset UInt64) ENGINE ReplicatedReplacingMergeTree('/clickhouse/tables/profile_chunks/{shard}/default/profile_chunks_local', '{replica}') ORDER BY (project_id, profiler_id, start_timestamp, cityHash64(chunk_id)) PARTITION BY (retention_days, toStartOfDay(start_timestamp)) SAMPLE BY cityHash64(chunk_id) TTL toDateTime(end_timestamp) + toIntervalDay(retention_days) SETTINGS index_granularity=8192;
Distributed op: CREATE TABLE IF NOT EXISTS profile_chunks_dist (project_id UInt64, profiler_id UUID, chunk_id UUID, start_timestamp DateTime64(6) CODEC (DoubleDelta), end_timestamp DateTime64(6) CODEC (DoubleDelta), retention_days UInt16, partition UInt16, offset UInt64) ENGINE Distributed(`cluster_one_sh`, default, profile_chunks_local, cityHash64(profiler_id));
-- end forward migration profile_chunks : 0001_create_profile_chunks_table
-- backward migration profile_chunks : 0001_create_profile_chunks_table
Distributed op: DROP TABLE IF EXISTS profile_chunks_dist;
Local op: DROP TABLE IF EXISTS profile_chunks_local;
-- end backward migration profile_chunks : 0001_create_profile_chunks_table |
34e88bb
to
57c232b
Compare
57c232b
to
81a4671
Compare
81a4671
to
ba31b76
Compare
0dc7a81
to
45b9c76
Compare
45b9c76
to
8d2643b
Compare
8d2643b
to
b9b425c
Compare
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me 👍
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
snuba/snuba_migrations/profile_chunks/0001_create_profile_chunks_table.py
Outdated
Show resolved
Hide resolved
Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time. ❌ Failed Test Results:Completed 286 tests with View the full list of failed tests
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment
MigrationGroup.PROFILE_CHUNKS: _MigrationGroup( | ||
loader=ProfileChunksLoader(), | ||
storage_sets_keys={StorageSetKey.PROFILE_CHUNKS}, | ||
readiness_state=ReadinessState.PARTIAL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the ClickHouse cluster already exist in SaaS (US + DE) and S4S? If the table is suppose to live the profiles cluster, has the storage_set
been added there? If this isn't done yet, we should set the readiness state to ReadinessState.LIMITED
for now so that GoCD doesn't try to run these migrations on a cluster that doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ClickHouse cluster we'd use exists in all environments (us
, de
, s4s
, all STs).
I also have https://github.com/getsentry/ops/pull/10526 to configure it for all environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which cluster are you using @phacops ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster dedicated to profiling.
PR reverted: 64c044e |
This reverts commit 6dfa3a7. Co-authored-by: volokluev <3169433+volokluev@users.noreply.github.com>
This will add a new dataset for profile chunks in order to support our continuous profiling feature (receiving profile chunk by chunk and not one profile per transaction).
The SDK will start a profiler session, identified by
profiler_id
, will profile and send a chunk with containing this profile ID. It will also tag spans collected with this profiler ID.Later on, to fetch a profile for a span, we'll receive a profile ID and start and stop timestamps for the span and, using this dataset, we'll query the chunk IDs necessary to assemble the profile for that span, with a query looking like this:
Since all our queries will contain a date range on both
start_timestamp
andend_timestamp
, I think it's useful to have them in the sort key.chunk_id
appears there in order to make it work with aReplacingMergeTree
since this will guaranteed uniqueness of rows, even though having 2 chunks for the same profiler session and timestamps would be a bug.We're using the
DateTime64
type to be able to store sub-millisecond precision and now have 2 different fields. I add support for that field in #5896.This PR (https://github.com/getsentry/ops/pull/10526) is related as it adds the necessary config for
StorageSetKey.PROFILE_CHUNKS
to the profiling cluster in every environment.