-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Implement data lifecycle policies #536
Comments
This handles the first part: openedx/aspects-dbt#42 however, I'm not sure if the main xapi table should be partitioned too? |
Yeah, especially that table should be on the list. It'll have to be another Alembic migration on that table, sadly. |
Should we also implement some clean-up workflow on the |
Hmm I suppose so, though I think it would make sense to have that be separate from the user data. Like if you want to keep user data for 2 yrs and course data for 5 I think we should support that. Maybe a setting per table, or some kind of table grouping. We should remember the vector tables as well. |
I said in the ADR that we'd be putting partitions into dbt instead of using alembic. Was I incorrect? |
@pomegranited it needs to be both since some tables aren't able to be managed in dbt (yet, we're working on it). |
The recommended approach by Altinitty is to use TTL with the setting |
@Ian2012 is pushing this forward in dbt-clickhouse here: ClickHouse/dbt-clickhouse#254 |
@Ian2012 can we consider this done now? |
Yes, it's |
Actually I need to reopen this since we didn't get to the documentation. @Ian2012 do you think you can write up a "concepts" doc in openedx-aspects describing the TTL, how it works, and the setting options? |
Sure |
We would like to make data lifecycle part management a default part of Aspects. Currently we are unable to gracefully age-off old data, creating some potential compliance issues and making it difficult to implement data best practices. Primarily this is because time-series tables are not partitioned, making ClickHouse deletes force a complete rewrite of the table. To enable lifecycle management we need to do the following:
The text was updated successfully, but these errors were encountered: