-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving where tableversion data is stored #11
Comments
All possible and makes sense but that's a large change. What's the driver? |
Potential future issues. Joe is currently backing up AIMS and ROADS schemas independently. If we ever expanded that to include admin boundaries or turned on table versioning for any AIMS or ROADS tables we'd run into this problem, |
Also, @dwsilk just said roads is currently versioned. So would be a difficult to resolve now if we had to do a restore for roads. Maybe there's other ways to get around this problem? A ver_repair() function that can repair the restored table to some specified date or version number? |
ProposalPlace versioned table specific objects into the same schema as the source versioned table. This will be done by creating a table specific metadata table to record the revision metadata (IDs, descriptions, dates), moving the location of the revision data table and data access functions. This additional metadata table is required to remove the requirement for the global table_version.revision tables. The new layout would like something like:
These 3 tables can then be backed up and restored independently. I also propose that the table_version.versioned_tables table gets removed and a view is created in it's place. This view definition would then look up the database table catalogue and return tables with the NOTE: This is a major change that will result in a new major version release i.e 2.0 ChangesTo support isolated backups of versioned table means we no longer support global version ID management. One of the main side affects of this change is we can no longer support functions calls such as: table_version.ver_get_modified_tables(p_revision integer)
table_version.ver_get_revision(...)
table_version.ver_get_revisions(...)
table_version.ver_create_revision()
table_version.ver_get_last_revision()
table_version.ver_complete_revision()
table_version.ver_delete_revision() The following new table specific functions will be created: table_version.ver_create_revision(p_schema name, p_table name, p_comment text, p_revision_time timestamp DEFAULT now(), p_schema_change boolean DEFAULT false) RETURNS BOOL
table_version.ver_complete_revision(p_schema name, p_table name, p_comment text, p_revision_time timestamp DEFAULT now(), p_schema_change boolean DEFAULT false) RETURNS BOOL
table_version.ver_get_table_revisions(p_schema name, p_table name) RETURNS TABLE(..) However the Also to keep table specific objects in the same schema the table revision and diff data access function will be renamed and moved: {schema}.{table}_get_table_diff(p_revision1 integer, p_revision2 integer) RETURNS TABLE(..)
{schema}.{table}_get_table_revision(p_revision integer) RETURNS TABLE(..) ExampleCreate table foo.bar: CREATE SCHEMA foo;
CREATE TABLE foo.bar
(
id INTEGER NOT NULL PRIMARY KEY (id),
baz TEXT
);
SELECT table_version.ver_enable_versioning('foo', 'bar'); This will then create the following auxiliary tables: CREATE TABLE foo.bar___revision
(
_revision_created INTEGER NOT NULL, -- references foo.bar___revision_metadata.id but not enforced as foreign key due to performance
_revision_expired INTEGER, -- references foo.bar___revision_metadata.id but not enforced as foreign key due to performance
id INTEGER NOT NULL,
baz TEXT,
CONSTRAINT "pkey_foo_bar_revision" PRIMARY KEY (_revision_created, id)
);
CREATE TABLE foo.bar___revision_metadata
(
id serial NOT NULL PRIMARY KEY (id),
revision_time TIMESTAMP NOT NULL DEFAULT now(),
start_time TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
user_name TEXT NOT NULL DEFAULT "current_user"(),
comment TEXT,
); and the following specific table specific access functions: foo.bar_get_table_diff(p_revision1 integer, p_revision2 integer) RETURNS TABLE(..)
foo.bar_get_table_revision(p_revision integer) RETURNS TABLE(..) and the following trigger and function: CREATE TRIGGER foo_bar_revision_trg
AFTER INSERT OR UPDATE OR DELETE
ON foo.bar
FOR EACH ROW
EXECUTE PROCEDURE foo.bar_revision_changes(); Creating and completing a revision will now be done with a specific table call: SELECT table_version.ver_create_revision('foo', 'bar', 'My revision xxx');
...
do some edits
...
SELECT table_version.ver_complete_revision('foo', 'bar'); |
This looks great for simplifying back-up / restore on individual tables and avoiding the creation of a single schema with mixed At the moment, it is convenient that a revision can relate to multiple tables because
With this change, a single 'revision' to a related set of data will require a Have I got that right? |
Thanks @dwsilk interesting. I think we have conflict in feature requests and I can't see a nice simple implementation that can support independent table backups and multiple table single revision management. Do you have an application requirement for this multi-table functionality right now? If so you could solve that problem at the per application level by recording the last revision id per table in some sort of set. If revisions are created and managed in a transactions you shouldn't lose any functionality. |
Yes, that would work. I do have this requirement now and would need to make changes simultaneous to release of table_version 2.0 to production. |
Would need to test it on the test server, the same as for BDE processor and Roads. Other than that I don't think it should have an effect on AIMS if it upgrades properly and is tested? |
Ok I will roll out the new version in a branch, get @imincik to package it up for the testing PPA and leave it up to you to test and deployment to production. |
Actually I think this might have an affect on the AIMS backend code. I think there are stored procedures that use the table_version.ver_xxx API. We should review this, but it should only be a simple change. |
So each versioned table will have these additional objects within the same schema: {schema}.{table}____revision
{schema}.{table}____revision_metadata
{schema}.{table}_get_table_diff(p_revision1 integer, p_revision2 integer) RETURNS TABLE(..)
{schema}.{table}_get_table_revision(p_revision integer) RETURNS TABLE(..) I wonder if it would be cleaner if these objects were still stored in a separate schema It would also be nice if these two functions could be dynamic and remain on the {schema}.{table}_get_table_diff(p_revision1 integer, p_revision2 integer) RETURNS TABLE(..)
{schema}.{table}_get_table_revision(p_revision integer) RETURNS TABLE(..) But seems that is not really possible e.g.
Aside from those fairly minor points, it would be good to see these changes go ahead. |
Thanks for the comment @dwsilk 👍
This somewhat seems to defeat the purpose that all the table and revision data is in an easy to backup place with a tool like pg_dump. I believe it's quite common to do backup by schema for example. In stating this I did considered another other design option of per schema revisioning to support application level versioning and tracking changes in one logical group across a revision (solves your initial stated issue/requirement). In this design, like the current state, the version IDs would be shared across all tables within the schema. In addition there would be a group difference function would return the changes across all tables within that revision. This design of course means that consistent backups can only occur at the schema level. One of the issues of this design is the group revision difference function would likely cause performance and semantic calling issues. So I went for simple per table version model that allowed the lowest level of granularity for revisioning and consistent backup. Maybe within the removal group revision difference function from this design I'm open either way - but I do want the extension to remain general purpose. @imincik do you have a view? From my understand of our current extension usage a per schema revsioning design is ok.
Do you have a specific reason for still wanting the functions in the table_version schema?
Unfortunately it's not, without having a function return a |
@palmerj did you mean to close this? I agree with the per table model instead of per schema / application. While the other option you considered suits the
The proposal makes it easy to backup both table and revision data, not so easy to separate them. So I'm saying that if you turned on revisioning for tables in With the proposed design, the same thing would require an There may also be use cases for backing up current data and revision data at different frequencies, particularly with large schema where revision data grows rapidly. A more modular design would be helpful for that too? It doesn't worry me too much - it's just a big design change so want to make sure that it is well considered.
Just for the same reason that it's nice that we can have one |
Nope sorry! |
Ok great.
I'm not sure of the requirement to separate the main table from the revision data table? If the functional requirement to enable revisioning on a table has been made then you dump both. The main table will also have trigger references to the
I really only see dumping per table (three per revisioned table in this case), per schemas or per databases as options in operational databases.
I don't really understand the requirement for this.
IMHO if the table is growing rapidly then more disk is required. I don't think we should get into the game of designing a complex solution for this. Partitioning (with some data going to cheaper disk) or some sort of bespoke old data purpose could be considered. But really the way the revisioning data model is designed it's pretty hard to archive old versions.
Yip would be nice but not possible. |
There are a couple of things that come to mind.
I still think this is pretty minor, and would be happy to see the proposal go ahead if you feel that it is the better approach.
OK cool - sounds sensible. I haven't had much exposure to that part of database management. |
Hi @imincik do you have a reason for liking this idea? Is that to remove duplication of functions and metadata tables required under the current proposal? Do you like the idea of a group changeset function? I'm ok to implement this, but just wanted to understand your reasons a little more, so any new design proposal can accomodate the issues. |
@imincik could you please clarify, are you supporting:
..or something else? It's not clear to me. |
@dwsilk , I am voting for "Jeremy's Other Consideration"
Because it can allow us possibility to logically maintain, backup and replicate data based on schema. I can imagine few use cases for this. If you want to maintain your data in once, then you have only one schema. If you are creating more schemas, you want logically maintain your data in multiple groups. |
Ok. I will re-adjust the proposal as a wiki page and then get both of your reviews again. Cheers all! |
This issue has been automatically marked as stale as there has not been any activity for sometime. The issue will be closed in 14 days if no further activity. |
We need to review what we want to do with this change |
Not sure we have much of a driver to do this at this stage. Labelling as |
Not sure if this is best practice or a good idea, but would it be possible to move tableversion data into the schema of the table that has been versioned?
e.g. table version data, metadata about the table and functions specific to that table to the schema that the table originates from.
The main benefit would be to allow backup/restores of individual schemas that would preserve the versions and metadata about the versions. At the moment you have to backup/restore the entire database I believe.
The text was updated successfully, but these errors were encountered: