Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PostgreSQL support as alternative data storage #3236

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

robinbraemer
Copy link
Contributor

@robinbraemer robinbraemer commented Oct 10, 2024

See #3235

Checklist

  • Implement storage.InstallationProvider
  • Implement storage.ParameterSetProvider
  • Implement storage.CredentialSetProvider
  • Future-proof migrations support for evolving schemas
  • Allow specifying Postgres as storage
  • Did you write tests?
  • Add to integration tests as well
  • Specify gorm logger
  • Did you write documentation?

Notes

  • When using Postgres, the legacy Porter v0.38 storage plugin migrate is not implemented
    • I expect there are no such old versions in the wild anymore!
  • It will allow the addition of MySQL, SQLite, and other later thanks to GORM‘s nature - but is out of scope for this PR

@kichristensen
Copy link
Contributor

@robinbraemer Just let us know when you want some feedback. I have skimmed through it, and it looks good on the surface.

robinbraemer and others added 11 commits October 18, 2024 12:13
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
* Example on how to wire an image via exec mixin

Signed-off-by: John Cudd <jmcudd@gmail.com>

* Small tweak to naming

Signed-off-by: John Cudd <jmcudd@gmail.com>

---------

Signed-off-by: John Cudd <jmcudd@gmail.com>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
Signed-off-by: robinbraemer <robin.braemer@web.de>
@robinbraemer
Copy link
Contributor Author

robinbraemer commented Oct 18, 2024

@robinbraemer Just let us know when you want some feedback. I have skimmed through it, and it looks good on the surface.

@kichristensen I just got the first working install outside of tests.

image

Can you look into whether this SQL schema makes sense? For example, there are ParameterSet columns in the installations table. Does Mongo store copy the parameters into an installation object and update it on upgrades? If yes, that's good for the installation's parameters' immutability.

Should the slice of params/creds in installations be foreign keys or allow loose reference and deleting from the parameter sets table without affecting the installations that used those (I like this more).

Are empty string "" namespaces treated equally to global namespaces. I might ensure that those are actually NULL in the table to ensure queries work as expected. There are possibly other cases where we should use NULL (gorm default:null struct tag) instead of the zero value

Other than those clarifications you can start to review more, ideally, we should also test Postgres in the integration tests

func NewTestPorter(t *testing.T) *TestPorter {
.

Signed-off-by: robinbraemer <robin.braemer@web.de>
@robinbraemer
Copy link
Contributor Author

robinbraemer commented Oct 18, 2024

Here is a schema dump from my local cockroach instance after a fresh sql/migrate.Migrate.

create table goose_db_version
(
    id         bigint    default nextval('public.goose_db_version_id_seq'::REGCLASS) not null
        primary key,
    version_id bigint                                                                not null,
    is_applied boolean                                                               not null,
    tstamp     timestamp default now()                                               not null
);

create table installations
(
    id                         text not null
        primary key,
    schema_type                text,
    schema_version             text,
    name                       text,
    namespace                  text,
    uninstalled                boolean,
    bundle_repository          text,
    bundle_version             text,
    bundle_digest              text,
    bundle_tag                 text,
    custom                     jsonb,
    labels                     jsonb,
    credential_sets            jsonb,
    parameter_sets             jsonb,
    parameters_schema_type     text,
    parameters_schema_version  text,
    parameters_namespace       text,
    parameters_name            text,
    parameters_labels          jsonb,
    parameters_parameters      jsonb,
    parameters_status_created  timestamp with time zone,
    parameters_status_modified timestamp with time zone,
    status_run_id              text,
    status_action              text,
    status_result_id           text,
    status_result_status       text,
    status_created             timestamp with time zone,
    status_modified            timestamp with time zone,
    status_installed           timestamp with time zone,
    status_uninstalled         timestamp with time zone,
    status_bundle_reference    text,
    status_bundle_version      text,
    status_bundle_digest       text,
    constraint idx_installations_namespace_name
        unique (namespace, name)
);

create table results
(
    schema_version  text,
    id              text not null
        primary key,
    created         timestamp with time zone,
    namespace       text,
    installation    text,
    run_id          text,
    message         text,
    status          text,
    output_metadata jsonb,
    custom          jsonb
);

create index idx_results_namespace_installation
    on results (namespace, installation);

create index idx_results_run_id
    on results (run_id);

create table outputs
(
    schema_version text,
    name           text,
    namespace      text,
    installation   text,
    run_id         text,
    result_id      text,
    key            text,
    value          bytea,
    constraint idx_outputs_result_id_name
        unique (result_id, name)
);

create index idx_outputs_namespace_installation_result_id
    on outputs (namespace asc, installation asc, result_id desc);

create index idx_outputs_namespace_installation_name_result_id
    on outputs (namespace asc, installation asc, name asc, result_id desc);

create table runs
(
    schema_version                       text,
    id                                   text not null
        primary key,
    created                              timestamp with time zone,
    modified                             timestamp with time zone,
    namespace                            text,
    installation                         text,
    revision                             text,
    action                               text,
    bundle                               text,
    bundle_reference                     text,
    bundle_digest                        text,
    "parameterOverrides_schema_type"     text,
    "parameterOverrides_schema_version"  text,
    "parameterOverrides_namespace"       text,
    "parameterOverrides_name"            text,
    "parameterOverrides_labels"          jsonb,
    "parameterOverrides_parameters"      jsonb,
    "parameterOverrides_status_created"  timestamp with time zone,
    "parameterOverrides_status_modified" timestamp with time zone,
    credential_sets                      jsonb,
    parameter_sets                       jsonb,
    parameters_schema_type               text,
    parameters_schema_version            text,
    parameters_namespace                 text,
    parameters_name                      text,
    parameters_labels                    jsonb,
    parameters_parameters                jsonb,
    parameters_status_created            timestamp with time zone,
    parameters_status_modified           timestamp with time zone,
    custom                               jsonb,
    parameters_digest                    text,
    credentials_schema_type              text,
    credentials_schema_version           text,
    credentials_namespace                text,
    credentials_name                     text,
    credentials_labels                   jsonb,
    credentials_credentials              jsonb,
    credentials_status_created           timestamp with time zone,
    credentials_status_modified          timestamp with time zone,
    credentials_digest                   text
);

create index idx_runs_namespace_installation
    on runs (namespace, installation);

create table credential_sets
(
    schema_type     text,
    schema_version  text,
    namespace       text,
    name            text,
    labels          jsonb,
    credentials     jsonb,
    status_created  timestamp with time zone,
    status_modified timestamp with time zone,
    constraint idx_credentials_namespace_name
        unique (namespace, name)
);

create table parameter_sets
(
    schema_type     text,
    schema_version  text,
    namespace       text,
    name            text,
    labels          jsonb,
    parameters      jsonb,
    status_created  timestamp with time zone,
    status_modified timestamp with time zone,
    constraint idx_parameters_namespace_name
        unique (namespace, name)
);

We may want to be more explicit in not null fields.

@robinbraemer robinbraemer marked this pull request as ready for review October 21, 2024 12:15
@kichristensen
Copy link
Contributor

@robinbraemer Sorry, didn't notice the notifications. I will take a look tomorrow

@kichristensen
Copy link
Contributor

Does Mongo store copy the parameters into an installation object and update it on upgrades? If yes, that's good for the installation's parameters' immutability.

That part is not where I have most knowledge in Porter, but looking at the code it looks like that is the case. @schristoff might be able to confirm.

Should the slice of params/creds in installations be foreign keys or allow loose reference and deleting from the parameter sets table without affecting the installations that used those (I like this more).

I agree with you, loose references should be used. There can be good reasons for deleting parameter sets, without deleteing installations that used them.

Are empty string "" namespaces treated equally to global namespaces. I might ensure that those are actually NULL in the table to ensure queries work as expected.

To my knowlegde yes, but again @schristoff might be able to confirm

Copy link
Contributor

@kichristensen kichristensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robinbraemer Thank you very much for the PR, really good work. I have added some comments.
There is currently some compilation errors that needs to be fixed. Once fixed I will play around with it locally, to ensure things works.
Meanwhile it would be great to if @schristoff and @sgettys would take a look too.

}
s, err := c.GetStorage(c.Data.DefaultStorage)
if err != nil {
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not make sense to return the error here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code would work either way. Ignoring the error will result in an empty string s. I should be more explicit and just return ok = false on err != nil. Good point to clarify.

@@ -49,16 +54,6 @@ func NewTestRuntimeFor(tc *config.TestConfig, testInstallations *storage.TestIns
}
}

func (t *TestRuntime) Close() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function no longer used? It was previously used in some unit tests, and doesn't look like you changed those tests

signer := signing.NewPluginAdapter(signingplugin.NewSigner(c))
return NewFor(c, storage, secretStorage, signer)
// differ full initialization to Connect
return newWith(nil, nil, nil, nil, nil, nil, nil, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am personally a little against passing all these nils. Might be a good idea to implement the Options pattern, might also simply some of the later logic in the other functions.
On the other hand it will make this PR even bigger, so might a better to do something about it in a different PR.


// NewFor creates a new Porter client with the provided configuration and storage backend.
//
// Deprecated: Use NewWith instead. NewFor does not support newer sql storage backends.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to remove this function in this PR, or will it stay deprecated for a while?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I'm fine to remove it since we already say Porter does not guarantee backwards compatibility in its library. :)

@@ -45,7 +46,7 @@ type Run struct {

// Bundle is the definition of the bundle.
// Bundle has custom marshal logic in MarshalJson.
Bundle bundle.Bundle `json:"-"`
Bundle bundle.Bundle `json:"-" gorm:"json"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not used Gorm a lot, but that this mean that the custom logic in MarshalJson will be executed for when stored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, types can implement the json.Marshaller interface to change their default json marshal structure. It will store that output in the db.

@kichristensen
Copy link
Contributor

Hi @robinbraemer. Today I wrote with @schristoff, and if possible we would really like to get on a call with you.
After diving a little more into this, we have a few design topics we would like to discuss with you, which would be easier on a call. Would it possible to have a call and some point, or alternatively that you could attend a community meeting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants