What is the best way to handle schema migrations? #59
Replies: 10 comments 5 replies
-
Yes, right now what you have described is the case and the only way to handle schema changes is via stopping Marmot, dropping a snapshot on all nodes, and restarting the Marmot process. The major reason to not implement schema changes propagation is because:
Having said that, here are my recommendations:
I've been brain-storm about this topic, and I have questions that I believe community can answer. It will really help me develop a solution that works really well for community:
Again I am open to suggestions and contributions on this topic. |
Beta Was this translation helpful? Give feedback.
-
@maxpert thanks for the elaborate response! The algorithms that you describe involves downtime. Also, if I understand correctly, the migration runs on a single marmot node , and the distribution to other nodes happens via snapshots, so that other nodes would be required to download the snapshot to be up-to-date? If that is the case, it would be problematic with bigger database (100s of GB), and the migration would involve a long time. How it could work: maybe it would be possible to have a special table, that marmot would use to signal the state of the migration. A rolling update is good, and having some way to see the progress of the roll-out through the cluster would be great.
How are migration executed in general is a very broad question, since there are so many tools to manage migrations. I guess the simplest would be to post a SQL file with 2 attributes: name + timestamp as identifier for this particular migration. We could start with only forward migrations, because supporting reversible migrations does not really work in production. We would also need to store applied migrations in a table, maybe "__marmot_migrations"? Maybe we should collect those ideas in a documents (could be a markdown file in the repo), with arguments why those design decisions where preferred. From that document we could distill the minimal feature set, that would just solve the issue at hand. |
Beta Was this translation helpful? Give feedback.
-
No we are not restoring snapshots, I am using snapshots as safety measure. You actually run the migration scripts everywhere.
Excellent you are essentially converging to same idea of having migration script essentially being published on NATS, and then applying them everywhere.
I can introduce a RFC category under discussion, but in long run we might have to move to different repo like https://github.com/reactjs/rfcs |
Beta Was this translation helpful? Give feedback.
-
Here is an alternative way that solves all of the above issues raised and also gives extra features Design to make it scale well and upgradable with no downtime.
why ?
Changes:
https://github.com/mholt/caddy-l4 https://caddy.community/t/cant-connect-to-database-behind-layer4-server/16168/5 This all results in one binary because Caddy can be compiled with the Caddy Marmto Module. Writing Caddy Modules is not complex and the APi is quite stable. You could use NATS as the proxy, but then the client also needs NATS, which is a leaky abstraction. Hence why the Caddy L4 proxy. |
Beta Was this translation helpful? Give feedback.
-
The caddy l4 just passes the sql through it. It does nothing else unless you want it to . So it’s not maybe what you think . Not an orm . |
Beta Was this translation helpful? Give feedback.
-
Ok so forget the caddy idea . It’s too complex. i agree with the sentiment of using nats to do the schema migration because it’s a no downtime solution and we already have it , so no new parts to be added. If devs want to integration it with CI then they could call out to nats from their ci. |
Beta Was this translation helpful? Give feedback.
-
There are couple projects that implement zero-downtime migrations for Postgres: There are also couple links with ideas: References
Maybe there is something that we could apply also for Sqlite + Marmot. |
Beta Was this translation helpful? Give feedback.
-
Interesting seems like it enforces a full framework of how migration should be done. I've been working with couple of people on how they handle migrations and so far for SQLite specially people had their typical flow of run a script to alter tables. I can may be inspire page or two out of pgroll to see how that can be mapped into SQLite workflows. |
Beta Was this translation helpful? Give feedback.
-
any progress on this? |
Beta Was this translation helpful? Give feedback.
-
I have been reading up on this problem space and found a system that is like marmot it’s SQLite with crdt . They are able to do live sql migrations by modifying the tracking tables , so as to not break the sync tables. That’s a tough problem to solve. there are 3 working examples too . @maxpert I am guessing you have read all their stuff ? i personally think that NATS is required for all these db sync systems. The reason is because I always end up have a farm if SQLite db and I want to do a sql subscription and I want to make sure that only 1 of the clients handles that subscription feed event . This is a classic global pattern for me and almost every other system. Only once ( or at least once ). So nats is needed for that anyway. This stuff is HARD :) |
Beta Was this translation helpful? Give feedback.
-
Currently the triggers are only attached during the start phase of marmot. Individual schema changes (new tables / columns) are not reflected at runtime and require either a manual way to apply them on all DBs in the cluster + restarting each marmot node to update triggers and changelog tables.
Are there some recommendations how to approach it in a straightforward and maintainable way?
Best,
Roman
Beta Was this translation helpful? Give feedback.
All reactions