Tables as Database Files as a Potential Solution to the Schema Change Problem #113
Replies: 2 comments 1 reply
-
Hey @WesselAtWork Thank you for making the effort to explain this approach. I like that your taking a different approach and braining storming different approaches. Because marmot is designed to work with different systems like Pocketbase and others, putting the effort on the Application developer might not work that well maybe ? I kind of worry about that. Some of my own thoughts on this from a pragmatic perspective: If we can calculate the migration diff then maybe we can keep the effort off the Application Developer... As far as I know the tricky part when you do migrations to the Tables with a CRDT system is that you need to hold the Old Tables schema in the DB also. Thats because CRDT is designed to allow offline changes on ANY of the SQLite DB's. With marmot because we are Ordered ( thats to NATS ) , it is possibly much easier. The DB migration is calculate on 1 server , and then that migration is pushed ( via NATS ) to all the other SQLITE DB's. I am of course not certain, but this makes sense to me as a way to do it. What about Race conditions ? If NATS is partitioned ( fancy word for the network between some NATS Servers going down for a bit ), and then reforms the network, I do not know if the above Race condition would occur or not.. Would be cool if @maxpert could chime in here if he knows. https://atlasgo.io/getting-started/sqlite-declarative-sql shows how we can calculate the Schema change automatically.. I don't know if that only works for Ent though. Anyone know ? https://github.com/ariga/atlas is the code. This project is relevant too for ideas perhaps ? https://superfly.github.io/corrosion/schema.html Corrosion uses CR-SQlite btw: https://vlcn.io/docs/cr-sqlite/migrations Its using the CRDT ideas, and not the NATS RAFT approach. Some things I can point out with Corrosion that is sort of the same with Marmot:
|
Beta Was this translation helpful? Give feedback.
-
Reading from the feature promises on CR-SQLite it's supposed to deal with diverging writes if a network partition happens. You probably need to integrate more directly with the library, and manage the meta-structure inside marmot.
Could you elaborate? |
Beta Was this translation helpful? Give feedback.
-
Tables as Database Files
The approach is simple in that we don't try to deal with the schema changes on Marmot's level, but on the boundary between Marmot and the Application.
A theoretical setup would look like where every table has multiple files (versions) that are all replicated by marmot.
We can use the ATTACH to create a database connection that binds all (or some) of the table files together.
This moves the concern to the application, and by extension the application developer.
Inside the application we will have to deal with migrations and or data fusing.
Simple Switchover
If your application needs a code change to deal with the new schema anyway, you have to stop all instance of the application from adding to the old table.
To have a new schema of
tableX
we would create the new filetableX-16.db
, apply the new schema, and then migrate the data over.Once the new table is live, all the applications instances are activated again and will now operate on the new table file.
Gradual Switchover
If you need every instance of the application to always be online, you have to have both databases (tables) active and live at some point in time.
You need to deal with reading from an old version at the same time you are writing to the new one, this has the added bonus of giving your application the ability to do time based partitions.
This increases the programming burden as you need to deal with more then one version of the representation of your data.
Once you have the process well defined for your application, you could deal with any change at any time.
If you already have you application partitioned into a READ component and a WRITE component, this is way easier, as your writing component just needs to start writing into the new schema the moment it becomes active, and the reading component deals with the problem of fusing the data across files
You could also make every table file versioned to the application version:
tableX_v1.0.1.db
to make the distinction between old data and new data easy inside the application.Implementation requirement from Marmot
IINS: Easily Dealing with multiple db files at the same time.
Currently we CAN do this model by running multiple instances of Marmot connected to all the different files, but that will get unmanageable very quickly.
The ideal would be a single process watching a folder, and then registering any new database files, as and when they arrive.
This way the migration would be controlled completely by the application, which would create the new files in the directory and use them.
All marmot needs to do, is get it replicated as fast as possible.
Optionally: What could be useful is if the marmot process could signal the application instance it is sidecar-ing, by either issuing a SIGHUP or by creating a file somewhere the application can watch for.
This should cause the application to "reload config" and catch the new table file. (Of course what EXACTLY the application does is left to the implementer)
Other problems
Config.
Easily solved with just including a configuration table that has a pretty static schema that contains the filenames and their associated logical table names.
The application would then "listen" to changes on this table and deal with them appropriately.
Cleanup
Having a simple way to clean-up old db files would be nice, but that should really be a application concern.
Bonus
You can "watch" tables selectively!
In a theoretical setup where every table is self contained in separate sqlite database files , all you need to do to "watch" a single table is to only include that table file in your application environment.
i.e.
From marmot we just need the ability to use a regex on the sqlite files we want to replicate in our local instance form the main pool
Again this raises the implementation to the developer, who needs to be cognizant of this when they implement their program.
The application developer needs to be aware of what they want from the database and then set the db connection appropriately.
Beta Was this translation helpful? Give feedback.
All reactions