Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide DB synchronization #27

Open
pkulchenko opened this issue Sep 8, 2023 · 1 comment
Open

Provide DB synchronization #27

pkulchenko opened this issue Sep 8, 2023 · 1 comment
Assignees

Comments

@pkulchenko
Copy link
Owner

pkulchenko commented Sep 8, 2023

Background

As I've been working on several state synchronization projects -- SyncX, which is an implementation of Sync9 in Lua and Cabinet, which is an implementation of shelf in Lua -- I came to realize that it may be worth extending the web framework I've been developing with SQLite-based data synchronization, as it already includes SQLite, has the network layer to send and accept HTTP/HTTPS requests and has a DB management layer that takes care of query management and schema upgrades.

Terminology

A quick word on terminology first. I'm using nodeid, peerid, and occasionally userid, which all mean slightly different things. nodeid is a unique identifier for each node/instance running the module and participating in synchronization. Each node can have multiple databases it's used to synchronize, but for the purpose of this document we can assume that it's one DB per node (without changing anything in the design). peer is one of the nodes that a particular node is directly interacting with with its identity and some version information associated with it being tracked.

SQLite Session extension

The original goal was to build the simplest synchronization mechanism that would take DB changes and send them to other peers to apply. Since SQLite already provides the session extension that allows tracking of changes to one database and packaging and applying those changes to another database, I decided to build the synchronization on top of that mechanism.

The session extension API that is needed for the synchronization to work is quite simple:

  • capture the changes (using create_session() and changeset() functions)
  • package those changes into a changepack (one or more changesets)
  • send the changes
  • apply the received changes (using the apply_changeset() function)

When changes are captured, the session extension generates the smallest number of INSERT/UPDATE/DELETE statements that create the same result when applied.

The session extension also detects conflicts of various categories when changes are applied (that need to be resolved in different ways) and provides functions to inspect changesets if needed.

Design

The session extension needs to be configured to track changes, so the framework provides one function (sync()) to set up necessary tables and hooks to the current connections. When the sync() function is called, it adds methods that capture/apply changes and "promotes" the connection to provide the sync by installing the hook that captures the changes. It also creates two tables that are used to track the changes: one to track peers to exchange changes with and the other one to track generated and received changesets.

  local dbm1 = fm.makeStorage("sync1.db", sql) -- open new connection to sync1.db
  assert(dbm1:upgrade({delete = false})) -- upgrade the schema if needed
  local cdbm1 = dbm1:sync({userid = "u1", remotes = {"http://localhost"}})

  assert(1 == dbm1:execute("insert into foo(k, a, b, c, cnt1) values (1, 'v1', 'v1', 'v1', 0)"))
  assert(1 == dbm1:execute("update foo set a = 'v2', b = 'v2', c = 'v1', cnt1 = cnt1 + 5 where k = 1"))
  assert(1 == dbm1:execute("update foo set a = 'v3', b = 'v1', cnt1 = cnt1 + 2 where k = 1"))

All changes executed through the "original" connection are going to be versioned. If some operations need to be excluded, they can be executed with the connection returned by the sync() function. If some tables need to be excluded, they can be excluded using the filter option to the sync() function.

As the result of this setup, any combination of versioned or non-version tables is supported.

The schema changes are not versioned, but schema upgrades are provided by the upgrade() function and are done transparently for the user (as long as the schema follows some rules, as described below).

Only those nodes that have been configured with one or more "remotes" are going to send messages to those remotes. Only those nodes that have been configured with "userid" (node identifier) are going to capture local changes. A node can capture local changes, but don't send then anywhere, as it may accept requests from other nodes and then send the recorded changes in response to the request from other nodes.

This allows for a flexible and dynamic configuration with nodes accepting requests as desired and sending generated and received messages to other nodes (based on configured "remotes" values).

Changeset versioning

Versioning is based on hybrid logical clocks with the clock updated with each local operation and reset if the remote clock value is larger.

Only local changes are versioned. As existing rows or values are not versioned individually, deletes don't need to be tracked in any special ways (no tombstones).

Changesets can be purged when their acceptance is acknowledged by all peers. It's easy to check for the earliest version seen by all peers and delete all older versions (if needed).

Most of the time the received changesets are going to have a newer/later version, so they can be directly applied to the current state of the DB. In some cases, some local changesets may already record newer changes, in which case to apply received changesets, the local changesets need to be "reverted" and re-applied after remote changesets are applied.

In other words, if versions 1, 3, and 4 are applied locally and version 2 is received, then 4 and then 3 need to be reverted with the state restored as of version 1 with 2 then applied and 3 and 4 re-applied. All these changes happen within a transaction and the conflicts are resolved as described below.

Conflict handling

With the session extension sending old/new values for all UPDATE statements, it's possible to detect various conflicts that may happen with concurrent updates and need to be resolved for the changes to be consistently applied despite arriving in (potentially) different order.

Here is the conflict resolver function that is provided as part of the apply_changeset() call:

    local function conflictResolver(udata, econf, iter)
      -- local and remote update/delete for the same key; allow
      if econf == sdbm.CHANGESET_DATA then return sdbm.CHANGESET_REPLACE end
      -- no primary key for update/delete; ignore
      if econf == sdbm.CHANGESET_NOTFOUND then return sdbm.CHANGESET_OMIT end
      -- local and remote insert with the same key; the applied insert replaces the existing one
      if econf == sdbm.CHANGESET_CONFLICT then return sdbm.CHANGESET_REPLACE end
      -- FK violations from the changeset; abort the changes and retry after fixing.
      -- this may happen when the changeset is deleting a record still being referenced locally
      if econf == sdbm.CHANGESET_FOREIGN_KEY then return sdbm.CHANGESET_ABORT end
      -- ignore changes that violate constraints, as they can't be applied
      -- TODO: mark as PENDING as re-apply with other PENDING changes
      -- (as these may be related to schema changes that haven't been applied yet)
      if econf == sdbm.CHANGESET_CONSTRAINT then return sdbm.CHANGESET_OMIT end
      error("Conflict handler: Unexpected conflict value "..tostring(econf))
    end

Some of the received changesets may have conflicts that can't be resolved locally. For example, if the remote schema has been updated and added fields to a table, then an insert to that table can't be applied locally until the schema is updated. These changesets are marked as PENDING and are re-applied as part of each sync() call with the expectation that the local conditions have improved enough to correct apply changesets.

Sending messages in the opposite direction (from peers with older schema to a newer one) is allowed, as sending changes for tables with fewer fields is accepted (assuming schema changes are adding fields, but not remove them).

Package format

I've considered using JSON, multipart/mixed, SQLite DB and a couple of other options, but for now settled on zipping the changesets into one archive with their versions being filenames, as SQLite can both generate and read/decompress zip archive, so it plays well with the goals of the project.

Requirements/Limitations

Sync-related constraints

  • use wal journal
  • set primary keys for all tracked tables and use NOT NULL for fields in primary keys
  • use random instead of autoincrement keys to minimize unnecessary conflicts
  • use cascade deletes to avoid/minimize fk constraint violations

Upgade-related constraints

  • set defaults on all (added) non-null fields (to allow upgrades for existing records)
  • use "create table if not exists" to facilitate schema upgrades, as the upgrades
    need to execute the current schema script without errors
  • use expanding schema evolution to add tables/fields
  • schema changes are not propagated, but can be supported with fullmoon upgrades
  • don't get too fancy with special characters in table names (because of upgrades,
    which require parsing of sql statements to support table changes)

Motivation/Goals

  • provide simple and efficient SQLite-based data synchronization
  • require minimal coding from developers (in the simplest case only one sync() call is needed)
  • provide redbean-based network and security layer
  • support direct (p2p), client-server, and relay-based models of data sync
@pkulchenko pkulchenko self-assigned this Sep 8, 2023
@diyism
Copy link

diyism commented Nov 25, 2023

Great idea, a mesh network database seems like a blockchain.

I'm dreaming a mesh network based on redbean,
currently I'm using a customized caddy web server with http3 behind NAT,
and I added 10 lines of golang codes into the caddy server to send udp packets to each others to penetrate the NAT and built a mesh network,
but what a pity, redbean still dosen't support http3:
jart/cosmopolitan#595

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants