Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - sql: improve database schema handling #6003

Closed
wants to merge 60 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
7a97f7a
sql: improve database schema handling
ivan4th May 31, 2024
6a53628
sql: fixup: malsync and rewards
ivan4th Jun 1, 2024
5db7e18
sql: add migration / schema drift related tests
ivan4th Jun 1, 2024
c2938b2
sql: make schema drift table ignore regexp configurable
ivan4th Jun 1, 2024
ec3397e
sql: refactor localsql / statesql tests
ivan4th Jun 1, 2024
1af47bb
Update CHANGELOG.md
ivan4th Jun 1, 2024
7bb2b1b
config: fix db presets for db schema drift detection
ivan4th Jun 1, 2024
bc98240
sql: fix review comments
ivan4th Jun 5, 2024
895f50f
sql: add database schema handling docs
ivan4th Jun 5, 2024
6f33999
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 5, 2024
6257174
config: fix build errors
ivan4th Jun 5, 2024
aada7c6
Fix README.md
ivan4th Jun 7, 2024
dbea047
Fix go.mod / go.sum
ivan4th Jun 7, 2024
6bf114a
sql: use go:generate for database schema files
ivan4th Jun 12, 2024
bd62a8e
sql: make schema drift fatal by default
ivan4th Jun 12, 2024
c9d1a6c
sql: remove db-ignore-schema-rx config option
ivan4th Jun 12, 2024
beccddd
sql: avoid cyclic dependencies in future coded migrations
ivan4th Jun 12, 2024
8552a96
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 12, 2024
2e41e1a
activation: fix test
ivan4th Jun 12, 2024
dd2ed3e
sql: fix tests
ivan4th Jun 12, 2024
13e8c14
sql: fix query cache handling
ivan4th Jun 12, 2024
fbf2880
sql: update mocks
ivan4th Jun 13, 2024
6dbc902
sql: fix QueryCache related mocks
ivan4th Jun 13, 2024
09e36bb
node: don't print error twice upon failure
ivan4th Jun 13, 2024
9dc3f0c
sql: fix schema drift on Windows
ivan4th Jun 13, 2024
8bbeeb5
sql: update docs on schema handling
ivan4th Jun 13, 2024
30dcf71
sql, malsync: fix handling of context cancelation
ivan4th Jun 13, 2024
8506114
sql: split Schema.Migrate() method
ivan4th Jun 13, 2024
feb39fe
sql: another fix for Windows newlines in the schema
ivan4th Jun 13, 2024
77b46d1
merge-nodes: fix test naming
ivan4th Jun 13, 2024
ee36ee4
sql: close db on schema errors
ivan4th Jun 13, 2024
da6ab28
sql, datastore: remove unneeded mocks
ivan4th Jun 16, 2024
cbf4195
sql: fix naming
ivan4th Jun 16, 2024
b56dc59
sql: remove unneeded assertions from tests
ivan4th Jun 16, 2024
f1f11fe
sql: schemagen: fix help
ivan4th Jun 16, 2024
a172a94
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 16, 2024
b149e23
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 25, 2024
c2c7e6f
Moved database schema handling docs to CODING.md
ivan4th Jun 25, 2024
0400d33
api: fix database handling in the test
ivan4th Jun 25, 2024
a2679a5
sql: fix identities test
ivan4th Jun 25, 2024
33b9501
sql: simplify StateDatabase and LocalDatabase interfaces
ivan4th Jun 26, 2024
ff35334
node: make it possibe to allow localsql schema drift
ivan4th Jun 26, 2024
93a5087
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 28, 2024
b568b25
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jun 28, 2024
bd28a9f
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jul 8, 2024
507aca6
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jul 8, 2024
5b47e76
sql: fix failing tests
ivan4th Jul 8, 2024
4090f64
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jul 8, 2024
945f2ec
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jul 9, 2024
ce9647b
Merge branch 'develop' into feature/schema-snapshot
ivan4th Jul 11, 2024
1220037
Merge branch 'develop' into feature/schema-snapshot
ivan4th Aug 13, 2024
dcab056
statesql: update schema
ivan4th Aug 13, 2024
1d8d310
tmp: fix lint errors (will need to revert)
ivan4th Aug 13, 2024
a942ccc
Revert "tmp: fix lint errors (will need to revert)"
ivan4th Aug 15, 2024
fe95d18
Move statesql migrations to a separate package to avoid cyclic deps
ivan4th Aug 15, 2024
354dc56
Merge branch 'develop' into feature/schema-snapshot
ivan4th Aug 15, 2024
22847df
sql: fix schemagen
ivan4th Aug 15, 2024
a59cc3c
Merge branch 'develop' into feature/schema-snapshot
ivan4th Aug 20, 2024
47fbc84
Addressed comments
ivan4th Aug 20, 2024
2e7e150
sql: ignore whitespace during schema drift checks
ivan4th Aug 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,86 @@ $ grpcurl -plaintext 127.0.0.1:9093 spacemesh.v1.DebugService.NetworkInfo
}
```

#### Handling database schema changes

go-spacemesh currently maintains 2 SQLite databases in the data folder: `state.sql` (state database) and `local.sql` (local database). It employs schema versioning for both databases, with a possibility to upgrade older versions of each database to the current schema version by means of running a series of migrations. Also, go-spacemesh tracks any schema drift (unexpected schema changes) in the databases.

When a database is first created, the corresponding schema file embedded in go-spacemesh executable is used to initialize it:
* `sql/statesql/schema/schema.sql` for `state.sql`
* `sql/localsql/schema/schema.sql` for `local.sql`
The schema file includes `PRAGMA user_version = ...` which sets the version of the database schema. The version of the schema is equal to the number of migrations defined for the corresponding database (`state.sql` or `local.sql`).

For an existing database, the `PRAGMA user_version` is checked against the expected version number. If the database's schema version is too new, go-spacemesh fails right away as an older go-spacemesh version cannot be expected to work with a database from a newer version. If the database version number is older than the expected version, go-spacemesh runs the necessary migration scripts embedded in go-spacemesh executable and updates `PRAGMA user_version = ...`. The migration scripts are located in the following folders:
* `sql/statesql/schema/migrations` for `state.sql`
* `sql/localsql/schema/migrations` for `local.sql`

Additionally, some migrations ("coded migrations") can be implemented in Go code, in which case they reside in `.go` files located in `sql/statesql` and `sql/localsql` packages, respectively. It is worth noting that old coded migrations can be removed at some point, rendering database versions that are *too* old unusable with newer go-spacemesh versions.
fasmat marked this conversation as resolved.
Show resolved Hide resolved

After all the migrations are run, go-spacemesh compares the schema of each database to the embedded schema scripts and if they differ, warns the user about any differences:
```
logger.go:146: 2024-06-05T05:39:32.247+0400 WARN database schema drift detected {"uri": "file:/var/folders/r0/4mks2v4n5ysbntnf3xq6h_q80000gn/T/TestSchemaidempotent_migration3425594786/001/test.db", "diff": " (\n \t\"\"\"\n \t... // 81 identical lines\n \t PRIMARY KEY (kind, epoch)\n \t) WITHOUT ROWID;\n- \t\n- \t-- some change\n \t\"\"\"\n )\n"}
```

In this case, an empty line and `-- some change` was added to `schema.sql` by hand. The pretty-printed diff looks like this:
```
(
"""
... // 81 identical lines
PRIMARY KEY (kind, epoch)
) WITHOUT ROWID;
-
- -- some change
"""
)
```

The possible reasons for schema drift can be the following:
* running an unreleased version of go-spacemesh using your data folder. The unreleased version may contain migrations that may be changed before the release happens
* manual changes in the database
* external SQLite tooling used on the database that adds some tables, indices etc.

In the latter case, it is possible to make go-spacemesh ignore certain objects (tables and indices) when checking for schema drift. For this, you can use `main.db-schema-ignore-rx` setting to set a regular expression that is used to ignore tables and indices in the database during schema drift checks. The setting defaults to `_litestream` to help with certain tooling.

The schema changes in go-spacemesh code should be always done by means of adding migrations. After that, the schema tests in `sql/localsql` and `sql/statesql` will start failing. When the tests fail, they display the difference between the schema stored in `schema.sql` and the schema that is loaded from the database after running all the migrations.
If the schema changes shown in the diff are expected, the schema file needs to be updated.
fasmat marked this conversation as resolved.
Show resolved Hide resolved

```console
$ # run the tests
$ eval $(make print-test-env) go test ./sql/localsql ./sql/statesql
...
=== RUN TestSchema/schema/force_migrations
test.go:106: updated schema written to schema/schema.sql.updated
test.go:108:
Error Trace: /Users/ivan4th/work/spacemesh/go-spacemesh/sql/test/test.go:108
Error: Should be empty, but was (
"""
... // 81 identical lines
PRIMARY KEY (kind, epoch)
) WITHOUT ROWID;
- -- some change
"""
)
Test: TestSchema/schema/force_migrations
Messages: schema diff
FAIL
FAIL github.com/spacemeshos/go-spacemesh/sql/localsql 0.163s
ok github.com/spacemeshos/go-spacemesh/sql/statesql 0.286s
FAIL
$ git status
...
Untracked files:
(use "git add <file>..." to include in what will be committed)
sql/localsql/schema/schema.sql.updated

$ # update the schema file
$ mv sql/localsql/schema/schema.sql{.updated,}

$ # rerun the tests
$ eval $(make print-test-env) go test -count=1 ./sql/localsql ./sql/statesql
ok github.com/spacemeshos/go-spacemesh/sql/localsql 0.166s
ok github.com/spacemeshos/go-spacemesh/sql/statesql 0.293s
```
fasmat marked this conversation as resolved.
Show resolved Hide resolved

#### Next Steps

- Please visit our [wiki](https://github.com/spacemeshos/go-spacemesh/wiki)
Expand Down
2 changes: 1 addition & 1 deletion cmd/merge-nodes/internal/merge_action.go
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ func openDB(dbLog *zap.Logger, path string) (*localsql.Database, error) {

db, err := localsql.Open("file:"+dbPath,
sql.WithLogger(dbLog),
sql.WithEnableMigrations(false),
sql.WithMigrationsDisabled(),
)
if err != nil {
return nil, fmt.Errorf("open source database %s: %w", dbPath, err)
Expand Down
4 changes: 2 additions & 2 deletions cmd/merge-nodes/internal/merge_action_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ func Test_MergeDBs_InvalidTargetScheme(t *testing.T) {
require.NoError(t, db.Close())

err = MergeDBs(context.Background(), zaptest.NewLogger(t), "", tmpDst)
require.ErrorIs(t, err, sql.ErrOld)
require.ErrorIs(t, err, sql.ErrOldSchema)
require.ErrorContains(t, err, "target database")
}

Expand Down Expand Up @@ -100,7 +100,7 @@ func Test_MergeDBs_InvalidSourceScheme(t *testing.T) {
require.NoError(t, db.Close())

err = MergeDBs(context.Background(), zaptest.NewLogger(t), tmpSrc, tmpDst)
require.ErrorIs(t, err, sql.ErrOld)
require.ErrorIs(t, err, sql.ErrOldSchema)
require.ErrorContains(t, err, "source database")
}

Expand Down
10 changes: 5 additions & 5 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ type BaseConfig struct {
DatabaseSkipMigrations []int `mapstructure:"db-skip-migrations"`
DatabaseQueryCache bool `mapstructure:"db-query-cache"`
DatabaseQueryCacheSizes DatabaseQueryCacheSizes `mapstructure:"db-query-cache-sizes"`
DatabaseIgnoreTableRx string `mapstructure:"db-ignore-table-rx"`
DatabaseSchemaIgnoreRx string `mapstructure:"db-ignore-schema-rx"`

PruneActivesetsFrom types.EpochID `mapstructure:"prune-activesets-from"`

Expand Down Expand Up @@ -246,10 +246,10 @@ func defaultBaseConfig() BaseConfig {
ATXBlob: 10000,
ActiveSetBlob: 200,
},
DatabaseIgnoreTableRx: "^_litestream",
NetworkHRP: "sm",
ATXGradeDelay: 10 * time.Second,
PostValidDelay: 12 * time.Hour,
DatabaseSchemaIgnoreRx: "^_litestream",
NetworkHRP: "sm",
ATXGradeDelay: 10 * time.Second,
PostValidDelay: 12 * time.Hour,

PprofHTTPServerListener: "localhost:6060",
}
Expand Down
20 changes: 10 additions & 10 deletions config/mainnet.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,16 +67,16 @@ func MainnetConfig() Config {
hare3conf.EnableLayer = 35117
return Config{
BaseConfig: BaseConfig{
DataDirParent: defaultDataDir,
FileLock: filepath.Join(os.TempDir(), "spacemesh.lock"),
MetricsPort: 1010,
DatabaseConnections: 16,
DatabasePruneInterval: 30 * time.Minute,
DatabaseVacuumState: 15,
DatabaseIgnoreTableRx: "^_litestream",
PruneActivesetsFrom: 12, // starting from epoch 13 activesets below 12 will be pruned
ScanMalfeasantATXs: false, // opt-in
NetworkHRP: "sm",
DataDirParent: defaultDataDir,
FileLock: filepath.Join(os.TempDir(), "spacemesh.lock"),
MetricsPort: 1010,
DatabaseConnections: 16,
DatabasePruneInterval: 30 * time.Minute,
DatabaseVacuumState: 15,
DatabaseSchemaIgnoreRx: "^_litestream",
PruneActivesetsFrom: 12, // starting from epoch 13 activesets below 12 will be pruned
ScanMalfeasantATXs: false, // opt-in
NetworkHRP: "sm",

LayerDuration: 5 * time.Minute,
LayerAvgSize: 50,
Expand Down
2 changes: 1 addition & 1 deletion config/presets/testnet.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ func testnet() config.Config {
DatabaseConnections: 16,
DatabaseSizeMeteringInterval: 10 * time.Minute,
DatabasePruneInterval: 30 * time.Minute,
DatabaseIgnoreTableRx: "^_litestream",
DatabaseSchemaIgnoreRx: "^_litestream",
NetworkHRP: "stest",

LayerDuration: 5 * time.Minute,
Expand Down
Loading
Loading