[RFC] 075 - 客户端模式 从 Dexie DB 迁移 pglite #4868
Replies: 7 comments 6 replies
-
PGlite 和 Postgres 复用Schema 定义是否可以直接给 PGlite 使用? 现有的 Schema 定义代码可以直接给 PGlite 使用,原因如下:
因此,不需要对现有的 Schema 定义代码做任何修改,可以直接复用。 |
Beta Was this translation helpful? Give feedback.
-
Pglite 浏览器 Migration 机制和服务端 migration 不同,在浏览器中 pglite 迁移只能在用户启动网站时完成,因此需要做一个运行时 migration 的实现方案,参考官方的一个 discussion: drizzle-team/drizzle-orm#2532 进行实现即可。 核心思路分两步: Step1. 提前将所有的 sql 文件编译为一个 migrations.json 文件import { readMigrationFiles } from 'drizzle-orm/migrator';
import { writeFileSync } from 'node:fs';
import { join } from 'node:path';
const dbBase = join(__dirname, '../../src/database');
const migrationsFolder = join(dbBase, './migrations');
const migrations = readMigrationFiles({ migrationsFolder: migrationsFolder });
writeFileSync(
join(dbBase, './client/migrations.json'),
JSON.stringify(migrations, null, 2), // null, 2 adds indentation for better readability
);
console.log('🏁 client migrations.json compiled!'); 示例 SQL: ALTER TABLE "messages" ADD COLUMN "client_id" text;--> statement-breakpoint
ALTER TABLE "session_groups" ADD COLUMN "client_id" text;--> statement-breakpoint
ALTER TABLE "sessions" ADD COLUMN "client_id" text;--> statement-breakpoint
ALTER TABLE "topics" ADD COLUMN "client_id" text;--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "messages_client_id_idx" ON "messages" ("client_id");--> statement-breakpoint
ALTER TABLE "messages" ADD CONSTRAINT "messages_client_id_unique" UNIQUE("client_id");--> statement-breakpoint
ALTER TABLE "session_groups" ADD CONSTRAINT "session_groups_client_id_unique" UNIQUE("client_id");--> statement-breakpoint
ALTER TABLE "sessions" ADD CONSTRAINT "sessions_client_id_unique" UNIQUE("client_id");--> statement-breakpoint
ALTER TABLE "topics" ADD CONSTRAINT "topics_client_id_unique" UNIQUE("client_id"); 输出JSON 示例: {
"sql": [
"ALTER TABLE \"messages\" ADD COLUMN \"client_id\" text;",
"\nALTER TABLE \"session_groups\" ADD COLUMN \"client_id\" text;",
"\nALTER TABLE \"sessions\" ADD COLUMN \"client_id\" text;",
"\nALTER TABLE \"topics\" ADD COLUMN \"client_id\" text;",
"\nCREATE INDEX IF NOT EXISTS \"messages_client_id_idx\" ON \"messages\" (\"client_id\");",
"\nALTER TABLE \"messages\" ADD CONSTRAINT \"messages_client_id_unique\" UNIQUE(\"client_id\");",
"\nALTER TABLE \"session_groups\" ADD CONSTRAINT \"session_groups_client_id_unique\" UNIQUE(\"client_id\");",
"\nALTER TABLE \"sessions\" ADD CONSTRAINT \"sessions_client_id_unique\" UNIQUE(\"client_id\");",
"\nALTER TABLE \"topics\" ADD CONSTRAINT \"topics_client_id_unique\" UNIQUE(\"client_id\");\n"
],
"bps": true,
"folderMillis": 1717153686544,
"hash": "ddb29ee7e7a675c12b44996e4be061b1736e8f785052242801f4cdfb2a94f258"
} 这些字段的含义:
Step2. 在运行时调用 migrate 方法触发迁移import { PgDialect } from 'drizzle-orm/pg-core';
import { clientDB } from './db';
import migrations from './migrations.json';
export const migrate = async () => {
// refs: https://github.com/drizzle-team/drizzle-orm/discussions/2532
// @ts-ignore
await clientDB.dialect.migrate(migrations, clientDB.session, {});
return clientDB;
}; 注意事项: breakpoint statement有一个巨坑需要注意,迁移的 sql 语句必须要有 breakpoint statement。 一开始遇到的一个问题是报 后来发现是因为之前加过一些自定义的 SQL (如果全用 drizzle 生成的 sql 就不会遇到这个问题) ,如下: -- step 1: create a temporary table to store the rows we want to keep
CREATE TEMP TABLE embeddings_temp AS
SELECT DISTINCT ON (chunk_id) *
FROM embeddings
ORDER BY chunk_id, random();
-- step 2: delete all rows from the original table
DELETE FROM embeddings;
-- step 3: insert the rows we want to keep back into the original table
INSERT INTO embeddings
SELECT * FROM embeddings_temp;
-- step 4: drop the temporary table
DROP TABLE embeddings_temp;
-- step 5: now it's safe to add the unique constraint
ALTER TABLE "embeddings" ADD CONSTRAINT "embeddings_chunk_id_unique" UNIQUE("chunk_id"); 这些执行步骤中如果没添加 {
"sql": [
"-- step 1: create a temporary table to store the rows we want to keep\nCREATE TEMP TABLE embeddings_temp AS\nSELECT DISTINCT ON (chunk_id) *\nFROM embeddings\nORDER BY chunk_id, random();\n\n-- step 2: delete all rows from the original table\nDELETE FROM embeddings;\n\n-- step 3: insert the rows we want to keep back into the original table\nINSERT INTO embeddings\nSELECT * FROM embeddings_temp;\n\n-- step 4: drop the temporary table\nDROP TABLE embeddings_temp;\n\n-- step 5: now it's safe to add the unique constraint\nALTER TABLE \"embeddings\" ADD CONSTRAINT \"embeddings_chunk_id_unique\" UNIQUE(\"chunk_id\");\n"
],
"bps": true,
"folderMillis": 1724254147447,
"hash": "6aa3e7a9ff9dcd0541ade5471ceec758bc741ee4a3045b4b848e46faedeae7af"
} 这样就会导致出错。所以需要添加 \-- step 1: create a temporary table to store the rows we want to keep
CREATE TEMP TABLE embeddings_temp AS
SELECT DISTINCT ON (chunk_id) *
FROM embeddings
ORDER BY chunk_id, random();
+ --> statement-breakpoint
\-- step 2: delete all rows from the original table
DELETE FROM embeddings;
+ --> statement-breakpoint
\-- step 3: insert the rows we want to keep back into the original table
INSERT INTO embeddings
SELECT * FROM embeddings_temp;
+ --> statement-breakpoint
\-- step 4: drop the temporary table
DROP TABLE embeddings_temp;
+ --> statement-breakpoint
\-- step 5: now it's safe to add the unique constraint
ALTER TABLE "embeddings" ADD CONSTRAINT "embeddings_chunk_id_unique" UNIQUE("chunk_id"); 这样一来生成的 json 就会正确: {
"sql": [
"-- step 1: create a temporary table to store the rows we want to keep\nCREATE TEMP TABLE embeddings_temp AS\nSELECT DISTINCT ON (chunk_id) *\nFROM embeddings\nORDER BY chunk_id, random();\n",
"\n\n-- step 2: delete all rows from the original table\nDELETE FROM embeddings;\n",
"\n\n-- step 3: insert the rows we want to keep back into the original table\nINSERT INTO embeddings\nSELECT * FROM embeddings_temp;\n",
"\n\n-- step 4: drop the temporary table\nDROP TABLE embeddings_temp;\n",
"\n\n-- step 5: now it's safe to add the unique constraint\nALTER TABLE \"embeddings\" ADD CONSTRAINT \"embeddings_chunk_id_unique\" UNIQUE(\"chunk_id\");\n"
],
"bps": true,
"folderMillis": 1724254147447,
"hash": "e99840848ffbb33ca4d7ead6158f02b8d12cb4ff5706d4529d7fa586afa4c2a9"
}, |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
pglite 初始化应用初始化需要一个地方初始化并存储 userId。 现在存储的位置是 IndexedDB -> class _UserModel extends BaseModel {
getUser = async (): Promise<DB_User & { id: number }> => {
const noUser = !(await this.table.count());
if (noUser) await this.table.put({ uuid: uuid() });
const list = (await this.table.toArray()) as (DB_User & { id: number })[];
return list[0];
};
} 是不是可以直接参考这个方案在 useInitUserState 中做初始化呢? 先这么做了,加了一个 private async makeSureUserExist() {
const existUsers = await clientDB.query.users.findMany();
let user: { id: string };
if (existUsers.length === 0) {
const result = await clientDB.insert(users).values({ id: uuid() }).returning();
user = result[0];
} else {
user = existUsers[0];
}
if (typeof window !== 'undefined') {
window.__lobeClientUserId = user.id;
}
} db 部分初始化Cluade 还是很强的,对话了几轮就给了一个很好的实现。 |
Beta Was this translation helpful? Give feedback.
-
PGLite 数据库管理器设计核心设计思路
关键实现特性
使用示例// 初始化数据库
await initializeDB({
onStateChange: (state) => {
console.log('Database state:', state);
},
onProgress: ({ phase, progress }) => {
console.log(`Loading ${phase}: ${progress}%`);
},
});
// 数据库操作
await clientDB.query(...);
// 手动迁移(如果需要)
await migrate(true); 设计优势
|
Beta Was this translation helpful? Give feedback.
-
现存的一些bug:
修改设置时报错。原因是 userModel 有加密存储数据的逻辑,这部分逻辑需要调整掉
|
Beta Was this translation helpful? Give feedback.
-
数据迁移dexie 现存数据理论上走一道 importService 就能导入。 测下来所有的消息都没有问题,但没有做文件部分的导入。 用户导出配置数据旧版本走一道 Migration 迁移到最新版,然后用 importService 新版本直接走 schema 变更 SQL 数据迁移方案 |
Beta Was this translation helpful? Give feedback.
-
背景
LobeChat 目前采用了一种兼容客户端 DB 和服务端 DB 的架构设计,以实现在不同场景下的灵活应用。这种设计的核心是在前端 store 中实现核心业务逻辑,通过 service 层区分客户端 DB 和服务端 DB 的操作。这种架构使得我们能够用同一套代码同时支持两种模式,为用户提供了更多选择。
然而,这种设计也带来了一些挑战:
随着 PGlite 的逐渐成熟,特别是其开始支持插件生态(如 pgvector),我们看到了一个统一前后端 DB 逻辑的机会。迁移到 PGlite 将带来以下优势:
这次迁移是我们技术架构演进的重要一步,将为 LobeChat 带来更好的性能、更统一的开发体验,以及更强大的功能支持。同时,这种改造主要集中在 service 和 client DB 层,对现有的前端逻辑影响较小,是一种高性价比的技术升级路径。
设计思路
database/server/schemas
和database/server/models
作为通用的实现,进而实现一码双端;实施进展
分三阶段走:
Step1. 项目目录架构调整
database/server
目录迁移到database/
目录,并完成相应配套工程实现的重构;♻️ refactor: move schema and migration folder #4874Step2. server 端 db 实现改造
database/repositories
目录下Step3. client 端部分实现统一
Step4. 集成 pglite 客户端实现
考虑到 pglite 目前的不成熟性(在 macOS 11.5.2 中无法正常使用),先不准备强制迁移到 pglite 了,准备让用户通过添加
NEXT_PUBLIC_CLIENT_DB='pglite'
来手动开启。Beta Was this translation helpful? Give feedback.
All reactions