feat(db): 把 PostgreSQL 做成一等公民(独立迁移树 + JSONB + 集成测试)#427
Conversation
PostgreSQL previously ran on the KingbaseES migration tree (db/migration/kingbase) because the two share a SQL dialect. This couples PostgreSQL-specific evolution to KingbaseES and blocks PG-only optimizations. Fork db/migration/kingbase -> db/migration/postgresql (152 files, V1-V158, byte-identical) and point application-postgres.yml's flyway.locations at the new tree. The switch is transparent for existing PostgreSQL deployments: Flyway tracks version + checksum, not the classpath location, and the identical scripts keep identical checksums, so nothing re-runs. Also align application-kingbase.yml's mate.wiki block with the postgres profile (allowed-source-roots / watcher-enabled / watcher-interval-ms), which were missing. No column definitions change in this commit; PG schema output is identical to what the kingbase tree produced. JSONB optimizations land in a follow-up.
…ostgreSQL
The forked postgresql migration tree stored all JSON payloads as TEXT
(config_json / headers_json / settings_json / delivery_config / ... ~40
columns), inherited from the kingbase/h2/mysql dialect. On PostgreSQL these
are better modelled as JSONB: writes are validated as well-formed JSON at the
database boundary, and the door is open to GIN indexing / JSON queries later.
Changes (postgresql tree only; kingbase/h2/mysql untouched):
- Convert 46 columns across 18 migrations from TEXT to JSONB. Columns were
whitelisted by name + verified against their entities and seed inserts; free
text columns (description, source_code, encrypted_value, ...) keep TEXT.
- Six NOT NULL columns get a JSONB default ('{}'::jsonb, or '[]'::jsonb for the
array-typed steps_json) so a missing write can't break the NOT NULL contract.
- Three columns stay TEXT on purpose: mate_tool.params_schema and
mate_wiki_transformation.output_schema (arbitrary JSON-Schema text) and
mate_message.metadata (frequently truncated half-structured blob).
- Rewrite V53's connection_mode recovery to JSONB-native ops. The TEXT version
used TRIM/POSITION/SUBSTRING/CONCAT/REPLACE on config_json, which are invalid
on jsonb; the JSONB merge operator `||` does the same idempotent key-set in
one step while preserving the other keys.
- Add stringtype=unspecified to the datasource and flyway JDBC URLs so the
driver sends String-bound JSON values as `unknown`, letting PostgreSQL coerce
them into jsonb (covers both the JacksonTypeHandler path and plain String
columns). Without it, setString -> jsonb fails at runtime.
Verified against a real PostgreSQL 16 container: all 153 migrations apply
cleanly, converted columns report data_type=jsonb (params_schema / metadata
stay text), invalid JSON is rejected by a jsonb column, and the V53 merge sets
connection_mode=websocket while preserving sibling keys.
…tgresql tree The postgresql migration tree (JSONB columns, JSONB-native V53) can only be exercised on a real PostgreSQL server — H2/MySQL/Kingbase profiles never touch it, so nothing in CI proved it actually applies. Add two Testcontainers tests against postgres:16-alpine: - PostgresE2EBaseTest: shared base that points the datasource + Flyway at a throwaway PG container with the postgresql tree, postgre_sql dialect, the mateclaw schema (init script), and currentSchema/stringtype URL params (mirrors application-postgres.yml). @testcontainers(disabledWithoutDocker) so a normal `mvn test` skips — not fails — where no Docker daemon exists. - PostgresMigrationSmokeTest: asserts all 150+ migrations apply with no failed flyway_schema_history rows, and that the upgraded columns are physically jsonb while the excluded ones (params_schema, message.metadata) stay text. - CronJobDeliveryConfigPgTest: drives the JacksonTypeHandler path end-to-end (CronJobEntity.deliveryConfig insert -> select round-trip), confirms the column is jsonb and queryable via ->> on the server, and that a jsonb column rejects malformed JSON. testcontainers junit-jupiter + postgresql deps added at test scope (versions managed by Spring Boot's testcontainers-bom). Verified: both classes green (4 tests) against postgres:16-alpine. This also exercises the DatabaseBootstrapRunner seed path (data-kingbase-zh.sql is the PostgreSQL-family seed), which the migration tree alone doesn't cover.
…ostgreSQL
The base docker-compose.yml already defines a postgres service but wires
mateclaw-server to MySQL (SPRING_PROFILES_ACTIVE=mysql, depends_on: mysql), so
the postgres service is dead weight. Add an override that re-points the app at
PostgreSQL:
docker compose -f docker-compose.yml -f docker-compose.pg.yml up -d
- mateclaw-server: switch to the postgres profile and DB_HOST=postgres; drop
the inherited mysql dependency with `depends_on.mysql: !reset null` (compose
merges depends_on maps, so the base mysql healthcheck dependency would
otherwise survive and block startup). Requires Compose v2.20+.
- mysql: gate behind a `mysql` profile so it doesn't start under the override.
(Compose still interpolates the base mysql service's ${VAR:?} at config-load
regardless of profiles, so .env must still carry DB_PASSWORD /
DB_ROOT_PASSWORD as placeholders — documented inline and in .env.example.)
- DB creds: the app reads DB_NAME / DB_USERNAME / DB_PASSWORD; map them to the
base postgres service's PGSQL_DB_* values so one .env drives both.
- .env.example: add the PGSQL_DB_* block with usage notes.
The mateclaw schema is created by Flyway on startup (application-postgres.yml
init-sqls), so the container needs no init script.
Verified: `docker compose ... config` resolves cleanly (app depends only on
postgres + searxng, profile=postgres); postgres:16 with the app creds comes up
healthy and accepts the CREATE SCHEMA init. End-to-end app-on-PostgreSQL boot
is covered by the Testcontainers suite in the previous commit.
Docs only mentioned MySQL/H2 even though PostgreSQL is now a fully supported,
tested target. Bring the docs in line with the code:
- New docs/{zh,en}/database-postgresql.md: quick start (Docker + manual),
connection string (currentSchema + the required stringtype=unspecified),
JSONB design (why stringtype is required, which columns stay TEXT, how to add
a GIN index later), MySQL differences, pg_dump backups, the transparent
upgrade path from the old parasitic-kingbase-tree setup, and how to run the
Testcontainers verification.
- CLAUDE.md: database section now lists MySQL/PostgreSQL/KingbaseES and the four
migration trees (h2/mysql/kingbase/postgresql) with the "keep all four in
sync" rule and the PG specifics.
- README.md: prose + capability table mention PostgreSQL 14+ / KingbaseES 8+.
- docs/{zh,en}/config.md: profile table gains postgres + kingbase rows, a
PostgreSQL datasource block, the four-tree migration list, and a "switching
to PostgreSQL" section.
- docs/{zh,en}/docker-deploy.md: a "use PostgreSQL instead of MySQL" section
pointing at docker-compose.pg.yml.
Self-checked: all ./database-postgresql links resolve, referenced test classes
exist, stringtype is actually in application-postgres.yml, and no stale
"MySQL-only" claims remain.
9346028 to
7e26d15
Compare
|
非常感谢这份把 PostgreSQL 做成一等公民的工作 —— 独立迁移树 + JSONB + 集成测试,体量很大(169 个文件),方向我们也认可。 正因为它体量大、且改动触及所有方言共享的 DB 运行时层(迁移树布局、Flyway 定位、TypeHandler 等),合并风险需要更谨慎评估,暂不合并,我们会安排更完整的审查后再跟进。几个需要先确认的点:
这些确认清楚、并跑通三方言启动后我们再合并。辛苦了 🙏 |
|
感谢详尽的 review!逐条回复三个确认点: 1. postgresql/ 与 kingbase/ 的关系并存,不是取代。
分叉时(commit 不会破坏 h2/mysql/kingbase 启动——这三个树和对应的 profile 配置本 PR 完全没动。 2. JSONB 列与其它方言 TEXT 的对应 + TypeHandler 往返
3. 迁移版本号冲突无冲突。PR #427 分叉自 V160/V161 在另一个分支( 三方言启动我已跑过(H2 默认 / MySQL profile / Testcontainers PG 16),均干净通过。如果还需要额外验证或有其他顾虑,请告知 🙏 |
|
补充说明一下为什么需要独立的 postgresql/ 树,而不仅仅是复用 kingbase/——核心动机是释放 PostgreSQL 原生数据结构的能力。 为什么不复用 kingbase 树KingbaseES 虽然基于 PG 内核,但它的 SQL 方言和类型系统有差异(JSONB 运算符支持、默认值表达式语法、部分 DDL 行为)。如果继续寄生在 kingbase 树上,就有两个选择,都有代价:
独立树让每套方言各走各的最优路径:PG 用 JSONB + 原生运算符,KingbaseES 保持 TEXT(兼容性优先),互不牵制。 分叉带来的实际收益1. 数据完整性——数据库层校验,而非应用层TEXT 列对 JSON 内容零校验,坏数据可以静默写入。JSONB 列在写入时由数据库引擎校验 JSON 合法性——畸形 JSON 直接被拒绝(已测试验证)。这是把数据质量护栏从应用层下沉到数据库层,不依赖 Java 代码的防御性解析。 2. 查询能力——结构化 JSON 查询开箱即用JSONB 列可以用 3. 存储与性能——二进制存储 + 压缩 + 索引路径
对高频读的 JSON 列(如 4. V53 的被迫重写就是最好的例子原来的 V53 用 -- TEXT 时代(字符串拼接,脆弱)
SET config_json = REPLACE(config_json, ..., CONCAT(...))
-- JSONB 时代(原生合并,安全)
SET config_json = COALESCE(config_json, '{}'::jsonb) || '{"connection_mode":"websocket"}'::jsonb这正是独立 PG 树的价值:让 PG 用 PG 的方式做事,而不是被迫用最低公共分母。 对架构的影响独立树之后,数据库方言从「一套迁移勉强适配多个引擎」变成「每套引擎有自己的最优迁移」: 新增 PG-only 迁移从 V160 起,不会回头影响其它三套树。这是把「方言差异」从运行时妥协变成编译时(迁移时)的正确性保证。 |
Closes #426. 相关 #244。
背景
PostgreSQL 之前是寄生在 KingbaseES 迁移树上的半成品:无独立迁移树、从未被真实数据库验证、JSON 全存 TEXT、无部署文档。本 PR 把它补成真正可用、可验证、有文档的一等公民。
改动(5 个 commit,按单一关注点排列,可逐个 review)
chore(db): fork kingbase migration tree into independent postgresql treecpkingbase 树 →db/migration/postgresql(152 文件,字节一致);application-postgres.yml指向新树;补齐application-kingbase.yml缺失的mate.wiki.watcher-*。零行为变化——切 location 对现有 PG 部署透明(Flyway 按 version+checksum 而非路径判定,相同脚本相同 checksum)。feat(db): upgrade high-frequency JSON columns from TEXT to JSONB on PostgreSQLTEXT → JSONB(6 个 NOT NULL 补DEFAULT '{}'/'[]');params_schema/output_schema/mate_message.metadata刻意保持 TEXT;datasource + flyway URL 加stringtype=unspecified;V53 改写为 JSONB 原生||合并(原本用 TRIM/SUBSTRING/CONCAT/REPLACE,对 jsonb 非法)。test(db): add Testcontainers PostgreSQL integration testsPostgresE2EBaseTest+ 迁移冒烟测试 + JSONB CRUD 往返测试;@Testcontainers(disabledWithoutDocker=true)无 Docker 自动跳过。ops(docker): add docker-compose.pg.yml overridedepends_on.mysql: !reset null解除对 MySQL 的依赖;MySQL 用 profile 挡住。docs(db): document PostgreSQL as a first-class database targetdatabase-postgresql.md;更新 README / CLAUDE.md / config / docker-deploy。设计要点
stringtype=unspecified一行解决(覆盖 JacksonTypeHandler + 普通 String JSON 列),零 Java 改动,是 PG JDBC 官方推荐姿势。->>/@>,加了是空索引;文档写明未来如何加。验证
真实 PostgreSQL 16 上:
${...}占位符、V53 改写)jsonb,排除列保持text->>查询docker compose -f docker-compose.yml -f docker-compose.pg.yml config校验通过Tests run: 4, Failures: 0, Errors: 0/BUILD SUCCESS跑测试:
mvn -Dtest='PostgresMigrationSmokeTest,CronJobDeliveryConfigPgTest' test(需本机 Docker)。如果维护者更希望拆成 5 个独立 PR 分别合并,我可以拆。