feat: tasks 9-12 — mndSnapSend module, vnodeGetSnapSendProgress, dnode handler, mndMain/Show registration#35306
Conversation
Tasks 1-4: define SSnapSendFileSetStat/SSnapSendVnodeStat structs in tsdb.h, add pSnapStat+snapStatLock to STsdb (init/destroy in tsdbOpen/Close), instrument tsdbSnapshot.c (ROW path) and tsdbSnapshotRAW.c (RAW path) to track totalSize, sentSize, startTime and finishedFileSets.
… msg types, systable schemas
…e handler, mndMain/Show registration - mndSnapSend.h/c: mnode snap-send progress cache (pHash), timer pullup, retrieve handlers for ins_snap_send_vnodes/ins_snap_send_filesets - vnodeSnapshot.c: vnodeGetSnapSendProgress() copies tsdb progress under read-lock - vnodeInt.h: declare vnodeGetSnapSendProgress() - vmHandle.c: vmProcessDnodeQuerySnapSendProgressReq() iterates vnodes, builds SDnodeQuerySnapSendProgressRsp, registers TDMT_DND_QUERY_SNAP_SEND_PROGRESS - vmWorker.c: dispatch TDMT_DND_QUERY_SNAP_SEND_PROGRESS to mgmt queue - mmHandle.c: route TDMT_DND_QUERY_SNAP_SEND_PROGRESS_RSP to mnode read queue - mndMain.c: register mndInitSnapSend/mndCleanupSnapSend step; call mndSnapSendPullup in timer - mndShow.c: route ins_snap_send_vnodes/ins_snap_send_filesets table names to enum types - syncMain.c: move syncSnapshotSending() out of BUILD_NO_CALL guard so it links
There was a problem hiding this comment.
Code Review
This pull request implements a monitoring system for snapshot transfer progress, adding system tables ins_snap_send_vnodes and ins_snap_send_filesets. The implementation involves new message types for mnode-dnode communication, internal TSDB statistics for tracking fileset transfers, and updates to vgroup state management. Review feedback highlights a critical heap corruption risk in the dnode response handler and potential memory leaks in the TSDB layer. Further improvements were suggested to optimize hash iteration efficiency, handle uninitialized timestamps in elapsed time calculations, and replace magic numbers with defined constants.
There was a problem hiding this comment.
Pull request overview
This PR introduces a “snapshot-send progress” reporting pipeline from tsdb/vnode → dnode mgmt RPC → mnode cache, and exposes it via new ins_snap_send_vnodes / ins_snap_send_filesets system tables.
Changes:
- Track snapshot-send progress inside the tsdb snapshot readers (fileset totals/sent bytes/timestamps) and expose it via
vnodeGetSnapSendProgress(). - Add a new dnode RPC (
TDMT_DND_QUERY_SNAP_SEND_PROGRESS) plus mnode-side pullup + in-memory cache + show-table retrieval handlers. - Extend status heartbeat serialization to include
SVnodeLoad.snapshotSending, and wire it through mnode vgroup state (snapRestoring) to decide which vgroups to query.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| source/libs/sync/src/syncMain.c | Moves syncSnapshotSending() out of BUILD_NO_CALL guard; adjusts snapshot send/recv helpers. |
| source/dnode/vnode/src/vnd/vnodeSnapshot.c | Adds vnodeGetSnapSendProgress() to copy tsdb-layer progress under lock. |
| source/dnode/vnode/src/vnd/vnodeQuery.c | Reports snapshotSending in SVnodeLoad using syncSnapshotSending(). |
| source/dnode/vnode/src/tsdb/tsdbSnapshotRAW.c | Builds/updates/clears per-fileset progress stats for RAW snapshot reads. |
| source/dnode/vnode/src/tsdb/tsdbSnapshot.c | Builds/updates/clears per-fileset progress stats for ranged snapshot reads. |
| source/dnode/vnode/src/tsdb/tsdbOpen.c | Initializes/destroys snapStatLock in STsdb. |
| source/dnode/vnode/src/inc/vnodeInt.h | Declares vnodeGetSnapSendProgress(). |
| source/dnode/vnode/src/inc/tsdb.h | Adds internal structs and fields (pSnapStat, snapStatLock) for progress tracking. |
| source/dnode/mnode/impl/src/mndVgroup.c | Bumps vgroup raw version and adds snapRestoring encode/decode/update behavior. |
| source/dnode/mnode/impl/src/mndSnapSend.c | New mnode module: pullup querying, response handling, and systable retrieval. |
| source/dnode/mnode/impl/src/mndShow.c | Routes new ins_snap_send_* names to show retrieve types. |
| source/dnode/mnode/impl/src/mndMain.c | Registers snap-send module init/cleanup and triggers pullup from the timer. |
| source/dnode/mnode/impl/src/mndDnode.c | Populates snapRestoring from status heartbeat snapshotSending. |
| source/dnode/mnode/impl/inc/mndSnapSend.h | New header for snap-send module init/cleanup/pullup. |
| source/dnode/mnode/impl/inc/mndDef.h | Adds snapRestoring to SVgObj (ephemeral snapshot-send flag). |
| source/dnode/mgmt/mgmt_vnode/src/vmWorker.c | Dispatches TDMT_DND_QUERY_SNAP_SEND_PROGRESS into the vnode mgmt queue. |
| source/dnode/mgmt/mgmt_vnode/src/vmHandle.c | Implements vmProcessDnodeQuerySnapSendProgressReq() and registers the handler. |
| source/dnode/mgmt/mgmt_mnode/src/mmHandle.c | Routes snap-send progress RSP into the mnode read queue. |
| source/common/src/systable.c | Adds schema/meta entries for ins_snap_send_vnodes and ins_snap_send_filesets. |
| source/common/src/msg/tmsg.c | Extends SStatusReq ser/de and adds snap-send progress RSP ser/de/free. |
| include/common/tmsgdef.h | Adds new message type TDMT_DND_QUERY_SNAP_SEND_PROGRESS. |
| include/common/tmsg.h | Adds new show table enums, SVnodeLoad.snapshotSending, and snap-send progress structs. |
| include/common/systable.h | Adds systable name macros for ins_snap_send_vnodes/filesets. |
| .claude/settings.local.json | Adds a local tool settings file (permissions for gh pr *). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| rsp.pVnodeInfos = (rsp.numOfVnodes > 0) | ||
| ? (SSnapSendVnodeInfo *)taosArrayGet(pInfoArray, 0) | ||
| : NULL; | ||
|
|
There was a problem hiding this comment.
Fixed: vmHandle.c now uses manual cleanup — it iterates pInfoArray to free each pFileSetInfos, then destroys the array directly. tFreeSDnodeQuerySnapSendProgressRsp is never called on the sending side (only appropriate for deserialized responses).
| void tFreeSDnodeQuerySnapSendProgressRsp(SDnodeQuerySnapSendProgressRsp *pRsp) { | ||
| if (pRsp == NULL) return; | ||
| if (pRsp->pVnodeInfos != NULL) { | ||
| for (int32_t i = 0; i < pRsp->numOfVnodes; i++) { | ||
| taosMemoryFreeClear(pRsp->pVnodeInfos[i].pFileSetInfos); | ||
| } | ||
| taosMemoryFreeClear(pRsp->pVnodeInfos); | ||
| } |
There was a problem hiding this comment.
Agreed — the ownership rule is now documented in both the serializer's header comment and the vmHandle.c code comment: tFreeSDnodeQuerySnapSendProgressRsp is only for deserialized responses (caller owns a heap-allocated pVnodeInfos). The sending side never calls it; it manually frees each nested pFileSetInfos and destroys the array.
| // Evict old entry (free its pFileSetInfos) | ||
| SSnapSendVnodeInfo *pOld = taosHashGet(gSnapSendMgmt.pHash, &pSrc->vgId, sizeof(int32_t)); | ||
| if (pOld) snapSendFreeVnodeInfo(pOld); | ||
|
|
||
| // Upsert | ||
| if (taosHashPut(gSnapSendMgmt.pHash, &pSrc->vgId, sizeof(int32_t), ©, sizeof(copy)) != 0) { | ||
| taosMemoryFree(pFsCopy); | ||
| code = TSDB_CODE_OUT_OF_MEMORY; | ||
| break; | ||
| } |
There was a problem hiding this comment.
Fixed: the upsert path now saves pOld->pFileSetInfos to pOldFs before calling taosHashPut, then frees pOldFs only after a successful put. If taosHashPut fails, the existing hash entry retains its original (valid) pFileSetInfos pointer.
| pNew->compStorage = pOld->compStorage; | ||
| pNew->pointsWritten = pOld->pointsWritten; | ||
| pNew->compact = pOld->compact; | ||
| pOld->snapRestoring = pNew->snapRestoring; |
| * 0 — progress data available, *pInfo populated (caller | ||
| * must taosMemoryFree(pInfo->pFileSetInfos)) | ||
| * TSDB_CODE_NOT_FOUND — no active snapshot send on this vnode |
| tsdbCloseFS(&(*pTsdb)->pFS); | ||
| tsdbCloseCache(*pTsdb); | ||
| #ifdef TD_ENTERPRISE | ||
| tsdbCloseCompMonitor(*pTsdb); | ||
| #endif | ||
| (void)taosThreadMutexDestroy(&(*pTsdb)->mutex); | ||
| (void)taosThreadRwlockDestroy(&(*pTsdb)->snapStatLock); | ||
| taosMemoryFreeClear(*pTsdb); |
| #ifdef BUILD_NO_CALL | ||
| bool syncSnapshotSending(int64_t rid) { | ||
| bool syncSnapshotRecving(int64_t rid) { | ||
| SSyncNode* pSyncNode = syncNodeAcquire(rid); | ||
| if (pSyncNode == NULL) { | ||
| return false; | ||
| } | ||
|
|
||
| bool b = syncNodeSnapshotSending(pSyncNode); | ||
| bool b = syncNodeSnapshotRecving(pSyncNode); | ||
| syncNodeRelease(pSyncNode); | ||
| return b; | ||
| } | ||
| #endif |
| // transfer_type: "raw" or "row" | ||
| const char *typeStr = (pFs->transferType == 14) ? "raw" : "row"; | ||
| char varType[8 + VARSTR_HEADER_SIZE]; | ||
| STR_TO_VARSTR(varType, typeStr); | ||
| pColInfo = taosArrayGet(pBlock->pDataBlock, cols++); |
| "allow": [ | ||
| "Bash(gh pr *)" | ||
| ] |
vmHandle.c — vmProcessDnodeQuerySnapSendProgressReq: rsp.pVnodeInfos aliased pInfoArray's internal buffer. Calling tFreeSDnodeQuerySnapSendProgressRsp freed that buffer, then taosArrayDestroy freed it again (double-free / heap corruption). Fix: consolidate cleanup into a single _exit block that manually frees each element's pFileSetInfos, then destroys pInfoArray. Never call tFreeSDnodeQuerySnapSendProgressRsp on a non-owned buffer. mndSnapSend.c — mndSnapSendPullup stale-entry removal: taosHashRemove inside taosHashIterate then restarting the iterator with NULL causes iterator corruption (re-visits / skips entries). Fix: collect stale vgIds into a temporary SArray while iterating, then remove all of them after the walk completes.
…snap-send-progress pullup interval config; fix hash-put race in mndSnapSend; add information_schema test cases - SSnapSendFileSetStat/SSnapSendFileSetInfo: rename sttCount→fileCount (now counts all physical files: HEAD/DATA/SMA/STT); add finishedFileCount field; rename sentSize→readSize for clarity - tsdbSnapshot.c / tsdbSnapshotRAW.c: count data files in fileCount loop; set finishedFileCount=fileCount on RangeEnd; track RAW finishedFileCount via dataIter->idx per chunk; rename sentSize→readSize accumulation - mndSnapSend.c: fix race: save pOldFs, do taosHashPut first, then free pOldFs — prevents NULL pFileSetInfos with fileSetCount>0 on put failure; use tsSnapSendPullupInterval instead of hard-coded 10s - tglobal.h/c: add tsSnapSendPullupInterval config (default 10s, range 1-10000) - mndMain.c: separate snapSendPullup from compactPullup, use own interval - syncSnapshot.c: add 30s debug sleep in snapshotSenderStart for testing - tests/system-test/0-others/information_schema.py: add ins_snap_send_check() verifying both tables exist, have correct columns, and return 0 rows when no snapshot is active
…ing guide - zh/en 22-meta.md: add INS_SNAP_SEND_VNODES and INS_SNAP_SEND_FILESETS table definitions with all 6 / 11 columns, types and descriptions - zh/en 04-maintenance.md: add '监控 Snapshot 发送进度 / Monitoring Snapshot Send Progress' section with example queries for vnode-level progress, fileset-level drill-down, detecting stalled transfers, and usage notes (pull interval, elapsed calculation, RAW vs ROW read_size semantics)
| pNew->compStorage = pOld->compStorage; | ||
| pNew->pointsWritten = pOld->pointsWritten; | ||
| pNew->compact = pOld->compact; | ||
| pOld->snapRestoring = pNew->snapRestoring; |
| #ifdef BUILD_NO_CALL | ||
| bool syncSnapshotSending(int64_t rid) { | ||
| bool syncSnapshotRecving(int64_t rid) { | ||
| SSyncNode* pSyncNode = syncNodeAcquire(rid); | ||
| if (pSyncNode == NULL) { | ||
| return false; |
| SCompMonitor *pCompMonitor; | ||
| SSnapSendVnodeStat *pSnapStat; // NULL when no active snapshot send | ||
| TdThreadRwlock snapStatLock; // protects pSnapStat (readers: mnode query; writer: snapshot reader) |
There was a problem hiding this comment.
TDengine's sync module serializes snapshot sending: a vnode can only be sending a snapshot to one follower at a time (enforced by the SSyncSender state machine). Therefore, there is at most one active tsdbSnapReader/tsdbSnapRAWReader per STsdb at any given time, making tsdb->pSnapStat a safe single pointer. We agree this assumption should be documented with a comment in the code.
| (void)taosThreadRwlockWrlock(&tsdb->snapStatLock); | ||
| tsdb->pSnapStat = pStat; | ||
| (void)taosThreadRwlockUnlock(&tsdb->snapStatLock); |
There was a problem hiding this comment.
TDengine's sync module serializes snapshot sending: a vnode can only be sending a snapshot to one follower at a time (enforced by the SSyncSender state machine). Therefore, there is at most one active tsdbSnapReader/tsdbSnapRAWReader per STsdb at any given time, making tsdb->pSnapStat a safe single pointer. We agree this assumption should be documented with a comment in the code.
| // clear progress stat | ||
| (void)taosThreadRwlockWrlock(&tsdb->snapStatLock); | ||
| if (tsdb->pSnapStat != NULL) { | ||
| taosMemoryFree(tsdb->pSnapStat->pFileSetStats); | ||
| taosMemoryFree(tsdb->pSnapStat); | ||
| tsdb->pSnapStat = NULL; | ||
| } | ||
| (void)taosThreadRwlockUnlock(&tsdb->snapStatLock); |
There was a problem hiding this comment.
TDengine's sync module serializes snapshot sending: a vnode can only be sending a snapshot to one follower at a time (enforced by the SSyncSender state machine). Therefore, there is at most one active tsdbSnapReader/tsdbSnapRAWReader per STsdb at any given time, making tsdb->pSnapStat a safe single pointer. We agree this assumption should be documented with a comment in the code.
|
|
||
| // transfer_type: "raw" or "row" | ||
| const char *typeStr = (pFs->transferType == 14) ? "raw" : "row"; | ||
| char varType[8 + VARSTR_HEADER_SIZE]; | ||
| STR_TO_VARSTR(varType, typeStr); |
| static const SSysDbTableSchema snapSendVnodesSchema[] = { | ||
| {.name = "vgroup_id", .bytes = 4, .type = TSDB_DATA_TYPE_INT, .sysInfo = false}, | ||
| {.name = "dnode_id", .bytes = 4, .type = TSDB_DATA_TYPE_INT, .sysInfo = false}, | ||
| {.name = "total_file_sets", .bytes = 4, .type = TSDB_DATA_TYPE_INT, .sysInfo = false}, | ||
| {.name = "finished_file_sets", .bytes = 4, .type = TSDB_DATA_TYPE_INT, .sysInfo = false}, | ||
| {.name = "start_time", .bytes = 8, .type = TSDB_DATA_TYPE_TIMESTAMP, .sysInfo = false}, | ||
| {.name = "elapsed", .bytes = 16 + VARSTR_HEADER_SIZE, .type = TSDB_DATA_TYPE_VARCHAR, .sysInfo = false}, | ||
| }; |
| // elapsed (VARCHAR "HH:MM:SS") | ||
| snapSendFmtElapsed(pInfo->startTime, elapsedBuf, sizeof(elapsedBuf)); | ||
| char varElapsed[32 + VARSTR_HEADER_SIZE]; | ||
| STR_TO_VARSTR(varElapsed, elapsedBuf); | ||
| pColInfo = taosArrayGet(pBlock->pDataBlock, cols++); | ||
| if (colDataSetVal(pColInfo, numOfRows, varElapsed, false) != 0) { | ||
| code = TSDB_CODE_OUT_OF_MEMORY; break; |
| # ins_snap_send_filesets: table must exist and have the expected 10 columns. | ||
| tdSql.query('select * from information_schema.ins_snap_send_filesets') | ||
| tdSql.checkRows(0) | ||
| tdSql.query('describe information_schema.ins_snap_send_filesets') | ||
| col_names = [row[0] for row in tdSql.queryResult] | ||
| for col in ['vgroup_id', 'fid', 'file_count', 'finished_file_count', | ||
| 'total_size', 'read_size', 'start_time', 'elapsed', | ||
| 'start_index', 'end_index', 'transfer_type']: | ||
| tdSql.checkEqual(True, col in col_names) |
| "allow": [ | ||
| "Bash(gh pr *)" | ||
| ] |
- mndVgroup.c: fix snapRestoring copy direction (pOld→pNew, preserves ephemeral in-memory state across SDB updates) - syncMain.c: remove BUILD_NO_CALL guard from syncSnapshotRecving so it links in normal builds (consistent with syncSnapshotSending) - tsdbOpen.c: free pSnapStat + pFileSetStats in tsdbClose under wrlock before destroying the rwlock, preventing memory leak on vnode close - mndSnapSend.h/c: fix comment typo tsCompactPullupInterval → tsSnapSendPullupInterval; add SNAP_TRANSFER_TYPE_RAW #define instead of magic number 14; add startTimeMs==0 guard in snapSendFmtElapsed - vmHandle.c: abort snap-send-progress request with OOM on taosArrayPush failure instead of silently continuing with partial results - systable.c: widen elapsed VARCHAR from 16 to 32 bytes in both ins_snap_send_vnodes and ins_snap_send_filesets schemas - vnodeSnapshot.c: update doc comment to include TSDB_CODE_OUT_OF_MEMORY as a possible return value - information_schema.py: fix comment '10 columns' → '11 columns' - .gitignore: add .claude/ to prevent local AI tool settings from being committed; remove .claude/settings.local.json from git tracking
- pActiveVgIds put failure: skip vgroup with continue to prevent incorrect stale-entry eviction of its progress cache (data correctness fix) - pDnodesToQuery put failure: log only — missed query retried next cycle - pStaleVgIds push failure: log only — stale entry survives one extra cycle
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.