Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,6 @@ test/allure-results/
.idea/
cmake-build-debug/
cmake-build-release/

# Local Claude AI tool settings
.claude/
16 changes: 16 additions & 0 deletions docs/en/08-operation/04-maintenance.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,22 @@ restore qnode on dnode <dnode_id>; # Restore qnode on dnode
- This feature is based on the recovery of existing replication capabilities, not disaster recovery or backup recovery. Therefore, for the mnode and vnode to be recovered, the prerequisite for using this command is that the other two replicas of the mnode or vnode can still function normally.
- This command cannot repair individual files in the data directory that are damaged or lost. For example, if individual files or data in an mnode or vnode are damaged, it is not possible to recover a specific file or block of data individually. In this case, you can choose to completely clear the data of that mnode/vnode and then perform recovery.

### Monitoring Snapshot Send Progress

After running `restore dnode`, TDengine synchronizes data to the target node via snapshot replication. Use the following system tables to monitor progress in real time:

```sql
-- Vnode-level: overall progress for each vnode currently sending a snapshot
SELECT * FROM information_schema.ins_snap_send_vnodes;

-- Fileset-level: per-time-partition file transfer details for a given vgroup
SELECT * FROM information_schema.ins_snap_send_filesets
WHERE vgroup_id = <vgroup_id>
ORDER BY fid;
```

For column definitions of both tables, see [Metadata](../../tdengine-reference/sql-manual/metadata).

## Splitting Virtual Groups

When a vgroup is overloaded with CPU or Disk resource usage due to too many subtables, after adding a dnode, you can split the vgroup into two virtual groups using the `split vgroup` command. After the split, the newly created two vgroups will undertake the read and write services originally provided by one vgroup. This command was first released in version 3.0.6.0, and it is recommended to use the latest version whenever possible.
Expand Down
31 changes: 31 additions & 0 deletions docs/en/14-reference/03-taos-sql/22-meta.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,37 @@ Provides information about file sets.
| 7 | last_compact | TIMESTAMP | Time of the last compaction |
| 8 | should_compact | bool | Whether the file set should be compacted |

## INS_SNAP_SEND_VNODES

Provides overall progress information for vnodes currently undergoing snapshot transfer. A row appears when a vgroup's leader is transferring a snapshot to a follower, and disappears automatically when the transfer completes.

| # | **Column Name** | **Data Type** | **Description** |
| --- | :------------------: | ------------- | -------------------------------------------------- |
| 1 | vgroup_id | INT | vgroup ID of the vnode |
| 2 | dnode_id | INT | dnode ID of the leader |
| 3 | total_file_sets | INT | total number of filesets to transfer |
| 4 | finished_file_sets | INT | number of filesets fully transferred |
| 5 | start_time | TIMESTAMP | time when the snapshot reader was opened |
| 6 | elapsed | VARCHAR(16) | elapsed duration, format `H:MM:SS` |

## INS_SNAP_SEND_FILESETS

Provides file-level transfer progress for each fileset (time partition) in the active snapshot send. Rows disappear automatically when the snapshot transfer for the owning vnode completes.

| # | **Column Name** | **Data Type** | **Description** |
| --- | :-------------------: | ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| 1 | vgroup_id | INT | vgroup ID of the vnode |
| 2 | fid | INT | fileset ID (time-partition ID) |
| 3 | file_count | INT | total number of physical files in this fileset (HEAD/DATA/SMA/STT etc.) |
| 4 | finished_file_count | INT | number of physical files fully transferred |
| 5 | total_size | BIGINT | sum of all physical file sizes in this fileset, in bytes |
| 6 | read_size | BIGINT | bytes read and sent so far (monotonically increasing; same unit as total_size in RAW mode; re-compressed bytes in ROW mode, use as trend indicator only) |
| 7 | start_time | TIMESTAMP | time when transfer of this fileset began |
| 8 | elapsed | VARCHAR(16) | elapsed time for this fileset, format `H:MM:SS` |
| 9 | start_index | BIGINT | start version of this fileset (sver) |
| 10 | end_index | BIGINT | end version of this fileset (ever) |
| 11 | transfer_type | VARCHAR(4) | transfer mode: `raw` (full RAW transfer) or `row` (incremental ROW transfer) |

## INS_VNODES

Provides information about vnodes in the system. Users with SYSINFO property set to 0 cannot view this table.
Expand Down
16 changes: 16 additions & 0 deletions docs/zh/08-operation/04-maintenance.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,22 @@ restore qnode on dnode <dnode_id>;# 恢复dnode上的qnode
- 该功能是基于已有的复制功能的恢复,不是灾难恢复或者备份恢复,所以对于要恢复的 mnode 和 vnode 来说,使用该命令的前提是还存在该 mnode 或 vnode 的其它两个副本仍然能够正常工作。
- 该命令不能修复数据目录中的个别文件的损坏或者丢失。例如,如果某个 mnode 或者 vnode 中的个别文件或数据损坏,无法单独恢复损坏的某个文件或者某块数据。此时,可以选择将该 mnode/vnode 的数据全部清空再进行恢复。

### 监控 Snapshot 发送进度

执行 `restore dnode` 后,TDengine 会通过 snapshot 复制将数据从其他副本同步到目标节点。可以通过以下系统表实时查看进度:

```sql
-- vnode 级:查看每个正在传输 snapshot 的 vnode 的整体进度
SELECT * FROM information_schema.ins_snap_send_vnodes;

-- fileset 级:查看指定 vgroup 中每个时间分片的文件传输详情
SELECT * FROM information_schema.ins_snap_send_filesets
WHERE vgroup_id = <vgroup_id>
ORDER BY fid;
```

两张表的列定义参见[元数据](../../reference/taos-sql/meta)。

## 分裂虚拟组

当一个 vgroup 因为子表数过多而导致 CPU 或 Disk 资源使用量负载过高时,增加 dnode 节点后,可通过 `split vgroup` 命令把该 vgroup 分裂为两个虚拟组。分裂完成后,新产生的两个 vgroup 承担原来由一个 vgroup 提供的读写服务。该命令在 3.0.6.0 版本第一次发布,建议尽可能使用最新版本。
Expand Down
31 changes: 31 additions & 0 deletions docs/zh/14-reference/03-taos-sql/22-meta.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,37 @@ TDengine 内置了一个名为 `INFORMATION_SCHEMA` 的数据库,提供对数
| 7 | last_compact | TIMESTAMP | 最后一次压缩的时间 |
| 8 | should_compact | bool | 是否需要压缩,true:需要,false:不需要 |

## INS_SNAP_SEND_VNODES

提供当前正在进行 snapshot 发送的 vnode 的整体进度信息。当某 vgroup 的 leader 正在向 follower 传输 snapshot 时,对应行出现;传输完成后自动消失。

| # | **列名** | **数据类型** | **说明** |
| --- | :-----------------: | ------------- | --------------------------------------------------------------------- |
| 1 | vgroup_id | INT | vnode 所属 vgroup ID |
| 2 | dnode_id | INT | leader 所在 dnode ID |
| 3 | total_file_sets | INT | 本次发送需传输的 fileset 总数 |
| 4 | finished_file_sets | INT | 已完整传输完成的 fileset 数量 |
| 5 | start_time | TIMESTAMP | snapshot reader 打开时间 |
| 6 | elapsed | VARCHAR(16) | 持续时长,格式 `H:MM:SS` |

## INS_SNAP_SEND_FILESETS

提供当前活跃 snapshot 发送中每个 fileset(时间分片)的文件级传输进度。随所属 vnode 的 snapshot 发送完成后自动消失。

| # | **列名** | **数据类型** | **说明** |
| --- | :-------------------: | ------------- | ------------------------------------------------------------------------------------------------------------------ |
| 1 | vgroup_id | INT | vnode 所属 vgroup ID |
| 2 | fid | INT | fileset ID(时间分片 ID) |
| 3 | file_count | INT | 该 fileset 包含的物理文件总数(HEAD/DATA/SMA/STT 等累计) |
| 4 | finished_file_count | INT | 已完整传输的物理文件数 |
| 5 | total_size | BIGINT | 该 fileset 所有物理文件大小之和(bytes) |
| 6 | read_size | BIGINT | 已读取并发送的字节数(单调递增;RAW 模式与 total_size 同单位,ROW 模式为重压缩后字节,仅供趋势参考) |
| 7 | start_time | TIMESTAMP | 开始传输该 fileset 的时间 |
| 8 | elapsed | VARCHAR(16) | 该 fileset 已耗时,格式 `H:MM:SS` |
| 9 | start_index | BIGINT | 该 fileset 的起始 version(sver) |
| 10 | end_index | BIGINT | 该 fileset 的结束 version(ever) |
| 11 | transfer_type | VARCHAR(4) | 传输方式:`raw`(RAW 全量传输)或 `row`(ROW 增量传输) |

## INS_VNODES

提供系统中 vnode 的相关信息。属性为 0 的用户不能查看此表。
Expand Down
2 changes: 2 additions & 0 deletions include/common/systable.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ extern "C" {
#define TSDB_INS_DISK_USAGE "ins_disk_usage"
#define TSDB_INS_TABLE_FILESETS "ins_filesets"
#define TSDB_INS_TABLE_TRANSACTION_DETAILS "ins_transaction_details"
#define TSDB_INS_TABLE_SNAP_SEND_VNODES "ins_snap_send_vnodes"
#define TSDB_INS_TABLE_SNAP_SEND_FILESETS "ins_snap_send_filesets"

#define TSDB_PERFORMANCE_SCHEMA_DB "performance_schema"
#define TSDB_PERFS_TABLE_SMAS "perf_smas"
Expand Down
1 change: 1 addition & 0 deletions include/common/tglobal.h
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ extern bool tsWalDeleteOnCorruption;
extern bool tsDiskIDCheckEnabled;
extern int32_t tsTransPullupInterval;
extern int32_t tsCompactPullupInterval;
extern int32_t tsSnapSendPullupInterval;
extern int32_t tsMqRebalanceInterval;
extern int32_t tsStreamCheckpointInterval;
extern int32_t tsThresholdItemsInWriteQueue;
Expand Down
36 changes: 36 additions & 0 deletions include/common/tmsg.h
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@ typedef enum _mgmt_table {
TSDB_MGMT_TABLE_USAGE,
TSDB_MGMT_TABLE_FILESETS,
TSDB_MGMT_TABLE_TRANSACTION_DETAIL,
TSDB_MGMT_TABLE_SNAP_SEND_VNODES,
TSDB_MGMT_TABLE_SNAP_SEND_FILESETS,
TSDB_MGMT_TABLE_MAX,
} EShowType;

Expand Down Expand Up @@ -2030,6 +2032,7 @@ typedef struct {
int64_t syncCommitIndex;
int64_t bufferSegmentUsed;
int64_t bufferSegmentSize;
int8_t snapshotSending; // 1 if this vnode (as leader) is actively sending a snapshot
} SVnodeLoad;

typedef struct {
Expand Down Expand Up @@ -2326,6 +2329,39 @@ int32_t tSerializeSDnodeQueryCompactProgressRsp(void *buf, int32_t bufLen, SDnod
int32_t tDeserializeSDnodeQueryCompactProgressRsp(void *buf, int32_t bufLen, SDnodeQueryCompactProgressRsp *pRsp);
void tFreeSDnodeQueryCompactProgressRsp(SDnodeQueryCompactProgressRsp *pRsp);

// Snap send progress query (mnode → dnode, dnode → mnode RSP)
typedef struct {
int32_t fid;
int32_t fileCount;
int32_t finishedFileCount;
int64_t totalSize;
int64_t readSize;
int64_t startTime; // ms timestamp
int64_t sver;
int64_t ever;
int8_t transferType; // SNAP_DATA_TSDB(2) or SNAP_DATA_RAW(14)
} SSnapSendFileSetInfo;

typedef struct {
int32_t vgId;
int32_t dnodeId;
int32_t totalFileSets;
int32_t finishedFileSets;
int64_t startTime; // ms timestamp of reader open
int32_t fileSetCount; // length of pFileSetInfos
SSnapSendFileSetInfo *pFileSetInfos;
} SSnapSendVnodeInfo;

typedef struct {
int32_t dnodeId;
int32_t numOfVnodes;
SSnapSendVnodeInfo *pVnodeInfos; // array of numOfVnodes elements
} SDnodeQuerySnapSendProgressRsp;

int32_t tSerializeSDnodeQuerySnapSendProgressRsp(void *buf, int32_t bufLen, SDnodeQuerySnapSendProgressRsp *pRsp);
int32_t tDeserializeSDnodeQuerySnapSendProgressRsp(void *buf, int32_t bufLen, SDnodeQuerySnapSendProgressRsp *pRsp);
void tFreeSDnodeQuerySnapSendProgressRsp(SDnodeQuerySnapSendProgressRsp *pRsp);

typedef struct {
int32_t vgId;
int32_t dnodeId;
Expand Down
1 change: 1 addition & 0 deletions include/common/tmsgdef.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@
TD_DEF_MSG_TYPE(TDMT_MND_SET_VGROUP_KEEP_VERSION, "set-vgroup-keep-version", NULL, NULL)
TD_DEF_MSG_TYPE(TDMT_MND_TRIM_DB_WAL, "trim-db-wal", NULL, NULL)
TD_DEF_MSG_TYPE(TDMT_DND_QUERY_COMPACT_PROGRESS, "dnode-query-compact-progress", NULL, NULL)
TD_DEF_MSG_TYPE(TDMT_DND_QUERY_SNAP_SEND_PROGRESS, "dnode-query-snap-send-progress", NULL, NULL)
TD_CLOSE_MSG_SEG(TDMT_DND_MSG)

TD_NEW_MSG_SEG(TDMT_MND_MSG) // 1<<8
Expand Down
Loading
Loading