MixedDbClient: batch ZMQ Set/Del for multi-key gNMI updates (#27250)#675
Open
hdwhdw wants to merge 1 commit into
Open
MixedDbClient: batch ZMQ Set/Del for multi-key gNMI updates (#27250)#675hdwhdw wants to merge 1 commit into
hdwhdw wants to merge 1 commit into
Conversation
Why:
The current handleTableData loops DbSetTable/DbDelTable per key, so a
single gNMI Set carrying N keys to a DASH_* table generates N ZMQ
round-trips even though the messages all target the same producer.
At the scale DASH route programming requires (10k-100k entries per
update), the per-key overhead dominates wall-clock time.
What:
* Add DbBatchSetTable(table, []tableBatchEntry) and DbBatchDelTable(
table, []string) on MixedDbClient. When the table is ZMQ-backed
these collapse the batch into a single ZmqProducerBatchedSet /
ZmqProducerBatchedDel call (one ZMQ message). For plain Table or
non-ZMQ ProducerStateTable backends — where no batched C++ API
exists — they fall back to the existing per-entry loop, so behavior
for DASH_HA_* tables and ordinary CONFIG_DB writes is unchanged.
* Wire the multi-key code paths in handleTableData to the new
helpers: the opRemove branch and the multi-key JSON opAdd branch
now build a slice and make one batched call. Single-key paths are
untouched (batching one entry is pointless).
* Wrap the new SWIG entry points with ZmqProducerBatched{Set,Del}Wrapper
to convert C++ exceptions into Go errors, mirroring the existing
ProducerStateTable wrappers.
Tests:
client_test.go adds:
* TestZmqBatchedSetEndToEnd / TestZmqBatchedDelEndToEnd — round-trip
N entries through a real ZmqProducerStateTable + ZmqConsumerStateTable
and verify the consumer drains them all.
* TestDbBatchSetTableFallsBackForPlainTable / ...EmptyAndSingle —
verify the plainTableMap fallback path and empty/single-entry
shortcuts.
* TestHandleTableDataBatchesMultiKeyJSON / ...OpRemove — gomonkey-
patched proof that handleTableData routes through DbBatchSetTable /
DbBatchDelTable exactly once for multi-key payloads instead of N
per-key calls.
Related:
* sonic-net/sonic-buildimage#27250 — tracking issue for the
optimization.
* Depends on sonic-net/sonic-swss-common#1188, which exposes
ZmqProducerBatchedSet/Del/Send to Go via SWIG %inline helpers.
Until that PR lands and a fresh libswsscommon artifact is
published, sonic-gnmi CI on this branch will fail to build — the
Go bindings for the new symbols do not exist in the current
master swsscommon artifact.
Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
Contributor
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR optimizes the gNMI Set hot path in MixedDbClient by batching multi-key ZMQ-backed table writes/deletes into a single ZMQ message, reducing per-key round-trips for large DASH updates.
Changes:
- Add
DbBatchSetTable/DbBatchDelTableto batch SET/DEL operations for ZMQ-backed tables with a fallback to per-entry loops for non-ZMQ backends. - Update
handleTableDatato route multi-key JSONopAddand multi-keyopRemovethrough the new batch APIs. - Add end-to-end and routing tests covering batched ZMQ set/del, fallback behavior, and
handleTableDatabatching behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
sonic_data_client/mixed_db_client.go |
Adds batched ZMQ Set/Del wrappers and batch APIs; updates handleTableData to use batching for multi-key updates. |
sonic_data_client/client_test.go |
Adds new tests validating batched ZMQ behavior and that handleTableData calls batch APIs exactly once. |
Comment on lines
+2338
to
+2346
| // handleTableData looks up RedisDbMap[<mapkey>:<dbName>]; ensure a | ||
| // redis client is registered for the key. | ||
| mapKey := "test:" + APPL_DB_NAME | ||
| if _, ok := RedisDbMap[mapKey]; !ok { | ||
| RedisDbMap[mapKey] = redis.NewClient(&redis.Options{ | ||
| Addr: "127.0.0.1:6379", | ||
| DB: 0, | ||
| }) | ||
| } |
Comment on lines
+2396
to
+2407
| if _, ok := RedisDbMap[mapKey]; !ok { | ||
| RedisDbMap[mapKey] = redis.NewClient(&redis.Options{ | ||
| Addr: "127.0.0.1:6379", | ||
| DB: 0, | ||
| }) | ||
| } | ||
|
|
||
| // Seed three keys in Redis so the wildcard Keys() lookup returns | ||
| // multiple dbkeys. | ||
| const tableName = "DASH_BATCH_REMOVE_TABLE" | ||
| const delim = ":" | ||
| rdb := RedisDbMap[mapKey] |
Comment on lines
+2172
to
+2173
| for _, c := range zmqClientMap { | ||
| swsscommon.DeleteZmqClient(c) |
Comment on lines
+2211
to
+2212
| for _, c := range zmqClientMap { | ||
| swsscommon.DeleteZmqClient(c) |
|
@hdwhdw could you check PR failures and copilot comments? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
handleTableDatainMixedDbClient(undersonic_data_client/mixed_db_client.go) currently loops per key when servicing a gNMI Set:opRemove→ callsDbDelTableper dbkey.opAdd→ callsDbSetTableper JSON top-level key.For DASH route programming this is the hot path — a single gNMI Set can carry tens of thousands of keys, and we generate that many ZMQ round-trips even though they all target the same producer on the same DPU.
ZmqProducerStateTablealready has a batchedset(vector<KeyOpFieldsValuesTuple>)/del(vector<string>)insonic-swss-commonthat fires one ZMQ message regardless of N — but the Go bindings did not expose a callable form (see the dependency below).What
DbBatchSetTable(table string, entries []tableBatchEntry) errorandDbBatchDelTable(table string, keys []string) erroronMixedDbClient. For ZMQ-backed tables (DASH_*minusDASH_HA_*) they assemble parallelVectorString/FieldValuePairsListand call the newswsscommon.ZmqProducerBatchedSet/swsscommon.ZmqProducerBatchedDelhelpers underRetryHelper. For plainTableand non-ZMQProducerStateTablebackends — where there is no batched C++ API — they fall back to the existing per-entry loop, so behaviour forDASH_HA_*and ordinary CONFIG_DB writes is unchanged.handleTableData:opRemovebuilds a key slice and makes a singleDbBatchDelTablecall.opAddbuilds an entry slice and makes a singleDbBatchSetTablecall.ZmqProducerBatchedSetWrapper/ZmqProducerBatchedDelWrapperwrap the new SWIG entry points to convert C++ exceptions into Go errors (same pattern as the existingProducerStateTable…Wrappers).Tests
sonic_data_client/client_test.gogains:TestZmqBatchedSetEndToEnd/TestZmqBatchedDelEndToEnd— round-trip 5 entries through a realZmqProducerStateTable+ZmqConsumerStateTableand verify the consumer drains them all.TestDbBatchSetTableFallsBackForPlainTable/TestDbBatchSetTableEmptyAndSingle— verify theplainTableMapfallback and the no-op/single-entry shortcuts.TestHandleTableDataBatchesMultiKeyJSON/TestHandleTableDataBatchesOpRemove— gomonkey-patched proof thathandleTableDataroutes multi-key payloads throughDbBatchSetTable/DbBatchDelTableexactly once (not N times).Related
ZmqProducerBatchedSet/Del/Sendto Go via SWIG%inlinehelpers. TheSet__SWIG_3(vector<KeyOpFieldsValuesTuple>)overload that already exists in the generatedswsscommon.gois unreachable from Go because the vector element type has no constructor (it's an opaqueStd_tuple_<…>). PR #1188 adds parallel-vector shims that reuse the already-templatedVectorString/FieldValuePairsListtypes.CI on this branch will fail to build until sonic-swss-common#1188 lands and a new
libswsscommonartifact is published. The compile errors will be on theswsscommon.ZmqProducerBatched{Set,Del}symbols.Out of scope (deliberate)
ZmqProducerBatchedSend(mixed SET/DEL in one message) — exported by the swss-common PR for completeness but no gNMI caller needs it today.