Skip to content

fix: implement election logic and seed-node bootstrap#33

Merged
jaywyawhare merged 7 commits into
masterfrom
fix-stub-implementations
Apr 29, 2026
Merged

fix: implement election logic and seed-node bootstrap#33
jaywyawhare merged 7 commits into
masterfrom
fix-stub-implementations

Conversation

@jaywyawhare
Copy link
Copy Markdown
Owner

Summary

  • replication.c: CANDIDATE state now resolves to LEADER by counting in-process registered follower DBs as automatic votes (quorum calculation included); replication_request_leadership() uses the same logic instead of the previous single-node shortcut
  • cluster.c: cluster_start() now parses config.seed_nodes (comma-separated host:port list) and pre-populates the node table so the heartbeat thread can track peer health from startup

What was missing

The election logic was a skeleton with a comment describing what a real implementation would do — candidates could never transition to leader when replicas existed. Seed-node discovery was explicitly noted as "not yet implemented".

Test plan

  • Single-node replication: replication_request_leadership() returns 0
  • In-process multi-node: register follower DB, confirm election resolves to leader
  • Cluster with seed_nodes config: nodes table populated before heartbeat starts
  • Cluster with empty seed_nodes: no change in behaviour

replication: candidates count in-process follower DBs as automatic votes
and win election on quorum; cluster: bootstrap peer list from seed_nodes
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR delivers the two missing pieces called out in the codebase: proper election logic in replication.c (counting in-process follower DBs toward quorum with the correct strict-majority formula) and seed-node bootstrap in cluster.c. It also adds real librdkafka support in streaming.c, hardens SAML XML parsing in sso.c, and renames CUDA symbols to use the gv_ prefix.

  • streaming.c (P1): rd_kafka_conf_new() is not checked for NULL before rd_kafka_conf_set() is called on it — an OOM return crashes the process instead of falling back to synthetic mode.
  • streaming.c (P1): start_offset set via rd_kafka_topic_partition_list_set_offset is silently ignored by rd_kafka_subscribe, which relies entirely on consumer-group assignment. Users who configure a non-default starting offset will see their setting dropped with no error.

Confidence Score: 4/5

Safe to merge after fixing the two librdkafka P1 bugs; all other changes are correct.

Election and seed-node logic look correct. Two concrete defects in the new Kafka path (NULL deref on conf allocation failure, start_offset silently discarded) need to be fixed before merging.

src/admin/streaming.c — both P1 findings are in the new librdkafka setup block.

Important Files Changed

Filename Overview
src/admin/streaming.c librdkafka integration added; two P1 bugs: rd_kafka_conf_new() return value not checked before use, and start_offset silently ignored because rd_kafka_subscribe disregards partition offsets set via rd_kafka_topic_partition_list_set_offset.
src/admin/replication.c Election logic properly implemented: quorum formula (replica_count+1)/2+1 correctly computes strict majority; replication_request_leadership uses same formula; result role captured before unlock to avoid TOCTOU read.
src/admin/cluster.c Seed-node bootstrap correctly parses comma-separated host:port list; node_count is only incremented when both gv_dup_cstr allocations succeed, guarding against the NULL node_id crash identified in a previous review.
src/admin/sso.c SAML parsing hardened: XML entity decoding added, tag prefix-match false-positives fixed, attribute extraction refactored into reusable helper, saml2 namespace and OID attribute names added.
src/specialized/gpu.c CUDA extern declarations renamed from cuda_* to gv_cuda_* throughout; purely mechanical symbol rename with no logic changes.
CMakeLists.txt Adds optional librdkafka detection via pkg-config with manual fallback; sets HAVE_RDKAFKA compile definition and links library only when found.

Reviews (2): Last reviewed commit: "remove verbose comments introduced in th..." | Re-trigger Greptile

Comment thread src/admin/replication.c Outdated
Comment thread src/admin/cluster.c Outdated
Comment thread src/admin/streaming.c
Comment on lines +394 to +396
rd_kafka_conf_t *conf = rd_kafka_conf_new();

rd_kafka_conf_set(conf, "bootstrap.servers",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 rd_kafka_conf_new() not checked for NULL

rd_kafka_conf_new() can return NULL on OOM. The immediately following rd_kafka_conf_set(conf, "bootstrap.servers", ...) dereferences it unconditionally, producing a NULL dereference crash rather than a graceful fallback.

Suggested change
rd_kafka_conf_t *conf = rd_kafka_conf_new();
rd_kafka_conf_set(conf, "bootstrap.servers",
rd_kafka_conf_t *conf = rd_kafka_conf_new();
if (!conf) {
/* Fall back to synthetic mode on OOM */
goto kafka_setup_done;
}
rd_kafka_conf_set(conf, "bootstrap.servers",
consumer->config.kafka.brokers, errstr, sizeof(errstr));

@jaywyawhare jaywyawhare merged commit 23defde into master Apr 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant