Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces an initial OpenTelemetry tracing integration for remotedebugger, along with a small RBUS-based test application intended to exercise trace-context propagation.
Changes:
- Adds a new OpenTelemetry helper module (
rrdOpenTelemetry.[ch]) with APIs for context management and basic span/event scaffolding. - Wires trace-context setup/propagation into RBUS event handling paths and extends
data_bufto carry trace context + span handle. - Adds a new build target for a test app that sets RBUS trace context and triggers
RRD_SET_ISSUE_EVENT.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| src/rrdTraceTestApp.c | New RBUS test application to set trace context and trigger the issue event. |
| src/rrdOpenTelemetry.h | New public header defining OTel context + tracing APIs. |
| src/rrdOpenTelemetry.c | New OTel implementation scaffold (TLS context, simple span tracker, ID generation). |
| src/rrdInterface.c | Integrates trace context setup/clearing around RBUS ops and event handling. |
| src/rrdIarmEvents.c | Adds trace-context generation/propagation around a power-manager triggered RBUS set. |
| src/rrdCommon.h | Extends data_buf with trace context pointers and a span handle. |
| src/Makefile.am | Adds new rrdTestApp target and includes OTel sources/headers in the build. |
| #include "rrdRunCmdThread.h" | ||
| #include "rrdOpenTelemetry.h" |
There was a problem hiding this comment.
rrdOpenTelemetry APIs are used here, but under GTEST_ENABLE the unit test runner compiles by including C sources and currently does not include rrdOpenTelemetry.c. This change will break unit-test linking unless rrdOpenTelemetry.c is included/linked for tests or these calls are compiled out/stubbed for GTEST_ENABLE builds.
| #include "rrdRunCmdThread.h" | |
| #include "rrdOpenTelemetry.h" | |
| #include "rrdRunCmdThread.h" | |
| #if !defined(GTEST_ENABLE) | |
| #include "rrdOpenTelemetry.h" | |
| #endif |
|
|
||
| /* Generate and set trace context before RBUS operation */ | ||
| rrd_otel_context_t ctx; | ||
| rrdOtel_GenerateContext(&ctx); | ||
| rrdOtel_SetContext(&ctx); | ||
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | ||
|
|
||
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | ||
|
|
||
| /* Log the operation in trace */ | ||
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | ||
|
|
||
| /* Clear trace context after operation */ | ||
| rbusHandle_ClearTraceContext(rrdRbusHandle); |
There was a problem hiding this comment.
These calls ignore return codes (rrdOtel_GenerateContext/SetContext and rbusHandle_SetTraceContextFromString) and then call rrdOtel_LogEvent without ever starting a span (LogEvent currently no-ops when no span is active). Either create a span around this RBUS operation and handle failures, or skip logging when tracing isn't available.
| /* Generate and set trace context before RBUS operation */ | |
| rrd_otel_context_t ctx; | |
| rrdOtel_GenerateContext(&ctx); | |
| rrdOtel_SetContext(&ctx); | |
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | |
| /* Log the operation in trace */ | |
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | |
| /* Clear trace context after operation */ | |
| rbusHandle_ClearTraceContext(rrdRbusHandle); | |
| /* Generate and set trace context before RBUS operation */ | |
| rrd_otel_context_t ctx; | |
| int otelGenRc = rrdOtel_GenerateContext(&ctx); | |
| int otelSetRc = 0; | |
| rbusError_t rbusTraceRc = RBUS_ERROR_BUS_ERROR; | |
| int tracingEnabled = 0; | |
| if (otelGenRc == 0) | |
| { | |
| otelSetRc = rrdOtel_SetContext(&ctx); | |
| if (otelSetRc == 0) | |
| { | |
| rbusTraceRc = rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| if (rbusTraceRc == RBUS_ERROR_SUCCESS) | |
| { | |
| tracingEnabled = 1; | |
| } | |
| } | |
| } | |
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | |
| /* Log the operation in trace, if tracing is available */ | |
| if (tracingEnabled) | |
| { | |
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | |
| /* Clear trace context after operation */ | |
| rbusHandle_ClearTraceContext(rrdRbusHandle); | |
| } |
| /* Clean up trace context */ | ||
| rbusHandle_ClearTraceContext(handle); | ||
|
|
||
| printf("\n[4] Event Processing Summary...\n"); | ||
| printf(" SCENARIO 1 Complete: External -> RBUS -> remote_debugger\n"); | ||
| printf(" Trace ID was PRESERVED throughout the chain\n"); | ||
|
|
||
| char retrievedTraceParent[512]; | ||
| char retrievedTraceState[512]; | ||
|
|
||
| rc = rbusHandle_GetTraceContextAsString(handle, retrievedTraceParent, | ||
| sizeof(retrievedTraceParent), | ||
| retrievedTraceState, | ||
| sizeof(retrievedTraceState)); | ||
| if (rc == RBUS_ERROR_SUCCESS && retrievedTraceParent[0] != '\0') |
There was a problem hiding this comment.
The app clears the RBUS trace context (rbusHandle_ClearTraceContext) and then immediately calls rbusHandle_GetTraceContextAsString on the same handle. This will typically return an empty context, making the "Handler returned trace context" section misleading. If the intent is to display the context that was set, retrieve it before clearing; if the intent is to read context returned from the server/handler, avoid clearing until after retrieval (or clarify what API returns).
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); | ||
|
|
||
| for (int i = 0; i < length; i++) | ||
| { | ||
| buffer[i] = hex_chars[rand() % 16]; | ||
| } | ||
| buffer[length] = '\0'; |
There was a problem hiding this comment.
_generate_hex_id reseeds rand() on every call (srand(time(NULL)+pthread_self())), which makes IDs repeat/correlate (e.g., spanId will restart the same sequence as traceId if generated within the same second) and rand()/srand() are not thread-safe. Use a per-process/thread RNG seeded once (or getrandom/arc4random) and avoid reseeding per ID generation.
| snprintf(traceParent, sizeof(traceParent), | ||
| "00-test-trace-id-0123456789abcdef-test-span-id-01234567-01"); |
There was a problem hiding this comment.
The traceparent string used here is not valid W3C Trace Context format (trace-id must be 32 lowercase hex chars; span-id 16 hex chars). If rbusHandle_SetTraceContextFromString validates input, this demo will fail. Use a real traceparent value (or generate one) that conforms to "00-<32hex>-<16hex>-<2hex>".
| snprintf(traceParent, sizeof(traceParent), | |
| "00-test-trace-id-0123456789abcdef-test-span-id-01234567-01"); | |
| snprintf(traceParent, sizeof(traceParent), | |
| "00-0123456789abcdef0123456789abcdef-0123456789abcdef-01"); |
| bin_PROGRAMS = remotedebugger rrdTestApp | ||
|
|
||
| remotedebuggerincludedir = $(includedir)/rrd | ||
| rrdTestAppincludedir = $(includedir)/rrd | ||
| remotedebuggerinclude_HEADERS = rrdCommon.h rrdInterface.h \ | ||
| rrd_archive.h rrd_config.h rrd_logproc.h rrd_sysinfo.h rrd_upload.h | ||
|
|
||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c | ||
| rrd_archive.h rrd_config.h rrd_logproc.h rrd_sysinfo.h rrd_upload.h rrdOpenTelemetry.h | ||
| rrdTestApp_SOURCES = rrdTraceTestApp.c | ||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c |
There was a problem hiding this comment.
The new test app target is named "rrdTestApp", but the source file and in-app docs refer to "rrdTraceTestApp". Consider renaming the target to match the documented name (or update the documentation strings) to avoid confusion for users and scripts.
src/rrdOpenTelemetry.c
Outdated
| /* Remove span from tracking */ | ||
| g_active_spans[i].spanId = 0; |
There was a problem hiding this comment.
Span bookkeeping is incorrect: rrdOtel_EndSpan decrements g_span_count without compacting the array. If you end a span that isn't the most recently started, remaining spans become unreachable (EndSpan will never find them) and subsequent StartSpan calls may overwrite still-active entries. Consider enforcing LIFO (only end last span) or compacting/moving the last active span into the removed slot.
| /* Remove span from tracking */ | |
| g_active_spans[i].spanId = 0; | |
| /* Remove span from tracking: compact active spans array */ | |
| int last_index = g_span_count - 1; | |
| if (i != last_index) | |
| { | |
| /* Move last active span into the removed slot */ | |
| g_active_spans[i] = g_active_spans[last_index]; | |
| } | |
| /* Clear the now-unused last slot */ | |
| memset(&g_active_spans[last_index], 0, sizeof(g_active_spans[last_index])); |
| * ./rrdTraceTestApp [issue_value] | ||
| * Example: ./rrdTraceTestApp "SecurityPatch.0" | ||
| * Example: ./rrdTraceTestApp (uses default) |
There was a problem hiding this comment.
The usage text/documentation refers to running "./rrdTraceTestApp", but the build produces a binary named "rrdTestApp" (see Makefile.am). Update the usage string (or rename the target) so users can actually run the documented command.
| * ./rrdTraceTestApp [issue_value] | |
| * Example: ./rrdTraceTestApp "SecurityPatch.0" | |
| * Example: ./rrdTraceTestApp (uses default) | |
| * ./rrdTestApp [issue_value] | |
| * Example: ./rrdTestApp "SecurityPatch.0" | |
| * Example: ./rrdTestApp (uses default) |
| #include "rrdOpenTelemetry.h" | ||
| #if !defined(GTEST_ENABLE) |
There was a problem hiding this comment.
rrdOpenTelemetry APIs are now used unconditionally in this file, but the gtest build compiles by including C sources into rrdUnitTestRunner.cpp and does not include rrdOpenTelemetry.c. This will introduce undefined references when linking unit tests unless you either (1) include/link rrdOpenTelemetry.c for GTEST_ENABLE builds, or (2) provide stubs / compile-guards for these calls under GTEST_ENABLE.
| #include "rrdOpenTelemetry.h" | |
| #if !defined(GTEST_ENABLE) | |
| #if !defined(GTEST_ENABLE) | |
| #include "rrdOpenTelemetry.h" |
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); |
There was a problem hiding this comment.
Coverity Issue - Use of 32-bit time_t
A "time_t" value is stored in an integer with too few bits to accommodate it. The expression "time(NULL)" is cast to "unsigned int".
High Impact, CWE-197
Y2K38_SAFETY
|
|
||
| for (int i = 0; i < length; i++) | ||
| { | ||
| buffer[i] = hex_chars[rand() % 16]; |
There was a problem hiding this comment.
Coverity Issue - Calling risky function
"rand" should not be used for security-related applications, because linear congruential algorithms are too easy to break.
Low Impact, CWE-676
DC.WEAK_CRYPTO
How to fix
Use a compliant random number generator, such as "/dev/random" or "/dev/urandom" on Unix-like systems, and CNG (Cryptography API: Next Generation) on Windows.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.
Comments suppressed due to low confidence (1)
src/rrdInterface.c:407
- Memory leak: The rbusValue_t 'value' initialized with rbusValue_Init is never released. After the rbus_get call completes and before all return paths, rbusValue_Release(value) should be called to free the allocated resources. This affects all return paths in the function including the early returns.
rbusValue_t value = NULL;
rbusValue_Init(&value);
char const* issue = NULL;
/* Set trace context before RBUS operation */
_set_rbus_trace_context();
retCode = rbus_get(rrdRbusHandle, RRD_SET_ISSUE_EVENT, &value);
/* Log the RBUS operation in trace */
rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT);
/* Clear trace context after RBUS operation */
_clear_rbus_trace_context();
if(retCode != RBUS_ERROR_SUCCESS || value == NULL)
{
RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: RBUS get failed for the event [%s]\n", __FUNCTION__, __LINE__, RRD_SET_ISSUE_EVENT);
return;
| /* Initialize trace context for this event */ | ||
| eventBuf = (data_buf *)malloc(sizeof(data_buf)); | ||
| if (eventBuf) | ||
| { | ||
| RRD_data_buff_init(eventBuf, EVENT_MSG, RRD_DEEPSLEEP_INVALID_DEFAULT); | ||
| /* Setup OpenTelemetry trace context */ | ||
| _setup_trace_context_for_event(eventBuf, "ProcessIssueEvent"); | ||
|
|
||
| RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, | ||
| "[%s:%d]: Event processed with trace context - parent: %s\n", | ||
| __FUNCTION__, __LINE__, | ||
| eventBuf->traceParent ? eventBuf->traceParent : "none"); | ||
| } | ||
| pushIssueTypesToMsgQueue(dataMsg, EVENT_MSG); | ||
| } | ||
|
|
||
| if (eventBuf) | ||
| { | ||
| RRD_data_buff_deAlloc(eventBuf); | ||
| } |
There was a problem hiding this comment.
The trace context setup in eventBuf is never propagated to the message queue. The function _setup_trace_context_for_event sets up trace context on eventBuf, but then pushIssueTypesToMsgQueue creates a completely new data_buf (sbuf) without copying the trace context fields. The eventBuf is immediately deallocated, losing all trace context information. Either pass eventBuf to pushIssueTypesToMsgQueue instead of creating a new buffer, or modify pushIssueTypesToMsgQueue to accept and copy trace context parameters.
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); | ||
|
|
||
| for (int i = 0; i < length; i++) | ||
| { | ||
| buffer[i] = hex_chars[rand() % 16]; | ||
| } | ||
| buffer[length] = '\0'; | ||
| } |
There was a problem hiding this comment.
The srand call in _generate_hex_id is called every time the function is invoked, which defeats the purpose of seeding the random number generator and will produce poor quality randomness. The seed should only be set once at program initialization, not on every call. Additionally, using time(NULL) + pthread_self() as a seed in a multi-threaded environment can lead to collisions if multiple threads call this function at nearly the same time. Consider using /dev/urandom or a cryptographically secure random number generator for generating trace IDs and span IDs.
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | ||
|
|
||
| /* Clear trace context after operation */ | ||
| rbusHandle_ClearTraceContext(rrdRbusHandle); |
There was a problem hiding this comment.
Memory leak: The rbusValue_t initialized with rbusValue_Init is never released. After the rbus_set call and before the function returns, rbusValue_Release(value) should be called to free the allocated resources.
| rbusHandle_ClearTraceContext(rrdRbusHandle); | |
| rbusHandle_ClearTraceContext(rrdRbusHandle); | |
| rbusValue_Release(value); |
| AM_LDFLAGS += "-lIARMBus -lrfcapi -ltr181api" | ||
| endif | ||
| remotedebugger_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common | ||
| rrdTestApp_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common |
There was a problem hiding this comment.
The rrdTestApp_LDFLAGS references libraries that may not be needed for the test application. The test app only uses RBUS APIs and doesn't appear to need -lwebconfig_framework, -lmsgpackc, -ltrower-base64, -lsecure_wrapper, -lfwutils, -luploadstblogs, or -lz. This could cause unnecessary build dependencies or linking errors if these libraries aren't available. Consider using only the minimal set of libraries needed for the test app, likely just -lrbus and -lrdkloggers.
| int rrdOtel_Initialize(const char *serviceName, const char *collectorEndpoint) | ||
| { | ||
| if (!serviceName || !collectorEndpoint) | ||
| { | ||
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, | ||
| "[%s:%d]: Invalid parameters for OTel initialization\n", | ||
| __FUNCTION__, __LINE__); | ||
| return -1; | ||
| } | ||
|
|
||
| RDK_LOG(RDK_LOG_INFO, LOG_REMDEBUG, | ||
| "[%s:%d]: Initializing OpenTelemetry for service '%s' with collector '%s'\n", | ||
| __FUNCTION__, __LINE__, serviceName, collectorEndpoint); | ||
|
|
||
| /* | ||
| * Initialize OpenTelemetry-CPP SDK with OTLP exporter | ||
| * Note: In a full C++ implementation, this would: | ||
| * 1. Create OtlpHttpExporterOptions with collectorEndpoint | ||
| * 2. Create OtlpHttpExporter(options) | ||
| * 3. Create SimpleSpanProcessorFactory or BatchSpanProcessorFactory | ||
| * 4. Create TracerProvider with the span processor | ||
| * 5. Set global TracerProvider via GetTracerProvider()->AddProcessor() | ||
| * | ||
| * For C implementation, we create a minimal wrapper that: | ||
| * - Stores service name and endpoint for later use | ||
| * - Initializes thread-local storage for spans | ||
| * - Sets up periodic flush mechanism | ||
| * | ||
| * In production, this would call C++ code like: | ||
| * | ||
| * auto exporter = std::make_unique<opentelemetry::exporter::otlp::OtlpHttpExporter>( | ||
| * opentelemetry::exporter::otlp::OtlpHttpExporterOptions{collectorEndpoint}); | ||
| * | ||
| * auto processor = std::make_unique<opentelemetry::sdk::trace::BatchSpanProcessor>( | ||
| * std::move(exporter)); | ||
| * | ||
| * auto provider = std::make_shared<opentelemetry::sdk::trace::TracerProvider>(); | ||
| * provider->AddProcessor(std::move(processor)); | ||
| * opentelemetry::trace::Provider::SetTracerProvider(provider); | ||
| */ | ||
|
|
||
| if (pthread_key_create(&g_thread_local_key, NULL) != 0) | ||
| { | ||
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, | ||
| "[%s:%d]: Failed to create thread-local storage key\n", | ||
| __FUNCTION__, __LINE__); | ||
| return -1; | ||
| } | ||
|
|
||
| g_otel_initialized = 1; | ||
|
|
||
| RDK_LOG(RDK_LOG_INFO, LOG_REMDEBUG, | ||
| "[%s:%d]: OpenTelemetry initialization completed - ready to export to %s\n", | ||
| __FUNCTION__, __LINE__, collectorEndpoint); | ||
|
|
||
| return 0; | ||
| } |
There was a problem hiding this comment.
The rrdOtel_Initialize and rrdOtel_Shutdown functions are defined but there are no calls to these functions in any of the modified files. This means the OpenTelemetry SDK will not be properly initialized when the application starts, and spans/traces will not be exported to the collector. Either add initialization/shutdown calls in the appropriate files (likely in startup/shutdown code), or document that this is intentionally deferred to a separate change.
src/rrdOpenTelemetry.c
Outdated
| return 0; | ||
| } | ||
|
|
||
| uint64_t spanId = (uint64_t)time(NULL) * 1000000 + g_span_count; |
There was a problem hiding this comment.
The span ID generation formula 'time(NULL) * 1000000 + g_span_count' is prone to collisions across different threads since g_span_count is thread-local but time(NULL) has only second-level granularity. Multiple threads calling this function within the same second will generate identical or nearly identical span IDs. Use a proper random ID generator or include thread ID in the calculation.
| /* Format W3C Trace Context */ | ||
| snprintf(traceParent, sizeof(traceParent), | ||
| "00-test-trace-id-0123456789abcdef-test-span-id-01234567-01"); | ||
| snprintf(traceState, sizeof(traceState), "rdd=test_app"); |
There was a problem hiding this comment.
Typo in traceState value: 'rdd' should likely be 'rrd' to match the project naming convention (RRD = RDK Remote Debugger).
| snprintf(traceState, sizeof(traceState), "rdd=test_app"); | |
| snprintf(traceState, sizeof(traceState), "rrd=test_app"); |
| rrdOtel_GenerateContext(&ctx); | ||
| rrdOtel_SetContext(&ctx); | ||
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); |
There was a problem hiding this comment.
Missing error handling for OpenTelemetry functions. If rrdOtel_GenerateContext, rrdOtel_SetContext, or rbusHandle_SetTraceContextFromString fail, the code continues without checking return values. While trace context failures may not be critical to the main functionality, ignoring these errors could lead to silent trace loss and make debugging difficult. Consider logging errors when these operations fail.
| rrdOtel_GenerateContext(&ctx); | |
| rrdOtel_SetContext(&ctx); | |
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| int otelStatus = rrdOtel_GenerateContext(&ctx); | |
| if (otelStatus != 0) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rrdOtel_GenerateContext failed with status %d\n", __FUNCTION__, __LINE__, otelStatus); | |
| } | |
| otelStatus = rrdOtel_SetContext(&ctx); | |
| if (otelStatus != 0) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rrdOtel_SetContext failed with status %d\n", __FUNCTION__, __LINE__, otelStatus); | |
| } | |
| rbusError_t traceRc = rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| if (traceRc != RBUS_ERROR_SUCCESS) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rbusHandle_SetTraceContextFromString failed with error [%d]\n", __FUNCTION__, __LINE__, traceRc); | |
| } |
src/rrdOpenTelemetry.c
Outdated
| /* Remove span from tracking */ | ||
| g_active_spans[i].spanId = 0; |
There was a problem hiding this comment.
The span removal logic in rrdOtel_EndSpan is incorrect. When a span is removed from the middle of the g_active_spans array, simply zeroing out g_active_spans[i].spanId and decrementing g_span_count leaves a gap in the array. Subsequent StartSpan calls will add new spans at g_span_count, which may overwrite a valid span or leave invalid entries in the middle. Either shift all spans after index i down by one position, or use a free-list/linked-list structure to track active spans.
| /* Remove span from tracking */ | |
| g_active_spans[i].spanId = 0; | |
| /* Remove span from tracking: compact array to avoid gaps */ | |
| for (int j = i; j < g_span_count - 1; j++) | |
| { | |
| g_active_spans[j] = g_active_spans[j + 1]; | |
| } | |
| /* Clear the now-unused last entry */ | |
| if (g_span_count > 0) | |
| { | |
| memset(&g_active_spans[g_span_count - 1], 0, sizeof(otel_span_t)); | |
| } |
src/rrdOpenTelemetry.c
Outdated
| if (pthread_key_create(&g_thread_local_key, NULL) != 0) | ||
| { | ||
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, | ||
| "[%s:%d]: Failed to create thread-local storage key\n", | ||
| __FUNCTION__, __LINE__); | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
The variable g_thread_local_key is used in pthread_key_create but is never declared. This will cause a compilation error. Either remove this code or declare the variable as 'static pthread_key_t g_thread_local_key;'
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c | ||
| rrd_archive.h rrd_config.h rrd_logproc.h rrd_sysinfo.h rrd_upload.h rrdOpenTelemetry.h | ||
| rrdTestApp_SOURCES = rrdTraceTestApp.c | ||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c |
There was a problem hiding this comment.
The C++ wrapper implementation file 'rrdOpenTelemetry_cpp_wrapper.cpp' is not included in the remotedebugger_SOURCES list. This file contains the actual OpenTelemetry SDK integration that is called from rrdOpenTelemetry.c. Without compiling and linking this file, the build will fail with undefined references to functions like rrdOtel_Initialize_Cpp, rrdOtel_StartSpan_Cpp, etc. Add 'rrdOpenTelemetry_cpp_wrapper.cpp' to the remotedebugger_SOURCES variable.
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c | |
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c rrdOpenTelemetry_cpp_wrapper.cpp |
| /* Set trace context before RBUS operation */ | ||
| _set_rbus_trace_context(); | ||
|
|
||
| retCode = rbus_get(rrdRbusHandle, RRD_SET_ISSUE_EVENT, &value); | ||
|
|
||
| /* Log the RBUS operation in trace */ | ||
| rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT); | ||
|
|
||
| /* Clear trace context after RBUS operation */ | ||
| _clear_rbus_trace_context(); | ||
|
|
There was a problem hiding this comment.
Inconsistent indentation detected. This code uses tabs instead of spaces for several lines (393, 395, 399, 402, etc.). The codebase appears to use spaces for indentation (4 spaces per level based on surrounding code). Replace tabs with spaces to maintain consistent formatting throughout the file.
| /* Initialize OpenTelemetry trace context fields */ | ||
| sbuf->traceParent = NULL; | ||
| sbuf->traceState = NULL; | ||
| sbuf->spanHandle = 0; |
There was a problem hiding this comment.
Inconsistent indentation detected. This code uses tabs instead of spaces. The codebase appears to use spaces for indentation (4 spaces per level based on surrounding code). Replace tabs with spaces to maintain consistent formatting throughout the file.
| int EndSpan(uint64_t spanHandle) { | ||
| try { | ||
| std::lock_guard<std::mutex> lock(spans_mutex_); | ||
| auto thread_id = std::this_thread::get_id(); | ||
| auto it = active_spans_.find(thread_id); | ||
|
|
||
| if (it != active_spans_.end()) { | ||
| // End the span (marks end time, triggers export) | ||
| it->second->End(); | ||
| active_spans_.erase(it); | ||
|
|
||
| // ✅ Deactivate scope | ||
| if (t_active_scope_) t_active_scope_.reset(); | ||
|
|
||
| return 0; | ||
| } | ||
| } | ||
| catch (const std::exception& e) { | ||
| // Log error if needed | ||
| } | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
The spanHandle parameter is not being used. EndSpan receives a spanHandle parameter (line 175) but ignores it and instead looks up the span by thread ID (line 179). This could cause issues if multiple spans are created on the same thread or if a span is ended from a different thread than where it was created. Either use the spanHandle parameter to validate against the stored span, or document why thread-based lookup is sufficient for this use case.
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); | ||
|
|
||
| for (int i = 0; i < length; i++) | ||
| { | ||
| buffer[i] = hex_chars[rand() % 16]; | ||
| } | ||
| buffer[length] = '\0'; | ||
| } |
There was a problem hiding this comment.
Thread-safety issue with srand and rand. The srand function is called on every invocation of _generate_hex_id, which can cause issues in multi-threaded environments. Additionally, rand is not thread-safe. Consider using thread-safe random number generation such as rand_r with thread-local state, or use a cryptographically secure random source like /dev/urandom for generating trace and span IDs to ensure uniqueness and avoid collisions.
| /* Format W3C Trace Context */ | ||
| snprintf(traceParent, sizeof(traceParent), | ||
| "00-test-trace-id-0123456789abcdef-test-span-id-01234567-01"); | ||
| snprintf(traceState, sizeof(traceState), "rdd=test_app"); |
There was a problem hiding this comment.
Typo in comment. Line 99 has "rdd=test_app" but based on the service name pattern in the codebase, this should likely be "rrd=test_app" (RRD not RDD). This appears in the traceState which is used for vendor-specific trace metadata.
| snprintf(traceState, sizeof(traceState), "rdd=test_app"); | |
| snprintf(traceState, sizeof(traceState), "rrd=test_app"); |
| /* Free OpenTelemetry trace context fields */ | ||
| if (sbuf->traceParent) | ||
| { | ||
| free(sbuf->traceParent); | ||
| } | ||
|
|
||
| if (sbuf->traceState) | ||
| { | ||
| free(sbuf->traceState); | ||
| } |
There was a problem hiding this comment.
Inconsistent indentation detected. This code uses tabs instead of spaces. The codebase appears to use spaces for indentation (4 spaces per level based on surrounding code). Replace tabs with spaces to maintain consistent formatting throughout the file.
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | ||
|
|
||
| // ✅ Convert W3C hex to SpanContext (FIX #1) | ||
| auto parent_context = createSpanContextFromIds(trace_id, span_id, flags); | ||
|
|
||
| // ✅ CRITICAL: Create StartSpanOptions with parent (FIX #2) | ||
| // This is what creates the parent-child relationship | ||
| trace::StartSpanOptions options; | ||
| options.parent = parent_context; // CRITICAL - links to parent trace | ||
|
|
||
| // Create child span with parent context | ||
| span = tracer_->StartSpan(spanName, options); |
There was a problem hiding this comment.
Missing validation of sscanf return value. The sscanf call on line 131 could fail to parse the traceparent string if it's malformed, but the code doesn't check the return value. If parsing fails, trace_id, span_id, and flags buffers may contain uninitialized or partial data, leading to incorrect span context creation. Add validation: check that sscanf returns 3 (indicating all three fields were successfully parsed) before proceeding to createSpanContextFromIds.
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | |
| // ✅ Convert W3C hex to SpanContext (FIX #1) | |
| auto parent_context = createSpanContextFromIds(trace_id, span_id, flags); | |
| // ✅ CRITICAL: Create StartSpanOptions with parent (FIX #2) | |
| // This is what creates the parent-child relationship | |
| trace::StartSpanOptions options; | |
| options.parent = parent_context; // CRITICAL - links to parent trace | |
| // Create child span with parent context | |
| span = tracer_->StartSpan(spanName, options); | |
| int parsed = std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | |
| if (parsed == 3) { | |
| // ✅ Convert W3C hex to SpanContext (FIX #1) | |
| auto parent_context = createSpanContextFromIds(trace_id, span_id, flags); | |
| // ✅ CRITICAL: Create StartSpanOptions with parent (FIX #2) | |
| // This is what creates the parent-child relationship | |
| trace::StartSpanOptions options; | |
| options.parent = parent_context; // CRITICAL - links to parent trace | |
| // Create child span with parent context | |
| span = tracer_->StartSpan(spanName, options); | |
| } else { | |
| // Failed to parse traceparent - fallback to root span | |
| span = tracer_->StartSpan(spanName); | |
| } |
| rrdOtel_LogEvent("EventReceived", "Continuing external trace"); | ||
| } | ||
| else | ||
| { | ||
| /* SCENARIO 2: No trace from external component - generate new one */ | ||
| trace_source = 0; /* Generated new trace */ | ||
| if (rrdOtel_GenerateContext(&ctx) != 0) | ||
| { | ||
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, | ||
| "[%s:%d]: Failed to generate trace context for event %s\n", | ||
| __FUNCTION__, __LINE__, eventName); | ||
| return; | ||
| } | ||
|
|
||
| RDK_LOG(RDK_LOG_INFO, LOG_REMDEBUG, | ||
| "[%s:%d]: Generated new trace context (no external trace provided)\n" | ||
| " Trace Parent: %s\n" | ||
| " This is SCENARIO 2 - becoming root of new trace\n", | ||
| __FUNCTION__, __LINE__, ctx.traceParent); | ||
|
|
||
| rrdOtel_LogEvent("EventReceived", "Starting new root trace"); | ||
| } |
There was a problem hiding this comment.
Calling rrdOtel_LogEvent before creating a span. On line 297 (SCENARIO 1) and line 317 (SCENARIO 2), rrdOtel_LogEvent is called to log events, but the span for this event is not created until line 363. Since LogEvent requires an active span to attach the event to, these calls will fail and return -1. Move the rrdOtel_StartSpan call before these LogEvent calls, or remove these premature LogEvent calls.
| /* Set trace context before RBUS operation */ | ||
| _set_rbus_trace_context(); | ||
|
|
||
| retCode = rbus_get(rrdRbusHandle, RRD_SET_ISSUE_EVENT, &value); | ||
|
|
||
| /* Log the RBUS operation in trace */ | ||
| rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT); | ||
|
|
||
| /* Clear trace context after RBUS operation */ | ||
| _clear_rbus_trace_context(); | ||
|
|
There was a problem hiding this comment.
Missing trace context extraction from RBUS event. In SCENARIO 1, an external component sends trace context via RBUS (as demonstrated in rrdTraceTestApp.c). However, this event handler doesn't extract the trace context from the incoming RBUS event before calling _set_rbus_trace_context. The trace context should be extracted using rbusHandle_GetTraceContextAsString or similar RBUS API from the event/handle, stored in thread-local storage via rrdOtel_SetContext, then propagated. Without this, SCENARIO 1 (external trace continuation) will not work as documented.
| * @brief Initialize OpenTelemetry SDK | ||
| * | ||
| * Initializes the OpenTelemetry-CPP SDK with OTLP HTTP exporter. | ||
| * Creates a global TracerProvider with BatchSpanProcessor for efficient batch export. |
There was a problem hiding this comment.
The function header comment says a BatchSpanProcessor is used, but the implementation delegates to the C++ wrapper which uses a SimpleSpanProcessor. Update the comment to match the actual behavior (or switch processors if batch export is intended).
| * Creates a global TracerProvider with BatchSpanProcessor for efficient batch export. | |
| * Creates a global TracerProvider with SimpleSpanProcessor for synchronous, immediate export. |
| RRD_data_buff_deAlloc(eventBuf); | ||
| } | ||
|
|
||
| free(dataMsg); |
There was a problem hiding this comment.
dataMsg is passed into pushIssueTypesToMsgQueue, which stores the pointer in the queued data_buf without copying it. Freeing dataMsg here makes the message-queue consumer dereference/frees a dangling pointer (use-after-free / potential crash). Remove this free or change pushIssueTypesToMsgQueue to take ownership by copying the string into its own buffer.
| free(dataMsg); |
| message(STATUS "NOTE: Main binary built by existing Makefile") | ||
| message(STATUS "") |
There was a problem hiding this comment.
This repository is built with Autotools (configure.ac/Makefile.am), but this new CMake build for rrd_otel_cpp_wrapper is not invoked from the existing build/install flow. Add documentation or tooling to ensure the wrapper is built/installed as part of the standard build, otherwise users will get link-time failures when remotedebugger expects librrd_otel_cpp_wrapper to exist.
| message(STATUS "NOTE: Main binary built by existing Makefile") | |
| message(STATUS "") | |
| message(STATUS "NOTE:") | |
| message(STATUS " The main remotedebugger binary is built by the existing Autotools/Makefile") | |
| message(STATUS " build system. This CMake project only builds and installs the") | |
| message(STATUS " librrd_otel_cpp_wrapper library that remotedebugger expects to link against.") | |
| message(STATUS "") | |
| message(STATUS " To integrate this wrapper into your normal build/install flow, ensure that") | |
| message(STATUS " you run the following (or equivalent) commands in addition to the Autotools") | |
| message(STATUS " build, so that librrd_otel_cpp_wrapper is available at link time:") | |
| message(STATUS "") | |
| message(STATUS " mkdir -p build && cd build") | |
| message(STATUS " cmake -S .. -B . -DCMAKE_INSTALL_PREFIX=\${CMAKE_INSTALL_PREFIX}") | |
| message(STATUS " cmake --build .") | |
| message(STATUS " cmake --install .") | |
| message(STATUS "") | |
| message(STATUS " If this step is skipped, remotedebugger may fail to link because") | |
| message(STATUS " librrd_otel_cpp_wrapper cannot be found.") | |
| message(STATUS "") |
| /* Create root span for this event */ | ||
| sbuf->spanHandle = rrdOtel_StartSpan(eventName, NULL); | ||
| RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, |
There was a problem hiding this comment.
rrdOtel_StartSpan/rrdOtel_LogEvent are used here, but there is no call site in the codebase that initializes the OpenTelemetry SDK via rrdOtel_Initialize(...) (and no shutdown call). Without initialization, the C++ wrapper instance is never created and span/event calls will be no-ops or return errors. Add initialization during daemon startup (e.g., in main after logging init) and call rrdOtel_Shutdown() on exit.
| /* Generate and set trace context before RBUS operation */ | ||
| rrd_otel_context_t ctx; | ||
| rrdOtel_GenerateContext(&ctx); | ||
| rrdOtel_SetContext(&ctx); | ||
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | ||
|
|
There was a problem hiding this comment.
Return values from rrdOtel_GenerateContext, rrdOtel_SetContext, and rbusHandle_SetTraceContextFromString are ignored. If any of these fail, the code proceeds and logs events as if tracing is set up, which can lead to confusing/incorrect telemetry. Check and handle these return codes (and avoid leaving a partially-initialized context active).
| /* Initialize trace context for this event */ | ||
| eventBuf = (data_buf *)malloc(sizeof(data_buf)); | ||
| if (eventBuf) | ||
| { | ||
| RRD_data_buff_init(eventBuf, EVENT_MSG, RRD_DEEPSLEEP_INVALID_DEFAULT); | ||
| /* Setup OpenTelemetry trace context */ | ||
| _setup_trace_context_for_event(eventBuf, "ProcessIssueEvent"); | ||
|
|
||
| RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, | ||
| "[%s:%d]: Event processed with trace context - parent: %s\n", | ||
| __FUNCTION__, __LINE__, | ||
| eventBuf->traceParent ? eventBuf->traceParent : "none"); | ||
| } | ||
| pushIssueTypesToMsgQueue(dataMsg, EVENT_MSG); | ||
| } |
There was a problem hiding this comment.
The trace context/span is set up on eventBuf, but the buffer actually sent to the message queue is allocated inside pushIssueTypesToMsgQueue and does not receive traceParent/traceState/spanHandle. As written, the trace context is not propagated to the worker thread and the created span is never ended. Consider attaching the trace fields to the queued data_buf (e.g., extend pushIssueTypesToMsgQueue to accept/copy context) and ensure rrdOtel_EndSpan is called when processing completes.
| /* Thread-local storage for trace context */ | ||
| static __thread rrd_otel_context_t g_thread_local_context = {0}; | ||
| static pthread_once_t g_otel_init_once = PTHREAD_ONCE_INIT; | ||
| static int g_otel_initialized = 0; | ||
|
|
||
| /* Simple span tracker - store span handles for current thread */ | ||
| typedef struct { | ||
| uint64_t spanId; | ||
| char spanName[256]; | ||
| uint64_t startTime; | ||
| } otel_span_t; | ||
|
|
||
| #define MAX_SPANS_PER_THREAD 32 | ||
| static __thread otel_span_t g_active_spans[MAX_SPANS_PER_THREAD] = {0}; | ||
| static __thread int g_span_count = 0; | ||
|
|
There was a problem hiding this comment.
g_otel_init_once, otel_span_t, g_active_spans, and g_span_count are declared but never used. This will typically trigger unused-variable warnings (and may fail builds if -Werror is enabled). Remove these or wire them up to actual span tracking logic.
| * Usage: | ||
| * ./rrdTraceTestApp [issue_value] | ||
| * Example: ./rrdTraceTestApp "SecurityPatch.0" | ||
| * Example: ./rrdTraceTestApp (uses default) |
There was a problem hiding this comment.
The usage text says to run ./rrdTraceTestApp, but the Automake target name added is rrdTestApp (so the installed binary will be rrdTestApp). Update the usage/examples to match the actual built binary name (or rename the target to rrdTraceTestApp).
| if (traceParent && traceParent[0] != '\0') { | ||
| // ✅ CRITICAL FIX #1 + #2: Parse parent and create with StartSpanOptions | ||
| // W3C Format: 00-<32 hex chars>-<16 hex chars>-<2 hex chars> | ||
| char trace_id[33] = {0}; | ||
| char span_id[17] = {0}; | ||
| char flags[3] = {0}; | ||
|
|
||
| // Extract parts from W3C format | ||
| // Expected: 00-<traceId>-<spanId>-<flags> | ||
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | ||
|
|
||
| // ✅ Convert W3C hex to SpanContext (FIX #1) | ||
| auto parent_context = createSpanContextFromIds(trace_id, span_id, flags); | ||
|
|
There was a problem hiding this comment.
StartSpan parses traceParent with sscanf but does not check the return value or validate lengths/hex characters before indexing into trace_id_str/span_id_str. With a malformed traceParent, createSpanContextFromIds will read out of bounds and can crash. Validate the input (format + exact lengths + hex) and fall back to creating a root span (or treat as invalid parent) if parsing fails.
| void initialize(const char *serviceName, const char *collectorEndpoint) { | ||
| try { | ||
| // ✅ CRITICAL FIX #4: Create Resource with service attributes | ||
| auto resource_attributes = resource::ResourceAttributes{ | ||
| {"service.name", serviceName}, | ||
| {"service.version", "1.0.0"}, | ||
| {"service.namespace", "rdk"}, | ||
| {"service.instance.id", "remote-debugger"}, | ||
| {"deployment.environment", "rdk-device"}, | ||
| {"telemetry.sdk.name", "opentelemetry"}, | ||
| {"telemetry.sdk.language", "cpp"}, | ||
| {"telemetry.sdk.version", "1.23.0"} | ||
| }; | ||
| auto res = resource::Resource::Create(resource_attributes); | ||
|
|
||
| // ✅ Create OTLP HTTP exporter | ||
| otlp::OtlpHttpExporterOptions exporter_opts; | ||
| exporter_opts.url = std::string(collectorEndpoint) + "/v1/traces"; | ||
|
|
||
| auto exporter = std::make_unique<otlp::OtlpHttpExporter>(exporter_opts); | ||
|
|
||
| // ✅ CRITICAL FIX #6: Use SimpleSpanProcessor (synchronous, immediate export) | ||
| // vs BatchSpanProcessor (asynchronous, buffered) | ||
| // SimpleSpanProcessor: Each span exported immediately when ended | ||
| auto processor = std::make_unique<trace_sdk::SimpleSpanProcessor>(std::move(exporter)); | ||
|
|
||
| // ✅ Create TracerProvider with Resource (MUST pass resource) | ||
| auto provider = std::make_shared<trace_sdk::TracerProvider>( | ||
| std::move(processor), // span processor | ||
| res // CRITICAL: resource with attributes | ||
| ); | ||
|
|
||
| // Set as global provider | ||
| trace::Provider::SetTracerProvider(nostd::shared_ptr<trace::TracerProvider>(provider)); | ||
|
|
||
| // Get tracer from provider | ||
| tracer_ = provider->GetTracer(serviceName, "1.0.0"); | ||
| } | ||
| catch (const std::exception& e) { | ||
| // Log error if needed | ||
| } | ||
| } |
There was a problem hiding this comment.
initialize() swallows exceptions and does not report failure; if initialization fails, tracer_ may remain null but StartSpan unconditionally dereferences tracer_. This can crash at runtime and also causes rrdOtel_Initialize_Cpp to return success even when the SDK is not usable. Track initialization status, validate serviceName/collectorEndpoint, and return an error from rrdOtel_Initialize_Cpp when tracer_ isn't created.
| uint64_t StartSpan(const char *spanName, const char *attributes, | ||
| const char *traceParent) { | ||
| try { | ||
| nostd::shared_ptr<trace::Span> span; | ||
|
|
||
| if (traceParent && traceParent[0] != '\0') { | ||
| // ✅ CRITICAL FIX #1 + #2: Parse parent and create with StartSpanOptions | ||
| // W3C Format: 00-<32 hex chars>-<16 hex chars>-<2 hex chars> | ||
| char trace_id[33] = {0}; | ||
| char span_id[17] = {0}; | ||
| char flags[3] = {0}; | ||
|
|
||
| // Extract parts from W3C format | ||
| // Expected: 00-<traceId>-<spanId>-<flags> | ||
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | ||
|
|
||
| // ✅ Convert W3C hex to SpanContext (FIX #1) | ||
| auto parent_context = createSpanContextFromIds(trace_id, span_id, flags); | ||
|
|
||
| // ✅ CRITICAL: Create StartSpanOptions with parent (FIX #2) | ||
| // This is what creates the parent-child relationship | ||
| trace::StartSpanOptions options; | ||
| options.parent = parent_context; // CRITICAL - links to parent trace | ||
|
|
||
| // Create child span with parent context | ||
| span = tracer_->StartSpan(spanName, options); | ||
| } else { | ||
| // No parent - create root span | ||
| span = tracer_->StartSpan(spanName); | ||
| } |
There was a problem hiding this comment.
StartSpan uses tracer_ without verifying initialization succeeded. If initialize() throws or GetTracer() fails, this can dereference a null tracer and crash. Add a guard (and return 0) when tracer_ is null and consider surfacing initialization errors instead of swallowing exceptions.
| } | ||
|
|
||
| int Shutdown() { | ||
| try { |
There was a problem hiding this comment.
rrdOtel_Shutdown_Cpp/RrdOtelWrapper::Shutdown() only nulls tracer_ and does not force-flush/shutdown the global provider or processor. This contradicts the header comment that shutdown flushes pending spans and can lead to lost telemetry on exit. Consider calling ForceFlush()/Shutdown() on the provider/processor (depending on the opentelemetry-cpp API available) and ending any still-active spans.
| try { | |
| try { | |
| // End any still-active spans for this wrapper | |
| { | |
| std::lock_guard<std::mutex> lock(spans_mutex_); | |
| for (auto &entry : active_spans_) | |
| { | |
| if (entry.second) | |
| { | |
| entry.second->End(); | |
| } | |
| } | |
| active_spans_.clear(); | |
| } | |
| // Reset the thread-local active scope for this thread | |
| t_active_scope_.reset(); | |
| // Attempt to flush and shut down the global tracer provider | |
| auto provider = trace::Provider::GetTracerProvider(); | |
| if (provider) | |
| { | |
| // Cast to SDK tracer provider to access ForceFlush/Shutdown if available | |
| auto sdk_provider = nostd::dynamic_pointer_cast<trace_sdk::TracerProvider>(provider); | |
| if (sdk_provider) | |
| { | |
| // Best-effort flush and shutdown; ignore failures here | |
| sdk_provider->ForceFlush(); | |
| sdk_provider->Shutdown(); | |
| } | |
| } |
| # Install library for Makefile to link against | ||
| install(TARGETS rrd_otel_cpp_wrapper | ||
| ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} | ||
| LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR} | ||
| ) | ||
|
|
||
| # Install header for C code to include | ||
| install(FILES src/rrdOpenTelemetry_cpp_wrapper.h | ||
| DESTINATION ${CMAKE_INSTALL_INCLUDEDIR} | ||
| ) |
There was a problem hiding this comment.
CMAKE_INSTALL_LIBDIR/CMAKE_INSTALL_INCLUDEDIR are used but GNUInstallDirs is not included, so these variables may be empty/undefined depending on the CMake version/toolchain. Add include(GNUInstallDirs) before the install() commands to ensure correct install destinations.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
src/rrdIarmEvents.c:180
- Resource leak when rbus_set fails:
rbusValue_t valueis initialized but not released before the early return on line 179. This causes a memory leak.
Add rbusValue_Release(value); before the return statement.
#ifndef USE_L2_SUPPORT
if (rc != RBUS_ERROR_SUCCESS)
{
RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rbus_set failed for [%s] with error [%d]\n\n", __FUNCTION__, __LINE__,RRD_WEBCFG_FORCE_SYNC ,rc);
return;
}
| if (dataMsg[0] == '\0' || len <= 0 ) | ||
| { | ||
| RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: Message Received is empty, Exit Processing!!! \n", __FUNCTION__, __LINE__); | ||
| } |
There was a problem hiding this comment.
Memory leak: dataMsg is allocated but not freed when the message is empty (lines 516-519). The function returns early in this error path without freeing the allocated memory.
Add free(dataMsg); before the return or restructure to ensure cleanup happens in all paths.
|
|
||
| /* Generate and set trace context before RBUS operation */ | ||
| rrd_otel_context_t ctx; | ||
| rrdOtel_GenerateContext(&ctx); | ||
| rrdOtel_SetContext(&ctx); | ||
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | ||
|
|
||
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | ||
|
|
||
| /* Log the operation in trace */ | ||
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | ||
|
|
||
| /* Clear trace context after operation */ | ||
| rbusHandle_ClearTraceContext(rrdRbusHandle); | ||
|
|
There was a problem hiding this comment.
Inconsistent indentation: Tab characters are mixed with spaces on lines 160, 161, 165, 169, 173. This violates the consistent spacing pattern used throughout the rest of the file.
Convert tabs to spaces to match the existing code style.
| // Return span handle (pointer) | ||
| return reinterpret_cast<uint64_t>(span.get()); | ||
| } |
There was a problem hiding this comment.
Potential undefined behavior: The span handle is returned as a raw pointer cast to uint64_t (line 165), but there's no guarantee that span will remain valid after this function returns. The nostd::shared_ptr may be destroyed or the span could be deallocated, causing the returned handle to become a dangling pointer.
The active_spans_ map stores the span, but if the caller tries to use the handle before EndSpan is called and the span has been deallocated for any reason, this will cause undefined behavior. Consider using an ID-based handle system instead of raw pointers.
| bin_PROGRAMS = remotedebugger rrdTestApp | ||
|
|
||
| remotedebuggerincludedir = $(includedir)/rrd | ||
| rrdTestAppincludedir = $(includedir)/rrd |
There was a problem hiding this comment.
Inconsistent variable name: Line 12 defines rrdTestAppincludedir but this variable is never used. The rrdTestApp_SOURCES is defined but there's no corresponding include directory usage. If the test app needs special includes, they should be added via rrdTestApp_CFLAGS, or this line should be removed if not needed.
| rrdTestAppincludedir = $(includedir)/rrd |
| target_compile_features(rrd_otel_cpp_wrapper PRIVATE cxx_std_17) | ||
|
|
||
| # Link OpenTelemetry libraries | ||
| target_link_libraries(rrd_otel_cpp_wrapper PUBLIC | ||
| opentelemetry-cpp::trace | ||
| opentelemetry-cpp::common | ||
| opentelemetry-cpp::ostream_span_exporter |
There was a problem hiding this comment.
OpenTelemetry library names may not match the actual package: The target names use opentelemetry-cpp:: prefix (lines 19-24), which suggests use of CMake's imported targets from find_package(). However, there's no find_package(opentelemetry-cpp) or similar command in the CMakeLists.txt.
Add the appropriate find_package() command for OpenTelemetry C++, or use the correct raw library names (e.g., -lopentelemetry_trace instead of opentelemetry-cpp::trace). The current configuration will fail with "Target opentelemetry-cpp::trace not found" errors.
|
|
||
| /* Thread-local storage for trace context */ | ||
| static __thread rrd_otel_context_t g_thread_local_context = {0}; | ||
| static pthread_once_t g_otel_init_once = PTHREAD_ONCE_INIT; |
There was a problem hiding this comment.
Unused variable: The pthread_once_t g_otel_init_once variable is declared (line 52) but never used. There's no pthread_once() call in the code to ensure thread-safe initialization.
Either remove this unused variable, or use it with pthread_once() to ensure thread-safe initialization of the OpenTelemetry SDK.
| static pthread_once_t g_otel_init_once = PTHREAD_ONCE_INIT; |
| catch (const std::exception& e) { | ||
| // Log error if needed | ||
| } | ||
|
|
||
| return 0; | ||
| } | ||
|
|
||
| int EndSpan(uint64_t spanHandle) { | ||
| try { | ||
| std::lock_guard<std::mutex> lock(spans_mutex_); | ||
| auto thread_id = std::this_thread::get_id(); | ||
| auto it = active_spans_.find(thread_id); | ||
|
|
||
| if (it != active_spans_.end()) { | ||
| // End the span (marks end time, triggers export) | ||
| it->second->End(); | ||
| active_spans_.erase(it); | ||
|
|
||
| // ✅ Deactivate scope | ||
| if (t_active_scope_) t_active_scope_.reset(); | ||
|
|
||
| return 0; | ||
| } | ||
| } | ||
| catch (const std::exception& e) { | ||
| // Log error if needed | ||
| } | ||
| return -1; | ||
| } | ||
|
|
||
| int LogEvent(const char *eventName, const char *eventData) { | ||
| try { | ||
| std::lock_guard<std::mutex> lock(spans_mutex_); | ||
| auto it = active_spans_.find(std::this_thread::get_id()); | ||
|
|
||
| if (it != active_spans_.end()) { | ||
| // Add event to active span | ||
| if (eventData) { | ||
| // Event with attributes | ||
| it->second->AddEvent(eventName, {{"data", eventData}}); | ||
| } else { | ||
| // Event without attributes | ||
| it->second->AddEvent(eventName); | ||
| } | ||
| return 0; | ||
| } | ||
| } | ||
| catch (const std::exception& e) { | ||
| // Log error if needed |
There was a problem hiding this comment.
Similar error handling issue: exceptions are caught but silently ignored. When span operations fail, there's no logging or error reporting. This makes debugging difficult when things go wrong. Consider adding error logging (e.g., using fprintf to stderr or cerr) to report what went wrong in the catch blocks at lines 168-169, 192-193, and 215-216.
| uint64_t rrdOtel_StartSpan_Cpp(const char *spanName, const char *attributes, | ||
| const char *traceParent) { | ||
| try { | ||
| if (g_wrapper) { | ||
| return g_wrapper->StartSpan(spanName, attributes, traceParent); | ||
| } | ||
| } | ||
| catch (...) { | ||
| } | ||
| return 0; | ||
| } | ||
|
|
||
| int rrdOtel_EndSpan_Cpp(uint64_t spanHandle) { | ||
| try { | ||
| if (g_wrapper) { | ||
| return g_wrapper->EndSpan(spanHandle); | ||
| } | ||
| } | ||
| catch (...) { | ||
| } | ||
| return -1; | ||
| } | ||
|
|
||
| int rrdOtel_LogEvent_Cpp(const char *eventName, const char *eventData) { | ||
| try { | ||
| if (g_wrapper) { | ||
| return g_wrapper->LogEvent(eventName, eventData); | ||
| } | ||
| } | ||
| catch (...) { | ||
| } | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
Missing mutex protection for g_wrapper access. While rrdOtel_Initialize_Cpp and rrdOtel_Shutdown_Cpp properly lock g_wrapper_mutex before accessing g_wrapper, the functions rrdOtel_StartSpan_Cpp, rrdOtel_EndSpan_Cpp, and rrdOtel_LogEvent_Cpp check g_wrapper without holding the mutex. This creates a race condition where g_wrapper could be reset to nullptr by rrdOtel_Shutdown_Cpp between the null check and the actual method call. Either add mutex protection to these functions, or document that they must not be called concurrently with Shutdown.
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); |
There was a problem hiding this comment.
The srand() call is being made every time _generate_hex_id() is called, which will reset the random number generator seed repeatedly. This defeats the purpose of seeding and could lead to predictable or repeated trace IDs, especially when multiple calls happen in quick succession. The srand() call should be done once during initialization (e.g., in rrdOtel_Initialize) rather than in this function. Additionally, using time(NULL) + pthread_self() as a seed is not cryptographically secure and could produce colliding trace IDs. Consider using a more robust random number generator like /dev/urandom or a proper UUID generation library for trace and span IDs.
| /* Log the RBUS operation in trace */ | ||
| rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT); | ||
|
|
There was a problem hiding this comment.
Similar to the issue in rrdInterface.c, rrdOtel_LogEvent() is called at line 399 before any span has been started in the _rdmDownloadEventHandler function. The rrdOtel_LogEvent function requires an active span to add events to. This call will likely fail unless there's an active span from a parent context. Either a span should be started before this call, or the call should be removed if it's not needed.
| /* Log the RBUS operation in trace */ | |
| rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT); | |
| trace::SpanContext createSpanContextFromIds(const char* trace_id_str, | ||
| const char* span_id_str, | ||
| const char* trace_flags_str) { | ||
| // Trace ID: 32 hex chars = 16 bytes | ||
| std::array<uint8_t, 16> trace_id_bytes; | ||
| for (int i = 0; i < 16; i++) { | ||
| char byte_str[3] = {trace_id_str[i*2], trace_id_str[i*2+1], '\0'}; | ||
| trace_id_bytes[i] = static_cast<uint8_t>(std::strtol(byte_str, nullptr, 16)); | ||
| } | ||
|
|
||
| // Span ID: 16 hex chars = 8 bytes | ||
| std::array<uint8_t, 8> span_id_bytes; | ||
| for (int i = 0; i < 8; i++) { | ||
| char byte_str[3] = {span_id_str[i*2], span_id_str[i*2+1], '\0'}; | ||
| span_id_bytes[i] = static_cast<uint8_t>(std::strtol(byte_str, nullptr, 16)); | ||
| } | ||
|
|
||
| // Flags: 2 hex chars = 1 byte | ||
| uint8_t trace_flags = static_cast<uint8_t>(std::strtol(trace_flags_str, nullptr, 16)); | ||
|
|
||
| // Create OTEL SDK objects | ||
| auto otlp_trace_id = trace::TraceId(nostd::span<const uint8_t, 16>(trace_id_bytes.data(), 16)); | ||
| auto otlp_span_id = trace::SpanId(nostd::span<const uint8_t, 8>(span_id_bytes.data(), 8)); | ||
|
|
||
| // Return SpanContext (this is what goes into StartSpanOptions) | ||
| return trace::SpanContext(otlp_trace_id, otlp_span_id, trace::TraceFlags(trace_flags), true); | ||
| } |
There was a problem hiding this comment.
Missing input validation and potential buffer overflow. The createSpanContextFromIds function assumes that trace_id_str, span_id_str, and trace_flags_str are valid null-terminated strings with sufficient length. However, there are no checks to verify this. If trace_id_str has fewer than 32 characters, or span_id_str has fewer than 16 characters, the loop will read beyond the string bounds causing undefined behavior. The function should validate that strlen(trace_id_str) >= 32, strlen(span_id_str) >= 16, and strlen(trace_flags_str) >= 2 before processing, and return an error or default context if validation fails.
| /* Initialize OpenTelemetry trace context fields */ | ||
| sbuf->traceParent = NULL; | ||
| sbuf->traceState = NULL; | ||
| sbuf->spanHandle = 0; |
There was a problem hiding this comment.
Inconsistent indentation detected. Lines 226-229 use tabs for indentation while the surrounding code uses spaces. This should be changed to use spaces consistently with the rest of the file.
|
b'## WARNING: A Blackduck scan failure has been waived A prior failure has been upvoted
|
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | ||
|
|
||
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | ||
|
|
||
| /* Log the operation in trace */ | ||
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); |
There was a problem hiding this comment.
Same issue as in rrdInterface.c: rrdOtel_LogEvent is called when no span is active. A trace context is generated and set (lines 162-165), but no span is created before calling rrdOtel_LogEvent at line 170. According to the rrdOtel_LogEvent implementation, it requires an active span to log events to, so this call will fail. A span should be created before calling LogEvent.
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | |
| /* Log the operation in trace */ | |
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | |
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| /* Start a span for the RBUS operation so that events are recorded correctly */ | |
| rrdOtel_StartSpan("rbus_set"); | |
| rc = rbus_set(rrdRbusHandle, RRD_WEBCFG_FORCE_SYNC, value, NULL); | |
| /* Log the operation in trace */ | |
| rrdOtel_LogEvent("rbus_set", RRD_WEBCFG_FORCE_SYNC); | |
| /* End the span after logging the operation */ | |
| rrdOtel_EndSpan(); |
| void _remoteDebuggerEventHandler(rbusHandle_t handle, rbusEvent_t const* event, rbusEventSubscription_t* subscription) | ||
| { | ||
| char *dataMsg = NULL; | ||
| data_buf *eventBuf = NULL; |
There was a problem hiding this comment.
Inconsistent indentation: This line uses a tab character while the surrounding code uses spaces. The codebase appears to use spaces for indentation, so this should also use spaces for consistency.
| } | ||
| else | ||
| { | ||
| /* Initialize trace context for this event */ |
There was a problem hiding this comment.
Inconsistent indentation: This line uses tab characters while the surrounding code uses spaces. The codebase appears to use spaces for indentation, so this should also use spaces for consistency.
| sbuf->inDynamic = false; | ||
| sbuf->appendMode = false; | ||
| sbuf->dsEvent = deepSleepEvent; | ||
| /* Initialize OpenTelemetry trace context fields */ |
There was a problem hiding this comment.
Inconsistent indentation: This comment uses a tab character while the surrounding code uses spaces. The codebase appears to use spaces for indentation (4 spaces per level), so this comment should also use spaces for consistency.
| /* Set trace context before RBUS operation */ | ||
| _set_rbus_trace_context(); | ||
|
|
||
| retCode = rbus_get(rrdRbusHandle, RRD_SET_ISSUE_EVENT, &value); | ||
|
|
||
| /* Log the RBUS operation in trace */ | ||
| rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT); | ||
|
|
||
| /* Clear trace context after RBUS operation */ | ||
| _clear_rbus_trace_context(); | ||
|
|
There was a problem hiding this comment.
Inconsistent indentation: These lines use tab characters while the surrounding code uses spaces. The codebase appears to use spaces for indentation (4 spaces per level), so these should also use spaces for consistency.
| free(sbuf->jsonPath); | ||
| } | ||
|
|
||
| /* Free OpenTelemetry trace context fields */ |
There was a problem hiding this comment.
Inconsistent indentation: This comment uses a tab character while the surrounding code uses spaces. The codebase appears to use spaces for indentation, so this comment should also use spaces for consistency.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.
Comments suppressed due to low confidence (1)
src/rrdInterface.c:488
- Memory leak: The rbusValue_t initialized on line 390 is never released. When rbus_get succeeds, the value should be released with rbusValue_Release(value) before the function returns. Add rbusValue_Release(value) before line 488 (and also in error paths after line 407 if value was successfully retrieved).
rbusValue_t value = NULL;
rbusValue_Init(&value);
char const* issue = NULL;
/* Set trace context before RBUS operation */
_set_rbus_trace_context();
retCode = rbus_get(rrdRbusHandle, RRD_SET_ISSUE_EVENT, &value);
/* Log the RBUS operation in trace */
rrdOtel_LogEvent("rbus_get", RRD_SET_ISSUE_EVENT);
/* Clear trace context after RBUS operation */
_clear_rbus_trace_context();
if(retCode != RBUS_ERROR_SUCCESS || value == NULL)
{
RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: RBUS get failed for the event [%s]\n", __FUNCTION__, __LINE__, RRD_SET_ISSUE_EVENT);
return;
}
RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: issue type_value: = [%s]\n", __FUNCTION__, __LINE__, rbusValue_GetString(value, NULL));
issue =rbusValue_GetString(value, NULL);
char *dot_position = strchr(issue, '.'); // Find the first occurrence of '.'
if (dot_position != NULL)
{
*dot_position = '\0'; // Replace '.' with null terminator
}
size_t len = strlen(RDM_PKG_PREFIX) + strlen(issue) + 1;
char *pkg_name = (char *)malloc(len);
if(pkg_name == NULL)
{
return;
}
strncpy(pkg_name, RDM_PKG_PREFIX, strlen(RDM_PKG_PREFIX) + 1);
strncat(pkg_name, issue, len - strlen(RDM_PKG_PREFIX) - 1);
RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: pkg_name : [%s]\n", __FUNCTION__, __LINE__, pkg_name);
char *pkg_inst_path = (char *)malloc(strlen(RRD_TMP_DIR) + strlen(pkg_name) + 1);
if( pkg_inst_path == NULL)
{
return;
}
snprintf(pkg_inst_path, strlen(RRD_TMP_DIR) + strlen(pkg_name) + 1, "%s%s", RRD_TMP_DIR, pkg_name);
RDK_LOG(RDK_LOG_DEBUG,LOG_REMDEBUG,"[%s:%d]: pkg_inst_path : [%s]\n", __FUNCTION__, __LINE__, pkg_inst_path);
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: ...Entering... \n", __FUNCTION__, __LINE__);
(void)(handle);
(void)(subscription);
RDK_LOG(RDK_LOG_INFO, LOG_REMDEBUG, "[%s:%d]: Received event for RDM_DOWNLOAD_EVENT %s \n", __FUNCTION__, __LINE__, RDM_DOWNLOAD_EVENT);
cache = findPresentInCache(pkg_name);
if (cache != NULL)
{
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Package found in Cache...%s \n", __FUNCTION__, __LINE__, cache->issueString);
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Package Details jsonPath: %s \n", __FUNCTION__, __LINE__, pkg_inst_path);
rrdjsonlen = strlen(RRD_JSON_FILE);
recPkglen = strlen(pkg_inst_path) + 1;
recPkgNamelen = strlen(cache->issueString) + 1;
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]:recPkgNamelen=%d recPkglen=%d rrdjsonlen=%d \n", __FUNCTION__, __LINE__, recPkgNamelen, recPkglen, rrdjsonlen);
sendbuf = (data_buf *)malloc(sizeof(data_buf));
RRD_data_buff_init(sendbuf, EVENT_MSG, RRD_DEEPSLEEP_RDM_PKG_INSTALL_COMPLETE);
sendbuf->mdata = (char *) calloc(recPkgNamelen, sizeof(char));
if(!sendbuf->mdata)
{
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Memory Allocation Failed for the rdm download event \n", __FUNCTION__, __LINE__);
RRD_data_buff_deAlloc(sendbuf);
return;
}
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]:JSON_PATH_LEN=%d \n", __FUNCTION__, __LINE__, recPkglen + rrdjsonlen);
sendbuf->jsonPath = (char *)calloc(recPkglen + rrdjsonlen, sizeof(char));
if (!sendbuf->jsonPath)
{
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Memory Allocation Failed for the rdm download event \n", __FUNCTION__, __LINE__);
RRD_data_buff_deAlloc(sendbuf);
return;
}
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Cache.issueString=%s Cache.issueString.Len=%d\n", __FUNCTION__, __LINE__, cache->issueString, strlen(cache->issueString));
strncpy((char *)sendbuf->mdata, cache->issueString, recPkgNamelen);
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: IssueType: %s...\n", __FUNCTION__, __LINE__, (char *)sendbuf->mdata);
snprintf(sendbuf->jsonPath, strlen(pkg_inst_path) + rrdjsonlen + 1, "%s%s", pkg_inst_path, RRD_JSON_FILE);
sendbuf->inDynamic = true;
if (checkAppendRequest(sendbuf->mdata))
{
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]:Received command apppend request for the issue \n", __FUNCTION__, __LINE__);
sendbuf->inDynamic = false;
sendbuf->appendMode = true;
}
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: IssueType: %s... jsonPath: %s... \n", __FUNCTION__, __LINE__, (char *)sendbuf->mdata, sendbuf->jsonPath);
RDK_LOG(RDK_LOG_DEBUG, LOG_REMDEBUG, "[%s:%d]: Copying Message Received to the queue.. \n", __FUNCTION__, __LINE__);
RRDMsgDeliver(msqid, sendbuf);
RDK_LOG(RDK_LOG_INFO, LOG_REMDEBUG, "[%s:%d]: SUCCESS: Message sending Done, ID=%d MSG=%s Size=%d Type=%u AppendMode=%d! \n", __FUNCTION__, __LINE__, msqid, sendbuf->mdata, strlen(sendbuf->mdata), sendbuf->mtype, sendbuf->appendMode);
remove_item(cache);
}
else
{
RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: Package not requested... %s \n", __FUNCTION__, __LINE__, pkg_name);
}
free(pkg_name);
free(pkg_inst_path);
}
| // ✅ Deactivate scope | ||
| if (t_active_scope_) t_active_scope_.reset(); |
There was a problem hiding this comment.
The thread-local t_active_scope_ is reset in EndSpan, but it's not checked whether this scope actually corresponds to the span being ended. If nested spans are created on the same thread, ending an inner span would reset the scope that might still be needed for the outer span. This breaks nested span support.
| rbusValue_Init(&value); | ||
| rbusValue_SetString(value,"root"); | ||
|
|
||
| /* Generate and set trace context before RBUS operation */ | ||
| rrd_otel_context_t ctx; | ||
| rrdOtel_GenerateContext(&ctx); | ||
| rrdOtel_SetContext(&ctx); | ||
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | ||
|
|
There was a problem hiding this comment.
The return value from rrdOtel_GenerateContext and rrdOtel_SetContext is not checked. If these calls fail (e.g., return non-zero), the code continues to call rbusHandle_SetTraceContextFromString with potentially uninitialized ctx values. Add error checking and handle failures appropriately.
| rbusValue_Init(&value); | |
| rbusValue_SetString(value,"root"); | |
| /* Generate and set trace context before RBUS operation */ | |
| rrd_otel_context_t ctx; | |
| rrdOtel_GenerateContext(&ctx); | |
| rrdOtel_SetContext(&ctx); | |
| rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| rbusValue_Init(&value); | |
| rbusValue_SetString(value,"root"); | |
| /* Generate and set trace context before RBUS operation */ | |
| rrd_otel_context_t ctx; | |
| int otelRet = rrdOtel_GenerateContext(&ctx); | |
| if (otelRet != 0) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rrdOtel_GenerateContext failed (ret=%d), proceeding without trace context\n", __FUNCTION__, __LINE__, otelRet); | |
| } | |
| else | |
| { | |
| otelRet = rrdOtel_SetContext(&ctx); | |
| if (otelRet != 0) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rrdOtel_SetContext failed (ret=%d), proceeding without trace context\n", __FUNCTION__, __LINE__, otelRet); | |
| } | |
| else | |
| { | |
| rbusError_t traceRc = rbusHandle_SetTraceContextFromString(rrdRbusHandle, ctx.traceParent, ctx.traceState); | |
| if (traceRc != RBUS_ERROR_SUCCESS) | |
| { | |
| RDK_LOG(RDK_LOG_ERROR, LOG_REMDEBUG, "[%s:%d]: rbusHandle_SetTraceContextFromString failed with error [%d], proceeding without trace context\n", __FUNCTION__, __LINE__, traceRc); | |
| } | |
| } | |
| } |
| } | ||
| catch (...) { | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
Exception handling catches all exceptions with catch(...) but provides no error feedback to the caller. When initialization fails due to an exception, the function returns -1, but there's no logging or indication of what went wrong. Consider logging the exception message or at least a generic error message to help with debugging when OpenTelemetry initialization fails.
| int EndSpan(uint64_t spanHandle) { | ||
| try { | ||
| std::lock_guard<std::mutex> lock(spans_mutex_); | ||
| auto thread_id = std::this_thread::get_id(); | ||
| auto it = active_spans_.find(thread_id); | ||
|
|
||
| if (it != active_spans_.end()) { | ||
| // End the span (marks end time, triggers export) | ||
| it->second->End(); | ||
| active_spans_.erase(it); | ||
|
|
||
| // ✅ Deactivate scope | ||
| if (t_active_scope_) t_active_scope_.reset(); | ||
|
|
||
| return 0; | ||
| } |
There was a problem hiding this comment.
The EndSpan function doesn't verify that the spanHandle matches the span in active_spans_. It only checks if there's a span for the current thread. This means if you call EndSpan with the wrong handle (or any non-zero value), it will end whatever span is active on that thread, regardless of whether it's the right one. This could lead to spans being ended prematurely or in the wrong order.
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c | ||
|
|
||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c | ||
| # Allow wrapper library path to be overridden (for Yocto builds) | ||
| RRD_OTEL_WRAPPER_DIR ?= $(STAGING_DIR)/usr/lib | ||
|
|
||
| remotedebugger_CFLAGS = -I$(top_srcdir)/include/rrd -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/trower-base64/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir} $(CJSON_CFLAGS) | ||
| rrdTestApp_CFLAGS = -I$(top_srcdir)/include/rrd -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/trower-base64/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir} $(CJSON_CFLAGS) | ||
| AM_LDFLAGS="-lpthread -lrdkloggers -lmsgpackc -ltrower-base64 -lwebconfig_framework -lrbus -lsecure_wrapper " | ||
| remotedebugger_LDADD = -lfwutils -luploadstblogs -lz | ||
| if IARMBUS_ENABLE | ||
| remotedebugger_SOURCES += rrdIarmEvents.c | ||
| remotedebugger_CFLAGS += -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmmgrs/rdmmgr -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmmgrs-hal -DIARMBUS_SUPPORT | ||
| AM_LDFLAGS += "-lIARMBus -lrfcapi -ltr181api" | ||
| endif | ||
| remotedebugger_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common | ||
| rrdTestApp_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common | ||
| remotedebugger_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common -L$(RRD_OTEL_WRAPPER_DIR) -lrrd_otel_cpp_wrapper -lopentelemetry_trace -lopentelemetry_otlp_http_exporter -lopentelemetry_common -lopentelemetry_resources |
There was a problem hiding this comment.
The remotedebugger binary is linked with the OpenTelemetry C++ wrapper library and dependencies, but rrdOpenTelemetry_cpp_wrapper.cpp is not listed in remotedebugger_SOURCES. The wrapper library needs to be built separately (presumably by CMake as indicated in CMakeLists.txt) and then linked. However, there's no clear indication in this Makefile that the C++ wrapper library will be built first or is available. Consider adding a dependency or build rule to ensure the C++ wrapper is built before linking.
| // ✅ CRITICAL FIX #5: Thread-safe current span tracking | ||
| std::unordered_map<std::thread::id, nostd::shared_ptr<trace::Span>> active_spans_; | ||
| std::mutex spans_mutex_; |
There was a problem hiding this comment.
Using a single active_spans_ map indexed by thread ID means only one span can be active per thread at a time. This prevents proper nesting of spans within the same thread. The comment says "Thread-safe current span tracking" but it doesn't support the common use case of nested spans (e.g., a parent operation containing multiple child operations in sequence).
| static void _generate_hex_id(char *buffer, int length) | ||
| { | ||
| static const char hex_chars[] = "0123456789abcdef"; | ||
| srand((unsigned int)time(NULL) + pthread_self()); |
There was a problem hiding this comment.
The srand() function is called every time _generate_hex_id is invoked, which resets the random number generator with a potentially similar seed each time. This is inefficient and can lead to predictable sequences. The seed should be initialized once at program startup, not on every call. Additionally, using pthread_self() makes the sequence even more predictable across threads.
| /* Initialize trace context for this event */ | ||
| eventBuf = (data_buf *)malloc(sizeof(data_buf)); | ||
| if (eventBuf) | ||
| { | ||
| RRD_data_buff_init(eventBuf, EVENT_MSG, RRD_DEEPSLEEP_INVALID_DEFAULT); | ||
| /* Setup OpenTelemetry trace context */ | ||
| _setup_trace_context_for_event(eventBuf, "ProcessIssueEvent"); | ||
|
|
There was a problem hiding this comment.
The trace context from the incoming RBUS event is never extracted. When an external component sends a trace context via RBUS (as demonstrated in rrdTraceTestApp.c), the event handler should extract it using rbusHandle_GetTraceContextAsString or a similar RBUS API before setting up the trace. Currently, the code always generates a new trace context or uses the thread-local one, missing the trace context that was propagated through RBUS.
| rrdTestApp_SOURCES = rrdTraceTestApp.c | ||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c rrdOpenTelemetry.c | ||
|
|
||
| remotedebugger_SOURCES = rrdMain.c rrdEventProcess.c rrdJsonParser.c rrdRunCmdThread.c rrdCommandSanity.c rrdDynamic.c rrdExecuteScript.c rrdMsgPackDecoder.c rrdInterface.c uploadRRDLogs.c rrd_config.c rrd_sysinfo.c rrd_logproc.c rrd_archive.c rrd_upload.c | ||
| # Allow wrapper library path to be overridden (for Yocto builds) | ||
| RRD_OTEL_WRAPPER_DIR ?= $(STAGING_DIR)/usr/lib | ||
|
|
||
| remotedebugger_CFLAGS = -I$(top_srcdir)/include/rrd -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/trower-base64/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir} $(CJSON_CFLAGS) | ||
| rrdTestApp_CFLAGS = -I$(top_srcdir)/include/rrd -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/trower-base64/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir} $(CJSON_CFLAGS) | ||
| AM_LDFLAGS="-lpthread -lrdkloggers -lmsgpackc -ltrower-base64 -lwebconfig_framework -lrbus -lsecure_wrapper " | ||
| remotedebugger_LDADD = -lfwutils -luploadstblogs -lz | ||
| if IARMBUS_ENABLE | ||
| remotedebugger_SOURCES += rrdIarmEvents.c | ||
| remotedebugger_CFLAGS += -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmbus/ -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmmgrs/rdmmgr -I$(PKG_CONFIG_SYSROOT_DIR)${includedir}/rdk/iarmmgrs-hal -DIARMBUS_SUPPORT | ||
| AM_LDFLAGS += "-lIARMBus -lrfcapi -ltr181api" | ||
| endif | ||
| remotedebugger_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common | ||
| rrdTestApp_LDFLAGS = $(AM_LDFLAGS) $(CJSON_LDFLAGS) $(CJSON_LIBS) -luploadstblogs -fno-common |
There was a problem hiding this comment.
The rrdTestApp_LDADD variable is not defined, but rrdTestApp_LDFLAGS is. This means rrdTestApp won't link against the libraries specified in remotedebugger_LDADD (like -lfwutils, -luploadstblogs, -lz). The test app should either have its own rrdTestApp_LDADD or should not require these libraries. If RBUS functionality is needed for the test app, ensure it links against all required libraries.
|
|
||
| // Extract parts from W3C format | ||
| // Expected: 00-<traceId>-<spanId>-<flags> | ||
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); |
There was a problem hiding this comment.
The sscanf format string has potential buffer overflow issues. The format "%32[^-]" and "%16[^-]" will read up to 32 and 16 characters respectively, but the buffers are declared as size 33 and 17 (which includes space for null terminator). However, sscanf will add a null terminator after reading, so reading 32 characters will write 33 bytes into a 33-byte buffer, which is correct. But if the input string is malformed and doesn't have the expected dashes, sscanf could read beyond the expected boundaries. Consider validating the input format before parsing or using safer parsing methods.
| // Extract parts from W3C format | |
| // Expected: 00-<traceId>-<spanId>-<flags> | |
| std::sscanf(traceParent, "%*2c-%32[^-]-%16[^-]-%2s", trace_id, span_id, flags); | |
| // Extract parts from W3C format without using sscanf to avoid parsing issues | |
| // Expected: 00-<traceId>-<spanId>-<flags> | |
| { | |
| const std::size_t len = std::strlen(traceParent); | |
| // Minimal length check and dash positions: "00-" + 32 + "-" + 16 + "-" + 2 = 55 chars | |
| if (len >= 55 && | |
| traceParent[2] == '-' && | |
| traceParent[35] == '-' && | |
| traceParent[52] == '-') | |
| { | |
| // Copy trace_id (32 chars) starting after "00-" | |
| std::memcpy(trace_id, traceParent + 3, 32); | |
| trace_id[32] = '\0'; | |
| // Copy span_id (16 chars) | |
| std::memcpy(span_id, traceParent + 36, 16); | |
| span_id[16] = '\0'; | |
| // Copy flags (2 chars) | |
| std::memcpy(flags, traceParent + 53, 2); | |
| flags[2] = '\0'; | |
| } | |
| } |
No description provided.