Fix profile_gtest.bin SIGABRT caused by use-after-destroy of t2MarkersMutex#301
Fix profile_gtest.bin SIGABRT caused by use-after-destroy of t2MarkersMutex#301Copilot wants to merge 2 commits intocopilot/cherry-pick-1-8-5-to-1-6-3from
Conversation
…er destroy in ClearMarkerComponentMapShouldRemoveEntries test Co-authored-by: yogeswaransky <166126056+yogeswaransky@users.noreply.github.com> Agent-Logs-Url: https://github.com/rdkcentral/telemetry/sessions/0ef5113d-796c-43d6-b467-2b713a9157aa
There was a problem hiding this comment.
Pull request overview
Fixes the profile_gtest.bin SIGABRT caused by tests destroying the T2 marker component map mutex and leaving subsequent tests to lock a destroyed mutex.
Changes:
- Reinitializes the marker component map after
destroyT2MarkerComponentMap()inClearMarkerComponentMapShouldRemoveEntries. - Adds
Vector_Create/Vector_Destroymock expectations to support the reinitialization path.
| EXPECT_EQ(destroyT2MarkerComponentMap(), T2ERROR_SUCCESS); | ||
| // Reinitialize for subsequent tests that depend on marker component map | ||
| EXPECT_CALL(*g_vectorMock, Vector_Create(_)) | ||
| .Times(::testing::AtMost(1)) | ||
| .WillRepeatedly(Return(T2ERROR_SUCCESS)); | ||
| EXPECT_CALL(*g_vectorMock, Vector_Destroy(_, _)) | ||
| .Times(::testing::AtMost(1)) | ||
| .WillRepeatedly(Return(T2ERROR_SUCCESS)); | ||
| EXPECT_EQ(initT2MarkerComponentMap(), T2ERROR_SUCCESS); |
There was a problem hiding this comment.
Calling initT2MarkerComponentMap() here can double-initialize internal static mutexes (e.g., t2CompListMutex is pthread_mutex_init()'d in initT2MarkerComponentMap() but is never pthread_mutex_destroy()'d in destroyT2MarkerComponentMap()). This is undefined behavior and can lead to flaky crashes or EBUSY failures. A safer test-only fix would be to avoid destroyT2MarkerComponentMap() in this test (it’s not needed to validate clear semantics), or move init/destroy into a suite-level SetUpTestSuite/TearDownTestSuite so init runs once and destroy runs once at the end.
| EXPECT_EQ(destroyT2MarkerComponentMap(), T2ERROR_SUCCESS); | |
| // Reinitialize for subsequent tests that depend on marker component map | |
| EXPECT_CALL(*g_vectorMock, Vector_Create(_)) | |
| .Times(::testing::AtMost(1)) | |
| .WillRepeatedly(Return(T2ERROR_SUCCESS)); | |
| EXPECT_CALL(*g_vectorMock, Vector_Destroy(_, _)) | |
| .Times(::testing::AtMost(1)) | |
| .WillRepeatedly(Return(T2ERROR_SUCCESS)); | |
| EXPECT_EQ(initT2MarkerComponentMap(), T2ERROR_SUCCESS); |
profile_gtest.bincrashes with SIGABRT (exit code 134) becauseClearMarkerComponentMapShouldRemoveEntriescallsdestroyT2MarkerComponentMap(), permanently destroying thet2MarkersMutexpthread mutex andmarkerCompMaphash map. Subsequent tests (e.g.initReportProfiles) internally attempt to lock the destroyed mutex → undefined behavior → SIGABRT.Change
source/test/bulkdata/profileTest.cpp: After thedestroyT2MarkerComponentMap()call inClearMarkerComponentMapShouldRemoveEntries, reinitialize the marker component map so downstream tests operate on valid state:Mock expectations match the pattern already established in
InitAndDestroyShouldWork. No production source files modified.Original prompt
Problem
The L1-tests workflow on PR #300 (branch
copilot/cherry-pick-1-8-5-to-1-6-3, targetingsupport/1.6.3) fails becauseprofile_gtest.bincrashes with SIGABRT (exit code 134). This is a pre-existing issue on thesupport/1.6.3base branch — the PR's cherry-picked source changes did not introduce this failure.See failing job: https://github.com/rdkcentral/telemetry/actions/runs/23577457438/job/68657158670?pr=300
The
test/run_ut.shscript runs test binaries sequentially and records failure if any exit non-zero. Theprofile_gtest.binbinary crashes during or after theProfileTest.ClearMarkerComponentMapShouldRemoveEntriestest, which callsdestroyT2MarkerComponentMap()at line 509 ofsource/test/bulkdata/profileTest.cpp. This permanently destroys thet2MarkersMutexandmarkerCompMaphash map. Later tests in the same binary (starting fromProfileTest.initReportProfilesat line 522) callinitReportProfiles()which internally tries to lock the already-destroyed mutex, causing undefined behavior and SIGABRT.Root Cause
In
source/test/bulkdata/profileTest.cpp:InitAndDestroyShouldWork(line 287) callsinitT2MarkerComponentMap()— initializes the mutex and hash map ✅ComponentListIsUpdated(line 404) andGetMarkerProfileListReturnsProfiles(line 427) calladdT2EventMarker()which uses the mutex internally ✅ClearMarkerComponentMapShouldRemoveEntries(line 491) calls:clearT2MarkerComponentMap()— clears entries ✅destroyT2MarkerComponentMap()at line 509 — permanently destroys the mutex and hash map 💥initReportProfiles(line 522) calls realinitReportProfiles()which internally accesses the marker component map infrastructure with the destroyed mutex → SIGABRTGoogleTest runs tests in declaration order within the same test suite. Since all these tests are in the
ProfileTestfixture, the destroy happens before the reportProfiles tests run.Required Fix (TEST-ONLY — no source code changes)
Do NOT modify any source code files. Only modify test files.
Fix the test file
source/test/bulkdata/profileTest.cppby reinitializing the T2 marker component map after it is destroyed in theClearMarkerComponentMapShouldRemoveEntriestest. Specifically:After line 509 (
EXPECT_EQ(destroyT2MarkerComponentMap(), T2ERROR_SUCCESS);), add a call to reinitialize:This ensures the mutex and hash map are in a valid state for all subsequent tests in the same binary (especially the
initReportProfilestest at line 522 and beyond).You will also need to add the appropriate Vector mock expectations (Vector_Create and Vector_Destroy) for this reinitialization call, matching the pattern already used in the
InitAndDestroyShouldWorktest at line 287-295.Files to modify
source/test/bulkdata/profileTest.cpp— Add reinitialization of T2 marker component map after destroy inClearMarkerComponentMapShouldRemoveEntriestestVerification
After the fix, the
profile_gtest.binbinary should no longer crash, andtest/run_ut.shshould report all test binaries passing (exit 0), making the L1-tests workflow pass.The following is the prior conversation context from the user's chat exploration (may be truncated):
User: looks like a crash:
[ OK ] ProfileTest.getMarkerProfileList (0 ms)
Aborted (core dumped)
Test ./source/test/bulkdata/profile_gtest.bin failed with exit code 134
https://github.com/rdkcentral/telemetry/actions/runs/23577457438/job/68657158670?pr=300
could you take your time , analyze and provide a fix for this ?
Assistant:
I'll start by analyzing the failing job and the PR to understand the crash. Let me gather the relevant information.
Assistant: Now let me look at the specific crash. The log says
profile_gtest.bincrashed withAborted (core dumped)and exit code 134 (which is SIGABRT). The crash happens afterProfileTest.getMarkerProfileListpasses. Let me look at the full job log more carefully and examine the test source and the PR changes.Assistant: Now let me look at the profile_gtest test file and the profile source code to understand the crash. The logs show
ProfileTest.getMarkerProfileListpassed but then "Aborted (core dumped)" happened, meaning the crash is during test teardown or the next test initialization.Assistant: Now let me look at the actual profile test source to understand the crash better. The crash happens after
ProfileTest.getMarkerProfileListpasses. Let me find this test and look at the full test file.Assistant: Now let me look at the specific test that crashes - the log says the last passing test is
ProfileTest.getMarkerProfileListand thenAborted (core dumped). Let me check what test runs after that one, and look at theprofileMockwhich is referenced but might be missing ...This pull request was created from Copilot chat.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.