-
Notifications
You must be signed in to change notification settings - Fork 10
RDKEMW-10277: Multiple crashes observed while searching SSID #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Reason for Change: The wifiConnectionUpdate() callback registered in the wifiConnect(), got invoked after the wait(m_loop) timeout of 10sec and deleteClientConnection(). In the deleteClientConnection() function m_client is completely cleared and set to NULL. After this wifiConnectionUpdate() callback is invoked in which nm_client_activate_connection_async() uses the freed m_client, which leads to the crash. Addressed this by adding GCancellable, through which we are cancelling the pending async operations. Test Procedure: Check for crash in the issue reported scenario Priority:P1 Risks: Medium Signed-off-by: Gururaaja ESR<Gururaja_ErodeSriranganRamlingham@comcast.com>
Reason for Change: The wifiConnectionUpdate() callback registered in the wifiConnect(), got invoked after the wait(m_loop) timeout of 10sec and deleteClientConnection(). In the deleteClientConnection() function m_client is completely cleared and set to NULL. After this wifiConnectionUpdate() callback is invoked in which nm_client_activate_connection_async() uses the freed m_client, which leads to the crash. Addressed this by adding GCancellable, through which we are cancelling the pending async operations. Test Procedure: Check for crash in the issue reported scenario Priority:P1 Risks: Medium Signed-off-by: Gururaaja ESR<Gururaja_ErodeSriranganRamlingham@comcast.com>
…ager into topic/RDKEMW-10277
bpunnuru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses a critical crash issue (RDKEMW-10277) that occurred when async callbacks were invoked after the m_client object was freed due to timeout. The fix introduces GCancellable to properly cancel pending async operations before cleanup.
Changes:
- Added
GCancellablemember to track and cancel pending async operations - Updated all async NetworkManager API calls to pass the cancellable parameter
- Enhanced error handling to gracefully handle cancellation errors in callbacks
- Added infinite DHCP timeout configuration
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| plugin/gnome/NetworkManagerGnomeWIFI.h | Added m_cancellable member variable to wifiManager class |
| plugin/gnome/NetworkManagerGnomeWIFI.cpp | Implemented cancellable initialization, cleanup, and usage across all async operations; added cancellation error handling |
Comments suppressed due to low confidence (2)
plugin/gnome/NetworkManagerGnomeWIFI.cpp:540
- The
wifiConnectionUpdatecallback invokesnm_client_activate_connection_async(line 546) which could use the freedm_clientif called after timeout. This callback is not checking for cancellation errors and doesn't have a cancellable parameter passed tonm_remote_connection_update2at lines 1033 and 1189. Add cancellation error checking similar to other callbacks, and passm_cancellableto thenm_remote_connection_update2calls.
static void wifiConnectionUpdate(GObject *rmObject, GAsyncResult *res, gpointer user_data)
{
NMRemoteConnection *remote_con = NM_REMOTE_CONNECTION(rmObject);
wifiManager *_wifiManager = (static_cast<wifiManager*>(user_data));
GVariant *ret = NULL;
GError *error = NULL;
ret = nm_remote_connection_update2_finish(remote_con, res, &error);
if (!ret || error) {
if(error) {
NMLOG_ERROR("Error: %s.", error->message);
g_error_free(error);
}
_wifiManager->m_isSuccess = false;
_wifiManager->quit(NULL);
return;
}
plugin/gnome/NetworkManagerGnomeWIFI.cpp:1105
- The
addToKnownSSIDsUpdateCbcallback lacks cancellation error handling. This callback is associated withnm_remote_connection_update2calls that don't pass a cancellable parameter (lines 1033, 1189). Add a check forG_IO_ERROR_CANCELLEDbefore logging the error as a failure, similar to the pattern used inaddToKnownSSIDsCb.
static void addToKnownSSIDsUpdateCb(GObject *rmObject, GAsyncResult *res, gpointer user_data)
{
NMRemoteConnection *remote_con = NM_REMOTE_CONNECTION(rmObject);
wifiManager *_wifiManager = (static_cast<wifiManager*>(user_data));
GVariant *ret = NULL;
GError *error = NULL;
ret = nm_remote_connection_update2_finish(remote_con, res, &error);
if (!ret) {
if(error) {
NMLOG_ERROR("Error: %s.", error->message);
g_error_free(error);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if (_wifiManager->m_createNewConnection) { | ||
| // Check if operation was cancelled - this is expected during cleanup | ||
| if (g_error_matches(error, G_IO_ERROR, G_IO_ERROR_CANCELLED)) { |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error pointer is checked before it's populated by the finish functions. The check at line 454 evaluates error which is still NULL at this point. This check should be moved after the finish function calls (lines 459 or 464) have populated the error. The current logic will never detect cancellation because g_error_matches(NULL, ...) returns false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| NMLOG_DEBUG("Scan operation was cancelled"); | ||
| } else { | ||
| NMLOG_ERROR("Scanning Failed Error: %s.", error->message); | ||
| } |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error cleanup when cancellation is detected. When cancellation occurs (line 1378-1379), the error is logged but not freed with g_error_free(), causing a memory leak. Add g_error_free(error); in the cancellation branch or restructure to always free the error at the end.
| } | |
| } | |
| g_error_free(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_METHOD, NM_SETTING_IP4_CONFIG_METHOD_AUTO, NULL); // autoconf = true | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_HOSTNAME, hostname.c_str(), NULL); | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_SEND_HOSTNAME, TRUE, NULL); // hostname send enabled | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, 2147483647, NULL); // 2147483647 = infinite timeout |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 2147483647 (INT32_MAX) should be defined as a named constant or use a symbolic constant like G_MAXINT if available in the GLib library. This improves code readability and maintainability.
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, 2147483647, NULL); // 2147483647 = infinite timeout | |
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, G_MAXINT, NULL); // infinite timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| void wifiManager::deleteClientConnection() | ||
| { | ||
| // Cancel all pending async operations before destroying client | ||
| if(m_cancellable != NULL) { |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent NULL checking style. In GLib/GObject-based code, it's more idiomatic to check with if (m_cancellable) rather than explicit != NULL comparison. Consider using the shorter form for consistency with common GLib coding conventions.
| if(m_cancellable != NULL) { | |
| if (m_cancellable) { |
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if(g_error_matches(error, G_IO_ERROR, G_IO_ERROR_CANCELLED)) | ||
| { | ||
| NMLOG_ERROR("something went wrong no matching gerror found !"); | ||
| NMLOG_ERROR("Disconnect operation was cancelled"); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent log level: Cancellation is logged as NMLOG_ERROR here, but in other callbacks like wifiConnectTempCb (line 499) and wifiScanCb (line 1386), it's logged as NMLOG_DEBUG. Since cancellation is an expected part of cleanup and not an error condition, this should use NMLOG_DEBUG for consistency.
| NMLOG_ERROR("Disconnect operation was cancelled"); | |
| NMLOG_DEBUG("Disconnect operation was cancelled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_HOSTNAME, hostname.c_str(), NULL); | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_SEND_HOSTNAME, TRUE, NULL); // hostname send enabled | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, 2147483647, NULL); // 2147483647 = infinite timeout | ||
| nm_connection_add_setting(m_connection, NM_SETTING(sIpv4Conf)); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic number without explanation: The value 3 for "link-local" setting lacks documentation or a named constant. Consider using a named constant or adding a comment explaining what this value represents (e.g., NM_SETTING_IP4_LL_ENABLED, NM_SETTING_IP4_LL_DISABLED, or similar enumeration value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| NMLOG_DEBUG("Scan operation was cancelled"); | ||
| } else { | ||
| NMLOG_ERROR("Scanning Failed Error: %s.", error->message); | ||
| } |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory leak: When a scan operation is cancelled, the error is not freed. The error should be freed with g_error_free(error) after logging the cancellation message, similar to how it's done in other error handling paths.
| } | |
| } | |
| g_error_free(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| if (g_error_matches(error, G_IO_ERROR, G_IO_ERROR_CANCELLED)) { | ||
| NMLOG_DEBUG("Connection operation was cancelled"); | ||
| } |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error: g_error_matches is called on error before it is initialized. The error is set to NULL at line 455, and g_error_matches is called at line 460 before any operation that could set it. This will always return false. The check for cancelled operations should be done after attempting the finish operations (lines 465 or 470) where the error is actually populated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| nm_client_activate_connection_async(_wifiManager->m_client, | ||
| NM_CONNECTION(remoteConnection), | ||
| _wifiManager->m_wifidevice, | ||
| _wifiManager->m_objectPath, | ||
| NULL, | ||
| _wifiManager->m_cancellable, | ||
| wifiConnectCb, | ||
| _wifiManager); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential use-after-free: In this callback (wifiConnectTempCb), after the operation is cancelled and the callback returns early, there's a potential race where deleteClientConnection() might have already destroyed m_client and m_cancellable. However, if the cancellation didn't happen, the callback proceeds to call nm_client_activate_connection_async at line 514 with _wifiManager->m_client and _wifiManager->m_cancellable. Since deleteClientConnection() can set these to NULL after cancellation, accessing them without checking could cause issues. Consider adding null checks for m_client before using it in callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_METHOD, NM_SETTING_IP4_CONFIG_METHOD_AUTO, NULL); // autoconf = true | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_HOSTNAME, hostname.c_str(), NULL); | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_SEND_HOSTNAME, TRUE, NULL); // hostname send enabled | ||
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, 2147483647, NULL); // 2147483647 = infinite timeout |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change: The DHCP timeout and link-local settings (lines 762-763) are not mentioned in the PR description and appear unrelated to fixing the crash issue described (RDKEMW-10277). These configuration changes should be in a separate commit or PR with their own justification and testing, especially since setting DHCP timeout to INT_MAX (infinite) could have operational implications.
| g_object_set(G_OBJECT(sIpv4Conf), NM_SETTING_IP_CONFIG_DHCP_TIMEOUT, 2147483647, NULL); // 2147483647 = infinite timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
| else { | ||
| NMLOG_INFO("staring normal wifi scanning .."); | ||
| nm_device_wifi_request_scan_async(wifiDevice, NULL, wifiScanCb, this); | ||
| nm_device_wifi_request_scan_async(wifiDevice, m_cancellable, wifiScanCb, this); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "staring" should be "starting" in the log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
…ager into topic/RDKEMW-10277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
RDKEMW-11488, RDKEMW-11077, RDKEMW-10276
Reason for Change: The wifiConnectionUpdate() callback registered in the wifiConnect(), got invoked after the wait(m_loop) timeout of 10sec and deleteClientConnection(). In the deleteClientConnection() function m_client is completely cleared and set to NULL. After this wifiConnectionUpdate() callback is invoked in which nm_client_activate_connection_async() uses the freed m_client, which leads to the crash. Addressed this by adding GCancellable, through which we are cancelling the pending async operations.
Test Procedure: Check for crash in the issue reported scenario
Priority:P1
Risks: Medium
Signed-off-by: Gururaaja ESRGururaja_ErodeSriranganRamlingham@comcast.com