Skip to content

pickle error:#422

Open
Sonajeya31 wants to merge 21 commits intordkcentral:developfrom
Sonajeya31:topic/RDKEMW-13253-1
Open

pickle error:#422
Sonajeya31 wants to merge 21 commits intordkcentral:developfrom
Sonajeya31:topic/RDKEMW-13253-1

Conversation

@Sonajeya31
Copy link
Copy Markdown

Description

What does this PR change/fix and why?

If there is a corresponding JIRA ticket, please ensure it is in the title of the PR.

Test Procedure

How to test this PR (if applicable)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Other (doesn't fit into the above categories - e.g. documentation updates)

Requires Bitbake Recipe changes?

  • The base Bitbake recipe (meta-rdk-ext/recipes-containers/dobby/dobby.bb) must be modified to support the changes in this PR (beyond updating SRC_REV)

Copilot AI review requested due to automatic review settings March 25, 2026 04:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR appears to address a multiprocessing “pickle error” in the L2 test runner by moving the process-output waiting function to module scope so it can be used as a multiprocessing.Process target.

Changes:

  • Extracted the inner wait_for_string function into a module-level helper (_wait_for_string) to make it picklable.
  • Updated read_asynchronous to spawn the reader process using _wait_for_string.
  • Minor formatting adjustment at end-of-file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 123 to 125
found = False
reader = multiprocessing.Process(target=wait_for_string, args=(proc, string_to_find), kwargs={})
reader = multiprocessing.Process(target=_wait_for_string, args=(proc, string_to_find), kwargs={})
test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiprocessing.Process(..., args=(proc, ...)) still requires pickling proc under the spawn start method (default on macOS/Windows). subprocess.Popen instances aren’t picklable, so this change may still fail with a pickling error depending on the platform/start method. Consider switching to threading.Thread for the stderr reader (no pickling, shared file handles), or pass only a picklable primitive (e.g., proc.stderr.fileno() and reopen it in the child) instead of the Popen object.

Copilot uses AI. Check for mistakes.
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.
while True:
# notice that all data are in stderr not in stdout, this is DobbyDaemon design
output = proc.stderr.readline()
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_wait_for_string loops forever even if the process exits and the stderr pipe hits EOF (readline() returns an empty string). In that case this becomes a tight loop that can burn CPU until the timeout/terminate path triggers. Handle EOF explicitly (e.g., break/return when output == '') so the reader can exit cleanly when the subprocess ends.

Suggested change
output = proc.stderr.readline()
output = proc.stderr.readline()
# readline() returns an empty string on EOF; exit to avoid a tight infinite loop
if output == "":
test_utils.print_log("EOF reached while waiting for \"%s\"" % string_to_find, test_utils.Severity.debug)
return

Copilot uses AI. Check for mistakes.
found = False
reader = multiprocessing.Process(target=wait_for_string, args=(proc, string_to_find), kwargs={})
reader = multiprocessing.Process(target=_wait_for_string, args=(proc, string_to_find), kwargs={})
test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log message says "multithread read" but this code uses multiprocessing.Process, not threads. Updating the wording (or renaming the helper) will avoid confusion when debugging.

Suggested change
test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
test_utils.print_log("Starting multiprocess read", test_utils.Severity.debug)

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 25, 2026 05:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

tests/L2_testing/test_runner/basic_sanity_tests.py:134

  • On timeout, read_asynchronous returns while the reader thread continues to run and block on readline() indefinitely. Because the thread is non-daemon and there’s no stop signal, this can cause the test runner to hang at process exit (or leak background threads across tests). Make the thread daemon and/or implement a cancellation mechanism (e.g., threading.Event) that is set on timeout and checked inside _wait_for_string, and ensure the pipe read unblocks when cancelling (close proc.stderr or similar).
    found = False
    reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
    test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
    reader.start()
    reader.join(timeout)
    # if thread still running
    if reader.is_alive():
        test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
        test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
    else:
        found = True
    return found

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +106
while True:
# notice that all data are in stderr not in stdout, this is DobbyDaemon design
output = proc.stderr.readline()
if string_to_find in output:
test_utils.print_log("Found string \"%s\"" % string_to_find, test_utils.Severity.debug)
return
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_wait_for_string loops forever even when the subprocess stream reaches EOF. If proc.stderr.readline() returns an empty string (e.g., after the process exits or the pipe closes), this becomes a tight infinite loop that can peg CPU and never return. Break/return on EOF (and/or handle read exceptions). If you allow the worker to exit on EOF, also adjust read_asynchronous so it doesn’t treat “thread finished” as “string found” (e.g., use an Event/shared flag that is set only when the match is found).

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +90
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment/docstring for _wait_for_string still refers to multiprocessing/pickling ("Helper function for multiprocessing" / "timeout multiprocess"), but the implementation now uses a thread. Please update the comment/docstring to match the current threading approach to avoid misleading future maintenance.

Suggested change
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.
# Helper function intended to be run in a background thread to wait for a string in process output.
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process; caller should enforce any timeout using threading.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 25, 2026 06:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to 131
reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
reader.start()
reader.join(timeout)
# if thread still running
if reader.is_alive():
test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
reader.terminate()
test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On timeout, the reader thread is left running (and still reading from proc.stderr). This can leak threads and also consume stderr output needed by later calls (e.g., the second read_asynchronous call), causing nondeterministic failures. Consider avoiding a background thread entirely (use selectors/select to poll proc.stderr with a timeout), or implement a cooperative stop mechanism (daemon thread + nonblocking reads + stop event) so the reader reliably stops on timeout.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to 91
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment/docstring still refers to multiprocessing/pickling ("Helper function for multiprocessing" / "timeout multiprocess"), but the implementation now uses threading.Thread. Please update the comment/docstring to match the current behavior to avoid confusion for future maintainers.

Suggested change
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.
# Helper function for asynchronous reading in a separate thread
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until ``string_to_find`` is found in process output.
Intended to be run in a separate thread (see ``read_asynchronous``), which
enforces a timeout by joining the thread.

Copilot uses AI. Check for mistakes.
build:
runs-on: ubuntu-latest
container:
image: ubuntu:20.04
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this job inside ubuntu:20.04 is likely to break the workflow: the image typically doesn't include sudo, and systemctl usage later in the job requires a functioning systemd init (not present by default in GitHub Actions container jobs). Either switch to the existing CI container image used elsewhere in this repo (e.g. ghcr.io/rdkcentral/docker-rdk-ci:latest) or adjust the steps to run without sudo and without systemctl (or explicitly provision/start systemd in the container).

Suggested change
image: ubuntu:20.04
image: ghcr.io/rdkcentral/docker-rdk-ci:latest

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 25, 2026 07:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +106
while True:
# notice that all data are in stderr not in stdout, this is DobbyDaemon design
output = proc.stderr.readline()
if string_to_find in output:
test_utils.print_log("Found string \"%s\"" % string_to_find, test_utils.Severity.debug)
return
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_wait_for_string loops forever on proc.stderr.readline(). If the process exits or the pipe closes, readline() returns an empty string and this becomes a tight infinite loop. Handle EOF (empty string) by returning/breaking (and ideally log that the process ended) to avoid spinning and leaking the reader thread.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +111
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.

Parameters:
proc (process): process in which we want to read
string_to_find (string): what we want to find in process
timeout (float): how long we should wait if string not found (seconds)

Returns:
found (bool): True if found string_to_find inside proc.
None: Returns nothing if found, never ends if not found

"""

# as this function should not be used outside asynchronous read, it is moved inside it
def wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.
while True:
# notice that all data are in stderr not in stdout, this is DobbyDaemon design
output = proc.stderr.readline()
if string_to_find in output:
test_utils.print_log("Found string \"%s\"" % string_to_find, test_utils.Severity.debug)
return

Parameters:
proc (process): process in which we want to read
string_to_find (string): what we want to find in process

Returns:
None: Returns nothing if found, never ends if not found
# we need to do this asynchronous as if there is no such string we would end in endless loop
def read_asynchronous(proc, string_to_find, timeout):
"""Reads asynchronous from process. Ends when found string or timeout occurred.
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments/docstrings still reference multiprocessing/pickling ("Helper function for multiprocessing", "timeout multiprocess"), but the implementation now uses threading.Thread. Please update the documentation to match the current behavior so future readers don’t try to “fix” it back to multiprocessing or misunderstand the constraints.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 25, 2026 07:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (4)

tests/L2_testing/test_runner/basic_sanity_tests.py:134

  • read_asynchronous() starts a thread that can remain alive after the timeout, but there is no way to stop it (the previous multiprocessing version terminated the child process). This can leak a background reader and also consume proc.stderr so later reads/tests may miss output. Refactor to avoid an unkillable thread (e.g., use select/poll in the calling thread with a timeout loop, or add a stop Event + close/stop the stream and ensure only one reader exists).
    found = False
    reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
    test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
    reader.start()
    reader.join(timeout)
    # if thread still running
    if reader.is_alive():
        test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
        test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
    else:
        found = True
    return found

.github/workflows/l2-tests.yml:12

  • This workflow now runs inside a plain ubuntu:20.04 container, but later steps invoke systemctl (e.g., enabling smcroute). In a standard Docker container, systemd is not PID 1, so systemctl typically fails even if systemd is installed. Either switch to a CI image that runs systemd (or explicitly start systemd as init) or avoid systemctl and start/configure required services another way.
    .github/workflows/l2-tests.yml:12
  • Using container.options: --privileged grants the job container broad host access and can be disallowed in some CI environments. If privileged is only needed for specific capabilities (e.g., cgroup access), prefer adding only the minimal required flags/capabilities to reduce security and reliability risk.
    .github/workflows/l2-tests.yml:72
  • The safe.directory entries are hard-coded to /__w/Dobby/Dobby/Dobby..., which is brittle if the repo is checked out under a different path (forks, renamed repo, or different path: setting). Use ${GITHUB_WORKSPACE} (and/or pwd) to build these paths dynamically.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +88 to 91
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper comment/docstring still references multiprocessing/pickling (and says it must be run with a timeout multiprocess), but the implementation now uses threading.Thread. Update the comment/docstring to match the actual approach so future maintainers don't assume this is safe/usable with multiprocessing.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 25, 2026 08:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 27, 2026 08:40
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +475 to +488
// The config path may not be accessible from the current namespace.
// Try using the bundle path from the container state as a fallback.
if (state->bundle != nullptr)
{
std::string fallbackPath = std::string(state->bundle) + "/config.json";
AI_LOG_INFO("Config not found at '%s', trying bundle path '%s'",
gConfigPath.c_str(), fallbackPath.c_str());
absPath = realpath(fallbackPath.c_str(), NULL);
}

if (absPath == nullptr)
{
AI_LOG_ERROR("Couldn't find config at %s", gConfigPath.c_str());
return EXIT_FAILURE;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both gConfigPath and the bundle-derived fallback path fail to resolve, the error log only mentions gConfigPath even though a fallback was attempted. Including the fallback path in the failure log would make diagnosing namespace/path issues much easier.

Copilot uses AI. Check for mistakes.
Comment on lines +755 to +762
// Always bind mount the bundle directory into the container at the same path.
// This is needed because createContainer hooks run in the container's mount
// namespace but need to access the config.json file which exists on the host.
// By mounting the bundle directory, the host path becomes accessible.
if(!addMount(bundlePath, bundlePath, "bind", 0,
{ "rbind", "ro", "nosuid", "nodev" }))
{
AI_LOG_WARN("Failed to add bundle mount for hooks, createContainer may fail");
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recursively bind-mounting the entire bundle directory into the container exposes the full bundle contents (and the host path) to the container, which is a meaningful expansion of what code inside the container can read. If the goal is only to let hooks access config.json, consider mounting just that file (or a minimal directory) and document/justify the security implications of exposing the bundle.

Suggested change
// Always bind mount the bundle directory into the container at the same path.
// This is needed because createContainer hooks run in the container's mount
// namespace but need to access the config.json file which exists on the host.
// By mounting the bundle directory, the host path becomes accessible.
if(!addMount(bundlePath, bundlePath, "bind", 0,
{ "rbind", "ro", "nosuid", "nodev" }))
{
AI_LOG_WARN("Failed to add bundle mount for hooks, createContainer may fail");
// Bind-mount only the config.json file into the container at the same path.
// This is needed because createContainer hooks run in the container's mount
// namespace but need to access the config.json file which exists on the host.
// By mounting just the config.json file, the required host path becomes
// accessible without exposing the entire bundle directory.
const std::string hostConfigPath = bundlePath + "/config.json";
if (!addMount(hostConfigPath, hostConfigPath, "bind", 0,
{ "bind", "ro", "nosuid", "nodev" }))
{
AI_LOG_WARN("Failed to add config.json mount for hooks, createContainer may fail");

Copilot uses AI. Check for mistakes.
Comment on lines 123 to 132
found = False
reader = multiprocessing.Process(target=wait_for_string, args=(proc, string_to_find), kwargs={})
reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
reader.start()
reader.join(timeout)
# if thread still running
if reader.is_alive():
test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
reader.terminate()
test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
else:
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_asynchronous leaves a non-daemon reader thread running after a timeout, with no stop mechanism. This can keep the test runner from exiting and can interfere with later reads from the same pipe. Consider using a threading.Event to request shutdown and reader.daemon = True (or an alternative non-threaded timeout mechanism).

Copilot uses AI. Check for mistakes.
Comment on lines +68 to +73
# Keep bundle around until container has fully exited
# This is needed because Dobby uses the bundle in-place and hooks need access to config.json
if launch_result:
return validate_output_file(container_id, expected_output)

return False, "Container did not launch successfully"
return False, "Container did not launch successfully"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment says the bundle is kept around "until container has fully exited", but launch_container can return success even after a 5s timeout without confirming the container has exited. Either make launch_container reliably wait for exit/failure, or adjust the comment (and possibly the logic) so it matches actual behavior.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 27, 2026 11:58
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

tests/L2_testing/test_runner/basic_sanity_tests.py:134

  • read_asynchronous leaves the reader thread running after a timeout. Since the thread keeps blocking on proc.stderr.readline(), it can (1) prevent the Python process from exiting (non-daemon thread) and (2) race with subsequent read_asynchronous calls by consuming stderr output meant for later checks. Consider replacing this with a time-bounded read loop (e.g., select/poll on the fd) or add a stop mechanism (Event + non-blocking reads) and ensure the thread is always stopped/cleaned up before returning.
    found = False
    reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
    test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
    reader.start()
    reader.join(timeout)
    # if thread still running
    if reader.is_alive():
        test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
        test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
    else:
        found = True
    return found

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 697 to 705
#ifdef USE_STARTCONTAINER_HOOK
// startContainer hook paths must resolve in the container namespace,
// config is in container rootdir
configPath = "/tmp/config.json";
// startContainer hook runs in container namespace after pivot_root,
// use the path where config was mounted for startContainer
std::string startContainerConfigPath = "/tmp/config.json";

// populate startContainer hook with DobbyPluginLauncher args
rt_defs_hook *startContainerEntry = (rt_defs_hook*)calloc(1, sizeof(rt_defs_hook));
setPluginHookEntry(startContainerEntry, "startContainer", configPath);
setPluginHookEntry(startContainerEntry, "startContainer", startContainerConfigPath);
cfg->hooks->start_container = (rt_defs_hook**)realloc(cfg->hooks->start_container, sizeof(rt_defs_hook*) * ++cfg->hooks->start_container_len);
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addPluginLauncherHooks configures the startContainer hook to use /tmp/config.json, but in updateBundleConfig the bind-mount of bundlePath + "/config.json" to /tmp/config.json appears to have been removed. Unless /tmp/config.json is created elsewhere, the startContainer hook will fail to resolve the config path. Either reinstate the /tmp/config.json mount under USE_STARTCONTAINER_HOOK or update the hook’s -c path to match the newly mounted /tmp/dobby_config.json (or another guaranteed path).

Copilot uses AI. Check for mistakes.
{ "bind", "ro", "nosuid", "nodev" }))
{
return false;
AI_LOG_WARN("Failed to add config mount for hooks, createContainer may fail");
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the bind-mount of the host config.json into the container fails, the code now only logs a warning and continues. Since the createContainer hook’s -c path depends on this mount, continuing can lead to hard-to-debug hook failures later. Consider treating this as a hard failure (return false) or, if it’s intentionally best-effort, also disable/omit the createContainer hook when the mount can’t be added.

Suggested change
AI_LOG_WARN("Failed to add config mount for hooks, createContainer may fail");
AI_LOG_ERROR("Failed to add config mount for hooks, aborting bundle config update");
return false;

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 30, 2026 06:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

tests/L2_testing/test_runner/basic_sanity_tests.py:134

  • read_asynchronous starts a non-daemon thread that may remain blocked in _wait_for_string after a timeout; because it isn't stopped or marked daemon, the test runner can hang on process exit. Consider using a daemon thread, adding a stop Event, and/or closing the underlying pipe when timing out so the reader can exit cleanly.
    found = False
    reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
    test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
    reader.start()
    reader.join(timeout)
    # if thread still running
    if reader.is_alive():
        test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
        test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
    else:
        found = True
    return found

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +88 to 92
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
"""Waits indefinitely until string is found in process. Must be run with timeout multiprocess.

Parameters:
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new helper is now used with threading.Thread, but the comment/docstring still refers to "multiprocessing" / "picklable" / "timeout multiprocess", which is misleading. Please update the comment/docstring to match the threading-based implementation (or switch back to multiprocessing with a module-level target if that’s still the intent).

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +81
# Remove swappiness if present (not supported in cgroupv2)
if 'linux' in config and 'resources' in config['linux']:
resources = config['linux']['resources']
if 'memory' in resources and 'swappiness' in resources['memory']:
del resources['memory']['swappiness']
modified = True
print_log("Removed swappiness (not supported in cgroupv2)", Severity.debug)

# Update cgroup mount type if needed
if 'mounts' in config:
for mount in config['mounts']:
if mount.get('destination') == '/sys/fs/cgroup' and mount.get('type') == 'cgroup':
# cgroupv2 should use cgroup2 type, but 'cgroup' often works via auto-detect
# Add bind option to help with compatibility
pass # crun handles this automatically
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patch_bundle_for_ci’s docstring claims it removes the user namespace and updates the cgroup mount type for cgroupv2, but the implementation currently only removes linux.resources.memory.swappiness and the mounts loop is a no-op. Either implement the described behavior or adjust the docstring and remove the dead code so the function’s intent is unambiguous.

Suggested change
# Remove swappiness if present (not supported in cgroupv2)
if 'linux' in config and 'resources' in config['linux']:
resources = config['linux']['resources']
if 'memory' in resources and 'swappiness' in resources['memory']:
del resources['memory']['swappiness']
modified = True
print_log("Removed swappiness (not supported in cgroupv2)", Severity.debug)
# Update cgroup mount type if needed
if 'mounts' in config:
for mount in config['mounts']:
if mount.get('destination') == '/sys/fs/cgroup' and mount.get('type') == 'cgroup':
# cgroupv2 should use cgroup2 type, but 'cgroup' often works via auto-detect
# Add bind option to help with compatibility
pass # crun handles this automatically
# Remove user namespace (can cause conflicts in some CI environments)
linux_cfg = config.get('linux')
if isinstance(linux_cfg, dict):
namespaces = linux_cfg.get('namespaces')
if isinstance(namespaces, list):
original_len = len(namespaces)
linux_cfg['namespaces'] = [
ns for ns in namespaces
if not isinstance(ns, dict) or ns.get('type') != 'user'
]
if len(linux_cfg['namespaces']) != original_len:
modified = True
print_log("Removed user namespace for cgroupv2 CI environment", Severity.debug)
# Remove swappiness if present (not supported in cgroupv2)
if isinstance(linux_cfg, dict) and 'resources' in linux_cfg:
resources = linux_cfg['resources']
if isinstance(resources, dict) and 'memory' in resources:
memory_cfg = resources['memory']
if isinstance(memory_cfg, dict) and 'swappiness' in memory_cfg:
del memory_cfg['swappiness']
modified = True
print_log("Removed swappiness (not supported in cgroupv2)", Severity.debug)
# Update cgroup mount type if needed
if 'mounts' in config and isinstance(config['mounts'], list):
for mount in config['mounts']:
if (
isinstance(mount, dict)
and mount.get('destination') == '/sys/fs/cgroup'
and mount.get('type') == 'cgroup'
):
mount['type'] = 'cgroup2'
modified = True
print_log("Updated /sys/fs/cgroup mount type to cgroup2 for cgroupv2", Severity.debug)

Copilot uses AI. Check for mistakes.
Comment on lines +20 to 25
import os

def is_cgroupv2():
"""Check if the system is using cgroup v2 (unified hierarchy)"""
return os.path.exists('/sys/fs/cgroup/cgroup.controllers')

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pid_limit_tests.py duplicates the is_cgroupv2() helper that was added to test_utils.py. To avoid drift and keep behavior consistent across tests, please reuse test_utils.is_cgroupv2() here instead of defining a local copy.

Copilot uses AI. Check for mistakes.
Comment on lines +169 to +183
// Check for cgroupv2 unified hierarchy first
bool isCgroupV2 = false;
while ((mnt = getmntent_r(procMounts, &mntBuf, buf, sizeof(buf))) != nullptr)
{
// skip entries that don't have a mountpount, type or options
if (!mnt->mnt_type || !mnt->mnt_dir || !mnt->mnt_opts)
continue;

// skip non-cgroup mounts
if (strcmp(mnt->mnt_type, "cgroup") != 0)
continue;

// check for the cgroup type
for (const std::pair<const std::string, IDobbyEnv::Cgroup> cgroup : cgroupNames)
if (mnt->mnt_type && strcmp(mnt->mnt_type, "cgroup2") == 0)
{
char* mntopt = hasmntopt(mnt, cgroup.first.c_str());
if (!mntopt)
isCgroupV2 = true;
AI_LOG_INFO("detected cgroup v2 unified hierarchy @ '%s'", mnt->mnt_dir);
// For cgroupv2, all controllers are at the same mount point
std::string cgroupPath = mnt->mnt_dir;
for (const auto& cgroup : cgroupNames)
{
mounts[cgroup.second] = cgroupPath;
}
break;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cgroupv2 scan loop, the code no longer checks mnt->mnt_dir / mnt->mnt_opts for null before using mnt->mnt_dir in logging and assignments. This can lead to a crash if those fields are missing. Also, for cgroupv2 the code populates all controller mount points (including GPU/ION) with the unified mount, which makes callers treat unsupported controllers as available (e.g., DobbyStats will start attempting reads under that path). Consider restoring the null checks and only populating entries for controllers that are actually supported under cgroupv2 (or adding explicit cgroupv2 handling in callers).

Copilot uses AI. Check for mistakes.
Comment on lines +470 to +487
bool isCgroupV2 = false;
while ((mnt = getmntent_r(procMounts, &mntBuf, buf, sizeof(buf))) != nullptr)
{
// skip entries that don't have a mount point, type or options
if (!mnt->mnt_type || !mnt->mnt_dir || !mnt->mnt_opts)
continue;

// skip non-cgroup mounts
if (strcmp(mnt->mnt_type, "cgroup") != 0)
// skip non-cgroup mounts (check for both cgroup v1 and v2)
if (strcmp(mnt->mnt_type, "cgroup") != 0 && strcmp(mnt->mnt_type, "cgroup2") != 0)
continue;

// cgroupv2 doesn't support cpu.rt_runtime_us in the same way
if (strcmp(mnt->mnt_type, "cgroup2") == 0)
{
AI_LOG_INFO("cgroup v2 detected, CPU RT runtime defaults to disabled");
isCgroupV2 = true;
break;
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isCgroupV2 is set but never used, which will trigger an unused-but-set-variable warning on many toolchains. Since the cgroupv2 early-break already forces the RT runtime values to remain disabled, consider removing isCgroupV2 or using it to make the intent explicit.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 30, 2026 07:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

tests/L2_testing/test_runner/basic_sanity_tests.py:134

  • read_asynchronous now spawns a thread that can outlive the timeout: on timeout you log but never stop the thread, and because it’s non-daemon it can prevent the test runner from exiting. _wait_for_string also loops forever on EOF (when readline() returns an empty string after the process exits), potentially causing a busy loop. Consider using a threading.Event/stop flag + EOF/proc.poll() checks, and mark the thread as daemon or explicitly shut it down/close the pipe on timeout. Also update the helper comment/docstring (it still refers to multiprocessing/pickling).
# Helper function for multiprocessing - must be at module level to be picklable
def _wait_for_string(proc, string_to_find):
    """Waits indefinitely until string is found in process. Must be run with timeout multiprocess.

    Parameters:
    proc (process): process in which we want to read
    string_to_find (string): what we want to find in process

    Returns:
    None: Returns nothing if found, never ends if not found

    """

    while True:
        # notice that all data are in stderr not in stdout, this is DobbyDaemon design
        output = proc.stderr.readline()
        if string_to_find in output:
            test_utils.print_log("Found string \"%s\"" % string_to_find, test_utils.Severity.debug)
            return


# we need to do this asynchronous as if there is no such string we would end in endless loop
def read_asynchronous(proc, string_to_find, timeout):
    """Reads asynchronous from process. Ends when found string or timeout occurred.

    Parameters:
    proc (process): process in which we want to read
    string_to_find (string): what we want to find in process
    timeout (float): how long we should wait if string not found (seconds)

    Returns:
    found (bool): True if found string_to_find inside proc.

    """

    found = False
    reader = threading.Thread(target=_wait_for_string, args=(proc, string_to_find))
    test_utils.print_log("Starting multithread read", test_utils.Severity.debug)
    reader.start()
    reader.join(timeout)
    # if thread still running
    if reader.is_alive():
        test_utils.print_log("Reader still exists, closing", test_utils.Severity.debug)
        test_utils.print_log("Not found string \"%s\"" % string_to_find, test_utils.Severity.error)
    else:
        found = True
    return found

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +125 to +128
# cgroupv2 uses 'max' for unlimited
if pid_limit == 'max':
pid_limit = 'max'

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if pid_limit == 'max': pid_limit = 'max' block is a no-op. If the intent is to normalize unlimited limits for comparison, either remove this block or translate 'max' into whatever value the tests expect (e.g. keep 'max' in expected_output or map to a consistent sentinel).

Suggested change
# cgroupv2 uses 'max' for unlimited
if pid_limit == 'max':
pid_limit = 'max'

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants