Ctest locks for limited parallel execution#3380
Merged
Merged
Conversation
Introduce serialisation for Bash tests that create and delete temporary files, preventing parallel test runs from interfering with one another. Tests now acquire a RESOURCE_LOCK and a filesystem-level lock keyed to their shared test-data root directory, with separate resources for the C++ and Python command test trees. A NO_FILESYSTEM_LOCK opt-out keyword lets developers exempt tests that never write to disk. Following this, every Bash test for the C++ and Python commands was audited for use of the rm command or the -force option, and the 217 tests using neither were annotated with NO_FILESYSTEM_LOCK in their CMakeLists.txt entries. Session prompts: 1. > Many tests of C++ and Python commands involve the erase or > overwriting of temporary files and directories that commence with > the name "tmp*". Such filesystem contents are deleted before the > execution of each Bash test. If however multiple such tests are > executed in parallel, whether within a single ctest invocation or > in multiple invocations across instances, these tests will > interfere with one another. > 1. Set property RESOURCE_LOCK for all Bash tests. There should be > one resource for each unique directory from which such tests > can be executed: one for C++ commands and one for Python > commands (as they execute from different test data root > directories and hence their temporary filesystem contents do > not clash); investigate whether another should be added for UI > or unit tests. > 2. Over and above this, add to the machinery that executes each > test a filesystem-level lock that precludes the execution of > multiple tests from the same root directory. > 3. Add an option for add_bash_test that allows one on a per-test > basis to explicitly bypass these mechanisms if the developer > explicitly flags that that particular test does not write to > the filesystem. 2. > Perform an audit of all bash test files responsible for > evaluating C++ and Python command files. For each test, check for > the presence of the use of unix command "rm", or use of the > MRtrix3 command-line option "-force". If neither is found, then > attach the property NO_FILESYSTEM_LOCK to the corresponding test > in the relevant CMakeLists.txt file. Generated-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Audited every Bash test of the Python commands for the pattern where the command under test is run solely to write a single output image that is then fed to a single evaluation command. In each such case the intermediate file was eliminated by piping the command output directly into the testing command via the "-" sentinel, allowing the test to be annotated NO_FILESYSTEM_LOCK in testing/scripts/CMakeLists.txt since it no longer touches disk. Twenty-four tests across 5ttgen, dwi2mask, dwinormalise, labelsgmfirst and mask2glass were converted. The dedicated 5ttgen/hsvs_piping test was then removed as redundant, as the transformed hsvs_default now exercises the identical output-piping pipeline and hsvs has no pipeable image input. Session prompts: 1. > Perform an audit of all Bash tests of Python commands. In all circumstances > where the command is simply run to generate a singular output image file, > and that file is then passed as input to a singular test evaluation command, > replace the saving and loading of that image file with the MRtrix3 temporary > image sentinel "-" and join the two commands with a Unix pipe, and then set > property NO_FILESYSTEM_LOCK for that test in testing/scripts/CMakeLists.txt. 2. > Check all Python command Bash tests for redundancy with respect to > command-line piping. If the modifications from the prior prompt, leading to > the use of piping of an output image to a testing command in order to no > longer require a filesystem lock, results in redundancy between a test whose > primary purpose is to ensure correct operation of image piping for that > particular command, then the test dedicated to evaluation of command-line > piping should be removed. Generated-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
Conflicts: testing/binaries/CMakeLists.txt
|
clang-tidy review says "All clean, LGTM! 👍" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Coding AI models are not terribly good at obeying guard rails. No matter how stringently I told Claude to not run tests in parallel, it kept trying: first via
-j, then by having different sub-agents running different subsets of tests. Context cues ultimately only condition the model context, they are not boolean. So I needed to codify this constraint instead.ctestdoes give the ability to nominate which tests can be run independently in parallel vs. not. In our case we have both: some tests will write to the filesystem using "tmp*" filesystem paths, others only write toTmpFileDirusing a randomised path in order to pipe image data from one command to the next. Additionally, C++ and Python tests can run in parallel relative to one another since they use different test data repositories.One resource lock per testing root directory would be adequate to modulate the behaviour of a single
ctestinvocation. However as mentioned, Claude tried executingctestitself multiple times in parallel. So I further added a filesystem lock that should catch these.