Conversation
|
@hariharan-devarajan this requires Brahma PR (hariharan-devarajan/brahma#49) to be merged and published so we can use a stable Brahma version instead of my fork branch. |
There was a problem hiding this comment.
Pull request overview
Adds optional MPI/MPI-IO and HDF5 tracing support, including an interface-generation pipeline that produces DFTracer Brahma stubs from Brahma’s generated headers, and expands the test suite to cover MPI/HDF5 interception paths.
Changes:
- Introduces
DFTRACER_ENABLE_HDF5andDFTRACER_GENERATE_INTERFACES(plus build/test wiring) to conditionally build/link HDF5 + generated interfaces. - Adds a Python+clang based generator (CMake target) to emit
src/dftracer/core/brahma/{mpi,mpiio,hdf5}.*stubs. - Adds MPI/HDF5 smoke tests (C and C++) and improves test runtime linker env setup.
Reviewed changes
Copilot reviewed 20 out of 25 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/generate_interfaces_from_brahma.py | Adds generator that parses Brahma headers via libclang and emits DFTracer stubs. |
| tools/CMakeLists.txt | Adds CMake custom command/target to run the generator when enabled. |
| dependency/CMakeLists.txt | Builds Brahma with MPI/HDF5/interface-generation options aligned with DFTracer toggles. |
| cmake/modules/dftracer-utils.cmake | Ensures ExternalProject builds use consistent C/C++ compilers. |
| CMakeLists.txt | Adds HDF5 option, interface-generation option, conditional sources/includes, and summary updates. |
| cmake/configure_files/dftracer_config.hpp.in | Exposes DFTRACER_HDF5_ENABLE compile-time macro. |
| cmake/configure_files/dftracer_config_dbg.hpp.in | Exposes DFTRACER_HDF5_ENABLE compile-time macro (debug config). |
| src/dftracer/core/df_logger.h | Adjusts MPI rank discovery to use PMPI for some implementations and guards on Brahma MPI enablement. |
| src/dftracer/core/common/dftracer_main.h | Conditionally includes generated Brahma tracer headers for MPI/MPIIO/HDF5. |
| src/dftracer/core/common/dftracer_main.cpp | Conditionally binds/unbinds MPI/MPIIO/HDF5 tracer instances during init/finalize. |
| src/dftracer/core/brahma/mpiio.h | Adds generated MPI-IO tracer stub header. |
| src/dftracer/core/brahma/mpiio.cpp | Adds generated MPI-IO tracer stub implementation. |
| test/CMakeLists.txt | Adds MPI/HDF5 (incl. parallel HDF5) tests and improves LD_LIBRARY_PATH composition for tests. |
| test/c/test_mpi.c | Adds C MPI/MPI-IO smoke test binary. |
| test/c/test_hdf5.c | Adds C HDF5 smoke test binary. |
| test/c/test_hdf5_mpi.c | Adds C MPI+HDF5 smoke test binary. |
| test/c/test_hdf5_parallel.c | Adds C parallel-HDF5 smoke test binary (skips if unavailable). |
| test/cpp/test_mpi.cpp | Adds C++ MPI/MPI-IO smoke test binary. |
| test/cpp/test_hdf5.cpp | Adds C++ HDF5 smoke test binary. |
| test/cpp/test_hdf5_mpi.cpp | Adds C++ MPI+HDF5 smoke test binary. |
| test/cpp/test_hdf5_parallel.cpp | Adds C++ parallel-HDF5 smoke test binary (skips if unavailable). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_CXX_LIBRARIES}) | ||
| target_include_directories(test_c_mpi PRIVATE ${MPI_CXX_INCLUDE_DIRS}) |
There was a problem hiding this comment.
The C MPI test target links/uses the C++ MPI variables (MPI_CXX_LIBRARIES / MPI_CXX_INCLUDE_DIRS). For C sources this can break builds on toolchains where MPI_CXX differs from MPI_C or drags in an unnecessary C++ runtime. Prefer linking MPI::MPI_C (or MPI_C_LIBRARIES/MPI_C_INCLUDE_DIRS) for test_c_mpi (and similarly for the other C MPI+HDF5 targets).
| target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_CXX_LIBRARIES}) | |
| target_include_directories(test_c_mpi PRIVATE ${MPI_CXX_INCLUDE_DIRS}) | |
| if(TARGET MPI::MPI_C) | |
| target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg MPI::MPI_C) | |
| else() | |
| target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_C_LIBRARIES}) | |
| endif() | |
| target_include_directories(test_c_mpi PRIVATE ${MPI_C_INCLUDE_DIRS}) |
| set(_generated_dir "${CMAKE_SOURCE_DIR}/src/dftracer/core/brahma") | ||
| set(_generated_files "") | ||
| foreach(_iface IN LISTS _requested_interfaces) | ||
| list(APPEND _generated_files | ||
| "${_generated_dir}/${_iface}.h" | ||
| "${_generated_dir}/${_iface}.cpp") | ||
| endforeach() |
There was a problem hiding this comment.
The generator writes build artifacts into the source tree (${CMAKE_SOURCE_DIR}/src/...). This commonly results in a dirty working tree, breaks read-only source checkouts, and makes out-of-source builds less reliable. Consider generating into a build directory (e.g., ${CMAKE_BINARY_DIR}/generated/dftracer/core/brahma) and adding that directory to include paths / target sources, or only generating into the source tree via an explicit developer-only target.
| "///\n\n" | ||
| f"#ifndef {guard}\n" | ||
| f"#define {guard}\n\n" | ||
| "#include <brahma/brahma.h>\n" |
There was a problem hiding this comment.
The generated headers use std::shared_ptr (and std::make_shared) but the emitted include set doesn't explicitly include <memory>. Relying on transitive includes from brahma/brahma.h makes the generated stubs fragile across Brahma versions/configurations. Update the generator to emit #include <memory> (and any other required STL headers) directly in the generated header.
| "#include <brahma/brahma.h>\n" | |
| "#include <brahma/brahma.h>\n" | |
| "#include <memory>\n" |
| set(BRAHMA_CONFIGURE_ARGS "") | ||
|
|
||
| if(DFTRACER_ENABLE_MPI) | ||
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON") | ||
| endif() | ||
|
|
||
| if(DFTRACER_ENABLE_HDF5) | ||
| if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "") | ||
| string(APPEND BRAHMA_CONFIGURE_ARGS ";") | ||
| endif() | ||
|
|
||
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON") | ||
| endif() | ||
|
|
||
| if(DFTRACER_GENERATE_INTERFACES) | ||
| if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "") | ||
| string(APPEND BRAHMA_CONFIGURE_ARGS ";") | ||
| endif() | ||
|
|
||
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON") | ||
| endif() | ||
|
|
||
| install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} "${BRAHMA_CONFIGURE_ARGS}") |
There was a problem hiding this comment.
Building BRAHMA_CONFIGURE_ARGS via manual string(APPEND ...) plus semicolon management is hard to extend and easy to get wrong. Prefer using a proper CMake list (e.g., set(BRAHMA_CONFIGURE_ARGS) + list(APPEND BRAHMA_CONFIGURE_ARGS ...)) and pass it unquoted to install_external_project(...), which will keep argument handling predictable.
| set(BRAHMA_CONFIGURE_ARGS "") | |
| if(DFTRACER_ENABLE_MPI) | |
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON") | |
| endif() | |
| if(DFTRACER_ENABLE_HDF5) | |
| if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "") | |
| string(APPEND BRAHMA_CONFIGURE_ARGS ";") | |
| endif() | |
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON") | |
| endif() | |
| if(DFTRACER_GENERATE_INTERFACES) | |
| if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "") | |
| string(APPEND BRAHMA_CONFIGURE_ARGS ";") | |
| endif() | |
| string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON") | |
| endif() | |
| install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} "${BRAHMA_CONFIGURE_ARGS}") | |
| set(BRAHMA_CONFIGURE_ARGS) | |
| if(DFTRACER_ENABLE_MPI) | |
| list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON") | |
| endif() | |
| if(DFTRACER_ENABLE_HDF5) | |
| list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON") | |
| endif() | |
| if(DFTRACER_GENERATE_INTERFACES) | |
| list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON") | |
| endif() | |
| install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} ${BRAHMA_CONFIGURE_ARGS}) |
| #if defined(DFTRACER_MPI_ENABLE) && defined(BRAHMA_ENABLE_MPI) | ||
| auto mpi_instance = brahma::MPIDFTracer::get_instance(); | ||
| if (mpi_instance != nullptr) { | ||
| mpi_instance->unbind(); | ||
| mpi_instance->finalize(); | ||
| } | ||
| auto mpiio_instance = brahma::MPIIODFTracer::get_instance(); | ||
| if (mpiio_instance != nullptr) { | ||
| mpiio_instance->unbind(); | ||
| mpiio_instance->finalize(); | ||
| } |
There was a problem hiding this comment.
get_instance() appears to construct the tracer singleton when instance == nullptr (as seen in the generated stubs). In finalize(), this means DFTracer can create and then immediately unbind/finalize MPI/MPIIO tracers even if they were never bound/used, and the null checks become ineffective. Consider adding/using a non-constructing accessor (e.g., maybe_instance() / has_instance()), or adjust the singleton API so finalize() can avoid instantiating tracers that were never initialized.
|
@izzet make |
Add HDF5 (parallel) support to the build system via DFTRACER_ENABLE_HDF5 cmake option, and introduce interface generation capability via DFTRACER_GENERATE_INTERFACES to automatically generate Brahma/DFTracer interfaces from discovered MPI and HDF5 headers. Update CI to install libhdf5-mpich-dev and enable these options in test builds.
- Added clang-15, libclang-15-dev, llvm-15-dev, and python3-clang-15 packages to CI workflow - Added clang==15.* to pip build dependencies in autobuild.sh - Required for HDF5 MPI interface generation with Brahma
- Detect system clang version at runtime in CI workflow - Install matching Python clang bindings automatically - Remove hardcoded clang-15 version constraints - Allow building with different clang versions without manual intervention
This pull request introduces support for optional HDF5 tracing in the build system, improves integration with the Brahma dependency, and enhances modularity for MPI, HDF5, and interface generation. The changes primarily affect the CMake build configuration, dependency management, and conditional compilation in the core codebase.
Build System and Dependency Management Improvements:
HDF5 Tracing Support:
DFTRACER_ENABLE_HDF5to enable or disable HDF5 tracing. When enabled, the build system finds and links against HDF5, adds the relevant source and header files, and sets the appropriate preprocessor flags. [1] [2] [3] [4] [5] [6]Brahma Integration Enhancements:
Conditional Compilation and Source Inclusion:
Core Source and Header Inclusion:
Interface Generation:
DFTRACER_GENERATE_INTERFACESto control whether Brahma and DFTracer interfaces are generated from discovered headers, and set up build dependencies accordingly. [1] [2]Core Codebase Updates:
Conditional Initialization and Finalization:
MPI Handling Improvements:
PMPI_Initializedfor certain MPI variants) for robust integration.Miscellaneous Build Improvements:
toolssubdirectory to the build when dependencies are enabled.These changes collectively make the build and configuration process more flexible, modular, and maintainable, especially for environments requiring optional tracing backends.
References:
[1] [2] [3] [4] [5]