Skip to content

Support HDF5/MPI/MPIIO#340

Open
izzet wants to merge 7 commits intollnl:developfrom
izzet:feat/support-hdf5-mpi
Open

Support HDF5/MPI/MPIIO#340
izzet wants to merge 7 commits intollnl:developfrom
izzet:feat/support-hdf5-mpi

Conversation

@izzet
Copy link
Copy Markdown
Collaborator

@izzet izzet commented Feb 24, 2026

This pull request introduces support for optional HDF5 tracing in the build system, improves integration with the Brahma dependency, and enhances modularity for MPI, HDF5, and interface generation. The changes primarily affect the CMake build configuration, dependency management, and conditional compilation in the core codebase.

Build System and Dependency Management Improvements:

  • HDF5 Tracing Support:

    • Added a new CMake option DFTRACER_ENABLE_HDF5 to enable or disable HDF5 tracing. When enabled, the build system finds and links against HDF5, adds the relevant source and header files, and sets the appropriate preprocessor flags. [1] [2] [3] [4] [5] [6]
  • Brahma Integration Enhancements:

    • The Brahma subproject is now built with MPI and/or HDF5 support depending on DFTracer options, and supports interface generation if requested. The Brahma git repository URL and branch were also updated.
    • External project installation now explicitly passes the C and C++ compilers to ensure consistent builds.

Conditional Compilation and Source Inclusion:

  • Core Source and Header Inclusion:

    • The build system conditionally includes MPI, MPIIO, and HDF5 source and header files only if the corresponding features are enabled, improving modularity and reducing unnecessary dependencies. [1] [2]
  • Interface Generation:

    • Added a new CMake option DFTRACER_GENERATE_INTERFACES to control whether Brahma and DFTracer interfaces are generated from discovered headers, and set up build dependencies accordingly. [1] [2]

Core Codebase Updates:

  • Conditional Initialization and Finalization:

    • The core logic for initializing and finalizing tracing now conditionally manages MPI, MPIIO, and HDF5 tracer instances based on build-time flags and Brahma capabilities, ensuring correct resource management. [1] [2] [3]
  • MPI Handling Improvements:

    • Improved detection and handling of MPI implementations (e.g., using PMPI_Initialized for certain MPI variants) for robust integration.

Miscellaneous Build Improvements:

  • Added support for extra MPI launcher flags for testing, and improved configuration summary output for new options. [1] [2]
  • Added the tools subdirectory to the build when dependencies are enabled.

These changes collectively make the build and configuration process more flexible, modular, and maintainable, especially for environments requiring optional tracing backends.

References:
[1] [2] [3] [4] [5]

@izzet izzet self-assigned this Feb 24, 2026
Copilot AI review requested due to automatic review settings February 24, 2026 01:07
@izzet izzet added the enhancement New feature or request label Feb 24, 2026
@izzet
Copy link
Copy Markdown
Collaborator Author

izzet commented Feb 24, 2026

@hariharan-devarajan this requires Brahma PR (hariharan-devarajan/brahma#49) to be merged and published so we can use a stable Brahma version instead of my fork branch.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional MPI/MPI-IO and HDF5 tracing support, including an interface-generation pipeline that produces DFTracer Brahma stubs from Brahma’s generated headers, and expands the test suite to cover MPI/HDF5 interception paths.

Changes:

  • Introduces DFTRACER_ENABLE_HDF5 and DFTRACER_GENERATE_INTERFACES (plus build/test wiring) to conditionally build/link HDF5 + generated interfaces.
  • Adds a Python+clang based generator (CMake target) to emit src/dftracer/core/brahma/{mpi,mpiio,hdf5}.* stubs.
  • Adds MPI/HDF5 smoke tests (C and C++) and improves test runtime linker env setup.

Reviewed changes

Copilot reviewed 20 out of 25 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/generate_interfaces_from_brahma.py Adds generator that parses Brahma headers via libclang and emits DFTracer stubs.
tools/CMakeLists.txt Adds CMake custom command/target to run the generator when enabled.
dependency/CMakeLists.txt Builds Brahma with MPI/HDF5/interface-generation options aligned with DFTracer toggles.
cmake/modules/dftracer-utils.cmake Ensures ExternalProject builds use consistent C/C++ compilers.
CMakeLists.txt Adds HDF5 option, interface-generation option, conditional sources/includes, and summary updates.
cmake/configure_files/dftracer_config.hpp.in Exposes DFTRACER_HDF5_ENABLE compile-time macro.
cmake/configure_files/dftracer_config_dbg.hpp.in Exposes DFTRACER_HDF5_ENABLE compile-time macro (debug config).
src/dftracer/core/df_logger.h Adjusts MPI rank discovery to use PMPI for some implementations and guards on Brahma MPI enablement.
src/dftracer/core/common/dftracer_main.h Conditionally includes generated Brahma tracer headers for MPI/MPIIO/HDF5.
src/dftracer/core/common/dftracer_main.cpp Conditionally binds/unbinds MPI/MPIIO/HDF5 tracer instances during init/finalize.
src/dftracer/core/brahma/mpiio.h Adds generated MPI-IO tracer stub header.
src/dftracer/core/brahma/mpiio.cpp Adds generated MPI-IO tracer stub implementation.
test/CMakeLists.txt Adds MPI/HDF5 (incl. parallel HDF5) tests and improves LD_LIBRARY_PATH composition for tests.
test/c/test_mpi.c Adds C MPI/MPI-IO smoke test binary.
test/c/test_hdf5.c Adds C HDF5 smoke test binary.
test/c/test_hdf5_mpi.c Adds C MPI+HDF5 smoke test binary.
test/c/test_hdf5_parallel.c Adds C parallel-HDF5 smoke test binary (skips if unavailable).
test/cpp/test_mpi.cpp Adds C++ MPI/MPI-IO smoke test binary.
test/cpp/test_hdf5.cpp Adds C++ HDF5 smoke test binary.
test/cpp/test_hdf5_mpi.cpp Adds C++ MPI+HDF5 smoke test binary.
test/cpp/test_hdf5_parallel.cpp Adds C++ parallel-HDF5 smoke test binary (skips if unavailable).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/CMakeLists.txt
Comment on lines +420 to +421
target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_CXX_LIBRARIES})
target_include_directories(test_c_mpi PRIVATE ${MPI_CXX_INCLUDE_DIRS})
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C MPI test target links/uses the C++ MPI variables (MPI_CXX_LIBRARIES / MPI_CXX_INCLUDE_DIRS). For C sources this can break builds on toolchains where MPI_CXX differs from MPI_C or drags in an unnecessary C++ runtime. Prefer linking MPI::MPI_C (or MPI_C_LIBRARIES/MPI_C_INCLUDE_DIRS) for test_c_mpi (and similarly for the other C MPI+HDF5 targets).

Suggested change
target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_CXX_LIBRARIES})
target_include_directories(test_c_mpi PRIVATE ${MPI_CXX_INCLUDE_DIRS})
if(TARGET MPI::MPI_C)
target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg MPI::MPI_C)
else()
target_link_libraries(test_c_mpi ${PROJECT_NAME}_core_dbg ${MPI_C_LIBRARIES})
endif()
target_include_directories(test_c_mpi PRIVATE ${MPI_C_INCLUDE_DIRS})

Copilot uses AI. Check for mistakes.
Comment thread tools/CMakeLists.txt
Comment on lines +70 to +76
set(_generated_dir "${CMAKE_SOURCE_DIR}/src/dftracer/core/brahma")
set(_generated_files "")
foreach(_iface IN LISTS _requested_interfaces)
list(APPEND _generated_files
"${_generated_dir}/${_iface}.h"
"${_generated_dir}/${_iface}.cpp")
endforeach()
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generator writes build artifacts into the source tree (${CMAKE_SOURCE_DIR}/src/...). This commonly results in a dirty working tree, breaks read-only source checkouts, and makes out-of-source builds less reliable. Consider generating into a build directory (e.g., ${CMAKE_BINARY_DIR}/generated/dftracer/core/brahma) and adding that directory to include paths / target sources, or only generating into the source tree via an explicit developer-only target.

Copilot uses AI. Check for mistakes.
"///\n\n"
f"#ifndef {guard}\n"
f"#define {guard}\n\n"
"#include <brahma/brahma.h>\n"
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated headers use std::shared_ptr (and std::make_shared) but the emitted include set doesn't explicitly include <memory>. Relying on transitive includes from brahma/brahma.h makes the generated stubs fragile across Brahma versions/configurations. Update the generator to emit #include <memory> (and any other required STL headers) directly in the generated header.

Suggested change
"#include <brahma/brahma.h>\n"
"#include <brahma/brahma.h>\n"
"#include <memory>\n"

Copilot uses AI. Check for mistakes.
Comment thread dependency/CMakeLists.txt Outdated
Comment on lines +6 to +28
set(BRAHMA_CONFIGURE_ARGS "")

if(DFTRACER_ENABLE_MPI)
string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON")
endif()

if(DFTRACER_ENABLE_HDF5)
if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "")
string(APPEND BRAHMA_CONFIGURE_ARGS ";")
endif()

string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON")
endif()

if(DFTRACER_GENERATE_INTERFACES)
if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "")
string(APPEND BRAHMA_CONFIGURE_ARGS ";")
endif()

string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON")
endif()

install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} "${BRAHMA_CONFIGURE_ARGS}")
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building BRAHMA_CONFIGURE_ARGS via manual string(APPEND ...) plus semicolon management is hard to extend and easy to get wrong. Prefer using a proper CMake list (e.g., set(BRAHMA_CONFIGURE_ARGS) + list(APPEND BRAHMA_CONFIGURE_ARGS ...)) and pass it unquoted to install_external_project(...), which will keep argument handling predictable.

Suggested change
set(BRAHMA_CONFIGURE_ARGS "")
if(DFTRACER_ENABLE_MPI)
string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON")
endif()
if(DFTRACER_ENABLE_HDF5)
if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "")
string(APPEND BRAHMA_CONFIGURE_ARGS ";")
endif()
string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON")
endif()
if(DFTRACER_GENERATE_INTERFACES)
if(NOT "${BRAHMA_CONFIGURE_ARGS}" STREQUAL "")
string(APPEND BRAHMA_CONFIGURE_ARGS ";")
endif()
string(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON")
endif()
install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} "${BRAHMA_CONFIGURE_ARGS}")
set(BRAHMA_CONFIGURE_ARGS)
if(DFTRACER_ENABLE_MPI)
list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_MPI=ON")
endif()
if(DFTRACER_ENABLE_HDF5)
list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_BUILD_WITH_HDF5=ON")
endif()
if(DFTRACER_GENERATE_INTERFACES)
list(APPEND BRAHMA_CONFIGURE_ARGS "-DBRAHMA_GENERATE_INTERFACES=ON")
endif()
install_external_project(brahma 3.0.0 "BRAHMA" https://github.com/izzet/brahma.git fix/auto-discovery-generation ${CMAKE_INSTALL_PREFIX} ${BRAHMA_CONFIGURE_ARGS})

Copilot uses AI. Check for mistakes.
Comment on lines +142 to +152
#if defined(DFTRACER_MPI_ENABLE) && defined(BRAHMA_ENABLE_MPI)
auto mpi_instance = brahma::MPIDFTracer::get_instance();
if (mpi_instance != nullptr) {
mpi_instance->unbind();
mpi_instance->finalize();
}
auto mpiio_instance = brahma::MPIIODFTracer::get_instance();
if (mpiio_instance != nullptr) {
mpiio_instance->unbind();
mpiio_instance->finalize();
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_instance() appears to construct the tracer singleton when instance == nullptr (as seen in the generated stubs). In finalize(), this means DFTracer can create and then immediately unbind/finalize MPI/MPIIO tracers even if they were never bound/used, and the null checks become ineffective. Consider adding/using a non-constructing accessor (e.g., maybe_instance() / has_instance()), or adjust the singleton API so finalize() can avoid instantiating tracers that were never initialized.

Copilot uses AI. Check for mistakes.
@izzet
Copy link
Copy Markdown
Collaborator Author

izzet commented Feb 24, 2026

@izzet make MetadataType::IGNORE for generated functions.

Add HDF5 (parallel) support to the build system via DFTRACER_ENABLE_HDF5
cmake option, and introduce interface generation capability via
DFTRACER_GENERATE_INTERFACES to automatically generate Brahma/DFTracer
interfaces from discovered MPI and HDF5 headers. Update CI to install
libhdf5-mpich-dev and enable these options in test builds.
- Added clang-15, libclang-15-dev, llvm-15-dev, and python3-clang-15 packages to CI workflow
- Added clang==15.* to pip build dependencies in autobuild.sh
- Required for HDF5 MPI interface generation with Brahma
- Detect system clang version at runtime in CI workflow
- Install matching Python clang bindings automatically
- Remove hardcoded clang-15 version constraints
- Allow building with different clang versions without manual intervention
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants