Configuration

SOL ExecBench ROCm Port is configured through CLI flags, optional benchmark configuration JSON, pyproject.toml, Docker target metadata, Docker wrapper inputs, and runtime environment variables. The repository does not contain a required application-level .env file.

How To Use This Reference

Most users only need:

--config, --compile-timeout, --timeout, --output, --profile, and --static-evidence for single benchmark runs.
scripts/run_dataset.py flags for dataset batches and derived reports.
./scripts/run_docker.sh --target <id> when selecting a ROCm container.

The environment-variable table below is intentionally complete. Many variables are diagnostic or CI-style overrides used by tests, Docker preflight checks, or evidence sidecars. Treat them as reference material unless a command or failure message points you to one of them.

Environment Variables

No environment variable is required for normal host CLI startup. The variables below are optional runtime, Docker, diagnostic, or build inputs discovered in src/, scripts/, and docker/.

Variable	Required	Default	Description
`PYTORCH_ALLOC_CONF`	Optional	`expandable_segments:True` in compile/eval subprocesses	Set by the CLI subprocess launcher for staged PyTorch ROCm compilation and evaluation.
`PYTORCH_ROCM_ARCH`	Optional	Derived from solution `target_hardware` when unset	Overrides the ROCm architecture list used by PyTorch extension builds.
`SOL_EXECBENCH_ALLOW_CPU_TIMING`	Optional	Unset	Test/debug escape hatch that lets evaluator timing proceed on CPU tensors when set to `1`; not a GPU validation setting.
`SOL_EXECBENCH_GRACEFUL_EXIT`	Optional	Unset	When set to `1`, the eval driver uses `sys.exit(0)` instead of `os._exit(0)` at process end. Automatically injected by the profiler adapter during profiled evaluations.
`SOLEXECBENCH_ENV_SNAPSHOT`	Optional	Unset	Set to `1` to write an environment snapshot sidecar next to `--output`.
`SOLEXECBENCH_ENV_SNAPSHOT_PATH`	Optional	Unset	Explicit environment snapshot sidecar output path.
`HIP_VISIBLE_DEVICES`	Optional	Unset	Device visibility filter recorded in environment and runtime evidence.
`ROCR_VISIBLE_DEVICES`	Optional	Unset	ROCr device visibility filter recorded in environment and runtime evidence.
`HSA_OVERRIDE_GFX_VERSION`	Optional	Unset	Forced HSA architecture override recorded in environment snapshots.
`CUDA_VISIBLE_DEVICES`	Optional	Unset	Compatibility visibility variable recorded by Docker runtime evidence when present.
`GPU_DEVICE_ORDINAL`	Optional	Unset	Compatibility visibility variable recorded by Docker runtime evidence when present.
`SOL_EXECBENCH_CLOCKS_LOCKED`	Optional	`0` when unset	Set by `docker/entrypoint.sh` after clock-lock attempts; read by clock-lock checks.
`FLASHINFER_TRACE_DIR`	Optional	`/sol-execbench/data/flashinfer-trace` under `scripts/run_docker.sh`	Adds the FlashInfer trace safetensors lookup root for evaluation.
`IMAGE_NAME`	Optional	`sol-execbench`	Docker wrapper local image name.
`IMAGE_TAG`	Optional	`rocm-<selected Docker tag>`	Docker wrapper local image tag. The default target resolves to `rocm-7.1.1-complete`.
`ROCM_DOCKER_IMAGE`	Optional	`rocm/dev-ubuntu-24.04` for unknown-target override	Docker image repository override when `--allow-unknown-target` is used.
`ROCM_DOCKER_TAG`	Optional	Selected target ID for unknown-target override	Docker image tag override when `--allow-unknown-target` is used.
`SOL_EXECBENCH_ALLOW_MIXED_VERSION_DEPENDENCIES`	Optional	`0`	Allows dependency probe diagnostics for mixed-version stacks.
`SOL_EXECBENCH_ALLOW_UNTESTED_TARGET_SMOKE`	Optional	`0`	Allows `not_tested` targets to run smoke/E2E commands without validation claims.
`SOL_EXECBENCH_RECORD_CONTAINER_VALIDATION`	Optional	`0`	Records successful target-container wrapper benchmark evidence as `container_validated`.
`SOL_EXECBENCH_HOST_PYTHON`	Optional	`uv run python`	Host Python executable override for Docker wrapper helper commands.
`SOL_EXECBENCH_COMPATIBILITY_ENTRY`	Optional	Unset	Per-target compatibility JSON sidecar path.
`SOL_EXECBENCH_COMPATIBILITY_MATRIX`	Optional	Unset	Aggregate compatibility matrix JSON path.
`SOL_EXECBENCH_RUN_DOCKER_DRY_RUN`	Optional	`0`	Enables dry-run behavior in `scripts/run_docker.sh`.
`SOL_EXECBENCH_DOCKER_CONTEXT`	Optional	`docker context show` output	Test/debug override for Docker context preflight evidence.
`SOL_EXECBENCH_DOCKER_HOST`	Optional	`docker context inspect` output	Test/debug override for Docker host preflight evidence.
`SOL_EXECBENCH_DEV_KFD_PRESENT`	Optional	Filesystem probe of `/dev/kfd`	Test/debug override for Docker runtime preflight evidence.
`SOL_EXECBENCH_DEV_KFD_ACCESSIBLE`	Optional	Read/write probe of `/dev/kfd`	Test/debug override for Docker runtime preflight evidence.
`SOL_EXECBENCH_DEV_DRI_PRESENT`	Optional	Filesystem probe of `/dev/dri`	Test/debug override for Docker runtime preflight evidence.
`SOL_EXECBENCH_DEV_DRI_ACCESSIBLE`	Optional	Render/card device access probe under `/dev/dri`	Test/debug override for Docker runtime preflight evidence.
`SOL_EXECBENCH_GPU_ACCESSIBLE`	Optional	Unset	Test/debug override for Docker runtime preflight GPU accessibility.
`SOL_EXECBENCH_HOST_ROCM_VERSION`	Optional	Unset	Runtime evidence override for host ROCm version.
`SOL_EXECBENCH_HOST_DRIVER_VERSION`	Optional	Unset	Runtime evidence override for host driver version.
`SOL_EXECBENCH_IMAGE_DIGEST`	Optional	Unset	Runtime evidence override for container image digest.
`SOL_EXECBENCH_RUNTIME_DEVICE_COUNT`	Optional	Unset	Runtime evidence override for device count.
`SOL_EXECBENCH_RUNTIME_DEVICE_NAME`	Optional	Unset	Runtime evidence override for device name.
`SOL_EXECBENCH_RUNTIME_GFX_ARCHITECTURE`	Optional	Unset	Runtime evidence override for gfx architecture.
`SOL_EXECBENCH_GPU_CLK_MHZ`	Optional	Empty string in Docker wrapper environment	Forwarded into Docker runs for GPU clock diagnostics.
`SOL_EXECBENCH_DRAM_CLK_MHZ`	Optional	Empty string in Docker wrapper environment	Forwarded into Docker runs for DRAM clock diagnostics.
`SOL_EXECBENCH_DEPENDENCY_TORCH_DISTRIBUTION_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for installed Torch distribution version.
`SOL_EXECBENCH_DEPENDENCY_TORCH_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for imported Torch version.
`SOL_EXECBENCH_DEPENDENCY_TORCH_LOCAL_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for Torch local version suffix.
`SOL_EXECBENCH_DEPENDENCY_TORCH_ROCM_TARGET`	Optional	Unset	Dependency preflight/runtime evidence override for expected ROCm wheel target.
`SOL_EXECBENCH_DEPENDENCY_TORCH_HIP_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for `torch.version.hip`.
`SOL_EXECBENCH_DEPENDENCY_TORCH_CUDA_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for `torch.version.cuda`.
`SOL_EXECBENCH_DEPENDENCY_TORCH_DEVICE_AVAILABLE`	Optional	Unset	Dependency preflight/runtime evidence override for Torch device availability.
`SOL_EXECBENCH_DEPENDENCY_TORCH_IMPORT_ERROR`	Optional	Unset	Dependency preflight/runtime evidence override for Torch import failure text.
`SOL_EXECBENCH_DEPENDENCY_TORCHVISION_DISTRIBUTION_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for installed torchvision distribution version.
`SOL_EXECBENCH_DEPENDENCY_TRITON_ROCM_DISTRIBUTION_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for installed `triton-rocm` distribution version.
`SOL_EXECBENCH_DEPENDENCY_TRITON_ROCM_STATUS`	Optional	Unset	Dependency preflight/runtime evidence override for Triton ROCm status.
`SOL_EXECBENCH_DEPENDENCY_CONTAINER_ROCM_USER_SPACE_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for container ROCm user-space version.
`SOL_EXECBENCH_DEPENDENCY_HIPCC_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for `hipcc` version.
`SOL_EXECBENCH_DEPENDENCY_TOOLCHAIN_ROCM_VERSION`	Optional	Unset	Dependency preflight/runtime evidence override for toolchain ROCm version.
`ROCM_PATH`	Optional	`/opt/rocm` in Docker image	ROCm installation root in `docker/Dockerfile`.
`HIP_PATH`	Optional	`/opt/rocm` in Docker image	HIP installation root in `docker/Dockerfile`.
`HIP_PLATFORM`	Optional	`amd` in Docker image	HIP platform selector in `docker/Dockerfile`.
`UV_CACHE_DIR`	Optional	`/home/${HOST_USER}/.cache/uv` in Docker image	UV cache directory in `docker/Dockerfile`.
`UV_LINK_MODE`	Optional	`copy` in Docker image	UV link behavior in `docker/Dockerfile`.
`UV_COMPILE_BYTECODE`	Optional	`1` in Docker image	Enables bytecode compilation in Docker installs.
`UV_PYTHON_DOWNLOADS`	Optional	`never` in Docker image	Disables Python downloads during Docker installs.
`UV_PROJECT_ENVIRONMENT`	Optional	`/venv` in Docker image	Docker image virtual environment path.
`HOST_UID`	Optional	`1000` Docker build argument	Host user ID used when creating the Docker image user.
`HOST_GID`	Optional	`1000` Docker build argument	Host group ID used when creating the Docker image group.
`HOST_USER`	Optional	`sol-execbench` Docker build argument	Host user name used when creating the Docker image user.
`PYTORCH_TORCH_VERSION`	Optional	`2.10.0+rocm7.1` Docker build argument	Torch wheel version installed in the Docker image.
`PYTORCH_TORCHVISION_VERSION`	Optional	`0.25.0+rocm7.1` Docker build argument	Torchvision wheel version installed in the Docker image.
`PYTORCH_ROCM_INDEX_URL`	Optional	`https://download.pytorch.org/whl/rocm7.1` Docker build argument	PyTorch ROCm wheel index URL.
`TRITON_ROCM_VERSION`	Optional	`3.6.0` Docker build argument	`triton-rocm` wheel version installed in the Docker image.
`TRITON_ROCM_INDEX_URL`	Optional	`https://download.pytorch.org/whl/` Docker build argument	Extra wheel index URL used for `triton-rocm`.

Config File Format

The benchmark evaluator accepts an optional JSON config file through --config. It is loaded into BenchmarkConfig from src/sol_execbench/core/bench/config/benchmark_config.py.

{
  "warmup_runs": 10,
  "iterations": 50,
  "lock_clocks": false,
  "benchmark_reference": true,
  "seed": 200
}

Field	Default	Description
`warmup_runs`	`10`	Number of warmup runs before measurement. Must be greater than or equal to `0`.
`iterations`	`50`	Number of timed iterations. Must be greater than `0`.
`lock_clocks`	`false`	Whether evaluation requires clocks to be locked.
`benchmark_reference`	`true`	Whether to benchmark the reference implementation.
`seed`	`200`	Integer seed for benchmark input generation.

The Docker target manifest at docker/rocm-targets.json is another repository configuration file. It declares default_target_id, requested_rocm_user_space_version, Docker image tags, PyTorch ROCm wheel policies, and Triton ROCm wheel policies for supported container targets. The current declared targets are:

Target ID	ROCm User Space	Docker Tag	PyTorch Policy
`rocm-7.0.2-ubuntu-24.04-container`	`7.0.2`	`7.0.2-complete`	`torch==2.10.0+rocm7.0`, `torchvision==0.25.0+rocm7.0`, `triton-rocm==3.6.0`
`rocm-7.1.1-ubuntu-24.04-container`	`7.1.1`	`7.1.1-complete`	`torch==2.10.0+rocm7.1`, `torchvision==0.25.0+rocm7.1`, `triton-rocm==3.6.0`
`rocm-7.2.0-ubuntu-24.04-container`	`7.2.0`	`7.2-complete`	`torch==2.11.0+rocm7.2`, `torchvision==0.26.0+rocm7.2`, `triton-rocm==3.6.0`

These targets describe container user-space and dependency policy. They do not, by themselves, make native-host, paper-parity, leaderboard, MI300X, CDNA4, or NVFP4/MXFP4 validation claims.

provenance.toml is the machine-readable source attribution policy. It classifies active files as upstream-retained, derivative-modified, independent ROCm work, or generated/planning material; defines header policy for each class; and lists files that may retain NVIDIA SPDX notices. The prerelease readiness gate consumes this manifest with docs/provenance.md.

Required vs Optional Settings

There are no required environment variables for starting the CLI. Required inputs are passed as files or positional arguments:

sol-execbench <problem_dir> requires definition.json and workload.jsonl in the problem directory.
If no problem directory is used, --definition and --workload are required.
A solution must be supplied through --solution or by a conventional solution.json in the problem directory.
contract, doctor, and toolchain subcommands only support --json output and raise a Click exception without it.
BenchmarkConfig.warmup_runs must be >= 0; iterations must be > 0.
scripts/run_docker.sh --target, --compatibility-entry, and --compatibility-matrix require argument values when those flags are used.

Clock-lock settings are optional, but --lock-clocks changes behavior by forcing BenchmarkConfig.lock_clocks = true. If clocks are not locked, clock-sensitive evaluation paths can reject the run based on SOL_EXECBENCH_CLOCKS_LOCKED.

Defaults

Setting	Default	Source
Benchmark warmup runs	`10`	`BenchmarkConfig.warmup_runs`
Benchmark iterations	`50`	`BenchmarkConfig.iterations`
Benchmark clock-lock requirement	`false`	`BenchmarkConfig.lock_clocks`
Benchmark reference timing	`true`	`BenchmarkConfig.benchmark_reference`
Benchmark seed	`200`	`BenchmarkConfig.seed`
CLI compile timeout	`120` seconds	`--compile-timeout` option
CLI evaluation timeout	`600` seconds	`--timeout` option
CLI profiling mode	`none`	`--profile` option
CLI static evidence mode	`none`	`--static-evidence` option
Dataset runner output directory	`<repo_root>/out`	`scripts/run_dataset.py --output`
Dataset runner timeout	`300` seconds per problem	`scripts/run_dataset.py --timeout`
Dataset runner phase	`all`	`scripts/run_dataset.py --phase`
Dataset runner execution mode	`serial`	`scripts/run_dataset.py --execution-mode`
Baseline comparison win threshold	`2.0` percent	`sol-execbench-baseline --win-pct`
Baseline comparison parity threshold	`5.0` percent	`sol-execbench-baseline --parity-pct`
Docker target	`rocm-7.1.1-ubuntu-24.04-container`	`docker/rocm-targets.json`
Docker base image	`rocm/dev-ubuntu-24.04:7.1.1-complete`	Default Docker target and `docker/Dockerfile`
Docker local image name	`sol-execbench`	`scripts/run_docker.sh`
Docker local image tag	`rocm-<selected Docker tag>`	`scripts/run_docker.sh`
Docker FlashInfer trace root	`/sol-execbench/data/flashinfer-trace`	`scripts/run_docker.sh`
Python package version	`1.0.2`	`pyproject.toml`; separate from milestone and prerelease labels
Python requirement	`>=3.12,<3.14`	`pyproject.toml`
Default ROCm Torch wheel	`torch==2.10.0+rocm7.1`	`pyproject.toml` Linux x86_64 marker and Docker target metadata
Default ROCm torchvision wheel	`torchvision==0.25.0+rocm7.1`	`pyproject.toml` Linux x86_64 marker and Docker target metadata
Default Triton ROCm wheel	`triton-rocm==3.6.0`	`pyproject.toml` Linux x86_64 marker and Docker target metadata
Provenance policy document	`docs/provenance.md`	`provenance.toml`

Per-Environment Overrides

The repository does not define .env.development, .env.production, or .env.test files, and it does not contain deployment-specific configuration files. Use these source-backed override paths instead:

Host benchmark runs: pass CLI flags such as --config, --compile-timeout, --timeout, --output, --profile, and --static-evidence.
Per-problem benchmark settings: place a config.json next to definition.json and workload.jsonl, or pass a config file explicitly with --config.
Dataset batch settings: pass scripts/run_dataset.py flags such as --phase, --limit, --max-workloads, --workload-shard-size, --iterations, --warmup-runs, --timeout, --timeout-overrides, --blob-precheck, and --lock-clocks.
RDNA4 profiler timing closure settings: pass scripts/internal/rdna4/run_rdna4_profiler_timing_batch.py flags such as --only-problem, --skip-problem, --skip-problem-file, --mark-blocked-problem, --mark-blocked-only, --workload-limit, --workload-offset, --workload-sharded, --workload-slice-timing-dir, --workload-sharded-import-only, --timeout, and --temp-dir.
Docker ROCm stack selection: use ./scripts/run_docker.sh --target <id> for declared targets in docker/rocm-targets.json.
Docker image overrides for unknown targets: use --allow-unknown-target with ROCM_DOCKER_IMAGE and ROCM_DOCKER_TAG.
Diagnostic and CI-style preflight overrides: set the SOL_EXECBENCH_DEPENDENCY_*, SOL_EXECBENCH_RUNTIME_*, SOL_EXECBENCH_DEV_*, SOL_EXECBENCH_DOCKER_*, and compatibility sidecar variables listed above.
Source attribution policy: update provenance.toml and docs/provenance.md; prior blanket header corrections are handled as ordinary commits unless a separate legal review requires history rewriting.

CLI Flags

Primary evaluator form:

uv run sol-execbench <problem_dir> --solution solution.json

Explicit input form:

uv run sol-execbench \
  --definition definition.json \
  --workload workload.jsonl \
  --solution solution.json

Flag	Default	Purpose
`--definition`	None	Path to `definition.json` when not using a problem directory.
`--workload`	None	Path to `workload.jsonl` when not using a problem directory.
`--solution`	Conventional `solution.json` in problem directory when present	Path to solution JSON.
`--config`	None	Optional benchmark config JSON.
`--compile-timeout`	`120`	HIP/C++ compilation timeout in seconds.
`--timeout`	`600`	Evaluation subprocess timeout in seconds.
`-o`, `--output`	None	Trace JSONL output path.
`--json`	Disabled	Print trace JSON lines to stdout.
`--lock-clocks`	Disabled	Require GPU clocks to be locked.
`--keep-staging`	Disabled	Preserve the temporary staging directory.
`--profile`	`none`	Use `rocprofv3` for optional diagnostic profiling when set to `rocprofv3`.
`--static-evidence`	`none`	Collect optional diagnostic static kernel evidence when set to `auto`.
`-v`, `--verbose`	Disabled	Show subprocess output and staging details.

No-trace diagnostic sidecars are not controlled by a separate flag. When an evaluation subprocess produces no parseable trace JSONL, the CLI writes a bounded diagnostic-only sidecar next to --output, in the kept staging directory, or in the system temp directory depending on the available path. That sidecar records stdout/stderr tails and line counts and is not canonical trace JSONL.

Metadata and diagnostic subcommands require --json:

uv run sol-execbench contract --json
uv run sol-execbench doctor --json
uv run sol-execbench toolchain --json

toolchain also accepts --evidence-level, --artifact-type, --gpu-arch, --hardware-generation, --rocm-version, and --list-registry.

Local dataset migration subcommands live under the same sol-execbench entry point:

uv run sol-execbench dataset migrate-sol <source_root> <output_root> \
  --category L1 --manifest out/sol-migration-manifest.json --json
uv run sol-execbench dataset migrate-flashinfer <source_root> <output_root> \
  --manifest out/flashinfer-migration-manifest.json --json

Subcommand	Option	Purpose
`dataset migrate-sol`	`--category`	Restrict SOL-ExecBench migration to one or more categories.
`dataset migrate-sol`	`--source-revision`	Record the source dataset revision or local commit ref in the manifest.
`dataset migrate-sol`	`--manifest`	Write the migration manifest to an explicit path.
`dataset migrate-sol`	`--json`	Print the migration manifest JSON to stdout.
`dataset migrate-flashinfer`	`--source-revision`	Record the FlashInfer Trace source revision or local commit ref in the manifest.
`dataset migrate-flashinfer`	`--manifest`	Write the migration manifest to an explicit path.
`dataset migrate-flashinfer`	`--json`	Print the migration manifest JSON to stdout.

Dataset Runner Options

scripts/run_dataset.py runs a single problem directory or a dataset root with category subdirectories. It shells out to sol-execbench for trace collection and can also build derived AMD/SOLAR/timing reports from existing traces.

uv run scripts/run_dataset.py data/SOL-ExecBench/benchmark \
  --category L1 L2 --limit 5 -o out/dataset-smoke

Option	Default	Purpose
`--category`	None	Restrict to one or more of `L1`, `L2`, `FlashInfer-Bench`, or `Quant`.
`--limit`	None	Maximum number of problems to evaluate.
`--phase`	`all`	Run `all`, `traces`, `derived`, or `timing`.
`--jobs`	`1`	Parallel workers for CPU/I/O-only phases; `auto` is accepted.
`--execution-mode`	`serial`	Use `serial` ordering or `pipeline` for trace-stage CPU preparation overlapped with serial GPU evaluation.
`--prepare-jobs`	`auto`	CPU preparation workers for pipeline mode.
`--gpu-jobs`	`1`	GPU evaluation workers for pipeline mode; only `1` is currently accepted.
`--timeout-policy`	`record`	Record timeout traces or fail on no-trace timeout behavior.
`--timeout-overrides`	None	JSON object with `default`, `problems`, or `workloads` timeout overrides.
`--blob-precheck`	`fail`	Precheck safetensors references with `fail`, `warn`, or `off`.
`--log-order`	`completion`	Pipeline log ordering: `completion` or `problem`.
`-o`, `--output`	`<repo_root>/out`	Output directory for traces and summary files.
`--timeout`	`300`	Per-problem GPU evaluation timeout in seconds.
`--max-workloads`	None	Limit workloads per problem by truncating the workload input for the run.
`--workload-shard-size`	None	Split workload files into temporary shards with this many workloads per CLI invocation.
`--iterations`	`BenchmarkConfig.iterations`	Override timed iterations in generated per-run config.
`--warmup-runs`	`BenchmarkConfig.warmup_runs`	Override warmup runs in generated per-run config.
`--lock-clocks`	Disabled	Require locked GPU clocks for benchmark and timing evidence.
`--solution-name`	Definition reference	Use a named solution file in each problem directory.
`--rerun`	Disabled	Re-evaluate problems that already have results.
`--keep-staging`	Disabled	Preserve `sol-execbench` staging directories.
`-v`, `--verbose`	Disabled	Pass verbose mode through to `sol-execbench`.
`--amd-score-report`	None	Write a derived AMD-native suite score JSON report.
`--scoring-baseline`	None	Use a release-defined scoring baseline artifact for AMD score reports.
`--amd-sol-bound-dir`	None	Write derived AMD SOL bound v2 sidecars when score reporting is enabled.
`--solar-derivation`	None	Write SOLAR derivation sidecars from definitions and workloads.
`--timing-evidence-dir`	None	Write per-problem ROCm timing evidence JSON.
`--timing-tool-version`	`rocprofv3`	Tool version string recorded in timing evidence.
`--gpu-architecture`	`unknown`	GPU architecture string recorded in timing evidence.
`--ready-subset`	None	Bound dataset execution to a ready-subset JSON.
`--readiness`	None	Enrich execution-closure blockers from readiness JSON.
`--execution-closure`	`<output>/execution_closure.json` when `--ready-subset` is supplied	Write execution closure JSON.
`--dataset-manifest`	None	Dataset manifest JSON used for closure provenance.

--execution-mode pipeline currently supports --phase traces or --phase all, rejects --ready-subset and --execution-closure, and still requires --gpu-jobs 1. With --phase all, trace collection uses pipeline scheduling and derived/timing sidecars are produced after the serial GPU trace path.

Baseline Comparison CLI

The package also exposes sol-execbench-baseline for comparing existing trace JSONL files. It does not run GPU evaluation.

uv run sol-execbench-baseline \
  --candidate out/current/traces.jsonl \
  --baseline out/baseline/traces.jsonl \
  --format json --output out/baseline-comparison.json

Flag	Default	Purpose
`--candidate`	Required	Candidate trace JSONL file.
`--baseline`	Required, repeatable	One or more baseline trace JSONL files.
`--format`	`text`	Output `text` or `json`.
`--output`	stdout	Optional output path.
`--win-pct`	`2.0`	Candidate must beat baseline by at least this percentage to be a win.
`--parity-pct`	`5.0`	Candidate within this percentage of baseline is parity.
`--amd-native-claim`	Disabled	Label output as an AMD-native claim and emit guardrail warnings.

Docker Wrapper Flags

./scripts/run_docker.sh selects a declared ROCm Docker target, performs host and dependency preflights, optionally builds the image, and then runs a command inside the container. Arguments after -- become the container command; arguments before -- that are not wrapper flags are forwarded to docker run.

./scripts/run_docker.sh --target rocm-7.1.1-ubuntu-24.04-container -- \
  sol-execbench tests/sol_execbench/samples/rmsnorm \
    --solution tests/sol_execbench/samples/rmsnorm/solution_triton.json

Flag	Purpose
`--build`	Build the selected local image before running.
`--target <id>`	Select a declared target from `docker/rocm-targets.json`.
`--allow-unknown-target`	Permit explicit `ROCM_DOCKER_IMAGE` and `ROCM_DOCKER_TAG` override selection.
`--allow-mixed-version-dependencies`	Allow dependency probe/smoke diagnostics for mixed-version stacks.
`--allow-untested-target-smoke`	Allow `not_tested` targets to run smoke/E2E commands without validation claims.
`--record-container-validation`	Record successful wrapper benchmark evidence as `container_validated`.
`--preflight-only`	Print preflight JSON and exit before Docker build/run.
`--compatibility-entry <path>`	Write a per-target compatibility JSON sidecar.
`--compatibility-matrix <path>`	Write or aggregate a compatibility matrix JSON report.

RDNA4 Profiler Timing Operator Flags

The RDNA4 profiler closure scripts are evidence-generation tools for existing dataset layouts and timing sidecars. They do not change canonical Trace JSONL or create score authority.

scripts/internal/rdna4/run_rdna4_profiler_timing_coverage.py accepts:

Flag	Purpose
`--dataset-root`	Dataset benchmark root, defaulting to `data/SOL-ExecBench/benchmark`.
`--output-dir`	Directory for `coverage.json`, `coverage.md`, and `coverage-summary.json`.
`--timing-evidence-dir`	Timing sidecar root; may be repeated.
`--category`	Dataset category filter; may be repeated.
`--expected-problem-denominator`	Expected denominator, defaulting to `235`.
`--no-expected-problem-denominator`	Disable fixed denominator recording.
`--require-profiler-complete`	Exit nonzero unless every denominator problem has full profiler-backed timing.

scripts/internal/rdna4/run_rdna4_profiler_timing_batch.py accepts the same dataset/output roots plus target and workload controls:

Flag	Purpose
`--source-timing-dir`	Existing fallback timing sidecar root; may be repeated.
`--replacement-timing-dir`	Destination for replacement timing sidecars.
`--limit`	Limit selected targets.
`--only-problem`	Include only a problem ID; may be repeated.
`--skip-problem`	Skip a problem ID; may be repeated.
`--skip-problem-file`	Read skipped problem IDs from a file.
`--mark-blocked-problem`	Write a classified profiler-blocked sidecar for a problem ID.
`--mark-blocked-only`	Only write requested blocked sidecars; do not profile other targets.
`--workload-limit`	Profile a bounded workload slice for a selected problem.
`--workload-offset`	Start workload-slice profiling at an offset.
`--workload-sharded`	Profile each workload independently and aggregate complete manifests.
`--workload-slice-timing-dir`	Import existing workload-slice timing roots before profiling missing slices.
`--workload-sharded-import-only`	Aggregate imported slices without profiling missing workloads.
`--timeout`	Per-profiler subprocess timeout in seconds.
`--temp-dir`	Parent directory for profiler staging directories.

The classification scripts scripts/internal/rdna4/run_rdna4_profiler_partial_failures.py and scripts/internal/rdna4/run_rdna4_profiler_sharded_closure.py accept dataset root, output directory, timing evidence directories, and denominator controls. The sharded closure audit also accepts repeated --target-status values.

Package Configuration

pyproject.toml defines:

Package name sol-execbench
Version 1.0.2, which is the Python package version rather than the v1.26 research-prerelease milestone tag
Python range >=3.12,<3.14
Console scripts sol-execbench and sol-execbench-baseline
Runtime dependencies, including PyTorch ROCm, torchvision ROCm, triton-rocm, Pydantic, Click, Rich, datasets, and native build helpers
Development dependencies for pytest, pytest-xdist, Ruff, Ty, and pre-commit
Pytest markers
Ruff exclusions
Ty source roots
UV package indexes for PyPI, PyTorch ROCm 7.1, and the PyTorch ROCm package root

Docker Wrapper Settings

scripts/run_docker.sh supports:

--build to build the selected Docker image.
--target <id> to select a declared target from docker/rocm-targets.json.
--allow-unknown-target with ROCM_DOCKER_IMAGE and ROCM_DOCKER_TAG.
--allow-mixed-version-dependencies for mixed-version dependency diagnostics.
--allow-untested-target-smoke for non-authoritative smoke/E2E runs on not_tested targets.
--record-container-validation to write container validation evidence after successful wrapper execution.
--preflight-only to run preflight classification without launching the benchmark command.
--compatibility-entry <path> and --compatibility-matrix <path> to write compatibility sidecars.

The wrapper derives PyTorch and Triton Docker build arguments from docker/rocm-targets.json so ROCm 7.0, 7.1, and 7.2 images can install target-specific ROCm wheel stacks without changing the project lockfile.

Dataset Runner Evidence Notes

The full dataset runner flag table appears above in Dataset Runner Options. The most common evidence flags are:

--ready-subset and --readiness to bound execution to ready workloads and carry blocker reasons into closure records.
--execution-closure to write attempted, skipped, missing-trace, and missing-evidence states.
--dataset-manifest to include migration or acquisition provenance in reuse decisions.
--amd-score-report, --amd-sol-bound-dir, and --solar-derivation to generate opt-in derived score and bound sidecars.
--timing-evidence-dir to write source-specific ROCm timing evidence.

Existing traces are reused only when they exist, have no failed workloads, --rerun is not set, and any requested execution-closure provenance matches the current manifest, ready subset, readiness, solution, and derived-evidence requirements.

The auto job count uses the current CPU count capped by the number of selected problems and an internal upper bound. Derived sidecar names are scoped by problem path during dataset runs, so two problems with the same definition name and workload UUID do not overwrite each other's AMD SOL or SOLAR derivation files.

Docker Image Settings

docker/Dockerfile sets ROCm and uv environment defaults:

Variable	Default
`ROCM_PATH`	`/opt/rocm`
`HIP_PATH`	`/opt/rocm`
`HIP_PLATFORM`	`amd`
`PATH`	Includes `/opt/rocm/bin`, `/opt/rocm/llvm/bin`, and `/venv/bin`.
`LD_LIBRARY_PATH`	`/opt/rocm/lib`
`HOME`	`/home/${HOST_USER}`
`UV_CACHE_DIR`	`/home/${HOST_USER}/.cache/uv`
`UV_LINK_MODE`	`copy`
`UV_COMPILE_BYTECODE`	`1`
`UV_PYTHON_DOWNLOADS`	`never`
`UV_PROJECT_ENVIRONMENT`	`/venv`
`PYTHONPATH`	`/sol-execbench/src`

Docker build arguments include ROCM_DOCKER_IMAGE, ROCM_DOCKER_TAG, PYTORCH_TORCH_VERSION, PYTORCH_TORCHVISION_VERSION, PYTORCH_ROCM_INDEX_URL, TRITON_ROCM_VERSION, TRITON_ROCM_INDEX_URL, HOST_UID, HOST_GID, and HOST_USER.

Optional Evidence Outputs

These settings create sidecars without changing trace JSONL schema:

--profile rocprofv3 writes profiler metadata next to --output, or under the staging directory when no output file is provided.
--static-evidence auto writes static kernel evidence metadata for native solution builds.
SOLEXECBENCH_ENV_SNAPSHOT=1 writes an environment sidecar next to --output.
SOLEXECBENCH_ENV_SNAPSHOT_PATH=<path> writes an environment sidecar to an explicit path.

Configuration Boundaries

No deployment, staging, or production config files are present. For local benchmark changes, use CLI flags or benchmark config JSON. For ROCm container runs, use ./scripts/run_docker.sh so target selection, device mounting, dependency preflight, and compatibility sidecars stay aligned with repository logic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration

How To Use This Reference

Environment Variables

Config File Format

Required vs Optional Settings

Defaults

Per-Environment Overrides

CLI Flags

Dataset Runner Options

Baseline Comparison CLI

Docker Wrapper Flags

RDNA4 Profiler Timing Operator Flags

Package Configuration

Docker Wrapper Settings

Dataset Runner Evidence Notes

Docker Image Settings

Optional Evidence Outputs

Configuration Boundaries

FilesExpand file tree

CONFIGURATION.md

Latest commit

History

CONFIGURATION.md

File metadata and controls

Configuration

How To Use This Reference

Environment Variables

Config File Format

Required vs Optional Settings

Defaults

Per-Environment Overrides

CLI Flags

Dataset Runner Options

Baseline Comparison CLI

Docker Wrapper Flags

RDNA4 Profiler Timing Operator Flags

Package Configuration

Docker Wrapper Settings

Dataset Runner Evidence Notes

Docker Image Settings

Optional Evidence Outputs

Configuration Boundaries