Skip to content

gdb/testsuite: support optimized code in gdb.rocm testsuite#103

Open
spatrang wants to merge 8 commits intoamd-stagingfrom
users/spatrang/testsuite-optimized-code
Open

gdb/testsuite: support optimized code in gdb.rocm testsuite#103
spatrang wants to merge 8 commits intoamd-stagingfrom
users/spatrang/testsuite-optimized-code

Conversation

@spatrang
Copy link
Copy Markdown

@spatrang spatrang commented Apr 27, 2026

Summary

This series makes the gdb.rocm testsuite pass when the test programs are
compiled with optimization (-O1/-O2/-O3, -flto) passed via
GDB_TESTCASE_OPTIONS, rather than the implicit -O0 the suite assumes today.

Three commits, split by concern:

  1. Update gdb.rocm framework to support optimized code — adds helpers in
    gdb/testsuite/lib/rocm.exp:

    • check_for_any_flags {flags} — detects whether GDB_TESTCASE_OPTIONS
      contains any of the listed flags.
    • gdb_caching_proc no_optimized_code — predicate for -O1/-O2/-O3.
    • gdb_caching_proc no_lto — predicate for -flto.
  2. Update some gdb.rocm tests to support optimized code (22 files) —
    two patterns applied surgically:

    • In .cpp/.c: add volatile on locals the test inspects, and
      __attribute__((optnone)) on helper functions the test breakpoints into.
      Add a trivial volatile int statement to empty "break here" functions
      so their body is not elided.
    • In .exp: relax line-number arithmetic, backtrace PC formatting,
      expected source strings after step, and add a bounded retry loop for
      extra ticks under optimization.
  3. Skip unsupported tests in case of optimized code (19 files) — for tests
    whose semantics cannot be reconciled with optimized code, add
    require no_optimized_code at the top; mi-aspace.exp also gets
    require no_lto. register-watchpoint.exp soft-skips the tail via
    check_for_any_flags after the earlier portion has run.

Diffstat: 42 files changed, +146 / -46.

Context

This continues the work started in internal PR
AMD-ROCm-Internal/ROCgdb#6,
rebased onto the current amd-staging of the public repo. The commits are
content-identical to the internal PR, preserving the original authors
(Aleksandar Rikalo, Andrey Kasaurov) with their Co-Authored-By: trailers
and adding my Signed-off-by.

motokultivator and others added 3 commits April 24, 2026 04:41
Introduce the necessary helper functions into the gdb.rocm
framework so that support for optimized code can be implemented.

Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>

Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>

Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>

Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
@spatrang
Copy link
Copy Markdown
Author

Hi @lancesix

This PR is a re-submission of the optimized-code series that was originally
opened in the internal rocgdb repository. I'm summarising your comments and
how they've been addressed in the latest push:

  1. gdb.rocm/deep-stack.exp"This part of the change seems odd. Is
    it really necessary?"

    Reverted. Frames #10..#12 now use the same ".* <name>" regex
    style as #0..#9; the dropped space was unnecessary.

  2. gdb.rocm/device-attach.cpp"Is code still valid without this
    line?"

    No, it isn't. Restored the __global__ void qualifier on
    bit_extract_kernel; without it the function isn't a HIP kernel
    and hipLaunchKernelGGL would fail to compile.

  3. gdb.rocm/hip-builtin-variables.cpp"Not a fan of the implicit
    int … const volatile reads off … Looks like it could just be
    volatile int."

    Agreed. All const volatile X = …; declarators have been replaced
    with volatile int X = …;.

  4. gdb.rocm/precise-memory-exec.c"Why volatile int sometimes,
    attribute((optnone)) other times?"

    Good point. second() now uses __attribute__((optnone)) instead
    of an artificial volatile int i = 0;, matching the convention
    used in corefile.cpp, names.cpp, ocp_mx.cpp, and
    step-schedlock-spurious-waves.cpp (touched in the same series).

  5. lib/rocm.exp"missing '.'" (×2) and "indentation" (×2)
    on no_optimized_code and no_lto.
    Both comments were added a trailing period, and the body of the
    if-blocks in both procs is now tab-indented to match the
    surrounding style.

The corresponding fix-up commit is on top of this branch. Please take
another look when you have a moment

cc: @lumachad

@lumachad
Copy link
Copy Markdown
Collaborator

pre-commit seems to be grumpy.

@spatrang spatrang changed the title gdb/testsuite: support optimized code in gdb.rocm testsuite (AIROCGDB-573) gdb/testsuite: support optimized code in gdb.rocm testsuite Apr 27, 2026
  * lib/rocm.exp: Add missing periods and fix indentation in
    no_optimized_code and no_lto.
  * gdb.rocm/device-attach.cpp: Restore the "__global__ void"
    qualifier on bit_extract_kernel.
  * gdb.rocm/hip-builtin-variables.cpp: Use "volatile int" instead
    of the implicit-int "const volatile" declarators, and apply
    __attribute__((optnone)) to kern, inner_func1 and inner_func2
    so the locals stay observable and the inner functions remain
    distinguishable as frames under -O1/-O2/-O3.
  * gdb.rocm/hip-builtin-variables.exp: Relax the "next" regex
    after stepping into inner_func2 to accept either "return;" or
    the closing brace; Clang merges the line entry of the
    explicit "return;" with the closing brace under
    __attribute__((optnone)).
@spatrang spatrang force-pushed the users/spatrang/testsuite-optimized-code branch from e05abcd to 12346d2 Compare April 28, 2026 09:26
  * unaligned-memory-access.cpp: noinline on done(), optnone on
    kernel() to keep the breakpoint target and the local variable
    observable under optimization.
  * watch-gpu-global-from-host.cpp: optnone on main() and volatile
    int *devGlobal so the triggering store is not tail-optimised.

+82 PASS / -82 FAIL at -O1/-O2/-O3; no -O0 regression.
@spatrang spatrang marked this pull request as ready for review April 28, 2026 14:57
@spatrang spatrang requested a review from a team as a code owner April 28, 2026 14:57
At -O2/-O3 Clang eliminates bit_extract_kernel's "while (loop_forever)"
spin loop (constant condition + side-effect-free body).  The kernel
returns, the host exits, and "attach <pid>" fails.

Make loop_forever volatile so the loop survives optimisation.
@lumachad
Copy link
Copy Markdown
Collaborator

lumachad commented May 7, 2026

As stated elsewhere, the series only handles GDB_TESTCASE_OPTIONS. If someone passes CFLAGS directly to the compiler the optimization level checks won't work properly. Do we want to fix that as well? Maybe as a follow up if not in this one?

@lumachad
Copy link
Copy Markdown
Collaborator

lumachad commented May 7, 2026

There's a pre-commit failure.

Other than that, I'm still seeing FAIL's for gdb.rocm/watchpoint-basic.exp when running the testsuite with -O1 or -O3. It passes when running with -O0. Could you please check that you're running the testsuite correctly?

@spatrang spatrang force-pushed the users/spatrang/testsuite-optimized-code branch 2 times, most recently from ed63897 to 7c21341 Compare May 7, 2026 13:53
  * static-global, lane-info, deref-scoped-pointer,
    scheduler-locking: apply noinline / optnone (and volatile
    where the test relies on lvalue-ness) to anchor functions
    and kernels so the breakpoint targets, lvalue addresses and
    per-statement line entries survive optimization.  Drop the
    "require no_optimized_code" gate from each .exp.

Validated at -O0, -O2, and -O3 on gfx942: 73 PASS, 0 FAIL.
@spatrang spatrang force-pushed the users/spatrang/testsuite-optimized-code branch from 7c21341 to a912e0e Compare May 7, 2026 14:03
@lumachad
Copy link
Copy Markdown
Collaborator

lumachad commented May 8, 2026

Did the latest force-pushes provide anything that would fix gdb.rocm/watchpoint-basic.exp? I can't tell.

- lib/rocm.exp: check_for_any_flags now also inspects
    CFLAGS_FOR_TARGET / CXXFLAGS_FOR_TARGET, so existing
    require no_optimized_code gates fire under
    test_rocgdb.py.
- watchpoint-basic.exp: skip rwatch/awatch
    test_non_write_watchpoint_before_runtime_load sub-cases
    as UNSUPPORTED under optimisation; HIP runtime loads
    too early for the test's premise to hold.
Comment on lines +276 to +287
if { [check_for_any_flags {-O1 -O2 -O3}] == -1 } {
# Under -O1/-O2/-O3, the HIP runtime is initialised earlier
# (during static initialisers/dynamic loading) than at -O0, so
# by the time main() reaches "Break before runtime load" the
# amd-dbgapi target is already attached and rejects the
# host-side read/access watchpoint at insert time. The test's
# "before runtime load" premise does not hold for optimised
# builds; skip the sub-test in that case.
unsupported "$wp_cmd before runtime load (optimised code)"
return
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't "require no_optimized_code" cover this? If so, we don't need to call check_for_any_flags here, right?

Comment thread gdb/testsuite/lib/rocm.exp Outdated
Comment on lines +172 to +197
global GDB_TESTCASE_OPTIONS
global GDB_TESTCASE_OPTIONS CFLAGS_FOR_TARGET CXXFLAGS_FOR_TARGET

if {[llength $flags] && [info exists GDB_TESTCASE_OPTIONS]} {
if {![llength $flags]} {
return 0
}

# Sources of compile flags to check, in order:
# - GDB_TESTCASE_OPTIONS (the standard GDB testsuite override).
# - CFLAGS_FOR_TARGET / CXXFLAGS_FOR_TARGET (used by some CI
# drivers, e.g. TheRock's test_rocgdb.py, to inject
# optimisation levels without going through
# GDB_TESTCASE_OPTIONS). DejaGnu populates these as Tcl
# globals from the runtest command line.
set sources [list]
foreach var {GDB_TESTCASE_OPTIONS CFLAGS_FOR_TARGET CXXFLAGS_FOR_TARGET} {
if {[info exists $var]} {
set val [set $var]
if {$val ne ""} {
lappend sources $val
}
}
}

foreach src $sources {
foreach flag $flags {
if {[lsearch -exact $GDB_TESTCASE_OPTIONS $flag] != -1} {
if {[lsearch -exact $src $flag] != -1} {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change needs to be its own commit. Please remember this is an upstream change, so we have to format things by their rules, and upstream gdb's community favors single-purpose commits.

Also, please don't mention test_rocgdb.py as that doesn't exist upstream. It's AMD-specific.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants