Skip to content

Implement value barrier and apply to FrodoKEM#2431

Open
xuganyu96 wants to merge 31 commits into
mainfrom
ghsa-5c97-hp95-vg56
Open

Implement value barrier and apply to FrodoKEM#2431
xuganyu96 wants to merge 31 commits into
mainfrom
ghsa-5c97-hp95-vg56

Conversation

@xuganyu96
Copy link
Copy Markdown
Contributor

Summary

This pull request implements a OQS_MEM_BLACK_BOX macro, which serves as a frontend to a variety of implementations of value barriers:

  • Use inline assembly __asm__ volatile("" : "+r"(v) :) where available
  • Use a volatile round-trip that reads and writes every byte in the input buffer as a fallback

The black box is then applied to FrodoKEM's ct_select to prevent compiler from reasoning about selector when inlining ct_select.

Code change

value barrier implementation and application

  • src/common/common.h
  • src/kem/frodokem/external/util.c

Test harness for detecting cache timing side channel in FO-transformed KEM

  • tests/CMakeLists.txt
  • tests/kem_fo_cache_oracle.c
  • tests/kem_fo_cache_oracle.py

Thanks to @kaminuma for providing the initial PoC.

kem_fo_cache_oracle.c is a command-line program that measures the time it takes to read from a user-specified location of the KEM secret key after decapsulating some valid and/or invalid ciphertexts.
kem_fo_cache_oracle.py is a Python script that performs statistical analysis of the raw timing measurements from kem_fo_cache_oracle.c program.
Commands for using these two programs were documented in CONFIGURE.md.

Validation

Manually verify the absence of conditional memory reads from the compiled assembly with the following build commands:

# Apple Silicon, Homebrew Clang 18, MacOS 26.4
cmake -GNinja \
    -DCMAKE_C_FLAGS="-DOQS_DISABLE_MEM_BLACK_BOX" \
    -DCMAKE_C_COMPILER="/opt/homebrew/bin/clang-18" \
    -DCMAKE_BUILD_TYPE="MinSizeRel" \
    -DCMAKE_OSX_DEPLOYMENT_TARGET="26.0" \
    -DOQS_MINIMAL_BUILD="KEM_frodokem_640_aes" \
    .. 
    && ninja \
    && objdump -d lib/liboqs.a > liboqs.a.S
  • With -DOQS_DISABLE_MEM_BLACK_BOX, <OQS_KEM_frodokem_640_aes_decaps> contains a csel instruction.
  • Without -DOQS_DISABLE_MEM_BLACK_BOX, <OQS_KEM_frodokem_640_aes_decaps> does not contain csel instruction. This is true for all FrodoKEM parameters (with the OQS_MINIMAL_BUILD options removed).
  • Without -DOQS_DISABLE_MEM_BLACK_BOX, but modifying source code to force fallback, <OQS_KEM_frodokem_640_aes_decaps> does not contain csel either.

LLM disclosure

kem_fo_cache_oracle.py is generated with help from Claude.

  • [NO] Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
  • [NO] Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged. Also, make sure to update the list of algorithms in the continuous benchmarking files: .github/workflows/kem-bench.yml and sig-bench.yml)

@xuganyu96 xuganyu96 added this to the 0.16.0 milestone May 14, 2026
@xuganyu96 xuganyu96 force-pushed the ghsa-5c97-hp95-vg56 branch 2 times, most recently from 7ed7fe4 to 96c3a6b Compare May 14, 2026 23:45
@coveralls
Copy link
Copy Markdown

coveralls commented May 15, 2026

Coverage Status

coverage: 82.263% (-0.004%) from 82.267% — GHSA-5c97-hp95-vg56 into main

@dstebila dstebila moved this from Backlog to In progress in 0.16.0 prioritization May 15, 2026
dstebila
dstebila previously approved these changes May 15, 2026
@xuganyu96 xuganyu96 self-assigned this May 15, 2026
@xuganyu96 xuganyu96 marked this pull request as ready for review May 18, 2026 14:59
@xuganyu96 xuganyu96 requested review from baentsch and bhess as code owners May 18, 2026 14:59
Copy link
Copy Markdown
Member

@bhess bhess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xuganyu96 for adding this very useful feature.

I haven’t reviewed the FO cache oracle test in detail, but will this be run in CI? If not, is there a plan to run it regularly?

Regarding the countermeasures:

Use inline assembly __asm__ volatile("" : "+r"(v) :) where available
Use a volatile round-trip that reads and writes every byte in the input buffer as a fallback

I have two comments/questions:

  1. It might be good to document with a rationale or related work why they are effective.
  2. The fallback option (reading and writing every byte) seems potentially expensive if it is used with larger data structures than with the selector byte. Were there more efficient alternatives considered?

See also the comments inline.

Comment thread src/common/common.h
Comment thread src/common/common.h
Comment thread CONFIGURE.md Outdated
xuganyu96 and others added 17 commits May 26, 2026 15:09
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Amazon Linux 2023 ships with Clang 15 by default. PoC requires Clang 17
or above to work. Install with `sudo dnf install clang18` and use
CMAKE_C_COMPILER to specify clang-18 as the compiler.

Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
I can parameterize control location, too, but 63 is a safe value for KEM
secret keys since rejection symbols are probably 32 bytes in length

Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Python script is generated by Claude using Sonnet 4.6 extended. The
prompt:

Please help me write a Python script to perform statistical analysis
that determines whether two sets of samples come from the same
distribution. The input CSV file has header "epoch,sample,good,probe,ctrl`.

Some context for the data:
- We are trying to detect timing side channel in implementation of
  key encapsulation mechanisms (e.g. FrodoKEM). In some KEM schemes,
  the secret key contains a rejection symbol. In a faulty implementation,
  if the input ciphertext to a decapsulation routine is valid, then
  the rejection symbol is not used, and if the input ciphertext is
  invalid, then the rejection symbol is used and thus likely brought
  into CPU cache. We want to probe the time it takes to read the
  rejection symbol in the secret key and see if there is a cache
  timing channel.
- There are many epochs, in each epoch there are many samples
- Each sample represents one call to decapsulation. The `good` column
  is 1 if the ciphertext is valid, else it's 0. The `probe` column is
  the cycle count for reading from the rejection symbol. The `ctrl`
  column is the cycle count for reading another location that is
  guaranteed to have been used, which serves as a control

The statistical analysis should have the following components:
- For each epoch, discard the outliers. The percentage of outliers to
  be discarded should be tunable.
- Compute the relevant p-value
- Graph a violin plot

The Python script should have a CLI where the user can specify
percentage of outliers to discard, but also have a core routine that
other test modules can import for unit testing.

Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
xuganyu96 added 14 commits May 26, 2026 15:09
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Also only build kem_fo_cache_oracle only when
-DOQS_ENABLE_KEM_FO_CACHE_ORACLE=ON

Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Not sure why this section in CONFIGURE.md is duplicated after a rebase

Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
@xuganyu96
Copy link
Copy Markdown
Contributor Author

@bhess

will this be run in CI? If not, is there a plan to run it regularly?

I don't plan to run this on GitHub Actions because the cache timing test must run on bare metal for cache eviction to be meaningful and for timing measurements to be accurate. I've added a line to CONFIGURE.md explaining this.

It might be good to document with a rationale or related work why they are
effective.

I borrowed the inline ASM optimization barrier from mlkem-native. Here is one example. I am not sure if there is any official documents recommending this pattern, but it does seem to be the gold standard across many prominent projects.

The fallback option (reading and writing every byte) seems potentially
expensive if it is used with larger data structures than with the selector
byte. Were there more efficient alternatives considered?

The fallback option is meant to cover for Windows builds, since MSVC does not support inline assembly in the same way GNU/LLVM does. I plan to follow up in a subsequent patch to add an appropriate MSVC backend for optimization barrier. This round trip is just an unfortunate temporary band-aid.

@bhess
Copy link
Copy Markdown
Member

bhess commented May 28, 2026

Thanks @xuganyu96 for the update.

I borrowed the inline ASM optimization barrier from mlkem-native. Here is one example. I am not sure if there is any official documents recommending this pattern, but it does seem to be the gold standard across many prominent projects.

Can we document this somewhere (e.g., as code comment, reference to mlkem-native). Might be helpful to keep the reference.

#define ARGS_HELP_TEXT \
"Usage: %s <kem_name> <probe_loc>\n" \
"Arguments:\n" \
" kem_name: FrodoKEM-640-AES\n" \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just FrodoKEM-640-AES? Worthwhile listing all algs this could be applied to (be arguments)?

-----
.. code-block:: bash

python timing_analysis.py samples.csv
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Usage sample seems to disagree with the script's name. Intentional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

5 participants