Skip to content

Extend state representation to track errno across control-flow#1284

Draft
antoniofrighetto wants to merge 3 commits intoAliveToolkit:masterfrom
antoniofrighetto:feature/handle-errno-for-malloc
Draft

Extend state representation to track errno across control-flow#1284
antoniofrighetto wants to merge 3 commits intoAliveToolkit:masterfrom
antoniofrighetto:feature/handle-errno-for-malloc

Conversation

@antoniofrighetto
Copy link
Copy Markdown
Contributor

An attempt to handle errno in order to prevent Alive2 from treating store-to-load forwarding as valid refinements, if may alias errno. This is achieved by adding a new SMT expression to model errno meant to be used on malloc failure. Also treat malloc et alia as may write errno via LLVM errnomem location kind while translating from LLVM IR.


A thank to Nuno for direction on this!

An attempt to handle `errno` in order to prevent Alive2 from treating
store-to-load forwarding as valid refinements, if may alias `errno`.
This is achieved by adding a new SMT expression to model `errno` meant
to be used on malloc failure. Also treat malloc et alia as may write
errno via LLVM `errnomem` location kind while translating from LLVM IR.

// Create a new symbolic variable that represents errno if the allocation
// fails.
if (blockKind == MALLOC || blockKind == CXX_NEW) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don’t seem to model realloc as of now, correct?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do support realloc!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, there doesn't appear to exist a separate REALLOC block kind, instead we seem to reuse MALLOC one when encoding realloc calls (AllocKind::Realloc) into SMT expressions:

alive2/ir/instr.cpp

Lines 2774 to 2775 in 2cb1034

auto [p_new, allocated]
= m.alloc(&size, getAlign(), Memory::MALLOC, np_size, nonnull);

Comment on lines +2339 to +2340
expr errno_on_failure = expr::mkFreshVar("#malloc_errno",
expr::mkUInt(0, 32));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m experiencing the following false positive for test attrs/allocsize.srctgt.ll:

declare ptr @my_malloc(i32) allocsize(0)

define ptr @src() {
#0:
  %p = call ptr @my_malloc(i32 23) allocsize(0)
  ret ptr %p
}
=>
declare ptr @my_malloc(i32) allocsize(0)

define ptr @tgt() {
#0:
  %p = call ptr @my_malloc(i32 23) dereferenceable_or_null(23) allocsize(0)
  ret ptr %p
}


src_state.getReturnErrno():
(ite (or malloc_never_fails |#alloc_nondet_nonnull!3|)
     errno_init
     |#malloc_errno!4|)



tgt_state.getReturnErrno():
(ite (or malloc_never_fails |#alloc_nondet_nonnull!6|)
     errno_init
     |#malloc_errno!7|)

Transformation doesn't verify!

ERROR: Mismatch in errno at return

Possibly we shouldn’t use fresh unique symbolic variables for malloc_errno: I tried switching from using mkFreshVar w/ addNonDetVar to using the dedicated helper getFreshNondetVar instead. This fixes this issue but in doing so we don’t ; ERROR: Mismatch in errno at return anymore for the two new added tests (so we allow the refinement as valid). Any idea on what I'm missing?

@regehr
Copy link
Copy Markdown
Contributor

regehr commented Feb 4, 2026

this is super cool, I never thought about errno being something special we'd need to handle!


declare noalias ptr @malloc(i64) memory(inaccessiblemem: readwrite, errnomem: write)

; ERROR: Mismatch in errno at return
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the expected behavior. LLVM wants to allow removal of allocation functions.

This is annoying, maybe we need to model alloc functions as written to errno only non-deterministically?

@RalfJung @nikic

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally consider allocation functions to only be called non-deterministically I think -- see llvm/llvm-project#177592.

ir/state.cpp Outdated
}

expr errno_on_write = expr::mkFreshVar((name + "#errno").c_str(),
expr::mkUInt(0, 32));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of mkUInt, you can use errno_val.
I would also move this call inside mkIf_fold to avoid creating the expression altogether

ir/state.h Outdated
std::variant<smt::DisjointExpr<StateValue>, StateValue> return_val;
std::variant<smt::DisjointExpr<Memory>, Memory> return_memory;
std::set<smt::expr> return_undef_vars;
smt::expr return_errno;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use the std::variant trick above to keep a DisjointExpr during state execution. This can shrink final expr size.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks (I originally thought this wasn't necessary for a simple smt::expr).

@RalfJung
Copy link
Copy Markdown

RalfJung commented Feb 5, 2026

An attempt to handle errno in order to prevent Alive2 from treating store-to-load forwarding as valid refinements, if may alias errno.

What makes errno special here?

@antoniofrighetto
Copy link
Copy Markdown
Contributor Author

antoniofrighetto commented Mar 8, 2026

An attempt to handle errno in order to prevent Alive2 from treating store-to-load forwarding as valid refinements, if may alias errno.

What makes errno special here?

I think, per se, nothing makes errno semantically special here, it's just observable global state. Currently, Alive2 treats malloc has side-effect-free wrt errno, meaning that folding %v to 0 here:

store i32 0, ptr %p
call ptr @malloc(i64 INT64_MAX) ; may set errno
%v = load i32, ptr %p

would be allowed; except that it should not, as %p may happen to alias errno. On a discussion with Nuno, this may be modelled with varying fidelity. Particularly, it was determined to use a single symbolic variable (malloc_failure), and have every failed allocation set errno to such a variable. This RFC shows how errno, now part of MemoryEffects, is in the process of being modelled in LLVM.

@RalfJung
Copy link
Copy Markdown

RalfJung commented Mar 8, 2026

So the idea is that errno is the only global state observable by the program that malloc is allowed to mutate, and it may also only be mutated on allocation failure (and presumably, malloc is never allowed to itself observe errno, it can only be indiscriminantly overwritten)?

@antoniofrighetto
Copy link
Copy Markdown
Contributor Author

Yes, this at least is my understanding. malloc will start, hopefully, being declared as memory(inaccessiblemem: readwrite, errnomem: write) some time soon in LLVM: the inaccessiblemem part is memory outside the control of LLVM, and errno should be the only program global-visibile state malloc may mutate (and, indeed, from the new annotation, malloc should only be allowed to write to it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants