Skip to content

Comments

aarch64: Major backend overhaul, ABI compliance, and Apple Silicon support#47

Open
hakanrw wants to merge 27 commits intoPortableCC:masterfrom
hakanrw:aarch64-fixes
Open

aarch64: Major backend overhaul, ABI compliance, and Apple Silicon support#47
hakanrw wants to merge 27 commits intoPortableCC:masterfrom
hakanrw:aarch64-fixes

Conversation

@hakanrw
Copy link

@hakanrw hakanrw commented Feb 21, 2026

This PR provides a significant update to the AArch64 backend, effectively purging legacy ARMv7/VFP/FPA assumptions and bringing the implementation into alignment with modern AAPCS64 and Darwin ABIs.

The primary focus was correcting core code generation inaccuracies, fixing several Internal Compiler Errors (ICEs), and implementing missing ABI features such as proper struct passing and SIMD floating-point support.

The backend now also supports Darwin ABI (Apple Silicon). The compiler can now be used to compile C99 programs in Apple Silicon devices.

Another PR will follow in the pcc-libs repo (https://github.com/PortableCC/pcc-libs), as some arch-specific changes are necessary in CRT files for GNU/Linux.

Key Changes

Architecture & ABI Foundations

  • Purged ARMv7 Legacy: Removed 32-bit assumptions in stack frame setup, parameter passing, and instruction tables.

  • AAPCS64 & Darwin Support: Implemented correct calling conventions for structs and varargs.

  • Floating Point: Replaced old VFP/FPA logic with correct SIMD implementations. Fixed floating-point extensions.

  • Softfloat: Implemented IEEE-754 binary128 softfloat handler in common code.

Code Generation

  • Literal Pools: Introduced literal pools in pass2 for immediates exceeding MOV instruction limits.

  • OREG Logic: Refactored myormake() and offstar() to prevent undefined register behavior and capped OREG immediates to 256 bytes.

  • Tree Folding: Fixed a critical bug in ccom where node types were not preserved during UMUL & ADDROF combination folding, which previously caused scalar loads to be treated as structure loads.

Bug Fixes

  • Symbol Handling: Fixed a major ICE caused by macro discrepancies in symbol table definitions (sap vs sss).

  • Struct Initialization: Fixed a segfault in simpleinit() related to STASG operations wrapped in UMUL.

  • Various other fixes related to symbol handling and LANG_C/LANG_CXX divergence in local.c.

Current Status

Per my tests, the backend is now significantly more stable, with no observed ICEs or segfaults during the test suite execution.

I chose (https://github.com/c-testsuite/c-testsuite) as a test suite, as previously I was not aware we had our own PCC test suite. I am providing the results of the c-testsuite runs for both AAPCS64 and Darwin ABIs, and a table as a summary. If required, we can run the tests again with the pcc-tests suite.

Target Pass Rate Notes
aarch64 AAPCS64 (GNU/Linux) 197/220 Miscompilations and minor assembly/preprocessor issues remaining.
aarch64 Darwin (macOS) 196/220 Miscompilations and minor assembly/preprocessor issues remaining.

pcc_aarch64_aapcs64_test.txt
pcc_aarch64_darwin_test.txt

What Works

  • Hard/Soft floating point (both ABIs).

  • Complex struct passing/returning, including >16 byte and <=16 byte structs. (both ABIs)

  • Calling vararg functions.

  • Permanent register saving and ABI-compliant stack frames.

Known Limitations

  • Callee-side setup for vararg functions is still pending.

  • Calling via function pointers requires an instruction table update. Currently funptr calls emit old, ARMv7-style instrs.

  • Long double is not implemented properly on Linux. Missing AArch64-specific 128-bit instructions for long double.

  • HFA (Homogenous Floating-point Aggregate) struct passing is not implemented yet (both ABIs). HFAs are passed as if they are normal structs.

Ongoing Maintenance

I intend to continue contributing to the AArch64 backend and will address the remaining issues and limitations incrementally. I am submitting this patch series now, despite the remaining test failures, due to the upcoming academic semester; my availability to dedicate significant time to backend development will be reduced shortly. Providing these changes now prevents further divergence between the upstream repository and my local development branch.

Licensing and Claims

All changes included in this PR are provided under the original permissive licenses of their respective files. I make no proprietary claims regarding these contributions.

hakanrw and others added 27 commits February 21, 2026 16:11
Signed-off-by: Hakan Candar <hakan@candar.tr>
Add /usr/include/aarch64-none-linux-gnu to STDINC, as glibc expects this
to be available in the include path.

Signed-off-by: hakanrw <hakancandar@protonmail.com>
Signed-off-by: Hakan Candar <hakan@candar.tr>
This broke symtab definition across local.c and symtabs.c, which lead to major ICE.

Refactor to use `sss` instead of `sap`, similar to amd64 backend.

Also fix segmentation fault in exname() caused by sp->sname not being set to NULL
during initialization.

Signed-off-by: Hakan Candar <hakan@candar.tr>
…anisms.

Remove 32-bit ARMv7 assumptions.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: Hakan Candar <hakan@candar.tr>
Remove 32-bit ARMv7 assumptions.

Signed-off-by: Hakan Candar <hakan@candar.tr>
…and WORD.

Signed-off-by: Hakan Candar <hakan@candar.tr>
The Aarch64 ISA does not have xN,xN,[xN,#p] style OREG additions.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Remove geninsn() calls in myormake() and move them to offstar().
Calling instruction gen. on myormake leads to unwanted behaviour, where
the result register becomes undefined. No other backends call geninsn()
in myormake(), so I assume this was a previous backend bug.

Cap OREG immediate to +-256 bytes. AArch64 has a lot of instructions with
different OREG limits, but the least common denominator is 256.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Turn ICONs that are of INT type in PTR PLUS operations into LONGLONG.

Convert unnecessary ADDROF operations wrapped in an UMUL into a no-op.

Signed-off-by: Hakan Candar <hakan@candar.tr>
…MOV instruction.

Convert ICONs of such immediates into NAMEs, and print the literals into a pool
at the beginning of the function.

This logic is put into pass2 and not pass1 because the middleend generates
new ICONs (i.e. due to TEMP conversions) which can not be detected during pass1.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Remove 32-bit ARMv7 assumptions.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: Hakan Candar <hakan@candar.tr>
The previous optimization removed integer SCONV nodes solely based on equal
tsize(), which is wrong for same-width signedness conversions (e.g., char ->
unsigned char) and caused miscompiles under -O1/-xtemps. Restrict the "free
conversion" elision to (u)longlong <-> (u)longlong only.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Prefix extern symbols with '@'. Expand to GOT load in pass2 via local2.c.

Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: Hakan Candar <hakan@candar.tr>
Signed-off-by: hakanrw <hakancandar@protonmail.com>
Signed-off-by: hakanrw <hakancandar@protonmail.com>
Signed-off-by: Hakan Candar <hakan@candar.tr>
Add condition which prevents char arrays from segfaulting the compiler.
The previous solution now applies to both arm and aarch64 backends.

Signed-off-by: Hakan Candar <hakan@candar.tr>
When buildtree(ASSIGN, strtype, strtype) is called, it generates a STASG
operation wrapped in an UMUL operation. Due to this, the righthand side
of the returned node is NULL. This part of the code had not accounted
for this case, which resulted in a segfault.

Check whether the node returned from the ASSIGN build is UMUL, and if so,
discard the parent node.

Signed-off-by: Hakan Candar <hakan@candar.tr>
During the ADDROF elimination optimization, the resulting node type
was not preserved when rewriting the tree. This could cause loads
to be interpreted as STRUCT loads instead of scalar loads.

The issue appears when accessing the first field of a struct in the
first element of a struct array, where UMUL and ADDROF transformations
may alter the node type. When collapsing the ADDROF expression, the
replacement node must inherit the type of the UMUL result.

Fix by explicitly propagating the type from the parent node to the
replacement node before freeing the intermediate nodes.

An example of the previous misoptimization:

U*, int
    +, PTR int
	U&, PTR strty
	    NAME _arr, strty
	ICON, 0, 0x0, longlong

=>

NAME _arr, strty, REG x0 , SU= 0(@reg,,,,)

This commit fixes the above misoptimization.

Signed-off-by: Hakan Candar <hakan@candar.tr>
In code.c, the field sym->sap was getting rewritten to sym->sss,
which prevented setting sym->sap to NULL. This would in turn
cause a segfault in certain scenarios.

Refactor code.c, remove sap expansion from LANG_C, instead expand
sss from LANG_CXX. Rename sap access to sss to match intended
behavior.

Signed-off-by: hakanrw <hakancandar@protonmail.com>
Signed-off-by: Hakan Candar <hakan@candar.tr>
@ragge0 ragge0 self-assigned this Feb 21, 2026
@ragge0 ragge0 added the enhancement New feature or request label Feb 21, 2026
@ragge0
Copy link
Contributor

ragge0 commented Feb 21, 2026

Wow, impressing patch! I'll try to find time tomorrow to go through this! Thanks!

@hakanrw
Copy link
Author

hakanrw commented Feb 21, 2026

Hey, thanks for your interest in the patch!

I've just sent the other patch to pcc-libs (PortableCC/pcc-libs#2).
With these two, I am able to bootstrap aarch64-none-linux-gnu PCC along with libpcc & PCC CSU.

Since the backend still misses some features (i.e. funptr calls) and few bugs remain, I build the compiler and the pcc-libs with GCC. After the remaining backend issues are handled, hopefully we'll be able to self-bootstrap with PCC.

Have a nice day!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants