Stability changes to SIMD code (building for Clang + on the M1). #108

joerowell · 2022-07-17T14:59:11Z

TL;DR: This PR makes some small structural changes to the SIMD code. The code works as before, but it's slightly closer to standard C++. Might also be worth considering if we want to test how fast a pure C++ version of the bucketer is. Big thanks to both @ElenaKirshanova and @malb for helping me to debug some of these issues and fixing some of the build code respectively.

This PR fixes the SIMD code on Clang and ARM machines. It turns out that some of the SIMD code didn't build compile with Clang or crashed on the M1. This meant that some changes had to be made to e.g the shuffling code. There were also some type-punning differences that I had to fix (see all of the extra calls to memcpy).

Note that this PR doesn't add full support for G6K to the M1. All of the sieves work, except for the HK3 sieve in a multi-threaded setting (there's a use-after-free crash). This is something we hope to fix.

As before, the tests for this code are here.

Note that as part of implementing this, I had to re-implement every Intel intrinsic that we use in standard C++ (so that I could test the intrinsics on ARM). This means that we could switch over to a pure C++ implementation of the bucketer if needed: we just have to switch out the calls to the intrinsics.

Finally, there's one outstanding curiousity: the vectorised random number code (that I wrote) sometimes gets stuck in a fixed point. I just replaced this with two calls to rand, and I didn't notice any performance differences. This might be something to look into.

ElenaKirshanova · 2022-07-18T07:43:07Z

Tested on M1 with Martin's setup.py from #107
All compiles well.

lducas · 2023-12-02T06:11:49Z

Any reason we haven't merged this yet ??

joerowell · 2023-12-02T12:34:41Z

Afaik, no one ever reviewed it. However, since it's a bit old now, I'll rebase with the most recent Cython changes and see if it still works.

…

On Sat, 2 Dec 2023, 06:12 Léo Ducas, ***@***.***> wrote: Any reason we haven't merged this yet ?? — Reply to this email directly, view it on GitHub <#108 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF7CASGC7DLYU3ACGWNDJ4LYHLBDBAVCNFSM53Z4DK42U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTG4YDKNRSGQZA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

joerowell force-pushed the arm-fixes branch from 65f118c to 744b7ef Compare July 18, 2022 06:08

joerowell added 4 commits August 4, 2024 19:05

Stability changes to SIMD code.

eee08fc

Small changes for building on ARM.

52b9a6d

Remove spurious initialisation.

15efe9d

Stability changes to SIMD code.

e848580

joerowell force-pushed the arm-fixes branch from 1fd745f to e848580 Compare August 4, 2024 18:10

joerowell mentioned this pull request Aug 4, 2024

Installing g6k (arm-fixes) on M1: suggestion and error report #128

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stability changes to SIMD code (building for Clang + on the M1). #108

Stability changes to SIMD code (building for Clang + on the M1). #108

Uh oh!

joerowell commented Jul 17, 2022

Uh oh!

ElenaKirshanova commented Jul 18, 2022

Uh oh!

lducas commented Dec 2, 2023

Uh oh!

joerowell commented Dec 2, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Stability changes to SIMD code (building for Clang + on the M1). #108

Are you sure you want to change the base?

Stability changes to SIMD code (building for Clang + on the M1). #108

Uh oh!

Conversation

joerowell commented Jul 17, 2022

Uh oh!

ElenaKirshanova commented Jul 18, 2022

Uh oh!

lducas commented Dec 2, 2023

Uh oh!

joerowell commented Dec 2, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants