Skip to content
This repository was archived by the owner on Jun 21, 2026. It is now read-only.

Some speedup with SSE 4.1#340

Draft
jpcima wants to merge 1 commit into
sfztools:developfrom
jpcima:sse-opt
Draft

Some speedup with SSE 4.1#340
jpcima wants to merge 1 commit into
sfztools:developfrom
jpcima:sse-opt

Conversation

@jpcima

@jpcima jpcima commented Aug 1, 2020

Copy link
Copy Markdown
Collaborator

This speeds up sample_quality=2 by 15 to 20%, using SSE4.1 dot-product primitive, and avoiding a bit of instruction latency.
Just for illustrating, this optimization should be made CPU-dispatched.
Possibly strings can benefit from a similar optimization.

@jpcima jpcima force-pushed the sse-opt branch 3 times, most recently from 437892d to be7dad3 Compare August 1, 2020 11:02
@paulfd

paulfd commented Aug 2, 2020

Copy link
Copy Markdown
Member

Nice one! With the runtime dispatch I think we can target interesting optimization like this. Do you want that I benchmark it on Intel/AMD? I was thinking on working on ARM in the holidays, if I have some time 🙂

@paulfd

paulfd commented Aug 2, 2020

Copy link
Copy Markdown
Member

btw the meanSquared SIMD helper is also a dot product.

@paulfd

paulfd commented Dec 26, 2020

Copy link
Copy Markdown
Member

Considering the simde version you proposed, is this speedup obsolete? We could maybe have a runtime dispatcher.

@jpcima

jpcima commented Dec 26, 2020

Copy link
Copy Markdown
Collaborator Author

Considering the simde version you proposed, is this speedup obsolete? We could maybe have a runtime dispatcher.

It's by no means obsolete but it would be desirable to have the cpu dispatcher.

From experimenting with the strings effect, I discovered that one can extract great speed benefits from loop unrolling, and more so when coupled with some inlining. (some greater than 4x on SSE, which might be explained by latency effects of memory or individual instructions)
I'd like the same to be experimented with the resampler; but the simde PR should be dealt with first.

@paulfd

paulfd commented Dec 26, 2020 via email

Copy link
Copy Markdown
Member

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants