Hi, I'm having trouble with flash attention package you used on ARM architecture. I wonder if you folks would be able to release a huggingface version or one that uses PyTorch only. Thanks!
Hi, I'm having trouble with flash attention package you used on ARM architecture.
I wonder if you folks would be able to release a huggingface version or one that uses PyTorch only.
Thanks!