Skip to content

[Kernel][Quantization] feat: add Gluon kernels for AWQ quantization#1520

Open
AlpinDale wants to merge 1 commit into
mainfrom
gluon_awq
Open

[Kernel][Quantization] feat: add Gluon kernels for AWQ quantization#1520
AlpinDale wants to merge 1 commit into
mainfrom
gluon_awq

Conversation

@AlpinDale
Copy link
Copy Markdown
Collaborator

Still a WIP. Need to build triton from source.

$ apt install zlib1g-dev
$ git clone https://github.com/triton-lang/triton.git && cd triton
$ uv pip install -r python/requirements.txt
$ uv pip install -ve .  # may take a while, needs to download a 1.2 GiB llvm archive

$ APHRODITE_USE_GLUON_AWQ=1 aphrodite run Orion-zhen/Qwen3-0.6B-AWQ -q awq --dtype float16 --enforce-eager

Does not work yet, needs some quirks ironed out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant