You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ML] Add quantized model ops to pytorch_inference allowlist (#2991)
Add aten::mul_ and quantized::linear_dynamic to the allowed operations
list, fixing validation failures for dynamically quantized models such
as ELSER v2 when imported via Eland with torch.quantization.quantize_dynamic.
Also update the model extraction tooling to support a "quantize" flag in
reference_models.json so that quantized variants are traced with dynamic
quantization applied before graph extraction, mirroring the Eland import
pipeline.
(cherry picked from commit 92432d6)
"_comment:quantized": "Quantized variants: Eland applies torch.quantization.quantize_dynamic on nn.Linear layers when importing models. These produce quantized::* ops not present in the standard traced graphs above.",
0 commit comments