Skip to content

larger gemma models with fp8 are generating junk #128

@programmeddeath1

Description

@programmeddeath1

Hardware:
GPU: NVIDIA A100-SXM4-80GB | VRAM: 85.1 GB

Profile: a100-80gb-27b
Target: "google/gemma-4-31b-it",
"z-lab/gemma-4-31B-it-DFlash"
"google/gemma-4-26b-a4b-it",
"z-lab/gemma-4-26B-A4B-it-DFlash"
Draft: z-lab/Qwen3.6-27B-DFlash
MemFrac: 0.82 | MaxCtx: 8192 | Quant: fp8

Tried dflash with vllm for both the 31B and 26B gemma models. They both are giving junk output like below
"''' ’/ {-/ de///// de/ de// de/ de// own///// de own own own own de own own own/ de own/ deala owned single own or single owned de own/ de owned de own own own own ownes/ de/ersas laesalen_- {laesal немиеL-////////////////// de own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own own/ -/////ed singleed la la/L or,, la own or,,, or own own///////// de/ de/ de/ de// own own/eseaestasasasasasasaest_теyens own own/aasasaled single own own own////////L own ownaestened//edested,,,,, own,/_ {//_ own own own own own own own own own own own own own own//// own own own own own own own ownK, own own own own own own own own own own own/ own own own own own own ownednessyens own ownest_////// single own own ownedness_L//ationsed/ed//////// or/ de own own own own ownown kind own ownnessy/1 own own own own own own own own own own own own own own own own own own own own own own own own own own/ de or own/ own own own/’// own own own own ownK single, or or or or//al or, {ed single or//ed////edesie or {ed single ownednessy/L own orness_//siesSest_Sersalen own own aestednessy__siesasasalsalness/_ {ednessalsalness/_ own or own oralsalsalsalness/ {ed or oraenness/ {_ed de,/// KC deK1KKerTKalami same orPLTP LPLBS deCist idea//est_TBK deCKist deBernBP deKTTiBB1facebooktans singleT_1/ Interess idea1TT deC single1S single deHB deS de не_ingingsP same kind orK kind la/S singular own orK kind laP, kind de ownP idea// laBCistPP idea orB single/K-/K/P idea, ownH/L singleS/CSKC/T single kinded single/ de person laK/SKP/ de/C singleSK1 personH-S singularingS/ de ผ้า single/CerT l,KB/welcomeB,P/C deBT/T/ kind Mereka small ownL ideaeTist deed,PitoeSKSKLL or deP-C demise que or own de,LSlyP,,,P, or or or orB singleLP/eLPL oneP queingH ideawood own orSáCLP singleed laTist idea own kind small/ististist person la oneingLlyPPC sameoczes deC sameT ownTTT."""

Both are loading successfully but generate this kind of output for all requests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions