fix: align LlamaContextParams/LlamaModelParams with llama.cpp b6862 ABI by Marvinthebored · Pull Request #40 · dianlight/gollama.cpp

Marvinthebored · 2026-06-07T01:24:44Z

Problem

LlamaContextParams matches a much older llama.cpp than the bundled b6862 libraries. By coincidence both layouts are 120 bytes, so nothing crashes — but every field in the first 40 bytes (and the bool tail) silently lands on the wrong C struct member, on every platform and both FFI paths (purego struct registration on darwin, libffi descriptors elsewhere — ffi.go encodes the same stale layout).

Go field (old)	actual C field it landed on (b6862)
`Seed`	`n_ctx`
`NCtx`	`n_batch`
`NBatch`	`n_ubatch`
`NUbatch`	`n_seq_max`
`NSeqMax`	`n_threads`
`NThreads`	`n_threads_batch`
`NThreadsBatch`	`rope_scaling_type`
`RopeScalingType`	`pooling_type`
`PoolingType`	`attention_type`
`AttentionType`	`flash_attn_type`
(aligned again from `rope_freq_base` through `abort_callback_data`)
`Logits`	`embeddings`
`Embeddings`	`offload_kqv`
`Offload_kqv`	`no_perf`
`FlashAttn`	`op_offload`
`NoPerf`	`swa_full`

Observed symptoms (darwin-amd64, official b6862 dylib):

cp.NUbatch = 512 → llama_init_from_model: failed to initialize the context: n_seq_max must be <= 256
cp.Embeddings = 1 actually enabled offload_kqv; real embeddings stayed false → llama_get_embeddings* returns NULL
Context_default_params() reported NCtx=2048 (that's n_batch's default read through the wrong offset)

Fix

LlamaContextParams: drop Seed (removed from llama_context_params upstream long ago; LLAMA_DEFAULT_SEED kept for samplers), add FlashAttnType (LLAMA_FLASH_ATTN_TYPE_{AUTO,DISABLED,ENABLED}), replace the bool tail with b6862's {embeddings, offload_kqv, no_perf, op_offload, swa_full, kv_unified}.
LlamaModelParams: add trailing NoHost (b6862).
ffi.go: both libffi struct descriptors updated to the same layout.
Hardcoded default fallbacks updated; new fields use llama.cpp's defaults.
Examples: drop obsolete ctxParams.Logits = 1 (no context-level logits flag in b6862; per-token logits via batch.logits).
ffi_test: assert on real fields instead of Seed (that assertion only passed because Seed was reading n_ctx).

Verification (darwin-amd64, official b6862 dylib)

With this patch, through the public API only:

defaults via gollama API: n_ctx=512 n_batch=2048 n_ubatch=512 n_seq_max=1 flash_attn_type=-1   ← correct llama.cpp defaults, correct fields
Init_from_model with mutated NCtx/NBatch/NUbatch/Embeddings/PoolingType → context created
Tokenize + Encode + Get_embeddings_ith → real 768-dim embedding (nomic-embed-text-v1.5 Q8_0), L2≈24.4

Note for darwin-amd64

Unrelated to this PR but adjacent: with the pinned purego v0.9.1, RegisterFunc(llama_init_from_model) panics with too many arguments on darwin-amd64 (the 120-byte by-value struct exceeds purego's stack-slot budget). Fixed upstream in purego via ebitengine/purego#425 (first available in v0.11.0-alpha.3). Happy to send that bump as a separate PR if wanted.

🤖 Generated with Claude Code

LlamaContextParams matched a much older llama.cpp than the bundled b6862 libraries. Both layouts happen to be 120 bytes, so calls succeed but every field after the divergence lands on the wrong C struct member: Go field (old) -> actual C field (b6862) Seed -> n_ctx NCtx -> n_batch NBatch -> n_ubatch NUbatch -> n_seq_max NSeqMax -> n_threads NThreads -> n_threads_batch NThreadsBatch -> rope_scaling_type RopeScalingType -> pooling_type PoolingType -> attention_type AttentionType -> flash_attn_type (aligned again from rope_freq_base ... abort_callback_data) Logits -> embeddings Embeddings -> offload_kqv Offload_kqv -> no_perf FlashAttn -> op_offload NoPerf -> swa_full (kv_unified unreachable) Observed: setting NUbatch=512 was rejected by llama_init_from_model with "n_seq_max must be <= 256"; setting Embeddings=1 actually enabled offload_kqv while real embeddings stayed false (llama_get_embeddings returns NULL). Changes: - LlamaContextParams: drop Seed (removed from llama_context_params long ago; LLAMA_DEFAULT_SEED remains for samplers), add FlashAttnType enum (LLAMA_FLASH_ATTN_TYPE_{AUTO,DISABLED,ENABLED}), replace the stale bool tail {Logits, Embeddings, Offload_kqv, FlashAttn, NoPerf} with b6862's {Embeddings, Offload_kqv, NoPerf, OpOffload, SwaFull, KvUnified}. - LlamaModelParams: add trailing NoHost bool (b6862). - ffi.go: update both libffi struct descriptors to the same layout (the non-darwin path marshals through these, so it had the same bug). - Hardcoded default fallbacks updated; new fields default to llama.cpp's values (flash_attn_type=AUTO, op_offload=true, swa_full=true, kv_unified=false). - Examples: drop obsolete `ctxParams.Logits = 1` (per-token logits are requested via batch.logits; the context-level flag no longer exists). - ffi_test: assert on real fields instead of Seed (that assertion only passed because Seed was reading C n_ctx). Verified on darwin-amd64 against the official b6862 dylib: defaults round-trip with correct values (n_ctx=512, n_batch=2048, n_ubatch=512, n_seq_max=1, flash_attn_type=-1), llama_init_from_model accepts mutated params, and Encode produces real embeddings through the public API. Note for darwin-amd64 users: RegisterFunc of llama_init_from_model needs purego >= v0.11.0-alpha.3 (ebitengine/purego#425) — with the pinned v0.9.1 registration panics with "too many arguments" regardless of this fix. Not bumped here to keep the change focused. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: align LlamaContextParams/LlamaModelParams with llama.cpp b6862 ABI#40

fix: align LlamaContextParams/LlamaModelParams with llama.cpp b6862 ABI#40
Marvinthebored wants to merge 1 commit into
dianlight:mainfrom
Marvinthebored:fix-context-params-b6862-abi

Marvinthebored commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Marvinthebored commented Jun 7, 2026

Problem

Fix

Verification (darwin-amd64, official b6862 dylib)

Note for darwin-amd64

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant