This example demonstrates generate high-dimensional embedding vector of a given text with llama.cpp.
To get started right away, run the following command, making sure to use the correct path for the model you have:
./llama-embedding -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>/dev/nullllama-embedding.exe -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>$nullThe above command will output space-separated float values.
| description | formula | |
|---|---|---|
| none | ||
| max absolute int16 | ||
| taxicab | ||
| euclidean (default) | ||
| p-norm |
| description | ||
|---|---|---|
| '' | same as before | (default) |
| 'array' | single embeddings | |
| multiple embeddings | ||
| 'json' | openai style | |
| 'json+' | add cosine similarity matrix |
| "\n" | (default) |
| "<#embSep#>" | for exemple |
| "<#sep#>" | other exemple |
./llama-embedding -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2 --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/nullllama-embedding.exe -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2 --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null