Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
14 commits
Select commit Hold shift + click to select a range
b5230d5
feat(ai-models): tool_call capability flag + open model registry
softmarshmallow Jun 12, 2026
0c9fcb3
feat(agent): OpenAI-compatible endpoint providers with Ollama preset …
softmarshmallow Jun 12, 2026
131cfe1
feat(desktop): local-model settings, picker integration, capability g…
softmarshmallow Jun 12, 2026
1ce4bab
fix(agent): streaming usage + thinking-model caps, live Ollama e2e (#…
softmarshmallow Jun 12, 2026
1a80910
fix(editor): pin provider_id on sends of registered endpoint models (…
softmarshmallow Jun 12, 2026
22c759f
docs(editor): local models (Ollama) user guide with screenshots (#806)
softmarshmallow Jun 12, 2026
abb9310
feat(agent): auto-detect endpoint models (Ollama /api/tags probe) (#806)
softmarshmallow Jun 12, 2026
765fa9f
feat(agent): detect the context window too (/api/ps + /api/show) (#806)
softmarshmallow Jun 12, 2026
e409380
feat(agent): detection-owned model fields + overrides escape hatch (#…
softmarshmallow Jun 12, 2026
482a6ad
docs(desktop): trim local-models doc to one screenshot, fix stale cop…
softmarshmallow Jun 12, 2026
2730188
refactor(agent): dedup #806 seams — shared merge/gate/default-model h…
softmarshmallow Jun 12, 2026
5f7b70a
fix(agent): delete endpoint secret with the endpoint; pin provider on…
softmarshmallow Jun 12, 2026
7a46ea8
fix(agent): address CodeRabbit review — key slot UI, store races, rou…
softmarshmallow Jun 12, 2026
2375916
fix(agent): scope stale-model check to its endpoint; guard settings a…
softmarshmallow Jun 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,34 @@ http://localhost:*`. The nonce is generated in the proxy, exposed
the BYOK provider; key material never returns to renderer. Closes
the exfil path even if all four layers above were bypassed.

**Endpoint providers (local LLMs, #806).** The agent host additionally
serves `/providers/endpoints/*` — CRUD over user-configured
OpenAI-compatible endpoints (Ollama preset, self-hosted gateways),
persisted at `${userData}/endpoints.json`. The split that keeps layer 5
intact: an endpoint **config** (base URL + registered model list) is
plain readable config the renderer may list back, while an endpoint's
optional **API key** rides the `/secrets/*` surface under the endpoint's
id (the secrets-route allowlist admits configured endpoint ids) and is
never readable. The config validator
(`packages/grida-ai-agent/src/protocol/endpoints.ts`) pins the shape —
http(s) URL, bounded sizes, unknown fields dropped — so a config write
cannot smuggle credentials or blobs into the readable store. The
`base_url` is user-owned egress by design (the desktop user points their
own agent at their own endpoint — same trust model as BYOK), and the
routes sit behind the same CORS/Referer/Basic-Auth stack as everything
else. The `/providers/endpoints/probe` route makes the host GET a
user-supplied URL's model listing (the renderer's grida.co origin cannot
reach a local Ollama itself) — the same egress a configured run already
performs; responses are parsed and reduced to
`{id, tool_call, contextWindow}` rows with bounded reads (timeout + size
cap), never proxied raw. On sandboxed
platforms the srt network policy additionally bounds all of this
structurally: outbound to **localhost** is permitted via the
`allowLocalBinding` local-ip rule (how the user's own `ollama serve` is
reached), while a config pointing at an arbitrary **remote** host is
blocked unless that host is in the enumerated `allowed_domains` — a
hostile config cannot turn the sidecar into an open exfil channel.

**Electron-side hardening (mandatory; see the
[Electron security checklist](https://www.electronjs.org/docs/latest/tutorial/security)).**
`contextIsolation: true`, `nodeIntegration: false`, `sandbox: true`,
Expand Down
12 changes: 12 additions & 0 deletions desktop/src/preload.ts
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,18 @@ const bridge: DesktopBridge = {
},
},

providers: {
list_endpoints: () => agentClient.providers.list_endpoints(),
set_endpoint: async (config) => {
await agentClient.providers.set_endpoint(config);
},
delete_endpoint: async (id) => {
await agentClient.providers.delete_endpoint(id);
},
info: () => agentClient.providers.info(),
probe_endpoint: (baseUrl) => agentClient.providers.probe_endpoint(baseUrl),
},

agent: {
run: (opts, onChunk) =>
// Fresh runs always return a stream (only `reconnect` may return
Expand Down
8 changes: 8 additions & 0 deletions docs/editor/desktop/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Desktop",
"link": {
"type": "generated-index",
"title": "Grida Desktop",
"description": "Guides for the Grida Desktop app."
}
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
127 changes: 127 additions & 0 deletions docs/editor/desktop/local-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
title: Local Models (Ollama)
description: Run the Grida Desktop agent on AI models that live on your own machine — no account, no API key.
keywords:
- ollama
- local llm
- local ai
- byok
- grida desktop
- ai agent
format: md
doc_tasks:
- update
---

# Local Models (Ollama)

Grida Desktop's AI agent can run on models that live entirely on your own
machine, served by [Ollama](https://ollama.com). There is no account to
create and no API key to paste — your prompts, files, and the model's
responses never leave your computer.

You can use local models alongside provider keys (OpenRouter, Vercel), or
as your only setup.

## Requirements

- **Grida Desktop** installed.
- **Ollama** installed and running (`ollama serve` — the desktop Ollama app
runs it for you).
- At least one model pulled, for example:

```sh
ollama pull gpt-oss:20b
```

A note on expectations: local models vary widely in how well they drive
the agent. The agent leans on tool calling (reading and writing files,
running commands, planning), and small models often handle this poorly.
Models in the ~30B class and up are recommended for agent tasks.

## Set up Ollama

Open **Settings** from the app menu, find the **Local Models** card, and
click **Set up Ollama**. The base URL is prefilled with Ollama's local
address (`http://localhost:11434/v1`), and the models you have pulled are
detected automatically.

![The Local Models card after setup, with an auto-detected model and its context window and tool-support badges](./img/local-models-configured.webp)

Review the list and click **Save**:

- Each detected model shows its **context window** and **tool-calling**
support as read-only badges. These come from the endpoint itself and
refresh whenever you open Settings (and on **Detect**, useful after you
`ollama pull` a new model). For a model that is currently loaded, the
context window is the size your server actually allocated; otherwise it
is the model's maximum.
- A model you add manually by id (for example on a gateway that doesn't
report capabilities) keeps editable fields instead — there, you are
the data source. Manually added models default to a conservative
`8192` context.

The first model in the list is the default — background work like session
titles and summaries also runs on it.

## Use a local model

Registered models appear in the model picker in every agent composer,
grouped under the endpoint name (for example `gpt-oss:20b · Ollama`).
Pick one and chat as usual. Everything the agent does — reading your
workspace files, making edits, planning — runs against the local model.
Each session remembers the model it ran with.

If you have no provider key configured at all, the agent uses your Ollama
setup automatically.

## Models without tool support

The agent works through tool calls, so a model that cannot make them
loses most of its abilities. Tool support is detected per model — Ollama
reports it, and `ollama show <model>` lists `tools` when a model supports
tool calling. When you select a model without tool support, the composer
shows a warning, but you can still chat with it.

## Troubleshooting

- **The model errors immediately.** Check that Ollama is running: open
`http://localhost:11434` in a browser — it should answer
`Ollama is running`.
- **A model is missing from the picker.** Only registered models appear.
Click **Detect** in **Settings → Local Models** after pulling a new
model, or add its id manually.
- **Long sessions stop or degrade.** The detected context window may be
larger than what your serving configuration actually allows (it
converges to the served size once the model has been loaded). To pin a
smaller value, set an override in the config file — see below.
- **Slow responses.** Local speed is your hardware's speed. Smaller
models respond faster but handle agent tasks worse.

## Other OpenAI-compatible endpoints

The base URL accepts any OpenAI-compatible server on your machine, so a
local gateway such as LiteLLM or vLLM works the same way: point the base
URL at it and register the models it serves. If the gateway needs an API
key, save it in the card's **API key** field (it appears once the
endpoint is saved) — the key is stored by the agent host and never shown
back. Ollama itself needs no key.

## Advanced: the config file

Everything on this page is stored as plain JSON in `endpoints.json` (the
settings card links to it). Detected values refresh automatically, so
hand-edits to them won't stick — if an endpoint reports a value that is
wrong for your setup (for example, your server caps context below the
model's maximum), pin the correction in the model's `overrides` instead.
Overrides always win over detected values, and detection never touches
them:

```json
{
"id": "gemma4:31b-mlx",
"tool_call": true,
"contextWindow": 262144,
"overrides": { "contextWindow": 32768 }
}
```
Loading
Loading