Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 48 additions & 3 deletions docs/API_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,12 @@ Block receives a `Langfuse::Config` object with these properties:
| `timeout` | Integer | No | `5` | HTTP timeout (seconds) |
| `cache_ttl` | Integer | No | `60` | Prompt cache TTL (seconds) |
| `cache_max_size` | Integer | No | `1000` | Max cached prompts |
| `cache_backend` | Symbol | No | `:memory` | `:memory` or `:rails` |
| `cache_backend` | Symbol | No | `:memory` | `:memory`, `:rails`, or `:auto` |
| `cache_lock_timeout` | Integer | No | `10` | Lock timeout (seconds) |
| `cache_stale_while_revalidate` | Boolean | No | `false` | Advisory SWR intent flag (effective activation depends on `cache_stale_ttl`) |
| `cache_stale_ttl` | Integer or `:indefinite` | No | `0` | Stale TTL (seconds, `>0` enables SWR) |
| `cache_refresh_threads` | Integer | No | `5` | Background refresh threads |
| `prompt_cache_observer` | Callable | No | `nil` | Prompt cache event hook |
| `batch_size` | Integer | No | `50` | Score + trace export batch size |
| `flush_interval` | Integer | No | `10` | Score + trace export interval (s) |
| `sample_rate` | Float | No | `1.0` | Trace + trace-linked score sampling rate (`0.0..1.0`) |
Expand Down Expand Up @@ -215,7 +216,7 @@ Fetch a prompt from Langfuse (with caching).
**Signature:**

```ruby
get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
get_prompt(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
```

**Parameters:**
Expand All @@ -227,6 +228,7 @@ get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
| `label` | String | No | Version label (e.g., "production") |
| `fallback` | String or Array<Hash> | No | Fallback prompt if not found (`String` for text, `Array<Hash>` for chat) |
| `type` | Symbol | Conditional | `:text` or `:chat` (required if `fallback` provided) |
| `cache_ttl` | Integer | No | Per-call cache TTL override. `0` bypasses cache read/write |

**Returns:** `Langfuse::TextPromptClient` or `Langfuse::ChatPromptClient`

Expand Down Expand Up @@ -257,14 +259,57 @@ prompt = client.get_prompt("new-prompt",

See [PROMPTS.md](PROMPTS.md) for complete guide.

### `Client#get_prompt_result`

Fetch a prompt and return cache metadata.

**Signature:**

```ruby
get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
```

**Returns:** `Langfuse::PromptFetchResult`

| Attribute | Type | Description |
| --------- | ---- | ----------- |
| `prompt` | TextPromptClient or ChatPromptClient | Prompt client |
| `logical_key` | String | Stable logical identity: name plus version or label/default production |
| `storage_key` | String | Backend key for the current cache generation |
| `cache_status` | Symbol | `:hit`, `:miss`, `:stale`, `:refresh`, `:bypass`, or `:disabled` |
| `source` | Symbol | `:cache`, `:api`, or `:fallback` |
| `fallback?` | Boolean | Whether fallback content was returned |

```ruby
result = client.get_prompt_result("greeting", label: "production")
result.prompt.compile(name: "Ada")
result.cache_status # => :miss
```

### Prompt Cache Operations

Flat client APIs for operational prompt cache control:

```ruby
client.refresh_prompt("greeting", label: "production", cache_ttl: 60)
client.invalidate_prompt_cache("greeting", label: "production")
client.invalidate_prompt_cache_by_name("greeting")
client.clear_prompt_cache
client.prompt_cache_stats
client.prompt_cache_key("greeting")
client.validate_prompt_cache_backend!
```

`invalidate_prompt_cache_by_name` and `clear_prompt_cache` use generation counters. Rails.cache entries from old generations are not scanned; they become unreachable and expire by TTL.

### `Client#compile_prompt`

Convenience method: fetch and compile in one call.

**Signature:**

```ruby
compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil)
compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
```

**Parameters:**
Expand Down
103 changes: 91 additions & 12 deletions docs/CACHING.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ For configuration options, see [CONFIGURATION.md](CONFIGURATION.md).
## Table of Contents

- [Overview](#overview)
- [Public Cache Operations](#public-cache-operations)
- [In-Memory Cache](#in-memory-cache-default)
- [Rails.cache Backend](#railscache-backend-distributed)
- [Stale-While-Revalidate (SWR)](#stale-while-revalidate-swr)
Expand All @@ -22,7 +23,71 @@ The Langfuse Ruby SDK provides two caching backends to optimize prompt fetching:
1. **In-Memory Cache** (default) - Thread-safe, local cache with TTL and bounded expiration-ordered eviction
2. **Rails.cache Backend** - Distributed caching with Redis/Memcached

Both backends support TTL-based expiration and stale-while-revalidate (SWR). Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks.
Both backends support TTL-based expiration, stale-while-revalidate (SWR), and logical generation-based invalidation. Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks.

## Public Cache Operations

`get_prompt` remains the normal prompt-returning API. Use `get_prompt_result` when you need cache metadata for logs, metrics, or operational validation:

```ruby
result = Langfuse.client.get_prompt_result("greeting", label: "production", cache_ttl: 60)

result.prompt # TextPromptClient or ChatPromptClient
result.logical_key # "greeting:production"
result.storage_key # Generated backend key for the current cache generation
result.cache_status # :hit, :miss, :stale, :refresh, :bypass, or :disabled
result.source # :cache, :api, or :fallback
result.fallback? # true when caller-provided fallback content was used
```

Per-call `cache_ttl` overrides the write TTL for that fetch. Passing `cache_ttl: 0` bypasses the cache read, fetches from the API, and does not retain the result:

```ruby
fresh = Langfuse.client.get_prompt_result("greeting", cache_ttl: 0)
fresh.cache_status # => :bypass
```

Use `refresh_prompt` when you intentionally want to bypass the read path and write the fresh prompt through to cache:

```ruby
result = Langfuse.client.refresh_prompt("greeting", label: "production")
result.cache_status # => :refresh
```

The operational cache APIs are flat on the client:

```ruby
Langfuse.client.invalidate_prompt_cache("greeting", label: "production")
Langfuse.client.invalidate_prompt_cache_by_name("greeting")
Langfuse.client.clear_prompt_cache

key = Langfuse.client.prompt_cache_key("greeting")
key.logical_key # => "greeting:production"
key.storage_key # Includes the current global and prompt-name generations

Langfuse.client.prompt_cache_stats
Langfuse.client.validate_prompt_cache_backend!
```

Cache identity is prompt name plus version or label. When neither is supplied, the logical identity defaults to the `production` label. Runtime variables never enter the cache key; the SDK caches the managed prompt template and compiles variables afterward.

Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL.

Automatic mutation invalidation only covers `create_prompt` and `update_prompt` calls made by the current SDK process. Prompt edits made in the Langfuse UI or by other SDKs become visible through TTL expiry, `refresh_prompt`, or explicit invalidation.

### Cache Events

Set `prompt_cache_observer` to receive cache events without binding the SDK to your metric names:

```ruby
Langfuse.configure do |config|
config.prompt_cache_observer = lambda do |event, payload|
Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status])
end
end
```

When `ActiveSupport::Notifications` is loaded, the SDK also instruments `prompt_cache.langfuse`. Event payloads include prompt name, version, label, logical key, storage key, backend, cache status, source, and error details when relevant.

## In-Memory Cache (Default)

Expand Down Expand Up @@ -102,6 +167,8 @@ Langfuse.configure do |config|
end
```

Use `config.cache_backend = :auto` only when you want the SDK to choose `:rails` if Rails and `Rails.cache` are present, otherwise `:memory`. The default remains `:memory`.

### Features

- **Distributed**: Shared cache across all processes and servers
Expand Down Expand Up @@ -499,10 +566,13 @@ RUN bundle exec rake langfuse:warm_cache_all

See [CONFIGURATION.md](CONFIGURATION.md) for all cache-related configuration options:

- `cache_backend` - `:memory` or `:rails`
- `cache_backend` - `:memory`, `:rails`, or `:auto`
- `cache_ttl` - Time-to-live in seconds
- `cache_max_size` - Max prompts (in-memory only)
- `cache_lock_timeout` - Lock timeout (Rails.cache only)
- `cache_stale_ttl` - Stale serving window; `> 0` enables SWR
- `cache_refresh_threads` - Background refresh worker count
- `prompt_cache_observer` - Optional cache event hook

## Performance Considerations

Expand Down Expand Up @@ -559,12 +629,11 @@ config.cache_backend = :rails
### 2. Enable SWR for Production

```ruby
# Development: disabled for predictable behavior
config.cache_stale_while_revalidate = !Rails.env.development?

# Production: enabled for best performance
if Rails.env.production?
config.cache_stale_while_revalidate = true # Advisory intent flag
config.cache_stale_ttl = config.cache_ttl # Set explicitly (common default)
else
config.cache_stale_ttl = 0 # Disabled for predictable prompt iteration
end
```

Expand All @@ -588,8 +657,14 @@ bundle exec rake langfuse:warm_cache_all
### 5. Monitor Cache Performance

```ruby
# Log cache hits/misses
Rails.logger.info "Fetching prompt: #{name} (cache: #{cache_hit? ? 'HIT' : 'MISS'})"
config.prompt_cache_observer = lambda do |event, payload|
Rails.logger.info(
event: event,
prompt: payload[:name],
status: payload[:cache_status],
source: payload[:source]
)
end
```

### 6. Handle Cache Failures Gracefully
Expand All @@ -607,9 +682,11 @@ prompt = Langfuse.client.get_prompt(

```ruby
# Rails console
Langfuse.client.api_client.cache&.clear
Langfuse.client.invalidate_prompt_cache("greeting", label: "production")
Langfuse.client.invalidate_prompt_cache_by_name("greeting")
Langfuse.client.clear_prompt_cache

# Or use rake task
# Or use the rake task
rake langfuse:clear_cache
```

Expand Down Expand Up @@ -658,8 +735,10 @@ end
**Solutions**:

1. Wait for TTL to expire
2. Clear cache manually: `rake langfuse:clear_cache`
3. Reduce `cache_ttl` in development
2. Refresh one prompt now: `Langfuse.client.refresh_prompt("greeting")`
3. Invalidate cached prompt entries: `Langfuse.client.invalidate_prompt_cache_by_name("greeting")`
4. Clear the prompt cache namespace: `Langfuse.client.clear_prompt_cache`
5. Reduce `cache_ttl` in development

### Stampede Protection Not Working

Expand Down
26 changes: 23 additions & 3 deletions docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ config.cache_max_size = 5000 # Large prompt library

#### `cache_backend`

- **Type:** Symbol (`:memory` or `:rails`)
- **Type:** Symbol (`:memory`, `:rails`, or `:auto`)
- **Default:** `:memory`
- **Description:** Cache storage backend

Expand All @@ -130,8 +130,13 @@ config.cache_backend = :memory

# Rails.cache (requires Rails + Redis)
config.cache_backend = :rails

# Opt in to automatic Rails.cache detection
config.cache_backend = :auto
```

`:auto` chooses `:rails` only when Rails and `Rails.cache` are present; otherwise it falls back to `:memory`. The gem default stays `:memory`.

**Requirements for `:rails` backend:**

- Rails must be defined
Expand Down Expand Up @@ -225,6 +230,20 @@ config.cache_refresh_threads = 10 # More threads for high-traffic apps

Only used when SWR is enabled (`cache_stale_ttl > 0`).

#### `prompt_cache_observer`

- **Type:** Callable or `nil`
- **Default:** `nil`
- **Description:** Observer hook for prompt cache events

```ruby
config.prompt_cache_observer = lambda do |event, payload|
Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status])
end
```

When ActiveSupport is loaded, the SDK also instruments `prompt_cache.langfuse`.

#### `batch_size`

- **Type:** Integer
Expand Down Expand Up @@ -599,8 +618,9 @@ Validation rules:

- `public_key` must be present
- `secret_key` must be present
- `cache_backend` must be `:memory` or `:rails`
- If `:rails`, Rails must be defined
- `cache_backend` must be `:memory`, `:rails`, or `:auto`
- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` must be available
- `prompt_cache_observer` must respond to `#call` (if set)
- `should_export_span` must respond to `#call` (if set)
- `mask` must respond to `#call` (if set)

Expand Down
9 changes: 5 additions & 4 deletions docs/ERROR_HANDLING.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ end
**Validation checklist:**
- `public_key` present and starts with `pk-lf-`
- `secret_key` present and starts with `sk-lf-`
- `cache_backend` is `:memory` or `:rails`
- If `:rails`, Rails is defined
- `cache_backend` is `:memory`, `:rails`, or `:auto`
- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` are available

### `Langfuse::UnauthorizedError`

Expand Down Expand Up @@ -432,8 +432,9 @@ puts config.inspect
### Check Cache State

```ruby
cache = Langfuse.client.api_client.cache
puts "Cache backend: #{cache&.class || 'disabled'}"
stats = Langfuse.client.prompt_cache_stats
puts "Cache backend: #{stats[:backend]}"
puts "Cache enabled: #{stats[:enabled]}"
```

### Test Credentials
Expand Down
4 changes: 2 additions & 2 deletions docs/MIGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -685,8 +685,8 @@ prompts:
```ruby
# In Rails console
Langfuse.reset! # Clears everything
# Or just clear cache
Langfuse.client.api_client.cache&.clear
# Or just clear the prompt cache namespace
Langfuse.client.clear_prompt_cache
```

### Problem: Variables not substituting correctly
Expand Down
25 changes: 24 additions & 1 deletion docs/PROMPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -553,7 +553,30 @@ Langfuse.configure do |config|
end
```

See [CACHING.md](CACHING.md) for advanced caching strategies (warming, stampede protection).
Override or bypass cache per call:

```ruby
prompt = client.get_prompt("greeting", cache_ttl: 300)
fresh = client.get_prompt_result("greeting", cache_ttl: 0)

fresh.cache_status # => :bypass
fresh.source # => :api
```

Use `get_prompt_result` and the flat cache operations when you need operational visibility:

```ruby
result = client.get_prompt_result("greeting")
result.cache_status # => :hit or :miss

client.refresh_prompt("greeting")
client.invalidate_prompt_cache("greeting")
client.invalidate_prompt_cache_by_name("greeting")
client.clear_prompt_cache
client.prompt_cache_stats
```

See [CACHING.md](CACHING.md) for advanced caching strategies, generation-based invalidation, cache events, warming, and stampede protection.

## Combining Prompts with Tracing

Expand Down
Loading
Loading