Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- Add SDK identity headers for REST and OTLP requests.
- Add prompt deletion with prompt-name-wide cache invalidation.
- Add dependency-light media references, media upload helpers, and media reference resolution.
- Add flat read/admin APIs for sessions, observations v2, scores v2, score configs, models, metrics v2, and health.

### Documentation
- Add the JS/Python SDK parity matrix and document the new prompt, media, and read/admin APIs.

## [0.10.1] - 2026-05-05

### Changed
Expand Down
252 changes: 252 additions & 0 deletions docs/API_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Complete method reference for the Langfuse Ruby SDK.
- [Trace ID Generation](#trace-id-generation)
- [Tracing & Observability](#tracing--observability)
- [Traces](#traces)
- [Media References](#media-references)
- [Read/Admin APIs](#readadmin-apis)
- [Scoring](#scoring)
- [Datasets](#datasets)
- [Experiments](#experiments)
Expand Down Expand Up @@ -422,6 +424,47 @@ prompt = client.update_prompt(
)
```

### `Client#delete_prompt`

Delete prompt versions and invalidate cached variants for the prompt name.

**Signature:**

```ruby
delete_prompt(name, version: nil, label: nil) # => nil
```

**Parameters:**

| Parameter | Type | Required | Description |
| --------- | ------- | -------- | ----------- |
| `name` | String | Yes | Prompt name |
| `version` | Integer | No | Specific prompt version to delete |
| `label` | String | No | Delete versions matching this label |

**Returns:** `nil`

**Raises:**

- `NotFoundError` if prompt/version/label is not found
- `UnauthorizedError` if credentials invalid
- `ApiError` on network/server errors

**Example:**

```ruby
# Delete all versions for a prompt
client.delete_prompt("support-assistant")

# Delete only one version
client.delete_prompt("support-assistant", version: 3)

# Delete versions carrying a label
client.delete_prompt("support-assistant", label: "staging")
```

After a successful delete, `delete_prompt` invalidates all cached variants for that prompt name. Edits made outside this SDK still become visible through TTL expiry, `refresh_prompt`, or explicit invalidation.

### `Client#list_prompts`

List all prompts in the project.
Expand Down Expand Up @@ -909,6 +952,215 @@ trace = client.get_trace("trace-uuid-123")
puts trace["name"]
```

## Media References

Media references let trace input, output, and metadata carry large media content by reference instead of embedding raw bytes directly in every payload.

### `Langfuse::Media`

Wrap bytes, a file, or a base64 data URI and expose the same deterministic reference-string shape used by the JS and Python SDKs.

**Signatures:**

```ruby
Langfuse::Media.new(content_bytes:, content_type:)
Langfuse::Media.new(file_path:, content_type:)
Langfuse::Media.new(base64_data_uri:)
```

**Properties:**

| Property | Type | Description |
| -------- | ---- | ----------- |
| `content_type` | String | MIME type |
| `content_bytes` | String | Raw bytes |
| `content_length` | Integer | Byte length |
| `content_sha256_hash` | String | Base64 SHA256 digest |
| `media_id` | String | Deterministic Langfuse media ID derived from the SHA256 digest |
| `reference_string` | String | `@@@langfuseMedia:...@@@` reference token |
| `tag` | String | Alias for `reference_string` |
| `base64_data_uri` | String | Inline `data:` URI representation |

**Example:**

```ruby
media = Langfuse::Media.new(
content_bytes: File.binread("receipt.png"),
content_type: "image/png"
)

media.reference_string
# => "@@@langfuseMedia:type=image/png|id=...|source=bytes@@@"
```

`Langfuse::LangfuseMedia` is an alias for compatibility with the upstream SDK naming.

### `Langfuse::Media.parse_reference_string`

Parse a reference token.

```ruby
reference = Langfuse::Media.parse_reference_string(media.reference_string)
reference.media_id
reference.content_type
reference.source
```

### `Client#upload_media`

Create a media record, upload bytes to the returned presigned URL when one is provided, patch upload status, and return the media reference string.

**Signature:**

```ruby
upload_media(media, trace_id:, field:, observation_id: nil, timeout: nil) # => String
```

**Parameters:**

| Parameter | Type | Required | Description |
| --------- | ---- | -------- | ----------- |
| `media` | Langfuse::Media | Yes | Media wrapper |
| `trace_id` | String | Yes | Associated trace ID |
| `field` | String or Symbol | Yes | `input`, `output`, or `metadata` |
| `observation_id` | String | No | Associated observation ID |
| `timeout` | Integer | No | Upload timeout override |

**Example:**

```ruby
trace_id = Langfuse.create_trace_id(seed: "receipt-42")
reference = media.reference_string

Langfuse.observe("receipt-review", { input: { image: reference } }, trace_id: trace_id) do |obs|
obs.update(output: "accepted")
end

client.upload_media(media, trace_id: trace_id, field: :input)
```

### Media REST Helpers

Flat helpers expose the underlying platform media API without introducing a nested manager:

```ruby
client.get_media(media_id)
client.get_media_upload_url(
trace_id: trace_id,
content_type: media.content_type,
content_length: media.content_length,
sha256_hash: media.content_sha256_hash,
field: :input
)
client.patch_media(
media_id: media.media_id,
uploaded_at: Time.now.utc,
upload_http_status: 200
)
```

### `Client#resolve_media_references`

Resolve reference strings in a nested object to base64 data URIs.

**Signature:**

```ruby
resolve_media_references(obj:, resolve_with: :base64_data_uri,
max_depth: 10, content_fetch_timeout: 10)
```

**Example:**

```ruby
payload = { input: { image: media.reference_string } }
resolved = client.resolve_media_references(obj: payload)
resolved[:input][:image] # => "data:image/png;base64,..."
```

Resolution is best-effort per reference: failed downloads are logged and left as reference strings so one broken media item does not destroy the whole payload.

## Read/Admin APIs

These are thin flat wrappers over high-value Langfuse read/admin endpoints. They return parsed response hashes or arrays and intentionally avoid nested generated-client managers.

### Sessions

```ruby
client.list_sessions(page: 1, limit: 20, environment: "production")
client.get_session("session-id")
```

`list_sessions` accepts optional API filters as Ruby snake_case keys.

### Observations v2

```ruby
client.list_observations(
trace_id: "trace-id",
from_start_time: Time.utc(2026, 1, 1),
limit: 50
)
```

Snake_case query keys are converted to API camelCase, and Time-like values are formatted as ISO8601.

### Scores v2

```ruby
client.list_scores(trace_id: "trace-id", data_type: "NUMERIC")
client.get_score("score-id")
```

Creation still uses the existing score APIs documented in [Scoring](#scoring); these v2 methods are for readback.

### Score Configs

```ruby
config = client.create_score_config(
name: "quality",
data_type: "NUMERIC",
min_value: 0,
max_value: 1
)

client.list_score_configs(limit: 20)
client.get_score_config(config["id"])
client.update_score_config(config_id: config["id"], max_value: 5)
```

Body keys are recursively converted from snake_case to camelCase.

### Models

```ruby
model = client.create_model(model_name: "gpt-4o", match_pattern: "gpt-4o")
client.list_models(limit: 20)
client.get_model(model["id"])
client.delete_model(model["id"])
```

### Metrics v2

```ruby
client.query_metrics(
query: {
view: "observations",
metrics: [{ measure: "count", aggregation: "count" }],
fromTimestamp: "2026-01-01T00:00:00Z",
toTimestamp: "2026-01-02T00:00:00Z"
}
)
```

Pass either a Ruby hash or an already-encoded JSON string. The platform expects the metrics query itself as the `query` URL parameter.

### Health

```ruby
client.health
```

## Scoring

### `Client#create_score`
Expand Down
2 changes: 1 addition & 1 deletion docs/CACHING.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Cache identity is prompt name plus version or label. When neither is supplied, t

Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL.

Automatic mutation invalidation only covers `create_prompt` and `update_prompt` calls made by the current SDK process. Prompt edits made in the Langfuse UI or by other SDKs become visible through TTL expiry, `refresh_prompt`, or explicit invalidation.
Automatic mutation invalidation only covers `create_prompt`, `update_prompt`, and `delete_prompt` calls made by the current SDK process. Prompt edits made in the Langfuse UI or by other SDKs become visible through TTL expiry, `refresh_prompt`, or explicit invalidation.

### Cache Events

Expand Down
62 changes: 62 additions & 0 deletions docs/PARITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# SDK Parity Matrix

This matrix compares `langfuse-rb` against the local sibling SDKs checked into this repository:

- Ruby: `lib/langfuse/**`, version `0.10.1`
- JS: `langfuse-js/packages/**`, package version `5.3.0`
- Python: `langfuse-python/langfuse/**`, package version `4.6.0b1`

The goal is not raw feature-count parity. Ruby should stay framework-agnostic, dependency-light, and flat-client-first. Generated manager trees from JS/Python are evidence for API behavior, not a mandate to copy their public shape.

## Shipped Now

| Area | JS/Python evidence | Ruby status |
| --- | --- | --- |
| SDK identity headers | JS `LangfuseClient` and generated core client pass `X-Langfuse-Sdk-Name` / `X-Langfuse-Sdk-Version`; Python `client_wrapper.py` sets the same REST headers. Both OTLP exporters also send lower-case SDK identity headers. | `Langfuse::SdkHeaders` now centralizes REST and OTLP identity headers. REST includes `X-Langfuse-Sdk-Name`, `X-Langfuse-Sdk-Version`, and `X-Langfuse-Public-Key`; OTLP includes `x-langfuse-sdk-name`, `x-langfuse-sdk-version`, and `x-langfuse-public-key`. |
| Prompt deletion | JS `prompt.delete(name, { version, label })` calls generated DELETE `/api/public/v2/prompts/{promptName}`; Python generated `prompts.delete` exposes the same endpoint. | `client.delete_prompt(name, version: nil, label: nil)` deletes prompt versions and invalidates all cached variants for that prompt name. Ruby returns `nil` for 204 responses instead of leaking transport details. |
| Media references | JS and Python expose `LangfuseMedia`, deterministic reference strings, reference parsing, reference resolution, and media upload APIs. | `Langfuse::Media` / `Langfuse::LangfuseMedia` support bytes, file, and base64 data URI input; deterministic media IDs; reference string parsing; nested reference resolution to base64 data URIs; and `get_media`, `get_media_upload_url`, `patch_media`, `upload_media`. |
| Sessions | JS/Python generated clients expose `/api/public/sessions` list and get. | `client.list_sessions(**filters)` and `client.get_session(session_id)` are flat read APIs. |
| Observations v2 | JS/Python generated clients expose GET `/api/public/v2/observations`. | `client.list_observations(**filters)` is a thin v2 read API with Ruby snake_case query keys converted to API camelCase. |
| Scores v2 | JS/Python generated clients expose GET `/api/public/v2/scores` and GET by score ID. | `client.list_scores(**filters)` and `client.get_score(score_id)` cover v2 readback while existing score creation remains batched and flat. |
| Score configs | JS/Python generated clients expose create/list/get/update under `/api/public/score-configs`. | `client.create_score_config`, `list_score_configs`, `get_score_config`, and `update_score_config` provide thin admin access with recursive snake_case to camelCase body conversion. |
| Models | JS/Python generated clients expose create/list/get/delete under `/api/public/models`. | `client.create_model`, `list_models`, `get_model`, and `delete_model` provide thin model admin access. |
| Metrics v2 | JS/Python generated clients expose GET `/api/public/v2/metrics`. | `client.query_metrics(query:)` accepts a JSON string or a Ruby hash encoded as the API `query` parameter. |
| Health | JS/Python generated clients expose GET `/api/public/health`. | `client.health` exposes the same check. |

## Separate Issues

These gaps are real, but they are not the same kind of work as AAI-129.

| Gap | Why separate |
| --- | --- |
| Full generated REST resource tree: annotation queues, comments, organizations, projects, LLM connections, blob storage integrations, SCIM, prompt-version namespace, trace delete/update, OpenTelemetry generated namespace | Shipping all of this as hand-written flat Ruby methods would either bloat the SDK or recreate generated-client machinery under another name. Each surface needs a Rails-facing use case before it belongs in the public Ruby client. |
| Experiment/eval ergonomics beyond the current runner | The useful work is run lifecycle, result comparison, and score attachment around real eval workflows. That coordinates with AAI-6 rather than landing as generic API breadth here. |
| Automatic media extraction from tracing payloads | JS/Python include task managers or media services that walk payloads and upload media in the background. Ruby now has the safe primitives; automatic span-payload rewriting needs a separate design because it changes tracing hot-path behavior. |
| Deeper v4 ingestion semantics | This branch aligns SDK identity headers and keeps existing v4-shaped observation primitives. Any additional ingestion-contract work should coordinate with AAI-67 rather than expanding this PR past observable parity. |

## Deferred

| Gap | Reason |
| --- | --- |
| Generated client machinery | Adds maintenance and dependency weight that conflicts with the current Ruby SDK design. Thin flat APIs are enough for the high-value Rails workflows. |
| Async media upload manager | Ruby already has explicit upload primitives. A background queue would need lifecycle, shutdown, retry, and error-reporting decisions. That is real architecture, not a parity checkbox. |
| Framework integrations copied from JS/Python | Ruby should stay framework-agnostic. Rails examples and cache support belong here; Rails as a gem dependency does not. |

## Not Applicable To Ruby

| JS/Python shape | Ruby decision |
| --- | --- |
| JS nested managers such as `langfuse.prompt.delete` or generated `client.prompts.delete` | Ruby keeps the flat API: `client.delete_prompt`. |
| Python decorator/context APIs copied literally | Ruby already exposes block/stateful observation APIs that match Ruby idioms better than decorator mimicry. |
| OpenAI/LangChain framework packages as SDK dependencies | Integrations can exist outside the core gem. The core SDK stays dependency-light. |
| Browser or Node-specific media objects | Ruby media input is bytes, file path, or base64 data URI. |

## Validation Map

| Requirement | Evidence |
| --- | --- |
| Local unit coverage | `spec/langfuse/api_client_spec.rb`, `spec/langfuse/client_spec.rb`, `spec/langfuse/media_spec.rb`, `spec/langfuse/otel_setup_spec.rb` |
| Client to ApiClient mocked HTTP coverage | WebMock specs assert REST paths, query/body mapping, cache invalidation, media upload PUT, and 204 delete semantics. |
| YARD docs for public methods | New public methods have YARD docs in `ApiClient`, delegated client docs, and consumer docs in `API_REFERENCE.md`. |
| Live platform validation | Use a local scratchpad verifier with Langfuse credentials plus Langfuse CLI discovery output in the PR evidence. |
| Caveats | This matrix is committed so the PR states what shipped, what did not ship, and why. |
17 changes: 17 additions & 0 deletions docs/PROMPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,23 @@ puts prompt.labels # => ["production"]

**Note:** Only labels can be updated. Prompt content is immutable after creation—create a new version instead.

### `delete_prompt` - Delete Versions

Delete all versions for a prompt, one explicit version, or versions carrying a label:

```ruby
# Delete all versions
client.delete_prompt("checkout-flow")

# Delete one version
client.delete_prompt("checkout-flow", version: 3)

# Delete versions carrying a label
client.delete_prompt("checkout-flow", label: "staging")
```

`delete_prompt` returns `nil` on success and invalidates all cached variants for that prompt name in the current SDK process.

### Promotion Workflow Example

A typical promotion workflow:
Expand Down
2 changes: 2 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ This is the consumer hub. Start here unless you are already looking for a specif
### Instrument an App

- **[Tracing](TRACING.md)** — Observation hierarchy, propagation, background jobs, explicit global install
- **[API Reference](API_REFERENCE.md#media-references)** — Media reference upload, parsing, and resolution
- **[Rails](RAILS.md)** — Rails-specific patterns for controllers, services, jobs, and tests
- **[Scoring](SCORING.md)** — Capture quality signals after a trace exists

Expand All @@ -39,5 +40,6 @@ This is the consumer hub. Start here unless you are already looking for a specif
### Reference

- **[API Reference](API_REFERENCE.md)** — Exact public signatures and types
- **[SDK Parity Matrix](PARITY.md)** — What matches JS/Python, what is intentionally separate, and what is out of scope
- **[Configuration](CONFIGURATION.md)** — Option-by-option config reference
- **[Architecture](ARCHITECTURE.md)** — Implementation and internal design reference, not required for the first run
Loading
Loading