From 85e52bc716991fce88b022e49eb27b5852f9979c Mon Sep 17 00:00:00 2001 From: kadekillary Date: Sun, 3 May 2026 17:07:26 -0600 Subject: [PATCH 1/7] feat(prompts): expose prompt cache operations --- docs/API_REFERENCE.md | 48 +++- docs/CACHING.md | 67 ++++- docs/CONFIGURATION.md | 21 +- docs/PROMPTS.md | 25 +- lib/langfuse.rb | 1 + lib/langfuse/api_client.rb | 369 ++++++++++++++++++++++++- lib/langfuse/client.rb | 191 ++++++++++++- lib/langfuse/config.rb | 15 +- lib/langfuse/prompt_cache.rb | 130 ++++++++- lib/langfuse/prompt_fetch_result.rb | 121 ++++++++ lib/langfuse/rails_cache_adapter.rb | 122 +++++++- lib/langfuse/stale_while_revalidate.rb | 71 ++++- spec/langfuse/api_client_spec.rb | 201 ++++++++++++++ spec/langfuse/client_spec.rb | 60 ++++ spec/langfuse/config_spec.rb | 20 ++ spec/langfuse/prompt_cache_spec.rb | 44 +++ 16 files changed, 1455 insertions(+), 51 deletions(-) create mode 100644 lib/langfuse/prompt_fetch_result.rb diff --git a/docs/API_REFERENCE.md b/docs/API_REFERENCE.md index 4b1ff5d..745d803 100644 --- a/docs/API_REFERENCE.md +++ b/docs/API_REFERENCE.md @@ -215,7 +215,7 @@ Fetch a prompt from Langfuse (with caching). **Signature:** ```ruby -get_prompt(name, version: nil, label: nil, fallback: nil, type: nil) +get_prompt(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) ``` **Parameters:** @@ -227,6 +227,7 @@ get_prompt(name, version: nil, label: nil, fallback: nil, type: nil) | `label` | String | No | Version label (e.g., "production") | | `fallback` | String or Array | No | Fallback prompt if not found (`String` for text, `Array` for chat) | | `type` | Symbol | Conditional | `:text` or `:chat` (required if `fallback` provided) | +| `cache_ttl` | Integer | No | Per-call cache TTL override. `0` bypasses cache read/write | **Returns:** `Langfuse::TextPromptClient` or `Langfuse::ChatPromptClient` @@ -257,6 +258,49 @@ prompt = client.get_prompt("new-prompt", See [PROMPTS.md](PROMPTS.md) for complete guide. +### `Client#get_prompt_result` + +Fetch a prompt and return cache metadata. + +**Signature:** + +```ruby +get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) +``` + +**Returns:** `Langfuse::PromptFetchResult` + +| Attribute | Type | Description | +| --------- | ---- | ----------- | +| `prompt` | TextPromptClient or ChatPromptClient | Prompt client | +| `logical_key` | String | Stable logical identity: name plus version or label/default production | +| `storage_key` | String | Backend key for the current cache generation | +| `cache_status` | Symbol | `:hit`, `:miss`, `:stale`, `:refresh`, `:bypass`, or `:disabled` | +| `source` | Symbol | `:cache`, `:api`, or `:fallback` | +| `fallback?` | Boolean | Whether fallback content was returned | + +```ruby +result = client.get_prompt_result("greeting", label: "production") +result.prompt.compile(name: "Ada") +result.cache_status # => :miss +``` + +### Prompt Cache Operations + +Flat client APIs for operational prompt cache control: + +```ruby +client.refresh_prompt("greeting", label: "production", cache_ttl: 60) +client.invalidate_prompt_cache("greeting", label: "production") +client.invalidate_prompt_cache_by_name("greeting") +client.clear_prompt_cache +client.prompt_cache_stats +client.prompt_cache_key("greeting") +client.validate_prompt_cache_backend! +``` + +`invalidate_prompt_cache_by_name` and `clear_prompt_cache` use generation counters. Rails.cache entries from old generations are not scanned; they become unreachable and expire by TTL. + ### `Client#compile_prompt` Convenience method: fetch and compile in one call. @@ -264,7 +308,7 @@ Convenience method: fetch and compile in one call. **Signature:** ```ruby -compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil) +compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) ``` **Parameters:** diff --git a/docs/CACHING.md b/docs/CACHING.md index 86d7819..7717fa2 100644 --- a/docs/CACHING.md +++ b/docs/CACHING.md @@ -7,6 +7,7 @@ For configuration options, see [CONFIGURATION.md](CONFIGURATION.md). ## Table of Contents - [Overview](#overview) +- [Public Cache Operations](#public-cache-operations) - [In-Memory Cache](#in-memory-cache-default) - [Rails.cache Backend](#railscache-backend-distributed) - [Stale-While-Revalidate (SWR)](#stale-while-revalidate-swr) @@ -22,7 +23,69 @@ The Langfuse Ruby SDK provides two caching backends to optimize prompt fetching: 1. **In-Memory Cache** (default) - Thread-safe, local cache with TTL and bounded expiration-ordered eviction 2. **Rails.cache Backend** - Distributed caching with Redis/Memcached -Both backends support TTL-based expiration and stale-while-revalidate (SWR). Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks. +Both backends support TTL-based expiration, stale-while-revalidate (SWR), and logical generation-based invalidation. Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks. + +## Public Cache Operations + +`get_prompt` remains the normal prompt-returning API. Use `get_prompt_result` when you need cache metadata for logs, metrics, or operational validation: + +```ruby +result = Langfuse.client.get_prompt_result("greeting", label: "production", cache_ttl: 60) + +result.prompt # TextPromptClient or ChatPromptClient +result.logical_key # "greeting:production" +result.storage_key # Generated backend key for the current cache generation +result.cache_status # :hit, :miss, :stale, :refresh, :bypass, or :disabled +result.source # :cache, :api, or :fallback +result.fallback? # true when caller-provided fallback content was used +``` + +Per-call `cache_ttl` overrides the write TTL for that fetch. Passing `cache_ttl: 0` bypasses the cache read, fetches from the API, and does not retain the result: + +```ruby +fresh = Langfuse.client.get_prompt_result("greeting", cache_ttl: 0) +fresh.cache_status # => :bypass +``` + +Use `refresh_prompt` when you intentionally want to bypass the read path and write the fresh prompt through to cache: + +```ruby +result = Langfuse.client.refresh_prompt("greeting", label: "production") +result.cache_status # => :refresh +``` + +The operational cache APIs are flat on the client: + +```ruby +Langfuse.client.invalidate_prompt_cache("greeting", label: "production") +Langfuse.client.invalidate_prompt_cache_by_name("greeting") +Langfuse.client.clear_prompt_cache + +key = Langfuse.client.prompt_cache_key("greeting") +key.logical_key # => "greeting:production" +key.storage_key # Includes the current global and prompt-name generations + +Langfuse.client.prompt_cache_stats +Langfuse.client.validate_prompt_cache_backend! +``` + +Cache identity is prompt name plus version or label. When neither is supplied, the logical identity defaults to the `production` label. Runtime variables never enter the cache key; the SDK caches the managed prompt template and compiles variables afterward. + +Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL. + +### Cache Events + +Set `prompt_cache_observer` to receive cache events without binding the SDK to your metric names: + +```ruby +Langfuse.configure do |config| + config.prompt_cache_observer = lambda do |event, payload| + Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status]) + end +end +``` + +When `ActiveSupport::Notifications` is loaded, the SDK also instruments `prompt_cache.langfuse`. Event payloads include prompt name, version, label, logical key, storage key, backend, cache status, source, and error details when relevant. ## In-Memory Cache (Default) @@ -102,6 +165,8 @@ Langfuse.configure do |config| end ``` +Use `config.cache_backend = :auto` only when you want the SDK to choose `:rails` if Rails and `Rails.cache` are present, otherwise `:memory`. The default remains `:memory`. + ### Features - **Distributed**: Shared cache across all processes and servers diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index fca1b50..6eb4714 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -120,7 +120,7 @@ config.cache_max_size = 5000 # Large prompt library #### `cache_backend` -- **Type:** Symbol (`:memory` or `:rails`) +- **Type:** Symbol (`:memory`, `:rails`, or `:auto`) - **Default:** `:memory` - **Description:** Cache storage backend @@ -130,8 +130,13 @@ config.cache_backend = :memory # Rails.cache (requires Rails + Redis) config.cache_backend = :rails + +# Opt in to automatic Rails.cache detection +config.cache_backend = :auto ``` +`:auto` chooses `:rails` only when Rails and `Rails.cache` are present; otherwise it falls back to `:memory`. The gem default stays `:memory`. + **Requirements for `:rails` backend:** - Rails must be defined @@ -225,6 +230,20 @@ config.cache_refresh_threads = 10 # More threads for high-traffic apps Only used when SWR is enabled (`cache_stale_ttl > 0`). +#### `prompt_cache_observer` + +- **Type:** Callable or `nil` +- **Default:** `nil` +- **Description:** Observer hook for prompt cache events + +```ruby +config.prompt_cache_observer = lambda do |event, payload| + Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status]) +end +``` + +When ActiveSupport is loaded, the SDK also instruments `prompt_cache.langfuse`. + #### `batch_size` - **Type:** Integer diff --git a/docs/PROMPTS.md b/docs/PROMPTS.md index 3a1c582..7e42f6e 100644 --- a/docs/PROMPTS.md +++ b/docs/PROMPTS.md @@ -553,7 +553,30 @@ Langfuse.configure do |config| end ``` -See [CACHING.md](CACHING.md) for advanced caching strategies (warming, stampede protection). +Override or bypass cache per call: + +```ruby +prompt = client.get_prompt("greeting", cache_ttl: 300) +fresh = client.get_prompt_result("greeting", cache_ttl: 0) + +fresh.cache_status # => :bypass +fresh.source # => :api +``` + +Use `get_prompt_result` and the flat cache operations when you need operational visibility: + +```ruby +result = client.get_prompt_result("greeting") +result.cache_status # => :hit or :miss + +client.refresh_prompt("greeting") +client.invalidate_prompt_cache("greeting") +client.invalidate_prompt_cache_by_name("greeting") +client.clear_prompt_cache +client.prompt_cache_stats +``` + +See [CACHING.md](CACHING.md) for advanced caching strategies, generation-based invalidation, cache events, warming, and stampede protection. ## Combining Prompts with Tracing diff --git a/lib/langfuse.rb b/lib/langfuse.rb index ca8cc3f..20515e6 100644 --- a/lib/langfuse.rb +++ b/lib/langfuse.rb @@ -41,6 +41,7 @@ class UnauthorizedError < ApiError; end require_relative "langfuse/config" require_relative "langfuse/prompt_cache" +require_relative "langfuse/prompt_fetch_result" require_relative "langfuse/rails_cache_adapter" require_relative "langfuse/cache_warmer" require_relative "langfuse/api_client" diff --git a/lib/langfuse/api_client.rb b/lib/langfuse/api_client.rb index f369442..0a2ea41 100644 --- a/lib/langfuse/api_client.rb +++ b/lib/langfuse/api_client.rb @@ -5,6 +5,7 @@ require "base64" require "json" require "uri" +require_relative "prompt_fetch_result" module Langfuse # HTTP client for Langfuse API @@ -40,6 +41,9 @@ class ApiClient # rubocop:disable Metrics/ClassLength # @return [PromptCache, RailsCacheAdapter, nil] Optional cache for prompt responses attr_reader :cache + # @return [#call, nil] Optional observer for prompt cache events + attr_reader :cache_observer + # Initialize a new API client # # @param public_key [String] Langfuse public API key @@ -48,15 +52,19 @@ class ApiClient # rubocop:disable Metrics/ClassLength # @param timeout [Integer] HTTP request timeout in seconds # @param logger [Logger] Logger instance for debugging # @param cache [PromptCache, RailsCacheAdapter, nil] Optional cache for prompt responses + # @param cache_observer [#call, nil] Optional observer for prompt cache events # @return [ApiClient] - def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cache: nil) + # rubocop:disable Metrics/ParameterLists + def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cache: nil, cache_observer: nil) @public_key = public_key @secret_key = secret_key @base_url = base_url @timeout = timeout @logger = logger || Logger.new($stdout, level: Logger::WARN) @cache = cache + @cache_observer = cache_observer end + # rubocop:enable Metrics/ParameterLists # Get a Faraday connection # @@ -109,17 +117,129 @@ def list_prompts(page: nil, limit: nil) # @param name [String] The name of the prompt # @param version [Integer, nil] Optional specific version number # @param label [String, nil] Optional label (e.g., "production", "latest") + # @param cache_ttl [Integer, nil] Optional TTL override for this fetch # @return [Hash] The prompt data # @raise [ArgumentError] if both version and label are provided # @raise [NotFoundError] if the prompt is not found # @raise [UnauthorizedError] if authentication fails # @raise [ApiError] for other API errors - def get_prompt(name, version: nil, label: nil) + def get_prompt(name, version: nil, label: nil, cache_ttl: nil) + get_prompt_result(name, version: version, label: label, cache_ttl: cache_ttl).prompt + end + + # Fetch a prompt and include cache metadata. + # + # @param name [String] The name of the prompt + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label (e.g., "production", "latest") + # @param cache_ttl [Integer, nil] Optional TTL override for this fetch + # @return [PromptFetchResult] Prompt data plus cache metadata + # @raise [ArgumentError] if both version and label are provided + # @raise [ArgumentError] if cache_ttl is negative + # @raise [NotFoundError] if the prompt is not found + # @raise [UnauthorizedError] if authentication fails + # @raise [ApiError] for other API errors + def get_prompt_result(name, version: nil, label: nil, cache_ttl: nil) + validate_prompt_fetch_options!(version, label, cache_ttl) + + key = prompt_cache_key(name, version: version, label: label) + return fetch_uncached_prompt_result(key, version, label, :disabled) if cache.nil? + return fetch_uncached_prompt_result(key, version, label, :bypass) if cache_ttl&.zero? + + fetch_cached_prompt_result(key, version, label, cache_ttl) + end + + # Refresh a prompt from the API, optionally writing through to cache. + # + # @param name [String] The name of the prompt + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label + # @param cache_ttl [Integer, nil] Optional TTL override for this refresh + # @return [PromptFetchResult] Prompt data plus cache metadata + # @raise [ArgumentError] if both version and label are provided + # @raise [ArgumentError] if cache_ttl is negative + # @raise [NotFoundError] if the prompt is not found + # @raise [UnauthorizedError] if authentication fails + # @raise [ApiError] for other API errors + def refresh_prompt(name, version: nil, label: nil, cache_ttl: nil) + validate_prompt_fetch_options!(version, label, cache_ttl) + + key = prompt_cache_key(name, version: version, label: label) + refresh_prompt_result(key, version, label, cache_ttl) + end + + # Inspect the logical and generated cache keys for a prompt. + # + # @param name [String] The prompt name + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label + # @return [PromptCacheKey] Logical and generated cache keys + # @raise [ArgumentError] if both version and label are provided + def prompt_cache_key(name, version: nil, label: nil) raise ArgumentError, "Cannot specify both version and label" if version && label - return fetch_prompt_from_api(name, version: version, label: label) if cache.nil? - cache_key = PromptCache.build_key(name, version: version, label: label) - fetch_with_appropriate_caching_strategy(cache_key, name, version, label) + logical_key = PromptCache.build_key(name, version: version, label: label) + storage_key = if generated_storage_key_cache? + cache.storage_key(logical_key, name: name) + else + logical_key + end + PromptCacheKey.new(name: name, version: version, label: label, logical_key: logical_key, storage_key: storage_key) + end + + # Invalidate one exact logical prompt cache key. + # + # @param name [String] The prompt name + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label + # @return [PromptCacheKey] The invalidated key + # @raise [ArgumentError] if both version and label are provided + def invalidate_prompt_cache(name, version: nil, label: nil) + key = prompt_cache_key(name, version: version, label: label) + deleted = cache&.delete(key.storage_key) || false + emit_prompt_cache_event(:delete, event_payload(key, :miss, :cache, deleted: deleted)) + emit_prompt_cache_event(:invalidate, event_payload(key, :miss, :cache, scope: :exact)) + key + end + + # Invalidate all cached variants for one prompt name. + # + # @param name [String] The prompt name + # @return [Integer, nil] New generation, or nil when cache is disabled + def invalidate_prompt_cache_by_name(name) + generation = cache&.invalidate_name(name) + payload = { name: name, backend: cache_backend_name, generation: generation, scope: :name } + emit_prompt_cache_event(:invalidate, payload) + generation + end + + # Logically clear the whole Langfuse prompt cache namespace. + # + # @return [Integer, nil] New global generation, or nil when cache is disabled + def clear_prompt_cache + generation = cache&.clear_logically + emit_prompt_cache_event(:clear, backend: cache_backend_name, generation: generation) + generation + end + + # Return prompt cache statistics. + # + # @return [Hash] Cache statistics + def prompt_cache_stats + return disabled_prompt_cache_stats unless cache + + cache.stats + end + + # Emit a prompt cache event to configured hooks. + # + # @param event [Symbol] Event name + # @param payload [Hash] Event payload + # @return [void] + def emit_prompt_cache_event(event, payload) + normalized_payload = payload.merge(event: event.to_sym) + notify_cache_observer(event, normalized_payload) + notify_active_support(normalized_payload) end # Create a new prompt (or new version if prompt with same name exists) @@ -158,7 +278,7 @@ def create_prompt(name:, prompt:, type:, config: {}, labels: [], tags: [], commi payload[:commitMessage] = commit_message if commit_message response = connection.post(path, payload) - handle_response(response) + handle_response(response).tap { invalidate_prompt_cache_after_mutation(name) } end end # rubocop:enable Metrics/ParameterLists @@ -188,7 +308,7 @@ def update_prompt(name:, version:, labels:) payload = { newLabels: labels } response = connection.patch(path, payload) - handle_response(response) + handle_response(response).tap { invalidate_prompt_cache_after_mutation(name) } end end @@ -595,6 +715,241 @@ def delete_dataset_item(id) private + def validate_prompt_fetch_options!(version, label, cache_ttl) + raise ArgumentError, "Cannot specify both version and label" if version && label + return if cache_ttl.nil? || (cache_ttl.respond_to?(:negative?) && !cache_ttl.negative?) + + raise ArgumentError, "cache_ttl must be non-negative" + end + + def fetch_uncached_prompt_result(key, version, label, cache_status) + prompt_data = fetch_prompt_from_api(key.name, version: version, label: label) + build_prompt_result(key, prompt_data, cache_status, :api) + end + + def fetch_cached_prompt_result(key, version, label, cache_ttl) + swr_enabled = swr_cache_available? + return fetch_swr_prompt_result(key, version, label, cache_ttl) if swr_enabled + + fetch_non_swr_prompt_result(key, version, label, cache_ttl) + end + + def fetch_swr_prompt_result(key, version, label, cache_ttl) + unless generated_storage_key_cache? + prompt_data = fetch_with_swr_cache(key.storage_key, key.name, version, label) + return cache_hit_prompt_result(key, prompt_data) + end + + result = fetch_swr_cached_prompt_result(key, version, label, cache_ttl) + return result if result + + fetch_cache_miss_prompt_result(key, version, label, cache_ttl, swr_enabled: true, distributed_enabled: false) + end + + def fetch_non_swr_prompt_result(key, version, label, cache_ttl) + distributed_enabled = distributed_cache_available? + + if !generated_storage_key_cache? && distributed_enabled + prompt_data = fetch_with_distributed_cache(key.storage_key, key.name, version, label) + return cache_hit_prompt_result(key, prompt_data) + end + + cached_data = cache.get(key.storage_key) + return cache_hit_prompt_result(key, cached_data) if cached_data + + fetch_cache_miss_prompt_result( + key, + version, + label, + cache_ttl, + swr_enabled: false, + distributed_enabled: distributed_enabled + ) + end + + def fetch_swr_cached_prompt_result(key, version, label, cache_ttl) + entry = cache.entry(key.storage_key) if cache.respond_to?(:entry) + return nil unless entry.respond_to?(:fresh?) + return cache_hit_prompt_result(key, entry.data) if entry.fresh? + + return nil unless entry.stale? + + emit_prompt_cache_event(:stale_serve, event_payload(key, :stale, :cache)) + schedule_prompt_cache_refresh(key, version, label, cache_ttl) + build_prompt_result(key, entry.data, :stale, :cache) + end + + def cache_hit_prompt_result(key, prompt_data) + emit_prompt_cache_event(:hit, event_payload(key, :hit, :cache)) + build_prompt_result(key, prompt_data, :hit, :cache) + end + + def fetch_cache_miss_prompt_result(key, version, label, cache_ttl, swr_enabled: false, distributed_enabled: nil) + emit_prompt_cache_event(:miss, event_payload(key, :miss, :api)) + distributed_enabled = distributed_cache_available? if distributed_enabled.nil? + + if !swr_enabled && distributed_enabled + fetch_cache_miss_with_lock(key, version, label, cache_ttl) + else + fetch_cache_miss_directly(key, version, label, cache_ttl, swr_enabled: swr_enabled) + end + end + + def fetch_cache_miss_with_lock(key, version, label, cache_ttl) + fetched = false + prompt_data = cache_fetch_with_lock(key.storage_key, cache_ttl) do + fetched = true + fetch_prompt_from_api(key.name, version: version, label: label) + end + emit_prompt_cache_event(:write, event_payload(key, :miss, :api)) if fetched + status = fetched ? :miss : :hit + source = fetched ? :api : :cache + build_prompt_result(key, prompt_data, status, source) + end + + def fetch_cache_miss_directly(key, version, label, cache_ttl, swr_enabled: false) + prompt_data = fetch_prompt_from_api(key.name, version: version, label: label) + write_prompt_cache(key, prompt_data, cache_ttl, swr_enabled: swr_enabled) + build_prompt_result(key, prompt_data, :miss, :api) + end + + def refresh_prompt_result(key, version, label, cache_ttl) + emit_prompt_cache_event(:refresh_start, event_payload(key, :refresh, :api)) + prompt_data = fetch_prompt_from_api(key.name, version: version, label: label) + write_refresh_prompt_cache(key, prompt_data, cache_ttl) + emit_prompt_cache_event(:refresh_success, event_payload(key, refresh_cache_status(cache_ttl), :api)) + build_prompt_result(key, prompt_data, refresh_cache_status(cache_ttl), :api) + rescue StandardError => e + payload = event_payload(key, :refresh, :api, error_class: e.class.name, error_message: e.message) + emit_prompt_cache_event(:refresh_failure, payload) + raise + end + + def schedule_prompt_cache_refresh(key, version, label, cache_ttl) + return unless cache.respond_to?(:refresh_async) + + scheduled = cache.refresh_async( + key.storage_key, + ttl: cache_ttl, + on_success: ->(_value) { emit_refresh_success_events(key) }, + on_failure: ->(error) { emit_refresh_failure_event(key, error) } + ) { fetch_prompt_from_api(key.name, version: version, label: label) } + emit_prompt_cache_event(:refresh_start, event_payload(key, :stale, :cache)) if scheduled + end + + def emit_refresh_success_events(key) + emit_prompt_cache_event(:refresh_success, event_payload(key, :refresh, :api)) + emit_prompt_cache_event(:write, event_payload(key, :refresh, :api)) + end + + def emit_refresh_failure_event(key, error) + payload = event_payload(key, :stale, :cache, error_class: error.class.name, error_message: error.message) + emit_prompt_cache_event(:refresh_failure, payload) + end + + def write_refresh_prompt_cache(key, prompt_data, cache_ttl) + return unless cache + return if cache_ttl&.zero? + + write_prompt_cache(key, prompt_data, cache_ttl, cache_status: :refresh, swr_enabled: swr_cache_available?) + end + + def write_prompt_cache(key, prompt_data, cache_ttl, cache_status: :miss, swr_enabled: false) + if swr_enabled && cache.respond_to?(:write_with_stale_while_revalidate) + cache.write_with_stale_while_revalidate(key.storage_key, prompt_data, ttl: cache_ttl) + elsif cache_ttl.nil? + cache.set(key.storage_key, prompt_data) + else + cache.set(key.storage_key, prompt_data, ttl: cache_ttl) + end + emit_prompt_cache_event(:write, event_payload(key, cache_status, :api)) + end + + def cache_fetch_with_lock(storage_key, cache_ttl, &) + return cache.fetch_with_lock(storage_key, &) if cache_ttl.nil? + + cache.fetch_with_lock(storage_key, ttl: cache_ttl, &) + end + + def refresh_cache_status(cache_ttl) + return :disabled unless cache + return :bypass if cache_ttl&.zero? + + :refresh + end + + def build_prompt_result(key, prompt_data, cache_status, source) + PromptFetchResult.new( + prompt: prompt_data, + logical_key: key.logical_key, + storage_key: key.storage_key, + cache_status: cache_status, + source: source, + name: prompt_data["name"] || key.name, + version: prompt_data["version"] || key.version, + label: key.label || (key.version ? nil : "production") + ) + end + + def event_payload(key, cache_status, source, extra = {}) + { + name: key.name, + version: key.version, + label: key.label || (key.version ? nil : "production"), + logical_key: key.logical_key, + storage_key: key.storage_key, + backend: cache_backend_name, + cache_status: cache_status, + source: source + }.merge(extra) + end + + def cache_backend_name + return "disabled" unless cache + return "rails" if cache.is_a?(RailsCacheAdapter) + return "memory" if cache.is_a?(PromptCache) + + cache.class.name + end + + def disabled_prompt_cache_stats + { + backend: "disabled", + enabled: false, + current_generation_entries: nil, + orphaned_entries: nil, + total_entries: nil, + unsupported_counts: %i[current_generation_entries orphaned_entries total_entries] + } + end + + def generated_storage_key_cache? + cache.is_a?(PromptCache) || cache.is_a?(RailsCacheAdapter) + end + + def invalidate_prompt_cache_after_mutation(name) + generation = cache&.invalidate_name(name) + payload = { name: name, backend: cache_backend_name, generation: generation, scope: :name, mutation: true } + emit_prompt_cache_event(:invalidate, payload) + end + + def notify_cache_observer(event, payload) + return unless cache_observer + + observer_arity = cache_observer.respond_to?(:arity) ? cache_observer.arity : 2 + observer_arity == 1 ? cache_observer.call(payload) : cache_observer.call(event, payload) + rescue StandardError => e + logger.warn("Langfuse prompt cache observer failed: #{e.class} - #{e.message}") + end + + def notify_active_support(payload) + return unless defined?(ActiveSupport::Notifications) + + ActiveSupport::Notifications.instrument("prompt_cache.langfuse", payload) + rescue StandardError => e + logger.warn("Langfuse ActiveSupport cache notification failed: #{e.class} - #{e.message}") + end + # Fetch prompt using the most appropriate caching strategy available # # @param cache_key [String] The cache key for this prompt diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb index 5c6fafa..de6df25 100644 --- a/lib/langfuse/client.rb +++ b/lib/langfuse/client.rb @@ -46,7 +46,8 @@ def initialize(config) base_url: config.base_url, timeout: config.timeout, logger: config.logger, - cache: cache + cache: cache, + cache_observer: config.prompt_cache_observer ) @project_id = nil @@ -68,6 +69,7 @@ def initialize(config) # @param label [String, nil] Optional label (e.g., "production", "latest") # @param fallback [String, Array, nil] Optional fallback prompt to use on error # @param type [Symbol, nil] Required when fallback is provided (:text or :chat) + # @param cache_ttl [Integer, nil] Optional TTL override for this fetch # @return [TextPromptClient, ChatPromptClient] The prompt client # @raise [ArgumentError] if both version and label are provided # @raise [ArgumentError] if fallback is provided without type @@ -77,24 +79,113 @@ def initialize(config) # # @example With fallback for graceful degradation # prompt = client.get_prompt("greeting", fallback: "Hello {{name}}!", type: :text) - def get_prompt(name, version: nil, label: nil, fallback: nil, type: nil) - # Validate fallback usage - if fallback && !type - raise ArgumentError, "type parameter is required when fallback is provided (use :text or :chat)" - end + def get_prompt(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) + get_prompt_result( + name, + version: version, + label: label, + fallback: fallback, + type: type, + cache_ttl: cache_ttl + ).prompt + end - # Try to fetch from API - prompt_data = api_client.get_prompt(name, version: version, label: label) - build_prompt_client(prompt_data) + # Fetch a prompt and return cache metadata. + # + # @param name [String] The name of the prompt + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label (e.g., "production", "latest") + # @param fallback [String, Array, nil] Optional fallback prompt to use on error + # @param type [Symbol, nil] Required when fallback is provided (:text or :chat) + # @param cache_ttl [Integer, nil] Optional TTL override for this fetch + # @return [PromptFetchResult] Prompt client plus cache metadata + # @raise [ArgumentError] if fallback is provided without type + # @raise [NotFoundError] if the prompt is not found and no fallback provided + # @raise [UnauthorizedError] if authentication fails and no fallback provided + # @raise [ApiError] for other API errors and no fallback provided + def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) + validate_fallback_usage!(fallback, type) + + api_result = api_client.get_prompt_result(name, version: version, label: label, cache_ttl: cache_ttl) + build_client_fetch_result(api_result, build_prompt_client(api_result.prompt)) rescue ApiError, NotFoundError, UnauthorizedError => e # If no fallback, re-raise the error raise e unless fallback # Log warning and return fallback config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.") - build_fallback_prompt_client(name, fallback, type) + build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e) + end + + # Refresh a prompt from the API, optionally writing through to cache. + # + # @param name [String] The name of the prompt + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label (e.g., "production", "latest") + # @param cache_ttl [Integer, nil] Optional TTL override for this refresh + # @return [PromptFetchResult] Prompt client plus cache metadata + # @raise [ArgumentError] if both version and label are provided + # @raise [NotFoundError] if the prompt is not found + # @raise [UnauthorizedError] if authentication fails + # @raise [ApiError] for other API errors + def refresh_prompt(name, version: nil, label: nil, cache_ttl: nil) + api_result = api_client.refresh_prompt(name, version: version, label: label, cache_ttl: cache_ttl) + build_client_fetch_result(api_result, build_prompt_client(api_result.prompt)) + end + + # Invalidate one exact logical prompt cache key. + # + # @param name [String] The prompt name + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label + # @return [PromptCacheKey] The invalidated key + def invalidate_prompt_cache(name, version: nil, label: nil) + api_client.invalidate_prompt_cache(name, version: version, label: label) + end + + # Invalidate all cached variants for one prompt name. + # + # @param name [String] The prompt name + # @return [Integer, nil] New generation, or nil when cache is disabled + def invalidate_prompt_cache_by_name(name) + api_client.invalidate_prompt_cache_by_name(name) + end + + # Logically clear the whole Langfuse prompt cache namespace. + # + # @return [Integer, nil] New global generation, or nil when cache is disabled + def clear_prompt_cache + api_client.clear_prompt_cache + end + + # Return prompt cache statistics. + # + # @return [Hash] Cache statistics + def prompt_cache_stats + api_client.prompt_cache_stats + end + + # Inspect the logical and generated cache keys for a prompt. + # + # @param name [String] The prompt name + # @param version [Integer, nil] Optional specific version number + # @param label [String, nil] Optional label + # @return [PromptCacheKey] Logical and generated cache keys + def prompt_cache_key(name, version: nil, label: nil) + api_client.prompt_cache_key(name, version: version, label: label) end + # Validate the configured prompt cache backend before first prompt fetch. + # + # @return [Boolean] true when the configured backend is usable + # @raise [ConfigurationError] if the backend is invalid + # rubocop:disable Naming/PredicateMethod + def validate_prompt_cache_backend! + api_client.cache&.validate! if api_client.cache.respond_to?(:validate!) + true + end + # rubocop:enable Naming/PredicateMethod + # List all prompts in the Langfuse project # # Fetches a list of all prompt names available in your project. @@ -126,6 +217,7 @@ def list_prompts(page: nil, limit: nil) # @param label [String, nil] Optional label (e.g., "production", "latest") # @param fallback [String, Array, nil] Optional fallback prompt to use on error # @param type [Symbol, nil] Required when fallback is provided (:text or :chat) + # @param cache_ttl [Integer, nil] Optional TTL override for this fetch # @return [String, Array] Compiled prompt (String for text, Array for chat) # @raise [ArgumentError] if both version and label are provided # @raise [ArgumentError] if fallback is provided without type @@ -148,10 +240,19 @@ def list_prompts(page: nil, limit: nil) # fallback: "Hello {{name}}!", # type: :text # ) - def compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil) - prompt = get_prompt(name, version: version, label: label, fallback: fallback, type: type) + # rubocop:disable Metrics/ParameterLists + def compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil) + prompt = get_prompt( + name, + version: version, + label: label, + fallback: fallback, + type: type, + cache_ttl: cache_ttl + ) prompt.compile(**variables) end + # rubocop:enable Metrics/ParameterLists # Create a new prompt (or new version if name already exists) # @@ -738,6 +839,66 @@ def resolve_experiment_items(data, dataset_name) list_dataset_items(dataset_name: dataset_name) end + def validate_fallback_usage!(fallback, type) + return unless fallback && !type + + raise ArgumentError, "type parameter is required when fallback is provided (use :text or :chat)" + end + + def build_client_fetch_result(api_result, prompt_client) + PromptFetchResult.new( + prompt: prompt_client, + logical_key: api_result.logical_key, + storage_key: api_result.storage_key, + cache_status: api_result.cache_status, + source: api_result.source, + name: prompt_client.name, + version: prompt_client.version, + label: api_result.label + ) + end + + def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context) + prompt_client = build_fallback_prompt_client(name, fallback, type) + key = api_client.prompt_cache_key(name, version: version, label: label) + api_client.emit_prompt_cache_event( + :fallback, + fallback_event_payload(key, fetch_context, fetch_context.fetch(:error)) + ) + PromptFetchResult.new( + prompt: prompt_client, + logical_key: key.logical_key, + storage_key: key.storage_key, + cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)), + source: :fallback, + name: name, + version: version || prompt_client.version, + label: key.label || (key.version ? nil : "production") + ) + end + + def fallback_event_payload(key, fetch_context, error) + { + name: key.name, + version: key.version, + label: key.label || (key.version ? nil : "production"), + logical_key: key.logical_key, + storage_key: key.storage_key, + backend: api_client.prompt_cache_stats.fetch(:backend), + cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)), + source: :fallback, + error_class: error.class.name, + error_message: error.message + } + end + + def fallback_cache_status(cache_ttl) + return :bypass if cache_ttl&.zero? + return :disabled unless api_client.cache + + :miss + end + # Check if caching is enabled in configuration # # @return [Boolean] @@ -754,11 +915,17 @@ def create_cache create_memory_cache when :rails create_rails_cache_adapter + when :auto + rails_cache_available? ? create_rails_cache_adapter : create_memory_cache else raise ConfigurationError, "Unknown cache backend: #{config.cache_backend}" end end + def rails_cache_available? + defined?(Rails) && Rails.respond_to?(:cache) && Rails.cache + end + # Create in-memory cache with SWR support if enabled # # @return [PromptCache] diff --git a/lib/langfuse/config.rb b/lib/langfuse/config.rb index c6ff5a6..8c9dac2 100644 --- a/lib/langfuse/config.rb +++ b/lib/langfuse/config.rb @@ -41,7 +41,7 @@ class Config # @return [Integer] Maximum number of cached items attr_accessor :cache_max_size - # @return [Symbol] Cache backend (:memory or :rails) + # @return [Symbol] Cache backend (:memory, :rails, or :auto) attr_accessor :cache_backend # @return [Integer] Lock timeout in seconds for distributed cache stampede protection @@ -57,6 +57,9 @@ class Config # @return [Integer] Number of background threads for cache refresh attr_accessor :cache_refresh_threads + # @return [#call, nil] Observer called for prompt cache events + attr_accessor :prompt_cache_observer + # @return [Boolean] Use async processing for traces (requires ActiveJob) attr_accessor :tracing_async @@ -158,6 +161,7 @@ def initialize @cache_stale_while_revalidate = DEFAULT_CACHE_STALE_WHILE_REVALIDATE @cache_stale_ttl = 0 # Default to 0 (SWR disabled, entries expire immediately after TTL) @cache_refresh_threads = DEFAULT_CACHE_REFRESH_THREADS + @prompt_cache_observer = nil @tracing_async = DEFAULT_TRACING_ASYNC @batch_size = DEFAULT_BATCH_SIZE @flush_interval = DEFAULT_FLUSH_INTERVAL @@ -189,6 +193,7 @@ def validate! validate_swr_config! validate_cache_backend! + validate_prompt_cache_observer! validate_sample_rate! validate_should_export_span! validate_mask! @@ -240,13 +245,19 @@ def initialize_tracing_defaults end def validate_cache_backend! - valid_backends = %i[memory rails] + valid_backends = %i[memory rails auto] return if valid_backends.include?(cache_backend) raise ConfigurationError, "cache_backend must be one of #{valid_backends.inspect}, got #{cache_backend.inspect}" end + def validate_prompt_cache_observer! + return if prompt_cache_observer.nil? || prompt_cache_observer.respond_to?(:call) + + raise ConfigurationError, "prompt_cache_observer must respond to #call" + end + def validate_swr_config! validate_swr_stale_ttl! validate_refresh_threads! diff --git a/lib/langfuse/prompt_cache.rb b/lib/langfuse/prompt_cache.rb index df401de..011a122 100644 --- a/lib/langfuse/prompt_cache.rb +++ b/lib/langfuse/prompt_cache.rb @@ -1,6 +1,7 @@ # frozen_string_literal: true require "monitor" +require "base64" require_relative "stale_while_revalidate" module Langfuse @@ -14,6 +15,7 @@ module Langfuse # cache.set("greeting:1", prompt_data) # cache.get("greeting:1") # => prompt_data # + # rubocop:disable Metrics/ClassLength class PromptCache include StaleWhileRevalidate @@ -79,6 +81,8 @@ def initialize(ttl: 60, max_size: 1000, stale_ttl: 0, refresh_threads: 5, logger @stale_ttl = stale_ttl @logger = logger @cache = {} + @global_generation = 0 + @name_generations = Hash.new(0) @monitor = Monitor.new @locks = {} # Track locks for in-memory locking initialize_swr(refresh_threads: refresh_threads) if swr_enabled? @@ -98,24 +102,46 @@ def get(key) end end + # Read a raw cache entry, including stale entries. + # + # @param key [String] Cache key + # @return [CacheEntry, nil] Raw cache entry + def entry(key) + @monitor.synchronize do + @cache[key] + end + end + # Set a value in the cache # # @param key [String] Cache key # @param value [Object] Value to cache # @return [Object] The cached value - def set(key, value) + def set(key, value, ttl: nil, stale_ttl: nil) @monitor.synchronize do # Evict oldest entry if at max size evict_oldest if @cache.size >= max_size now = Time.now - fresh_until = now + ttl - stale_until = fresh_until + stale_ttl + effective_ttl = ttl.nil? ? self.ttl : ttl + effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl + fresh_until = now + effective_ttl + stale_until = fresh_until + effective_stale_ttl @cache[key] = CacheEntry.new(value, fresh_until, stale_until) value end end + # Delete one generated storage key. + # + # @param key [String] Generated storage key + # @return [Boolean] true if an entry was removed + def delete(key) + @monitor.synchronize do + !@cache.delete(key).nil? + end + end + # Clear the entire cache # # @return [void] @@ -125,6 +151,57 @@ def clear end end + # Logically invalidate every generated storage key. + # + # @return [Integer] New global generation + def clear_logically + @monitor.synchronize do + @global_generation += 1 + end + end + + # Logically invalidate every cache variant for one prompt name. + # + # @param name [String] Prompt name + # @return [Integer] New name generation + def invalidate_name(name) + @monitor.synchronize do + @name_generations[name.to_s] += 1 + end + end + + # Build a generated storage key for the current cache generation. + # + # @param logical_key [String] Stable logical cache identity + # @param name [String] Prompt name + # @return [String] Generated storage key + def storage_key(logical_key, name:) + @monitor.synchronize do + self.class.storage_key( + logical_key, + name: name, + global_generation: @global_generation, + name_generation: @name_generations[name.to_s] + ) + end + end + + # @return [Hash] Prompt cache statistics + def stats + @monitor.synchronize do + counts = count_entries_by_generation + { + backend: "memory", + enabled: true, + current_generation_entries: counts.fetch(:current), + orphaned_entries: counts.fetch(:orphaned), + total_entries: @cache.size, + global_generation: @global_generation, + unsupported_counts: [] + } + end + end + # Remove expired entries from cache # # @return [Integer] Number of entries removed @@ -154,6 +231,15 @@ def empty? end end + # Validate that the memory cache backend is usable. + # + # @return [Boolean] + # rubocop:disable Naming/PredicateMethod + def validate! + true + end + # rubocop:enable Naming/PredicateMethod + # Build a cache key from prompt name and options # # @param name [String] Prompt name @@ -168,6 +254,18 @@ def self.build_key(name, version: nil, label: nil) key end + # Build a generated storage key from generation metadata. + # + # @param logical_key [String] Stable logical cache identity + # @param name [String] Prompt name + # @param global_generation [Integer] Global cache generation + # @param name_generation [Integer] Prompt-name cache generation + # @return [String] Generated storage key + def self.storage_key(logical_key, name:, global_generation:, name_generation:) + encoded_name = Base64.urlsafe_encode64(name.to_s, padding: false) + "g#{global_generation}:n#{encoded_name}:#{name_generation}:#{logical_key}" + end + private # Implementation of StaleWhileRevalidate abstract methods @@ -187,7 +285,7 @@ def cache_get(key) # @param key [String] Cache key # @param value [PromptCache::CacheEntry] Value to cache # @return [PromptCache::CacheEntry] The cached value - def cache_set(key, value) + def cache_set(key, value, **_options) @monitor.synchronize do # Evict oldest entry if at max size evict_oldest if @cache.size >= max_size @@ -230,6 +328,29 @@ def release_lock(lock_key) end end + def count_entries_by_generation + @cache.each_key.with_object({ current: 0, orphaned: 0 }) do |key, counts| + if current_generation_key?(key) + counts[:current] += 1 + else + counts[:orphaned] += 1 + end + end + end + + def current_generation_key?(key) + parts = key.split(":", 4) + return false unless parts.size == 4 + return false unless parts[0].start_with?("g") && parts[1].start_with?("n") + + global = Integer(parts[0][1..]) + name = Base64.urlsafe_decode64(parts[1][1..]) + name_generation = Integer(parts[2]) + global == @global_generation && name_generation == @name_generations[name] + rescue ArgumentError + false + end + # In-memory cache helper methods # Evict the oldest entry from cache @@ -250,4 +371,5 @@ def default_logger Logger.new($stdout, level: Logger::WARN) end end + # rubocop:enable Metrics/ClassLength end diff --git a/lib/langfuse/prompt_fetch_result.rb b/lib/langfuse/prompt_fetch_result.rb new file mode 100644 index 0000000..b11e511 --- /dev/null +++ b/lib/langfuse/prompt_fetch_result.rb @@ -0,0 +1,121 @@ +# frozen_string_literal: true + +module Langfuse + # Public metadata returned by prompt fetch operations. + class PromptFetchResult + # @return [Object] Prompt data or prompt client returned by the fetch + attr_reader :prompt + + # @return [String] Stable logical cache identity + attr_reader :logical_key + + # @return [String] Generated backend key for the current cache generation + attr_reader :storage_key + + # @return [Symbol] Cache status (:hit, :miss, :stale, :refresh, :bypass, :disabled) + attr_reader :cache_status + + # @return [Symbol] Source of the returned prompt (:cache, :api, :fallback) + attr_reader :source + + # @return [String] Prompt name + attr_reader :name + + # @return [Integer, nil] Prompt version + attr_reader :version + + # @return [String, nil] Prompt label + attr_reader :label + + # @param prompt [Object] Prompt data or prompt client + # @param logical_key [String] Stable logical cache identity + # @param storage_key [String] Generated backend key + # @param cache_status [Symbol] Cache status + # @param source [Symbol] Prompt source + # @param name [String] Prompt name + # @param version [Integer, nil] Prompt version + # @param label [String, nil] Prompt label + # @return [PromptFetchResult] + # rubocop:disable Metrics/ParameterLists + def initialize(prompt:, logical_key:, storage_key:, cache_status:, source:, name:, version: nil, label: nil) + @prompt = prompt + @logical_key = logical_key + @storage_key = storage_key + @cache_status = cache_status.to_sym + @source = source.to_sym + @name = name + @version = version + @label = label + end + # rubocop:enable Metrics/ParameterLists + + # Compatibility alias for callers that already use "cache key" language. + # + # @return [String] Stable logical cache identity + def cache_key + logical_key + end + + # @return [Boolean] Whether this result used caller-provided fallback content + def fallback? + source == :fallback || (prompt.respond_to?(:is_fallback) && prompt.is_fallback) + end + + # @return [Hash] Result metadata as a hash + def to_h + { + logical_key: logical_key, + storage_key: storage_key, + cache_status: cache_status, + source: source, + name: name, + version: version, + label: label, + fallback: fallback? + } + end + end + + # Public key inspection result for prompt cache operations. + class PromptCacheKey + # @return [String] Prompt name + attr_reader :name + + # @return [Integer, nil] Prompt version + attr_reader :version + + # @return [String, nil] Prompt label + attr_reader :label + + # @return [String] Stable logical cache identity + attr_reader :logical_key + + # @return [String] Generated backend key for the current cache generation + attr_reader :storage_key + + # @param name [String] Prompt name + # @param logical_key [String] Stable logical cache identity + # @param storage_key [String] Generated backend key + # @param version [Integer, nil] Prompt version + # @param label [String, nil] Prompt label + # @return [PromptCacheKey] + def initialize(name:, logical_key:, storage_key:, version: nil, label: nil) + @name = name + @version = version + @label = label + @logical_key = logical_key + @storage_key = storage_key + end + + # @return [Hash] Cache key data as a hash + def to_h + { + name: name, + version: version, + label: label, + logical_key: logical_key, + storage_key: storage_key + } + end + end +end diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb index 319b24c..8aaff06 100644 --- a/lib/langfuse/rails_cache_adapter.rb +++ b/lib/langfuse/rails_cache_adapter.rb @@ -14,6 +14,7 @@ module Langfuse # adapter.set("greeting:1", prompt_data) # adapter.get("greeting:1") # => prompt_data # + # rubocop:disable Metrics/ClassLength class RailsCacheAdapter include StaleWhileRevalidate @@ -65,18 +66,36 @@ def get(key) Rails.cache.read(namespaced_key(key)) end + # Read a raw cache entry, including stale entries. + # + # @param key [String] Cache key + # @return [Object, nil] Raw cache entry + def entry(key) + Rails.cache.read(namespaced_key(key)) + end + # Set a value in the cache # # @param key [String] Cache key # @param value [Object] Value to cache # @return [Object] The cached value - def set(key, value) + def set(key, value, ttl: nil, stale_ttl: nil) # Calculate expiration: use total_ttl if SWR enabled, otherwise just ttl - expires_in = swr_enabled? ? total_ttl : ttl + effective_ttl = ttl.nil? ? self.ttl : ttl + effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl + expires_in = swr_enabled? ? effective_ttl + effective_stale_ttl : effective_ttl Rails.cache.write(namespaced_key(key), value, expires_in:) value end + # Delete one generated storage key. + # + # @param key [String] Cache key + # @return [Boolean] true if an entry was removed + def delete(key) + Rails.cache.delete(namespaced_key(key)) + end + # Clear the entire Langfuse cache namespace # # Note: This uses delete_matched which may not be available on all cache stores. @@ -88,6 +107,36 @@ def clear Rails.cache.delete_matched("#{namespace}:*") end + # Logically invalidate every generated storage key. + # + # @return [Integer] New global generation + def clear_logically + bump_generation(global_generation_key) + end + + # Logically invalidate every cache variant for one prompt name. + # + # @param name [String] Prompt name + # @return [Integer] New name generation + def invalidate_name(name) + bump_generation(name_generation_key(name)) + end + + # Build a generated storage key for the current cache generation. + # + # @param logical_key [String] Stable logical cache identity + # @param name [String] Prompt name + # @return [String] Generated storage key + def storage_key(logical_key, name:) + generated = PromptCache.storage_key( + logical_key, + name: name, + global_generation: generation_value(global_generation_key), + name_generation: generation_value(name_generation_key(name)) + ) + namespaced_key(generated) + end + # Get current cache size # # Note: Rails.cache doesn't provide a size method, so we return nil @@ -98,6 +147,19 @@ def size nil end + # @return [Hash] Prompt cache statistics + def stats + { + backend: "rails", + enabled: true, + current_generation_entries: nil, + orphaned_entries: nil, + total_entries: nil, + global_generation: generation_value(global_generation_key), + unsupported_counts: %i[current_generation_entries orphaned_entries total_entries] + } + end + # Check if cache is empty # # Note: Rails.cache doesn't provide an efficient way to check if empty, @@ -108,6 +170,17 @@ def empty? false end + # Validate that Rails.cache is available for prompt caching. + # + # @return [Boolean] + # @raise [ConfigurationError] if Rails.cache is not available + # rubocop:disable Naming/PredicateMethod + def validate! + validate_rails_cache! + true + end + # rubocop:enable Naming/PredicateMethod + # Build a cache key from prompt name and options # # @param name [String] Prompt name @@ -135,7 +208,7 @@ def self.build_key(name, version: nil, label: nil) # cache.fetch_with_lock("greeting:v1") do # api_client.get_prompt("greeting") # end - def fetch_with_lock(key) + def fetch_with_lock(key, ttl: nil) # 1. Check cache first (fast path - no lock needed) cached = get(key) return cached if cached @@ -147,7 +220,7 @@ def fetch_with_lock(key) begin # We got the lock - fetch from source and populate cache value = yield - set(key, value) + set(key, value, ttl: ttl) value ensure # Always release lock, even if block raises @@ -173,7 +246,7 @@ def fetch_with_lock(key) # @param key [String] Cache key # @return [Object, nil] Cached value def cache_get(key) - get(key) + entry(key) end # Set value in cache (SWR interface) @@ -181,8 +254,9 @@ def cache_get(key) # @param key [String] Cache key # @param value [Object] Value to cache (expects CacheEntry) # @return [Object] The cached value - def cache_set(key, value) - set(key, value) + def cache_set(key, value, ttl: nil) + Rails.cache.write(namespaced_key(key), value, expires_in: ttl || total_ttl) + value end # Build lock key with namespace @@ -248,7 +322,38 @@ def wait_for_cache(key) # @param key [String] Original cache key # @return [String] Namespaced cache key def namespaced_key(key) - "#{namespace}:#{key}" + key.start_with?("#{namespace}:") ? key : "#{namespace}:#{key}" + end + + def global_generation_key + namespaced_key("__prompt_cache_generation__:global") + end + + def name_generation_key(name) + encoded_name = Base64.urlsafe_encode64(name.to_s, padding: false) + namespaced_key("__prompt_cache_generation__:name:#{encoded_name}") + end + + def generation_value(key) + Rails.cache.read(key).to_i + end + + def bump_generation(key) + incremented = increment_generation(key) + return incremented if incremented + + new_value = generation_value(key) + 1 + Rails.cache.write(key, new_value) + new_value + end + + def increment_generation(key) + return unless Rails.cache.respond_to?(:increment) + + Rails.cache.write(key, 0, unless_exist: true) + Rails.cache.increment(key, 1) + rescue StandardError + nil end # Validate that Rails.cache is available @@ -273,4 +378,5 @@ def default_logger end end end + # rubocop:enable Metrics/ClassLength end diff --git a/lib/langfuse/stale_while_revalidate.rb b/lib/langfuse/stale_while_revalidate.rb index 9d2e264..c469a95 100644 --- a/lib/langfuse/stale_while_revalidate.rb +++ b/lib/langfuse/stale_while_revalidate.rb @@ -41,6 +41,7 @@ module Langfuse # # Implementation-specific lock release # end # end + # rubocop:disable Metrics/ModuleLength module StaleWhileRevalidate # Initialize SWR infrastructure # @@ -72,7 +73,7 @@ def initialize_swr(refresh_threads: 5) # cache.fetch_with_stale_while_revalidate("greeting:v1") do # api_client.get_prompt("greeting") # end - def fetch_with_stale_while_revalidate(key, &) + def fetch_with_stale_while_revalidate(key, ttl: nil, stale_ttl: nil, &) raise ConfigurationError, "fetch_with_stale_while_revalidate requires a positive stale_ttl" unless swr_enabled? entry = cache_get(key) @@ -84,15 +85,50 @@ def fetch_with_stale_while_revalidate(key, &) elsif entry&.stale? # REVALIDATE - return stale + refresh in background logger.debug("CACHE STALE!") - schedule_refresh(key, &) + schedule_refresh(key, ttl: ttl, stale_ttl: stale_ttl, &) entry.data # Instant response! else # MISS - must fetch synchronously logger.debug("CACHE MISS!") - fetch_and_cache(key, &) + fetch_and_cache(key, ttl: ttl, stale_ttl: stale_ttl, &) end end + # Schedule a cache refresh without performing a read. + # + # @param key [String] Cache key + # @param ttl [Integer, nil] Optional fresh TTL override + # @param stale_ttl [Integer, nil] Optional stale TTL override + # @param on_success [#call, nil] Callback invoked after a successful write + # @param on_failure [#call, nil] Callback invoked when refresh raises + # @yield Block to execute to fetch fresh data + # @return [Boolean] true if a refresh was scheduled + def refresh_async(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: nil, &) + raise ConfigurationError, "refresh_async requires a positive stale_ttl" unless swr_enabled? + + schedule_refresh( + key, + ttl: ttl, + stale_ttl: stale_ttl, + on_success: on_success, + on_failure: on_failure, + & + ) + end + + # Write a value with stale-while-revalidate metadata. + # + # @param key [String] Cache key + # @param value [Object] Value to cache + # @param ttl [Integer, nil] Optional fresh TTL override + # @param stale_ttl [Integer, nil] Optional stale TTL override + # @return [Object] The cached value + def write_with_stale_while_revalidate(key, value, ttl: nil, stale_ttl: nil) + raise ConfigurationError, "write_with_stale_while_revalidate requires a positive stale_ttl" unless swr_enabled? + + set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl) + end + # Check if SWR is enabled # # SWR is enabled when stale_ttl is positive, meaning there's a grace period @@ -138,29 +174,35 @@ def initialize_thread_pool(refresh_threads) # @param key [String] Cache key # @yield Block to execute to fetch fresh data # @return [void] - def schedule_refresh(key, &block) + # rubocop:disable Naming/PredicateMethod + def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: nil, &block) # Prevent duplicate refreshes lock_key = build_lock_key(key) - return unless acquire_lock(lock_key) + return false unless acquire_lock(lock_key) @thread_pool.post do value = yield block - set_cache_entry(key, value) + set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl) + on_success&.call(value) rescue StandardError => e + on_failure&.call(e) logger.error("Langfuse cache refresh failed for key '#{key}': #{e.class} - #{e.message}") ensure release_lock(lock_key) end + + true end + # rubocop:enable Naming/PredicateMethod # Fetch data and cache it with SWR metadata # # @param key [String] Cache key # @yield Block to execute to fetch fresh data # @return [Object] Freshly fetched value - def fetch_and_cache(key, &block) + def fetch_and_cache(key, ttl: nil, stale_ttl: nil, &block) value = yield block - set_cache_entry(key, value) + set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl) end # Set value in cache with SWR metadata (CacheEntry) @@ -168,13 +210,15 @@ def fetch_and_cache(key, &block) # @param key [String] Cache key # @param value [Object] Value to cache # @return [Object] The cached value - def set_cache_entry(key, value) + def set_cache_entry(key, value, ttl: nil, stale_ttl: nil) now = Time.now - fresh_until = now + ttl - stale_until = fresh_until + stale_ttl + effective_ttl = ttl.nil? ? self.ttl : ttl + effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl + fresh_until = now + effective_ttl + stale_until = fresh_until + effective_stale_ttl entry = PromptCache::CacheEntry.new(value, fresh_until, stale_until) - cache_set(key, entry) + cache_set(key, entry, ttl: effective_ttl + effective_stale_ttl) value end @@ -213,7 +257,7 @@ def cache_get(_key) # @param value [Object] Value to cache # @return [Object] The cached value # @raise [NotImplementedError] if not implemented by including class - def cache_set(_key, _value) + def cache_set(_key, _value, ttl: nil) raise NotImplementedError, "#{self.class} must implement #cache_set" end @@ -259,4 +303,5 @@ def logger @logger || raise(NotImplementedError, "#{self.class} must provide @logger") end end + # rubocop:enable Metrics/ModuleLength end diff --git a/spec/langfuse/api_client_spec.rb b/spec/langfuse/api_client_spec.rb index d34b0dc..4e74873 100644 --- a/spec/langfuse/api_client_spec.rb +++ b/spec/langfuse/api_client_spec.rb @@ -974,6 +974,207 @@ def set(_key, value) # rubocop:enable RSpec/MultipleMemoizedHelpers end + # rubocop:disable RSpec/MultipleMemoizedHelpers + describe "public prompt cache operations" do + let(:prompt_name) { "greeting" } + let(:prompt_response) do + { + "id" => "prompt-123", + "name" => prompt_name, + "version" => 1, + "prompt" => "Hello {{name}}!", + "type" => "text", + "labels" => ["production"] + } + end + let(:cache) { Langfuse::PromptCache.new(ttl: 60) } + let(:cached_client) do + described_class.new(public_key: public_key, secret_key: secret_key, base_url: base_url, cache: cache) + end + + it "returns fetch metadata while keeping get_prompt prompt-data compatible" do + stub_prompt(prompt_response) + + result = cached_client.get_prompt_result(prompt_name) + prompt = cached_client.get_prompt(prompt_name) + + expect(result).to be_a(Langfuse::PromptFetchResult) + expect(result.prompt).to eq(prompt_response) + expect(result.logical_key).to eq("greeting:production") + expect(result.storage_key).to include("g0:") + expect(result.cache_status).to eq(:miss) + expect(result.source).to eq(:api) + expect(prompt).to eq(prompt_response) + expect(a_request(:get, prompt_url)).to have_been_made.once + end + + it "reports cache hits and emits cache events" do + events = [] + client = described_class.new( + public_key: public_key, + secret_key: secret_key, + base_url: base_url, + cache: cache, + cache_observer: ->(event, payload) { events << [event, payload] } + ) + stub_prompt(prompt_response) + + miss = client.get_prompt_result(prompt_name) + hit = client.get_prompt_result(prompt_name) + + expect(miss.cache_status).to eq(:miss) + expect(hit.cache_status).to eq(:hit) + expect(hit.source).to eq(:cache) + expect(events.map(&:first)).to include(:miss, :write, :hit) + expect(events.last.last).to include(logical_key: "greeting:production", source: :cache) + expect(a_request(:get, prompt_url)).to have_been_made.once + end + + it "bypasses cache reads and writes when cache_ttl is zero" do + stub_request(:get, prompt_url) + .to_return( + { status: 200, body: prompt_response.merge("version" => 1).to_json, + headers: { "Content-Type" => "application/json" } }, + { status: 200, body: prompt_response.merge("version" => 2).to_json, + headers: { "Content-Type" => "application/json" } } + ) + + first = cached_client.get_prompt_result(prompt_name, cache_ttl: 0) + second = cached_client.get_prompt_result(prompt_name, cache_ttl: 0) + + expect(first.cache_status).to eq(:bypass) + expect(second.cache_status).to eq(:bypass) + expect(second.version).to eq(2) + expect(a_request(:get, prompt_url)).to have_been_made.twice + end + + it "refreshes a prompt and writes the refreshed value through to cache" do + stub_request(:get, prompt_url) + .to_return( + { status: 200, body: prompt_response.merge("version" => 1).to_json, + headers: { "Content-Type" => "application/json" } }, + { status: 200, body: prompt_response.merge("version" => 2, "prompt" => "Hi {{name}}!").to_json, + headers: { "Content-Type" => "application/json" } } + ) + + cached_client.get_prompt_result(prompt_name) + refresh = cached_client.refresh_prompt(prompt_name) + hit = cached_client.get_prompt_result(prompt_name) + + expect(refresh.cache_status).to eq(:refresh) + expect(refresh.source).to eq(:api) + expect(hit.cache_status).to eq(:hit) + expect(hit.prompt["prompt"]).to eq("Hi {{name}}!") + expect(a_request(:get, prompt_url)).to have_been_made.twice + end + + it "invalidates one exact logical key without prefix matching" do + other_response = prompt_response.merge("name" => "greeting-extra") + stub_prompt(prompt_response) + stub_request(:get, "#{base_url}/api/public/v2/prompts/greeting-extra") + .to_return(status: 200, body: other_response.to_json, headers: { "Content-Type" => "application/json" }) + + cached_client.get_prompt_result(prompt_name) + cached_client.get_prompt_result("greeting-extra") + invalidated = cached_client.invalidate_prompt_cache(prompt_name) + cached_client.get_prompt_result(prompt_name) + cached_client.get_prompt_result("greeting-extra") + + expect(invalidated.logical_key).to eq("greeting:production") + expect(a_request(:get, prompt_url)).to have_been_made.twice + expect(a_request(:get, "#{base_url}/api/public/v2/prompts/greeting-extra")).to have_been_made.once + end + + it "invalidates all variants for a prompt name after prompt mutation" do + stub_request(:get, prompt_url) + .to_return( + { status: 200, body: prompt_response.merge("version" => 1).to_json, + headers: { "Content-Type" => "application/json" } }, + { status: 200, body: prompt_response.merge("version" => 2).to_json, + headers: { "Content-Type" => "application/json" } } + ) + stub_request(:post, "#{base_url}/api/public/v2/prompts") + .to_return(status: 201, body: prompt_response.merge("version" => 2).to_json, + headers: { "Content-Type" => "application/json" }) + + cached_client.get_prompt_result(prompt_name) + cached_client.create_prompt(name: prompt_name, prompt: "Hi", type: "text") + result = cached_client.get_prompt_result(prompt_name) + + expect(result.version).to eq(2) + expect(a_request(:get, prompt_url)).to have_been_made.twice + end + + it "uses Rails generation keys for name-wide and global invalidation" do + store = build_rails_cache_store + stub_const("Rails", Class.new) + allow(Rails).to receive(:cache).and_return(store) + rails_cache = Langfuse::RailsCacheAdapter.new(ttl: 60) + client = described_class.new( + public_key: public_key, + secret_key: secret_key, + base_url: base_url, + cache: rails_cache + ) + stub_prompt(prompt_response) + + first_key = client.prompt_cache_key(prompt_name).storage_key + client.get_prompt_result(prompt_name) + client.invalidate_prompt_cache_by_name(prompt_name) + second_key = client.prompt_cache_key(prompt_name).storage_key + client.clear_prompt_cache + third_key = client.prompt_cache_key(prompt_name).storage_key + + expect(second_key).not_to eq(first_key) + expect(third_key).not_to eq(second_key) + expect(store.deleted_patterns).to be_empty + end + + def prompt_url + "#{base_url}/api/public/v2/prompts/#{prompt_name}" + end + + def stub_prompt(response) + stub_request(:get, prompt_url) + .to_return(status: 200, body: response.to_json, headers: { "Content-Type" => "application/json" }) + end + + def build_rails_cache_store + Class.new do + attr_reader :deleted_patterns + + def initialize + @data = {} + @deleted_patterns = [] + end + + def read(key) + @data[key] + end + + # rubocop:disable Naming/PredicateMethod + def write(key, value, **_options) + @data[key] = value + true + end + + def delete(key) + !@data.delete(key).nil? + end + # rubocop:enable Naming/PredicateMethod + + def delete_matched(pattern) + @deleted_patterns << pattern + end + + def increment(key, amount) + @data[key] = @data.fetch(key, 0).to_i + amount + end + end.new + end + end + # rubocop:enable RSpec/MultipleMemoizedHelpers + describe "#create_prompt" do let(:prompt_name) { "new-prompt" } let(:text_prompt_request) do diff --git a/spec/langfuse/client_spec.rb b/spec/langfuse/client_spec.rb index 66403ab..917b31d 100644 --- a/spec/langfuse/client_spec.rb +++ b/spec/langfuse/client_spec.rb @@ -154,6 +154,35 @@ class << self end end + context "with auto cache backend" do + let(:config_with_auto_cache) do + Langfuse::Config.new do |config| + config.public_key = "pk_test_123" + config.secret_key = "sk_test_456" + config.base_url = "https://cloud.langfuse.com" + config.cache_ttl = 60 + config.cache_backend = :auto + end + end + + it "uses memory cache when Rails.cache is unavailable" do + client = described_class.new(config_with_auto_cache) + expect(client.api_client.cache).to be_a(Langfuse::PromptCache) + end + + it "uses Rails cache when Rails.cache is available" do + rails_class = Class.new do + def self.cache + Object.new + end + end + stub_const("Rails", rails_class) + + client = described_class.new(config_with_auto_cache) + expect(client.api_client.cache).to be_a(Langfuse::RailsCacheAdapter) + end + end + context "with invalid cache backend" do let(:config_invalid_backend) do Langfuse::Config.new do |config| @@ -520,6 +549,27 @@ class << self a_request(:get, "#{base_url}/api/public/v2/prompts/greeting") ).to have_been_made.once end + + it "returns prompt clients with cache metadata" do + result = cached_client.get_prompt_result("greeting") + cached_result = cached_client.get_prompt_result("greeting") + + expect(result.prompt).to be_a(Langfuse::TextPromptClient) + expect(result.logical_key).to eq("greeting:production") + expect(result.cache_status).to eq(:miss) + expect(result.source).to eq(:api) + expect(cached_result.cache_status).to eq(:hit) + expect(cached_result.source).to eq(:cache) + end + + it "exposes flat prompt cache inspection and validation APIs" do + key = cached_client.prompt_cache_key("greeting") + + expect(key.logical_key).to eq("greeting:production") + expect(key.storage_key).to include("g0:") + expect(cached_client.prompt_cache_stats).to include(backend: "memory", enabled: true) + expect(cached_client.validate_prompt_cache_backend!).to be(true) + end end context "with fallback support" do @@ -543,6 +593,16 @@ class << self expect(result.is_fallback).to be(true) end + it "returns fallback fetch metadata" do + result = client.get_prompt_result("missing", fallback: "Hello!", type: :text) + + expect(result.prompt).to be_a(Langfuse::TextPromptClient) + expect(result.fallback?).to be(true) + expect(result.source).to eq(:fallback) + expect(result.cache_status).to eq(:miss) + expect(result.logical_key).to eq("missing:production") + end + it "raises error when no fallback provided" do expect do client.get_prompt("missing") diff --git a/spec/langfuse/config_spec.rb b/spec/langfuse/config_spec.rb index 6c860b1..40b89cf 100644 --- a/spec/langfuse/config_spec.rb +++ b/spec/langfuse/config_spec.rb @@ -300,6 +300,26 @@ config.cache_backend = :rails expect { config.validate! }.not_to raise_error end + + it "allows :auto backend" do + config.cache_backend = :auto + expect { config.validate! }.not_to raise_error + end + end + + context "when prompt_cache_observer is invalid" do + it "raises ConfigurationError when observer is not callable" do + config.prompt_cache_observer = Object.new + expect { config.validate! }.to raise_error( + Langfuse::ConfigurationError, + "prompt_cache_observer must respond to #call" + ) + end + + it "allows callable observers" do + config.prompt_cache_observer = ->(_event, _payload) {} + expect { config.validate! }.not_to raise_error + end end context "when sample_rate is invalid" do diff --git a/spec/langfuse/prompt_cache_spec.rb b/spec/langfuse/prompt_cache_spec.rb index c786933..33e5a14 100644 --- a/spec/langfuse/prompt_cache_spec.rb +++ b/spec/langfuse/prompt_cache_spec.rb @@ -224,6 +224,50 @@ end end + describe "generated cache keys" do + it "keeps logical keys stable while changing storage keys by generation" do + logical_key = described_class.build_key("greeting") + first_storage_key = cache.storage_key(logical_key, name: "greeting") + + cache.invalidate_name("greeting") + second_storage_key = cache.storage_key(logical_key, name: "greeting") + + cache.clear_logically + third_storage_key = cache.storage_key(logical_key, name: "greeting") + + expect(logical_key).to eq("greeting:production") + expect(second_storage_key).not_to eq(first_storage_key) + expect(third_storage_key).not_to eq(second_storage_key) + end + + it "tracks current and orphaned entries after logical invalidation" do + logical_key = described_class.build_key("greeting") + old_storage_key = cache.storage_key(logical_key, name: "greeting") + cache.set(old_storage_key, test_data) + + cache.invalidate_name("greeting") + new_storage_key = cache.storage_key(logical_key, name: "greeting") + cache.set(new_storage_key, test_data) + + expect(cache.stats).to include( + current_generation_entries: 1, + orphaned_entries: 1, + total_entries: 2 + ) + end + + it "deletes one generated storage key without touching sibling names" do + greeting_key = cache.storage_key(described_class.build_key("greeting"), name: "greeting") + sibling_key = cache.storage_key(described_class.build_key("greeting-extra"), name: "greeting-extra") + cache.set(greeting_key, test_data) + cache.set(sibling_key, { "name" => "greeting-extra" }) + + expect(cache.delete(greeting_key)).to be(true) + expect(cache.get(greeting_key)).to be_nil + expect(cache.get(sibling_key)).to eq({ "name" => "greeting-extra" }) + end + end + describe "TTL expiration" do it "returns nil for expired entries" do cache.set("key1", test_data) From b7e2f2bbc81f74105d2647b3bc37deb468bfa252 Mon Sep 17 00:00:00 2001 From: kadekillary Date: Mon, 4 May 2026 03:36:58 -0600 Subject: [PATCH 2/7] refactor(prompts): simplify prompt cache operations - Centralize label-defaulting on PromptCacheKey#resolved_label - Share event payload builder between ApiClient and Client - Memoize cache backend name and ActiveSupport::Notifications lookup - Skip event emission early when no observers are configured - Drop dead to_sym coercions and uncalled fetch_with_simple_cache helpers --- lib/langfuse/api_client.rb | 55 ++++++++++++----------------- lib/langfuse/client.rb | 17 ++++----- lib/langfuse/prompt_fetch_result.rb | 12 +++++-- 3 files changed, 39 insertions(+), 45 deletions(-) diff --git a/lib/langfuse/api_client.rb b/lib/langfuse/api_client.rb index 0a2ea41..67f861b 100644 --- a/lib/langfuse/api_client.rb +++ b/lib/langfuse/api_client.rb @@ -63,6 +63,8 @@ def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cac @logger = logger || Logger.new($stdout, level: Logger::WARN) @cache = cache @cache_observer = cache_observer + @cache_backend_name = compute_cache_backend_name + @active_support_notifications = defined?(ActiveSupport::Notifications) ? ActiveSupport::Notifications : nil end # rubocop:enable Metrics/ParameterLists @@ -237,11 +239,25 @@ def prompt_cache_stats # @param payload [Hash] Event payload # @return [void] def emit_prompt_cache_event(event, payload) + return if @cache_observer.nil? && @active_support_notifications.nil? + normalized_payload = payload.merge(event: event.to_sym) notify_cache_observer(event, normalized_payload) notify_active_support(normalized_payload) end + # Build a prompt cache event payload for one logical key. + # + # @api private + # @param key [PromptCacheKey] Logical and storage cache key + # @param cache_status [Symbol] Cache status (:hit, :miss, :stale, :refresh, :bypass, :disabled) + # @param source [Symbol] Prompt source (:cache, :api, :fallback) + # @param extra [Hash] Extra fields to merge into the payload + # @return [Hash] Event payload + def prompt_event_payload(key, cache_status, source, extra = {}) + event_payload(key, cache_status, source, extra) + end + # Create a new prompt (or new version if prompt with same name exists) # # @param name [String] The prompt name @@ -887,7 +903,7 @@ def build_prompt_result(key, prompt_data, cache_status, source) source: source, name: prompt_data["name"] || key.name, version: prompt_data["version"] || key.version, - label: key.label || (key.version ? nil : "production") + label: key.resolved_label ) end @@ -895,7 +911,7 @@ def event_payload(key, cache_status, source, extra = {}) { name: key.name, version: key.version, - label: key.label || (key.version ? nil : "production"), + label: key.resolved_label, logical_key: key.logical_key, storage_key: key.storage_key, backend: cache_backend_name, @@ -904,7 +920,9 @@ def event_payload(key, cache_status, source, extra = {}) }.merge(extra) end - def cache_backend_name + attr_reader :cache_backend_name + + def compute_cache_backend_name return "disabled" unless cache return "rails" if cache.is_a?(RailsCacheAdapter) return "memory" if cache.is_a?(PromptCache) @@ -943,30 +961,13 @@ def notify_cache_observer(event, payload) end def notify_active_support(payload) - return unless defined?(ActiveSupport::Notifications) + return unless @active_support_notifications - ActiveSupport::Notifications.instrument("prompt_cache.langfuse", payload) + @active_support_notifications.instrument("prompt_cache.langfuse", payload) rescue StandardError => e logger.warn("Langfuse ActiveSupport cache notification failed: #{e.class} - #{e.message}") end - # Fetch prompt using the most appropriate caching strategy available - # - # @param cache_key [String] The cache key for this prompt - # @param name [String] The name of the prompt - # @param version [Integer, nil] Optional specific version number - # @param label [String, nil] Optional label - # @return [Hash] The prompt data - def fetch_with_appropriate_caching_strategy(cache_key, name, version, label) - if swr_cache_available? - fetch_with_swr_cache(cache_key, name, version, label) - elsif distributed_cache_available? - fetch_with_distributed_cache(cache_key, name, version, label) - else - fetch_with_simple_cache(cache_key, name, version, label) - end - end - # Check if SWR cache is available def swr_cache_available? cache.respond_to?(:swr_enabled?) && cache.swr_enabled? @@ -1062,16 +1063,6 @@ def fetch_with_distributed_cache(cache_key, name, version, label) end end - # Fetch with simple cache (in-memory cache) - def fetch_with_simple_cache(cache_key, name, version, label) - cached_data = cache.get(cache_key) - return cached_data if cached_data - - prompt_data = fetch_prompt_from_api(name, version: version, label: label) - cache.set(cache_key, prompt_data) - prompt_data - end - # Fetch a prompt from the API (without caching) # # @param name [String] The name of the prompt diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb index de6df25..9c50cda 100644 --- a/lib/langfuse/client.rb +++ b/lib/langfuse/client.rb @@ -873,23 +873,18 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con source: :fallback, name: name, version: version || prompt_client.version, - label: key.label || (key.version ? nil : "production") + label: key.resolved_label ) end def fallback_event_payload(key, fetch_context, error) - { - name: key.name, - version: key.version, - label: key.label || (key.version ? nil : "production"), - logical_key: key.logical_key, - storage_key: key.storage_key, - backend: api_client.prompt_cache_stats.fetch(:backend), - cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)), - source: :fallback, + api_client.prompt_event_payload( + key, + fallback_cache_status(fetch_context.fetch(:cache_ttl)), + :fallback, error_class: error.class.name, error_message: error.message - } + ) end def fallback_cache_status(cache_ttl) diff --git a/lib/langfuse/prompt_fetch_result.rb b/lib/langfuse/prompt_fetch_result.rb index b11e511..3ef5021 100644 --- a/lib/langfuse/prompt_fetch_result.rb +++ b/lib/langfuse/prompt_fetch_result.rb @@ -41,8 +41,8 @@ def initialize(prompt:, logical_key:, storage_key:, cache_status:, source:, name @prompt = prompt @logical_key = logical_key @storage_key = storage_key - @cache_status = cache_status.to_sym - @source = source.to_sym + @cache_status = cache_status + @source = source @name = name @version = version @label = label @@ -107,6 +107,14 @@ def initialize(name:, logical_key:, storage_key:, version: nil, label: nil) @storage_key = storage_key end + # Resolve the effective label, defaulting to "production" when neither + # an explicit label nor a version was specified. + # + # @return [String, nil] Effective label + def resolved_label + label || (version ? nil : "production") + end + # @return [Hash] Cache key data as a hash def to_h { From 36218b27cfd94de89f30002c58881bdfd7d6465f Mon Sep 17 00:00:00 2001 From: kadekillary Date: Mon, 4 May 2026 04:01:14 -0600 Subject: [PATCH 3/7] fix(prompts): address prompt cache review comments --- lib/langfuse/client.rb | 12 +++++++----- lib/langfuse/stale_while_revalidate.rb | 4 ++-- spec/langfuse/client_spec.rb | 7 +++++++ spec/langfuse/prompt_cache_spec.rb | 18 ++++++++++++++++++ 4 files changed, 34 insertions(+), 7 deletions(-) diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb index 9c50cda..f059f37 100644 --- a/lib/langfuse/client.rb +++ b/lib/langfuse/client.rb @@ -114,7 +114,7 @@ def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, # Log warning and return fallback config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.") - build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e) + build_fallback_prompt_result(name, version, label, fallback, type, { cache_ttl: cache_ttl, error: e }) end # Refresh a prompt from the API, optionally writing through to cache. @@ -859,17 +859,19 @@ def build_client_fetch_result(api_result, prompt_client) end def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context) + cache_ttl = fetch_context.fetch(:cache_ttl) + error = fetch_context.fetch(:error) prompt_client = build_fallback_prompt_client(name, fallback, type) key = api_client.prompt_cache_key(name, version: version, label: label) api_client.emit_prompt_cache_event( :fallback, - fallback_event_payload(key, fetch_context, fetch_context.fetch(:error)) + fallback_event_payload(key, cache_ttl, error) ) PromptFetchResult.new( prompt: prompt_client, logical_key: key.logical_key, storage_key: key.storage_key, - cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)), + cache_status: fallback_cache_status(cache_ttl), source: :fallback, name: name, version: version || prompt_client.version, @@ -877,10 +879,10 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con ) end - def fallback_event_payload(key, fetch_context, error) + def fallback_event_payload(key, cache_ttl, error) api_client.prompt_event_payload( key, - fallback_cache_status(fetch_context.fetch(:cache_ttl)), + fallback_cache_status(cache_ttl), :fallback, error_class: error.class.name, error_message: error.message diff --git a/lib/langfuse/stale_while_revalidate.rb b/lib/langfuse/stale_while_revalidate.rb index c469a95..11afa03 100644 --- a/lib/langfuse/stale_while_revalidate.rb +++ b/lib/langfuse/stale_while_revalidate.rb @@ -181,7 +181,7 @@ def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: return false unless acquire_lock(lock_key) @thread_pool.post do - value = yield block + value = block.call set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl) on_success&.call(value) rescue StandardError => e @@ -201,7 +201,7 @@ def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: # @yield Block to execute to fetch fresh data # @return [Object] Freshly fetched value def fetch_and_cache(key, ttl: nil, stale_ttl: nil, &block) - value = yield block + value = block.call set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl) end diff --git a/spec/langfuse/client_spec.rb b/spec/langfuse/client_spec.rb index 917b31d..6903f0d 100644 --- a/spec/langfuse/client_spec.rb +++ b/spec/langfuse/client_spec.rb @@ -603,6 +603,13 @@ def self.cache expect(result.logical_key).to eq("missing:production") end + it "returns bypass metadata when fallback is used during a cache bypass" do + result = client.get_prompt_result("missing", fallback: "Hello!", type: :text, cache_ttl: 0) + + expect(result.fallback?).to be(true) + expect(result.cache_status).to eq(:bypass) + end + it "raises error when no fallback provided" do expect do client.get_prompt("missing") diff --git a/spec/langfuse/prompt_cache_spec.rb b/spec/langfuse/prompt_cache_spec.rb index 33e5a14..484a7e5 100644 --- a/spec/langfuse/prompt_cache_spec.rb +++ b/spec/langfuse/prompt_cache_spec.rb @@ -128,6 +128,24 @@ cache.fetch_with_stale_while_revalidate("test") { "value" } end + it "fetches misses with strict zero-arity callables" do + cache = described_class.new(ttl: 60, stale_ttl: 120) + fetch_prompt = -> { "value" } + + expect(cache.fetch_with_stale_while_revalidate("test", &fetch_prompt)).to eq("value") + end + + it "refreshes stale entries with strict zero-arity callables" do + cache = described_class.new(ttl: 60, stale_ttl: 120) + thread_pool = cache.instance_variable_get(:@thread_pool) + allow(thread_pool).to receive(:post).and_yield + fetch_prompt = -> { "refreshed" } + + cache.send(:schedule_refresh, "test", &fetch_prompt) + + expect(cache.entry("test").data).to eq("refreshed") + end + it "accepts custom refresh_threads parameter" do # Can't verify thread pool size directly, but can verify it doesn't error expect do From dbca6ebfe363629dc814191ff52dd4706222f35e Mon Sep 17 00:00:00 2001 From: kadekillary Date: Mon, 4 May 2026 04:12:39 -0600 Subject: [PATCH 4/7] refactor(prompts): use keyword args for fallback context --- lib/langfuse/client.rb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb index f059f37..7602d8e 100644 --- a/lib/langfuse/client.rb +++ b/lib/langfuse/client.rb @@ -114,7 +114,7 @@ def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, # Log warning and return fallback config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.") - build_fallback_prompt_result(name, version, label, fallback, type, { cache_ttl: cache_ttl, error: e }) + build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e) end # Refresh a prompt from the API, optionally writing through to cache. @@ -858,9 +858,8 @@ def build_client_fetch_result(api_result, prompt_client) ) end - def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context) - cache_ttl = fetch_context.fetch(:cache_ttl) - error = fetch_context.fetch(:error) + # rubocop:disable Metrics/ParameterLists + def build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl:, error:) prompt_client = build_fallback_prompt_client(name, fallback, type) key = api_client.prompt_cache_key(name, version: version, label: label) api_client.emit_prompt_cache_event( @@ -878,6 +877,7 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con label: key.resolved_label ) end + # rubocop:enable Metrics/ParameterLists def fallback_event_payload(key, cache_ttl, error) api_client.prompt_event_payload( From 88372675e1ee70e09cbf342cf5baec17ce4475df Mon Sep 17 00:00:00 2001 From: kadekillary Date: Mon, 4 May 2026 14:01:39 -0600 Subject: [PATCH 5/7] fix(prompts): surface cache generation increment failures --- docs/CACHING.md | 2 ++ lib/langfuse/rails_cache_adapter.rb | 3 ++- spec/langfuse/rails_cache_adapter_spec.rb | 24 +++++++++++++++++++++++ 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/docs/CACHING.md b/docs/CACHING.md index 7717fa2..621f2f0 100644 --- a/docs/CACHING.md +++ b/docs/CACHING.md @@ -73,6 +73,8 @@ Cache identity is prompt name plus version or label. When neither is supplied, t Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL. +Automatic mutation invalidation only covers `create_prompt` and `update_prompt` calls made by the current SDK process. Prompt edits made in the Langfuse UI or by other SDKs become visible through TTL expiry, `refresh_prompt`, or explicit invalidation. + ### Cache Events Set `prompt_cache_observer` to receive cache events without binding the SDK to your metric names: diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb index 8aaff06..ffea7d8 100644 --- a/lib/langfuse/rails_cache_adapter.rb +++ b/lib/langfuse/rails_cache_adapter.rb @@ -352,7 +352,8 @@ def increment_generation(key) Rails.cache.write(key, 0, unless_exist: true) Rails.cache.increment(key, 1) - rescue StandardError + rescue StandardError => e + logger.warn("Langfuse prompt cache generation increment failed for key '#{key}': #{e.class} - #{e.message}") nil end diff --git a/spec/langfuse/rails_cache_adapter_spec.rb b/spec/langfuse/rails_cache_adapter_spec.rb index d4165f2..2e539cc 100644 --- a/spec/langfuse/rails_cache_adapter_spec.rb +++ b/spec/langfuse/rails_cache_adapter_spec.rb @@ -261,6 +261,30 @@ class << self end end + describe "#invalidate_name" do + let(:logger) { instance_double(Logger, warn: nil) } + let(:adapter) { described_class.new(logger: logger) } + let(:generation_key) do + encoded_name = Base64.urlsafe_encode64("greeting", padding: false) + "langfuse:__prompt_cache_generation__:name:#{encoded_name}" + end + + it "warns when atomic generation increment fails" do + allow(mock_cache).to receive(:respond_to?).with(:increment).and_return(true) + expect(mock_cache).to receive(:write) + .with(generation_key, 0, unless_exist: true) + .and_raise(StandardError, "connection refused") + warning = "Langfuse prompt cache generation increment failed for key '#{generation_key}': " \ + "StandardError - connection refused" + expect(logger).to receive(:warn) + .with(warning) + expect(mock_cache).to receive(:read).with(generation_key).and_return(2) + expect(mock_cache).to receive(:write).with(generation_key, 3).and_return(true) + + expect(adapter.invalidate_name("greeting")).to eq(3) + end + end + describe "#size" do let(:adapter) { described_class.new } From bee4e78601d36fe75cce4735399f9aab7b4775c3 Mon Sep 17 00:00:00 2001 From: kadekillary Date: Mon, 4 May 2026 14:23:40 -0600 Subject: [PATCH 6/7] perf(prompts): memoize Rails cache generations --- lib/langfuse/rails_cache_adapter.rb | 47 ++++++++++++++++++++++- spec/langfuse/rails_cache_adapter_spec.rb | 36 +++++++++++++++++ 2 files changed, 81 insertions(+), 2 deletions(-) diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb index ffea7d8..9986e7f 100644 --- a/lib/langfuse/rails_cache_adapter.rb +++ b/lib/langfuse/rails_cache_adapter.rb @@ -18,6 +18,8 @@ module Langfuse class RailsCacheAdapter include StaleWhileRevalidate + GENERATION_MEMO_TTL_SECONDS = 1.0 + # @return [Integer] Time-to-live in seconds attr_reader :ttl @@ -55,6 +57,8 @@ def initialize(ttl: 60, namespace: "langfuse", lock_timeout: 10, stale_ttl: 0, r @lock_timeout = lock_timeout @stale_ttl = stale_ttl @logger = logger + @generation_memo = {} + @generation_memo_mutex = Mutex.new initialize_swr(refresh_threads: refresh_threads) if swr_enabled? end @@ -105,6 +109,7 @@ def delete(key) def clear # Delete all keys matching the namespace pattern Rails.cache.delete_matched("#{namespace}:*") + clear_generation_memo end # Logically invalidate every generated storage key. @@ -335,15 +340,25 @@ def name_generation_key(name) end def generation_value(key) - Rails.cache.read(key).to_i + now = monotonic_time + memoized = memoized_generation_value(key, now) + return memoized unless memoized.nil? + + Rails.cache.read(key).to_i.tap do |value| + memoize_generation_value(key, value, now) + end end def bump_generation(key) incremented = increment_generation(key) - return incremented if incremented + if incremented + memoize_generation_value(key, incremented.to_i) + return incremented + end new_value = generation_value(key) + 1 Rails.cache.write(key, new_value) + memoize_generation_value(key, new_value) new_value end @@ -357,6 +372,34 @@ def increment_generation(key) nil end + def memoized_generation_value(key, now) + @generation_memo_mutex.synchronize do + entry = @generation_memo[key] + return nil unless entry + + return entry.fetch(:value) if now < entry.fetch(:expires_at) + + @generation_memo.delete(key) + nil + end + end + + def memoize_generation_value(key, value, now = monotonic_time) + @generation_memo_mutex.synchronize do + @generation_memo[key] = { value: value, expires_at: now + GENERATION_MEMO_TTL_SECONDS } + end + end + + def clear_generation_memo + @generation_memo_mutex.synchronize do + @generation_memo.clear + end + end + + def monotonic_time + Process.clock_gettime(Process::CLOCK_MONOTONIC) + end + # Validate that Rails.cache is available # # @raise [ConfigurationError] if Rails.cache is not available diff --git a/spec/langfuse/rails_cache_adapter_spec.rb b/spec/langfuse/rails_cache_adapter_spec.rb index 2e539cc..1c3bfdb 100644 --- a/spec/langfuse/rails_cache_adapter_spec.rb +++ b/spec/langfuse/rails_cache_adapter_spec.rb @@ -285,6 +285,42 @@ class << self end end + describe "#storage_key" do + let(:adapter) { described_class.new } + let(:logical_key) { "greeting:production" } + let(:encoded_name) { Base64.urlsafe_encode64("greeting", padding: false) } + let(:global_generation_key) { "langfuse:__prompt_cache_generation__:global" } + let(:name_generation_key) { "langfuse:__prompt_cache_generation__:name:#{encoded_name}" } + + it "memoizes generation reads for repeated key inspection" do + expect(mock_cache).to receive(:read).with(global_generation_key).once.and_return(4) + expect(mock_cache).to receive(:read).with(name_generation_key).once.and_return(7) + + first_key = adapter.storage_key(logical_key, name: "greeting") + second_key = adapter.storage_key(logical_key, name: "greeting") + + expect(second_key).to eq(first_key) + expect(first_key).to eq("langfuse:g4:n#{encoded_name}:7:greeting:production") + end + + it "updates memoized generations after same-process name invalidation" do + allow(mock_cache).to receive(:respond_to?).with(:increment).and_return(true) + expect(mock_cache).to receive(:read).with(global_generation_key).once.and_return(0) + expect(mock_cache).to receive(:read).with(name_generation_key).once.and_return(0) + expect(mock_cache).to receive(:write) + .with(name_generation_key, 0, unless_exist: true) + .and_return(true) + expect(mock_cache).to receive(:increment).with(name_generation_key, 1).and_return(1) + + first_key = adapter.storage_key(logical_key, name: "greeting") + adapter.invalidate_name("greeting") + second_key = adapter.storage_key(logical_key, name: "greeting") + + expect(first_key).to eq("langfuse:g0:n#{encoded_name}:0:greeting:production") + expect(second_key).to eq("langfuse:g0:n#{encoded_name}:1:greeting:production") + end + end + describe "#size" do let(:adapter) { described_class.new } From cf77981430aa7cbf5301c249a2a573e599f9dcce Mon Sep 17 00:00:00 2001 From: kadekillary Date: Tue, 5 May 2026 00:29:46 -0600 Subject: [PATCH 7/7] docs(prompts): align cache operation guidance --- docs/API_REFERENCE.md | 3 ++- docs/CACHING.md | 34 +++++++++++++++++++++++----------- docs/CONFIGURATION.md | 5 +++-- docs/ERROR_HANDLING.md | 9 +++++---- docs/MIGRATION.md | 4 ++-- docs/RAILS.md | 11 +++++++---- 6 files changed, 42 insertions(+), 24 deletions(-) diff --git a/docs/API_REFERENCE.md b/docs/API_REFERENCE.md index 745d803..17309ec 100644 --- a/docs/API_REFERENCE.md +++ b/docs/API_REFERENCE.md @@ -42,11 +42,12 @@ Block receives a `Langfuse::Config` object with these properties: | `timeout` | Integer | No | `5` | HTTP timeout (seconds) | | `cache_ttl` | Integer | No | `60` | Prompt cache TTL (seconds) | | `cache_max_size` | Integer | No | `1000` | Max cached prompts | -| `cache_backend` | Symbol | No | `:memory` | `:memory` or `:rails` | +| `cache_backend` | Symbol | No | `:memory` | `:memory`, `:rails`, or `:auto` | | `cache_lock_timeout` | Integer | No | `10` | Lock timeout (seconds) | | `cache_stale_while_revalidate` | Boolean | No | `false` | Advisory SWR intent flag (effective activation depends on `cache_stale_ttl`) | | `cache_stale_ttl` | Integer or `:indefinite` | No | `0` | Stale TTL (seconds, `>0` enables SWR) | | `cache_refresh_threads` | Integer | No | `5` | Background refresh threads | +| `prompt_cache_observer` | Callable | No | `nil` | Prompt cache event hook | | `batch_size` | Integer | No | `50` | Score + trace export batch size | | `flush_interval` | Integer | No | `10` | Score + trace export interval (s) | | `sample_rate` | Float | No | `1.0` | Trace + trace-linked score sampling rate (`0.0..1.0`) | diff --git a/docs/CACHING.md b/docs/CACHING.md index 621f2f0..cefbb65 100644 --- a/docs/CACHING.md +++ b/docs/CACHING.md @@ -566,10 +566,13 @@ RUN bundle exec rake langfuse:warm_cache_all See [CONFIGURATION.md](CONFIGURATION.md) for all cache-related configuration options: -- `cache_backend` - `:memory` or `:rails` +- `cache_backend` - `:memory`, `:rails`, or `:auto` - `cache_ttl` - Time-to-live in seconds - `cache_max_size` - Max prompts (in-memory only) - `cache_lock_timeout` - Lock timeout (Rails.cache only) +- `cache_stale_ttl` - Stale serving window; `> 0` enables SWR +- `cache_refresh_threads` - Background refresh worker count +- `prompt_cache_observer` - Optional cache event hook ## Performance Considerations @@ -626,12 +629,11 @@ config.cache_backend = :rails ### 2. Enable SWR for Production ```ruby -# Development: disabled for predictable behavior -config.cache_stale_while_revalidate = !Rails.env.development? - -# Production: enabled for best performance if Rails.env.production? + config.cache_stale_while_revalidate = true # Advisory intent flag config.cache_stale_ttl = config.cache_ttl # Set explicitly (common default) +else + config.cache_stale_ttl = 0 # Disabled for predictable prompt iteration end ``` @@ -655,8 +657,14 @@ bundle exec rake langfuse:warm_cache_all ### 5. Monitor Cache Performance ```ruby -# Log cache hits/misses -Rails.logger.info "Fetching prompt: #{name} (cache: #{cache_hit? ? 'HIT' : 'MISS'})" +config.prompt_cache_observer = lambda do |event, payload| + Rails.logger.info( + event: event, + prompt: payload[:name], + status: payload[:cache_status], + source: payload[:source] + ) +end ``` ### 6. Handle Cache Failures Gracefully @@ -674,9 +682,11 @@ prompt = Langfuse.client.get_prompt( ```ruby # Rails console -Langfuse.client.api_client.cache&.clear +Langfuse.client.invalidate_prompt_cache("greeting", label: "production") +Langfuse.client.invalidate_prompt_cache_by_name("greeting") +Langfuse.client.clear_prompt_cache -# Or use rake task +# Or use the rake task rake langfuse:clear_cache ``` @@ -725,8 +735,10 @@ end **Solutions**: 1. Wait for TTL to expire -2. Clear cache manually: `rake langfuse:clear_cache` -3. Reduce `cache_ttl` in development +2. Refresh one prompt now: `Langfuse.client.refresh_prompt("greeting")` +3. Invalidate cached prompt entries: `Langfuse.client.invalidate_prompt_cache_by_name("greeting")` +4. Clear the prompt cache namespace: `Langfuse.client.clear_prompt_cache` +5. Reduce `cache_ttl` in development ### Stampede Protection Not Working diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 6eb4714..ffbe2fd 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -618,8 +618,9 @@ Validation rules: - `public_key` must be present - `secret_key` must be present -- `cache_backend` must be `:memory` or `:rails` -- If `:rails`, Rails must be defined +- `cache_backend` must be `:memory`, `:rails`, or `:auto` +- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` must be available +- `prompt_cache_observer` must respond to `#call` (if set) - `should_export_span` must respond to `#call` (if set) - `mask` must respond to `#call` (if set) diff --git a/docs/ERROR_HANDLING.md b/docs/ERROR_HANDLING.md index 9d536d4..82130ad 100644 --- a/docs/ERROR_HANDLING.md +++ b/docs/ERROR_HANDLING.md @@ -50,8 +50,8 @@ end **Validation checklist:** - `public_key` present and starts with `pk-lf-` - `secret_key` present and starts with `sk-lf-` -- `cache_backend` is `:memory` or `:rails` -- If `:rails`, Rails is defined +- `cache_backend` is `:memory`, `:rails`, or `:auto` +- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` are available ### `Langfuse::UnauthorizedError` @@ -432,8 +432,9 @@ puts config.inspect ### Check Cache State ```ruby -cache = Langfuse.client.api_client.cache -puts "Cache backend: #{cache&.class || 'disabled'}" +stats = Langfuse.client.prompt_cache_stats +puts "Cache backend: #{stats[:backend]}" +puts "Cache enabled: #{stats[:enabled]}" ``` ### Test Credentials diff --git a/docs/MIGRATION.md b/docs/MIGRATION.md index 6abaca3..12b077c 100644 --- a/docs/MIGRATION.md +++ b/docs/MIGRATION.md @@ -685,8 +685,8 @@ prompts: ```ruby # In Rails console Langfuse.reset! # Clears everything -# Or just clear cache -Langfuse.client.api_client.cache&.clear +# Or just clear the prompt cache namespace +Langfuse.client.clear_prompt_cache ``` ### Problem: Variables not substituting correctly diff --git a/docs/RAILS.md b/docs/RAILS.md index 9b29bf9..154c0cd 100644 --- a/docs/RAILS.md +++ b/docs/RAILS.md @@ -253,7 +253,7 @@ Useful console checks: ```ruby Langfuse.configuration -Langfuse.client.api_client.cache +Langfuse.client.prompt_cache_stats ``` ## Troubleshooting @@ -263,11 +263,14 @@ Langfuse.client.api_client.cache The usual problem is stale cache, not a broken prompt API. 1. Wait for `cache_ttl` to expire. -2. Clear the cache entry store directly if you need to inspect fresh state now. -3. Lower `cache_ttl` in development if you are iterating quickly. +2. Refresh the specific prompt if you need fresh state now. +3. Invalidate the prompt name or clear the prompt cache namespace. +4. Lower `cache_ttl` in development if you are iterating quickly. ```ruby -Langfuse.client.api_client.cache&.clear +Langfuse.client.refresh_prompt("greeting", label: "production") +Langfuse.client.invalidate_prompt_cache_by_name("greeting") +Langfuse.client.clear_prompt_cache ``` ### Traces Missing Entirely