From 85e52bc716991fce88b022e49eb27b5852f9979c Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Sun, 3 May 2026 17:07:26 -0600
Subject: [PATCH 1/7] feat(prompts): expose prompt cache operations

---
 docs/API_REFERENCE.md                  |  48 +++-
 docs/CACHING.md                        |  67 ++++-
 docs/CONFIGURATION.md                  |  21 +-
 docs/PROMPTS.md                        |  25 +-
 lib/langfuse.rb                        |   1 +
 lib/langfuse/api_client.rb             | 369 ++++++++++++++++++++++++-
 lib/langfuse/client.rb                 | 191 ++++++++++++-
 lib/langfuse/config.rb                 |  15 +-
 lib/langfuse/prompt_cache.rb           | 130 ++++++++-
 lib/langfuse/prompt_fetch_result.rb    | 121 ++++++++
 lib/langfuse/rails_cache_adapter.rb    | 122 +++++++-
 lib/langfuse/stale_while_revalidate.rb |  71 ++++-
 spec/langfuse/api_client_spec.rb       | 201 ++++++++++++++
 spec/langfuse/client_spec.rb           |  60 ++++
 spec/langfuse/config_spec.rb           |  20 ++
 spec/langfuse/prompt_cache_spec.rb     |  44 +++
 16 files changed, 1455 insertions(+), 51 deletions(-)
 create mode 100644 lib/langfuse/prompt_fetch_result.rb
diff --git a/docs/API_REFERENCE.md b/docs/API_REFERENCE.md
index 4b1ff5d..745d803 100644
--- a/docs/API_REFERENCE.md
+++ b/docs/API_REFERENCE.md
@@ -215,7 +215,7 @@ Fetch a prompt from Langfuse (with caching).
 **Signature:**
 
 ```ruby
-get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
+get_prompt(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
 ```
 
 **Parameters:**
@@ -227,6 +227,7 @@ get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
 | `label`    | String  | No          | Version label (e.g., "production")                   |
 | `fallback` | String or Array<Hash> | No | Fallback prompt if not found (`String` for text, `Array<Hash>` for chat) |
 | `type`     | Symbol  | Conditional | `:text` or `:chat` (required if `fallback` provided) |
+| `cache_ttl` | Integer | No | Per-call cache TTL override. `0` bypasses cache read/write |
 
 **Returns:** `Langfuse::TextPromptClient` or `Langfuse::ChatPromptClient`
 
@@ -257,6 +258,49 @@ prompt = client.get_prompt("new-prompt",
 
 See [PROMPTS.md](PROMPTS.md) for complete guide.
 
+### `Client#get_prompt_result`
+
+Fetch a prompt and return cache metadata.
+
+**Signature:**
+
+```ruby
+get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
+```
+
+**Returns:** `Langfuse::PromptFetchResult`
+
+| Attribute | Type | Description |
+| --------- | ---- | ----------- |
+| `prompt` | TextPromptClient or ChatPromptClient | Prompt client |
+| `logical_key` | String | Stable logical identity: name plus version or label/default production |
+| `storage_key` | String | Backend key for the current cache generation |
+| `cache_status` | Symbol | `:hit`, `:miss`, `:stale`, `:refresh`, `:bypass`, or `:disabled` |
+| `source` | Symbol | `:cache`, `:api`, or `:fallback` |
+| `fallback?` | Boolean | Whether fallback content was returned |
+
+```ruby
+result = client.get_prompt_result("greeting", label: "production")
+result.prompt.compile(name: "Ada")
+result.cache_status # => :miss
+```
+
+### Prompt Cache Operations
+
+Flat client APIs for operational prompt cache control:
+
+```ruby
+client.refresh_prompt("greeting", label: "production", cache_ttl: 60)
+client.invalidate_prompt_cache("greeting", label: "production")
+client.invalidate_prompt_cache_by_name("greeting")
+client.clear_prompt_cache
+client.prompt_cache_stats
+client.prompt_cache_key("greeting")
+client.validate_prompt_cache_backend!
+```
+
+`invalidate_prompt_cache_by_name` and `clear_prompt_cache` use generation counters. Rails.cache entries from old generations are not scanned; they become unreachable and expire by TTL.
+
 ### `Client#compile_prompt`
 
 Convenience method: fetch and compile in one call.
@@ -264,7 +308,7 @@ Convenience method: fetch and compile in one call.
 **Signature:**
 
 ```ruby
-compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil)
+compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
 ```
 
 **Parameters:**
diff --git a/docs/CACHING.md b/docs/CACHING.md
index 86d7819..7717fa2 100644
--- a/docs/CACHING.md
+++ b/docs/CACHING.md
@@ -7,6 +7,7 @@ For configuration options, see [CONFIGURATION.md](CONFIGURATION.md).
 ## Table of Contents
 
 - [Overview](#overview)
+- [Public Cache Operations](#public-cache-operations)
 - [In-Memory Cache](#in-memory-cache-default)
 - [Rails.cache Backend](#railscache-backend-distributed)
 - [Stale-While-Revalidate (SWR)](#stale-while-revalidate-swr)
@@ -22,7 +23,69 @@ The Langfuse Ruby SDK provides two caching backends to optimize prompt fetching:
 1. **In-Memory Cache** (default) - Thread-safe, local cache with TTL and bounded expiration-ordered eviction
 2. **Rails.cache Backend** - Distributed caching with Redis/Memcached
 
-Both backends support TTL-based expiration and stale-while-revalidate (SWR). Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks.
+Both backends support TTL-based expiration, stale-while-revalidate (SWR), and logical generation-based invalidation. Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks.
+
+## Public Cache Operations
+
+`get_prompt` remains the normal prompt-returning API. Use `get_prompt_result` when you need cache metadata for logs, metrics, or operational validation:
+
+```ruby
+result = Langfuse.client.get_prompt_result("greeting", label: "production", cache_ttl: 60)
+
+result.prompt        # TextPromptClient or ChatPromptClient
+result.logical_key   # "greeting:production"
+result.storage_key   # Generated backend key for the current cache generation
+result.cache_status  # :hit, :miss, :stale, :refresh, :bypass, or :disabled
+result.source        # :cache, :api, or :fallback
+result.fallback?     # true when caller-provided fallback content was used
+```
+
+Per-call `cache_ttl` overrides the write TTL for that fetch. Passing `cache_ttl: 0` bypasses the cache read, fetches from the API, and does not retain the result:
+
+```ruby
+fresh = Langfuse.client.get_prompt_result("greeting", cache_ttl: 0)
+fresh.cache_status # => :bypass
+```
+
+Use `refresh_prompt` when you intentionally want to bypass the read path and write the fresh prompt through to cache:
+
+```ruby
+result = Langfuse.client.refresh_prompt("greeting", label: "production")
+result.cache_status # => :refresh
+```
+
+The operational cache APIs are flat on the client:
+
+```ruby
+Langfuse.client.invalidate_prompt_cache("greeting", label: "production")
+Langfuse.client.invalidate_prompt_cache_by_name("greeting")
+Langfuse.client.clear_prompt_cache
+
+key = Langfuse.client.prompt_cache_key("greeting")
+key.logical_key # => "greeting:production"
+key.storage_key # Includes the current global and prompt-name generations
+
+Langfuse.client.prompt_cache_stats
+Langfuse.client.validate_prompt_cache_backend!
+```
+
+Cache identity is prompt name plus version or label. When neither is supplied, the logical identity defaults to the `production` label. Runtime variables never enter the cache key; the SDK caches the managed prompt template and compiles variables afterward.
+
+Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL.
+
+### Cache Events
+
+Set `prompt_cache_observer` to receive cache events without binding the SDK to your metric names:
+
+```ruby
+Langfuse.configure do |config|
+  config.prompt_cache_observer = lambda do |event, payload|
+    Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status])
+  end
+end
+```
+
+When `ActiveSupport::Notifications` is loaded, the SDK also instruments `prompt_cache.langfuse`. Event payloads include prompt name, version, label, logical key, storage key, backend, cache status, source, and error details when relevant.
 
 ## In-Memory Cache (Default)
 
@@ -102,6 +165,8 @@ Langfuse.configure do |config|
 end
 ```
 
+Use `config.cache_backend = :auto` only when you want the SDK to choose `:rails` if Rails and `Rails.cache` are present, otherwise `:memory`. The default remains `:memory`.
+
 ### Features
 
 - **Distributed**: Shared cache across all processes and servers
diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md
index fca1b50..6eb4714 100644
--- a/docs/CONFIGURATION.md
+++ b/docs/CONFIGURATION.md
@@ -120,7 +120,7 @@ config.cache_max_size = 5000  # Large prompt library
 
 #### `cache_backend`
 
-- **Type:** Symbol (`:memory` or `:rails`)
+- **Type:** Symbol (`:memory`, `:rails`, or `:auto`)
 - **Default:** `:memory`
 - **Description:** Cache storage backend
 
@@ -130,8 +130,13 @@ config.cache_backend = :memory
 
 # Rails.cache (requires Rails + Redis)
 config.cache_backend = :rails
+
+# Opt in to automatic Rails.cache detection
+config.cache_backend = :auto
 ```
 
+`:auto` chooses `:rails` only when Rails and `Rails.cache` are present; otherwise it falls back to `:memory`. The gem default stays `:memory`.
+
 **Requirements for `:rails` backend:**
 
 - Rails must be defined
@@ -225,6 +230,20 @@ config.cache_refresh_threads = 10  # More threads for high-traffic apps
 
 Only used when SWR is enabled (`cache_stale_ttl > 0`).
 
+#### `prompt_cache_observer`
+
+- **Type:** Callable or `nil`
+- **Default:** `nil`
+- **Description:** Observer hook for prompt cache events
+
+```ruby
+config.prompt_cache_observer = lambda do |event, payload|
+  Rails.logger.info(event: event, prompt: payload[:name], status: payload[:cache_status])
+end
+```
+
+When ActiveSupport is loaded, the SDK also instruments `prompt_cache.langfuse`.
+
 #### `batch_size`
 
 - **Type:** Integer
diff --git a/docs/PROMPTS.md b/docs/PROMPTS.md
index 3a1c582..7e42f6e 100644
--- a/docs/PROMPTS.md
+++ b/docs/PROMPTS.md
@@ -553,7 +553,30 @@ Langfuse.configure do |config|
 end
 ```
 
-See [CACHING.md](CACHING.md) for advanced caching strategies (warming, stampede protection).
+Override or bypass cache per call:
+
+```ruby
+prompt = client.get_prompt("greeting", cache_ttl: 300)
+fresh = client.get_prompt_result("greeting", cache_ttl: 0)
+
+fresh.cache_status # => :bypass
+fresh.source       # => :api
+```
+
+Use `get_prompt_result` and the flat cache operations when you need operational visibility:
+
+```ruby
+result = client.get_prompt_result("greeting")
+result.cache_status # => :hit or :miss
+
+client.refresh_prompt("greeting")
+client.invalidate_prompt_cache("greeting")
+client.invalidate_prompt_cache_by_name("greeting")
+client.clear_prompt_cache
+client.prompt_cache_stats
+```
+
+See [CACHING.md](CACHING.md) for advanced caching strategies, generation-based invalidation, cache events, warming, and stampede protection.
 
 ## Combining Prompts with Tracing
 
diff --git a/lib/langfuse.rb b/lib/langfuse.rb
index ca8cc3f..20515e6 100644
--- a/lib/langfuse.rb
+++ b/lib/langfuse.rb
@@ -41,6 +41,7 @@ class UnauthorizedError < ApiError; end
 
 require_relative "langfuse/config"
 require_relative "langfuse/prompt_cache"
+require_relative "langfuse/prompt_fetch_result"
 require_relative "langfuse/rails_cache_adapter"
 require_relative "langfuse/cache_warmer"
 require_relative "langfuse/api_client"
diff --git a/lib/langfuse/api_client.rb b/lib/langfuse/api_client.rb
index f369442..0a2ea41 100644
--- a/lib/langfuse/api_client.rb
+++ b/lib/langfuse/api_client.rb
@@ -5,6 +5,7 @@
 require "base64"
 require "json"
 require "uri"
+require_relative "prompt_fetch_result"
 
 module Langfuse
   # HTTP client for Langfuse API
@@ -40,6 +41,9 @@ class ApiClient # rubocop:disable Metrics/ClassLength
     # @return [PromptCache, RailsCacheAdapter, nil] Optional cache for prompt responses
     attr_reader :cache
 
+    # @return [#call, nil] Optional observer for prompt cache events
+    attr_reader :cache_observer
+
     # Initialize a new API client
     #
     # @param public_key [String] Langfuse public API key
@@ -48,15 +52,19 @@ class ApiClient # rubocop:disable Metrics/ClassLength
     # @param timeout [Integer] HTTP request timeout in seconds
     # @param logger [Logger] Logger instance for debugging
     # @param cache [PromptCache, RailsCacheAdapter, nil] Optional cache for prompt responses
+    # @param cache_observer [#call, nil] Optional observer for prompt cache events
     # @return [ApiClient]
-    def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cache: nil)
+    # rubocop:disable Metrics/ParameterLists
+    def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cache: nil, cache_observer: nil)
       @public_key = public_key
       @secret_key = secret_key
       @base_url = base_url
       @timeout = timeout
       @logger = logger || Logger.new($stdout, level: Logger::WARN)
       @cache = cache
+      @cache_observer = cache_observer
     end
+    # rubocop:enable Metrics/ParameterLists
 
     # Get a Faraday connection
     #
@@ -109,17 +117,129 @@ def list_prompts(page: nil, limit: nil)
     # @param name [String] The name of the prompt
     # @param version [Integer, nil] Optional specific version number
     # @param label [String, nil] Optional label (e.g., "production", "latest")
+    # @param cache_ttl [Integer, nil] Optional TTL override for this fetch
     # @return [Hash] The prompt data
     # @raise [ArgumentError] if both version and label are provided
     # @raise [NotFoundError] if the prompt is not found
     # @raise [UnauthorizedError] if authentication fails
     # @raise [ApiError] for other API errors
-    def get_prompt(name, version: nil, label: nil)
+    def get_prompt(name, version: nil, label: nil, cache_ttl: nil)
+      get_prompt_result(name, version: version, label: label, cache_ttl: cache_ttl).prompt
+    end
+
+    # Fetch a prompt and include cache metadata.
+    #
+    # @param name [String] The name of the prompt
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label (e.g., "production", "latest")
+    # @param cache_ttl [Integer, nil] Optional TTL override for this fetch
+    # @return [PromptFetchResult] Prompt data plus cache metadata
+    # @raise [ArgumentError] if both version and label are provided
+    # @raise [ArgumentError] if cache_ttl is negative
+    # @raise [NotFoundError] if the prompt is not found
+    # @raise [UnauthorizedError] if authentication fails
+    # @raise [ApiError] for other API errors
+    def get_prompt_result(name, version: nil, label: nil, cache_ttl: nil)
+      validate_prompt_fetch_options!(version, label, cache_ttl)
+
+      key = prompt_cache_key(name, version: version, label: label)
+      return fetch_uncached_prompt_result(key, version, label, :disabled) if cache.nil?
+      return fetch_uncached_prompt_result(key, version, label, :bypass) if cache_ttl&.zero?
+
+      fetch_cached_prompt_result(key, version, label, cache_ttl)
+    end
+
+    # Refresh a prompt from the API, optionally writing through to cache.
+    #
+    # @param name [String] The name of the prompt
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label
+    # @param cache_ttl [Integer, nil] Optional TTL override for this refresh
+    # @return [PromptFetchResult] Prompt data plus cache metadata
+    # @raise [ArgumentError] if both version and label are provided
+    # @raise [ArgumentError] if cache_ttl is negative
+    # @raise [NotFoundError] if the prompt is not found
+    # @raise [UnauthorizedError] if authentication fails
+    # @raise [ApiError] for other API errors
+    def refresh_prompt(name, version: nil, label: nil, cache_ttl: nil)
+      validate_prompt_fetch_options!(version, label, cache_ttl)
+
+      key = prompt_cache_key(name, version: version, label: label)
+      refresh_prompt_result(key, version, label, cache_ttl)
+    end
+
+    # Inspect the logical and generated cache keys for a prompt.
+    #
+    # @param name [String] The prompt name
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label
+    # @return [PromptCacheKey] Logical and generated cache keys
+    # @raise [ArgumentError] if both version and label are provided
+    def prompt_cache_key(name, version: nil, label: nil)
       raise ArgumentError, "Cannot specify both version and label" if version && label
-      return fetch_prompt_from_api(name, version: version, label: label) if cache.nil?
 
-      cache_key = PromptCache.build_key(name, version: version, label: label)
-      fetch_with_appropriate_caching_strategy(cache_key, name, version, label)
+      logical_key = PromptCache.build_key(name, version: version, label: label)
+      storage_key = if generated_storage_key_cache?
+                      cache.storage_key(logical_key, name: name)
+                    else
+                      logical_key
+                    end
+      PromptCacheKey.new(name: name, version: version, label: label, logical_key: logical_key, storage_key: storage_key)
+    end
+
+    # Invalidate one exact logical prompt cache key.
+    #
+    # @param name [String] The prompt name
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label
+    # @return [PromptCacheKey] The invalidated key
+    # @raise [ArgumentError] if both version and label are provided
+    def invalidate_prompt_cache(name, version: nil, label: nil)
+      key = prompt_cache_key(name, version: version, label: label)
+      deleted = cache&.delete(key.storage_key) || false
+      emit_prompt_cache_event(:delete, event_payload(key, :miss, :cache, deleted: deleted))
+      emit_prompt_cache_event(:invalidate, event_payload(key, :miss, :cache, scope: :exact))
+      key
+    end
+
+    # Invalidate all cached variants for one prompt name.
+    #
+    # @param name [String] The prompt name
+    # @return [Integer, nil] New generation, or nil when cache is disabled
+    def invalidate_prompt_cache_by_name(name)
+      generation = cache&.invalidate_name(name)
+      payload = { name: name, backend: cache_backend_name, generation: generation, scope: :name }
+      emit_prompt_cache_event(:invalidate, payload)
+      generation
+    end
+
+    # Logically clear the whole Langfuse prompt cache namespace.
+    #
+    # @return [Integer, nil] New global generation, or nil when cache is disabled
+    def clear_prompt_cache
+      generation = cache&.clear_logically
+      emit_prompt_cache_event(:clear, backend: cache_backend_name, generation: generation)
+      generation
+    end
+
+    # Return prompt cache statistics.
+    #
+    # @return [Hash] Cache statistics
+    def prompt_cache_stats
+      return disabled_prompt_cache_stats unless cache
+
+      cache.stats
+    end
+
+    # Emit a prompt cache event to configured hooks.
+    #
+    # @param event [Symbol] Event name
+    # @param payload [Hash] Event payload
+    # @return [void]
+    def emit_prompt_cache_event(event, payload)
+      normalized_payload = payload.merge(event: event.to_sym)
+      notify_cache_observer(event, normalized_payload)
+      notify_active_support(normalized_payload)
     end
 
     # Create a new prompt (or new version if prompt with same name exists)
@@ -158,7 +278,7 @@ def create_prompt(name:, prompt:, type:, config: {}, labels: [], tags: [], commi
         payload[:commitMessage] = commit_message if commit_message
 
         response = connection.post(path, payload)
-        handle_response(response)
+        handle_response(response).tap { invalidate_prompt_cache_after_mutation(name) }
       end
     end
     # rubocop:enable Metrics/ParameterLists
@@ -188,7 +308,7 @@ def update_prompt(name:, version:, labels:)
         payload = { newLabels: labels }
 
         response = connection.patch(path, payload)
-        handle_response(response)
+        handle_response(response).tap { invalidate_prompt_cache_after_mutation(name) }
       end
     end
 
@@ -595,6 +715,241 @@ def delete_dataset_item(id)
 
     private
 
+    def validate_prompt_fetch_options!(version, label, cache_ttl)
+      raise ArgumentError, "Cannot specify both version and label" if version && label
+      return if cache_ttl.nil? || (cache_ttl.respond_to?(:negative?) && !cache_ttl.negative?)
+
+      raise ArgumentError, "cache_ttl must be non-negative"
+    end
+
+    def fetch_uncached_prompt_result(key, version, label, cache_status)
+      prompt_data = fetch_prompt_from_api(key.name, version: version, label: label)
+      build_prompt_result(key, prompt_data, cache_status, :api)
+    end
+
+    def fetch_cached_prompt_result(key, version, label, cache_ttl)
+      swr_enabled = swr_cache_available?
+      return fetch_swr_prompt_result(key, version, label, cache_ttl) if swr_enabled
+
+      fetch_non_swr_prompt_result(key, version, label, cache_ttl)
+    end
+
+    def fetch_swr_prompt_result(key, version, label, cache_ttl)
+      unless generated_storage_key_cache?
+        prompt_data = fetch_with_swr_cache(key.storage_key, key.name, version, label)
+        return cache_hit_prompt_result(key, prompt_data)
+      end
+
+      result = fetch_swr_cached_prompt_result(key, version, label, cache_ttl)
+      return result if result
+
+      fetch_cache_miss_prompt_result(key, version, label, cache_ttl, swr_enabled: true, distributed_enabled: false)
+    end
+
+    def fetch_non_swr_prompt_result(key, version, label, cache_ttl)
+      distributed_enabled = distributed_cache_available?
+
+      if !generated_storage_key_cache? && distributed_enabled
+        prompt_data = fetch_with_distributed_cache(key.storage_key, key.name, version, label)
+        return cache_hit_prompt_result(key, prompt_data)
+      end
+
+      cached_data = cache.get(key.storage_key)
+      return cache_hit_prompt_result(key, cached_data) if cached_data
+
+      fetch_cache_miss_prompt_result(
+        key,
+        version,
+        label,
+        cache_ttl,
+        swr_enabled: false,
+        distributed_enabled: distributed_enabled
+      )
+    end
+
+    def fetch_swr_cached_prompt_result(key, version, label, cache_ttl)
+      entry = cache.entry(key.storage_key) if cache.respond_to?(:entry)
+      return nil unless entry.respond_to?(:fresh?)
+      return cache_hit_prompt_result(key, entry.data) if entry.fresh?
+
+      return nil unless entry.stale?
+
+      emit_prompt_cache_event(:stale_serve, event_payload(key, :stale, :cache))
+      schedule_prompt_cache_refresh(key, version, label, cache_ttl)
+      build_prompt_result(key, entry.data, :stale, :cache)
+    end
+
+    def cache_hit_prompt_result(key, prompt_data)
+      emit_prompt_cache_event(:hit, event_payload(key, :hit, :cache))
+      build_prompt_result(key, prompt_data, :hit, :cache)
+    end
+
+    def fetch_cache_miss_prompt_result(key, version, label, cache_ttl, swr_enabled: false, distributed_enabled: nil)
+      emit_prompt_cache_event(:miss, event_payload(key, :miss, :api))
+      distributed_enabled = distributed_cache_available? if distributed_enabled.nil?
+
+      if !swr_enabled && distributed_enabled
+        fetch_cache_miss_with_lock(key, version, label, cache_ttl)
+      else
+        fetch_cache_miss_directly(key, version, label, cache_ttl, swr_enabled: swr_enabled)
+      end
+    end
+
+    def fetch_cache_miss_with_lock(key, version, label, cache_ttl)
+      fetched = false
+      prompt_data = cache_fetch_with_lock(key.storage_key, cache_ttl) do
+        fetched = true
+        fetch_prompt_from_api(key.name, version: version, label: label)
+      end
+      emit_prompt_cache_event(:write, event_payload(key, :miss, :api)) if fetched
+      status = fetched ? :miss : :hit
+      source = fetched ? :api : :cache
+      build_prompt_result(key, prompt_data, status, source)
+    end
+
+    def fetch_cache_miss_directly(key, version, label, cache_ttl, swr_enabled: false)
+      prompt_data = fetch_prompt_from_api(key.name, version: version, label: label)
+      write_prompt_cache(key, prompt_data, cache_ttl, swr_enabled: swr_enabled)
+      build_prompt_result(key, prompt_data, :miss, :api)
+    end
+
+    def refresh_prompt_result(key, version, label, cache_ttl)
+      emit_prompt_cache_event(:refresh_start, event_payload(key, :refresh, :api))
+      prompt_data = fetch_prompt_from_api(key.name, version: version, label: label)
+      write_refresh_prompt_cache(key, prompt_data, cache_ttl)
+      emit_prompt_cache_event(:refresh_success, event_payload(key, refresh_cache_status(cache_ttl), :api))
+      build_prompt_result(key, prompt_data, refresh_cache_status(cache_ttl), :api)
+    rescue StandardError => e
+      payload = event_payload(key, :refresh, :api, error_class: e.class.name, error_message: e.message)
+      emit_prompt_cache_event(:refresh_failure, payload)
+      raise
+    end
+
+    def schedule_prompt_cache_refresh(key, version, label, cache_ttl)
+      return unless cache.respond_to?(:refresh_async)
+
+      scheduled = cache.refresh_async(
+        key.storage_key,
+        ttl: cache_ttl,
+        on_success: ->(_value) { emit_refresh_success_events(key) },
+        on_failure: ->(error) { emit_refresh_failure_event(key, error) }
+      ) { fetch_prompt_from_api(key.name, version: version, label: label) }
+      emit_prompt_cache_event(:refresh_start, event_payload(key, :stale, :cache)) if scheduled
+    end
+
+    def emit_refresh_success_events(key)
+      emit_prompt_cache_event(:refresh_success, event_payload(key, :refresh, :api))
+      emit_prompt_cache_event(:write, event_payload(key, :refresh, :api))
+    end
+
+    def emit_refresh_failure_event(key, error)
+      payload = event_payload(key, :stale, :cache, error_class: error.class.name, error_message: error.message)
+      emit_prompt_cache_event(:refresh_failure, payload)
+    end
+
+    def write_refresh_prompt_cache(key, prompt_data, cache_ttl)
+      return unless cache
+      return if cache_ttl&.zero?
+
+      write_prompt_cache(key, prompt_data, cache_ttl, cache_status: :refresh, swr_enabled: swr_cache_available?)
+    end
+
+    def write_prompt_cache(key, prompt_data, cache_ttl, cache_status: :miss, swr_enabled: false)
+      if swr_enabled && cache.respond_to?(:write_with_stale_while_revalidate)
+        cache.write_with_stale_while_revalidate(key.storage_key, prompt_data, ttl: cache_ttl)
+      elsif cache_ttl.nil?
+        cache.set(key.storage_key, prompt_data)
+      else
+        cache.set(key.storage_key, prompt_data, ttl: cache_ttl)
+      end
+      emit_prompt_cache_event(:write, event_payload(key, cache_status, :api))
+    end
+
+    def cache_fetch_with_lock(storage_key, cache_ttl, &)
+      return cache.fetch_with_lock(storage_key, &) if cache_ttl.nil?
+
+      cache.fetch_with_lock(storage_key, ttl: cache_ttl, &)
+    end
+
+    def refresh_cache_status(cache_ttl)
+      return :disabled unless cache
+      return :bypass if cache_ttl&.zero?
+
+      :refresh
+    end
+
+    def build_prompt_result(key, prompt_data, cache_status, source)
+      PromptFetchResult.new(
+        prompt: prompt_data,
+        logical_key: key.logical_key,
+        storage_key: key.storage_key,
+        cache_status: cache_status,
+        source: source,
+        name: prompt_data["name"] || key.name,
+        version: prompt_data["version"] || key.version,
+        label: key.label || (key.version ? nil : "production")
+      )
+    end
+
+    def event_payload(key, cache_status, source, extra = {})
+      {
+        name: key.name,
+        version: key.version,
+        label: key.label || (key.version ? nil : "production"),
+        logical_key: key.logical_key,
+        storage_key: key.storage_key,
+        backend: cache_backend_name,
+        cache_status: cache_status,
+        source: source
+      }.merge(extra)
+    end
+
+    def cache_backend_name
+      return "disabled" unless cache
+      return "rails" if cache.is_a?(RailsCacheAdapter)
+      return "memory" if cache.is_a?(PromptCache)
+
+      cache.class.name
+    end
+
+    def disabled_prompt_cache_stats
+      {
+        backend: "disabled",
+        enabled: false,
+        current_generation_entries: nil,
+        orphaned_entries: nil,
+        total_entries: nil,
+        unsupported_counts: %i[current_generation_entries orphaned_entries total_entries]
+      }
+    end
+
+    def generated_storage_key_cache?
+      cache.is_a?(PromptCache) || cache.is_a?(RailsCacheAdapter)
+    end
+
+    def invalidate_prompt_cache_after_mutation(name)
+      generation = cache&.invalidate_name(name)
+      payload = { name: name, backend: cache_backend_name, generation: generation, scope: :name, mutation: true }
+      emit_prompt_cache_event(:invalidate, payload)
+    end
+
+    def notify_cache_observer(event, payload)
+      return unless cache_observer
+
+      observer_arity = cache_observer.respond_to?(:arity) ? cache_observer.arity : 2
+      observer_arity == 1 ? cache_observer.call(payload) : cache_observer.call(event, payload)
+    rescue StandardError => e
+      logger.warn("Langfuse prompt cache observer failed: #{e.class} - #{e.message}")
+    end
+
+    def notify_active_support(payload)
+      return unless defined?(ActiveSupport::Notifications)
+
+      ActiveSupport::Notifications.instrument("prompt_cache.langfuse", payload)
+    rescue StandardError => e
+      logger.warn("Langfuse ActiveSupport cache notification failed: #{e.class} - #{e.message}")
+    end
+
     # Fetch prompt using the most appropriate caching strategy available
     #
     # @param cache_key [String] The cache key for this prompt
diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb
index 5c6fafa..de6df25 100644
--- a/lib/langfuse/client.rb
+++ b/lib/langfuse/client.rb
@@ -46,7 +46,8 @@ def initialize(config)
         base_url: config.base_url,
         timeout: config.timeout,
         logger: config.logger,
-        cache: cache
+        cache: cache,
+        cache_observer: config.prompt_cache_observer
       )
 
       @project_id = nil
@@ -68,6 +69,7 @@ def initialize(config)
     # @param label [String, nil] Optional label (e.g., "production", "latest")
     # @param fallback [String, Array, nil] Optional fallback prompt to use on error
     # @param type [Symbol, nil] Required when fallback is provided (:text or :chat)
+    # @param cache_ttl [Integer, nil] Optional TTL override for this fetch
     # @return [TextPromptClient, ChatPromptClient] The prompt client
     # @raise [ArgumentError] if both version and label are provided
     # @raise [ArgumentError] if fallback is provided without type
@@ -77,24 +79,113 @@ def initialize(config)
     #
     # @example With fallback for graceful degradation
     #   prompt = client.get_prompt("greeting", fallback: "Hello {{name}}!", type: :text)
-    def get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
-      # Validate fallback usage
-      if fallback && !type
-        raise ArgumentError, "type parameter is required when fallback is provided (use :text or :chat)"
-      end
+    def get_prompt(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
+      get_prompt_result(
+        name,
+        version: version,
+        label: label,
+        fallback: fallback,
+        type: type,
+        cache_ttl: cache_ttl
+      ).prompt
+    end
 
-      # Try to fetch from API
-      prompt_data = api_client.get_prompt(name, version: version, label: label)
-      build_prompt_client(prompt_data)
+    # Fetch a prompt and return cache metadata.
+    #
+    # @param name [String] The name of the prompt
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label (e.g., "production", "latest")
+    # @param fallback [String, Array, nil] Optional fallback prompt to use on error
+    # @param type [Symbol, nil] Required when fallback is provided (:text or :chat)
+    # @param cache_ttl [Integer, nil] Optional TTL override for this fetch
+    # @return [PromptFetchResult] Prompt client plus cache metadata
+    # @raise [ArgumentError] if fallback is provided without type
+    # @raise [NotFoundError] if the prompt is not found and no fallback provided
+    # @raise [UnauthorizedError] if authentication fails and no fallback provided
+    # @raise [ApiError] for other API errors and no fallback provided
+    def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
+      validate_fallback_usage!(fallback, type)
+
+      api_result = api_client.get_prompt_result(name, version: version, label: label, cache_ttl: cache_ttl)
+      build_client_fetch_result(api_result, build_prompt_client(api_result.prompt))
     rescue ApiError, NotFoundError, UnauthorizedError => e
       # If no fallback, re-raise the error
       raise e unless fallback
 
       # Log warning and return fallback
       config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.")
-      build_fallback_prompt_client(name, fallback, type)
+      build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e)
+    end
+
+    # Refresh a prompt from the API, optionally writing through to cache.
+    #
+    # @param name [String] The name of the prompt
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label (e.g., "production", "latest")
+    # @param cache_ttl [Integer, nil] Optional TTL override for this refresh
+    # @return [PromptFetchResult] Prompt client plus cache metadata
+    # @raise [ArgumentError] if both version and label are provided
+    # @raise [NotFoundError] if the prompt is not found
+    # @raise [UnauthorizedError] if authentication fails
+    # @raise [ApiError] for other API errors
+    def refresh_prompt(name, version: nil, label: nil, cache_ttl: nil)
+      api_result = api_client.refresh_prompt(name, version: version, label: label, cache_ttl: cache_ttl)
+      build_client_fetch_result(api_result, build_prompt_client(api_result.prompt))
+    end
+
+    # Invalidate one exact logical prompt cache key.
+    #
+    # @param name [String] The prompt name
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label
+    # @return [PromptCacheKey] The invalidated key
+    def invalidate_prompt_cache(name, version: nil, label: nil)
+      api_client.invalidate_prompt_cache(name, version: version, label: label)
+    end
+
+    # Invalidate all cached variants for one prompt name.
+    #
+    # @param name [String] The prompt name
+    # @return [Integer, nil] New generation, or nil when cache is disabled
+    def invalidate_prompt_cache_by_name(name)
+      api_client.invalidate_prompt_cache_by_name(name)
+    end
+
+    # Logically clear the whole Langfuse prompt cache namespace.
+    #
+    # @return [Integer, nil] New global generation, or nil when cache is disabled
+    def clear_prompt_cache
+      api_client.clear_prompt_cache
+    end
+
+    # Return prompt cache statistics.
+    #
+    # @return [Hash] Cache statistics
+    def prompt_cache_stats
+      api_client.prompt_cache_stats
+    end
+
+    # Inspect the logical and generated cache keys for a prompt.
+    #
+    # @param name [String] The prompt name
+    # @param version [Integer, nil] Optional specific version number
+    # @param label [String, nil] Optional label
+    # @return [PromptCacheKey] Logical and generated cache keys
+    def prompt_cache_key(name, version: nil, label: nil)
+      api_client.prompt_cache_key(name, version: version, label: label)
     end
 
+    # Validate the configured prompt cache backend before first prompt fetch.
+    #
+    # @return [Boolean] true when the configured backend is usable
+    # @raise [ConfigurationError] if the backend is invalid
+    # rubocop:disable Naming/PredicateMethod
+    def validate_prompt_cache_backend!
+      api_client.cache&.validate! if api_client.cache.respond_to?(:validate!)
+      true
+    end
+    # rubocop:enable Naming/PredicateMethod
+
     # List all prompts in the Langfuse project
     #
     # Fetches a list of all prompt names available in your project.
@@ -126,6 +217,7 @@ def list_prompts(page: nil, limit: nil)
     # @param label [String, nil] Optional label (e.g., "production", "latest")
     # @param fallback [String, Array, nil] Optional fallback prompt to use on error
     # @param type [Symbol, nil] Required when fallback is provided (:text or :chat)
+    # @param cache_ttl [Integer, nil] Optional TTL override for this fetch
     # @return [String, Array<Hash>] Compiled prompt (String for text, Array for chat)
     # @raise [ArgumentError] if both version and label are provided
     # @raise [ArgumentError] if fallback is provided without type
@@ -148,10 +240,19 @@ def list_prompts(page: nil, limit: nil)
     #     fallback: "Hello {{name}}!",
     #     type: :text
     #   )
-    def compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil)
-      prompt = get_prompt(name, version: version, label: label, fallback: fallback, type: type)
+    # rubocop:disable Metrics/ParameterLists
+    def compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil, cache_ttl: nil)
+      prompt = get_prompt(
+        name,
+        version: version,
+        label: label,
+        fallback: fallback,
+        type: type,
+        cache_ttl: cache_ttl
+      )
       prompt.compile(**variables)
     end
+    # rubocop:enable Metrics/ParameterLists
 
     # Create a new prompt (or new version if name already exists)
     #
@@ -738,6 +839,66 @@ def resolve_experiment_items(data, dataset_name)
       list_dataset_items(dataset_name: dataset_name)
     end
 
+    def validate_fallback_usage!(fallback, type)
+      return unless fallback && !type
+
+      raise ArgumentError, "type parameter is required when fallback is provided (use :text or :chat)"
+    end
+
+    def build_client_fetch_result(api_result, prompt_client)
+      PromptFetchResult.new(
+        prompt: prompt_client,
+        logical_key: api_result.logical_key,
+        storage_key: api_result.storage_key,
+        cache_status: api_result.cache_status,
+        source: api_result.source,
+        name: prompt_client.name,
+        version: prompt_client.version,
+        label: api_result.label
+      )
+    end
+
+    def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context)
+      prompt_client = build_fallback_prompt_client(name, fallback, type)
+      key = api_client.prompt_cache_key(name, version: version, label: label)
+      api_client.emit_prompt_cache_event(
+        :fallback,
+        fallback_event_payload(key, fetch_context, fetch_context.fetch(:error))
+      )
+      PromptFetchResult.new(
+        prompt: prompt_client,
+        logical_key: key.logical_key,
+        storage_key: key.storage_key,
+        cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)),
+        source: :fallback,
+        name: name,
+        version: version || prompt_client.version,
+        label: key.label || (key.version ? nil : "production")
+      )
+    end
+
+    def fallback_event_payload(key, fetch_context, error)
+      {
+        name: key.name,
+        version: key.version,
+        label: key.label || (key.version ? nil : "production"),
+        logical_key: key.logical_key,
+        storage_key: key.storage_key,
+        backend: api_client.prompt_cache_stats.fetch(:backend),
+        cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)),
+        source: :fallback,
+        error_class: error.class.name,
+        error_message: error.message
+      }
+    end
+
+    def fallback_cache_status(cache_ttl)
+      return :bypass if cache_ttl&.zero?
+      return :disabled unless api_client.cache
+
+      :miss
+    end
+
     # Check if caching is enabled in configuration
     #
     # @return [Boolean]
@@ -754,11 +915,17 @@ def create_cache
         create_memory_cache
       when :rails
         create_rails_cache_adapter
+      when :auto
+        rails_cache_available? ? create_rails_cache_adapter : create_memory_cache
       else
         raise ConfigurationError, "Unknown cache backend: #{config.cache_backend}"
       end
     end
 
+    def rails_cache_available?
+      defined?(Rails) && Rails.respond_to?(:cache) && Rails.cache
+    end
+
     # Create in-memory cache with SWR support if enabled
     #
     # @return [PromptCache]
diff --git a/lib/langfuse/config.rb b/lib/langfuse/config.rb
index c6ff5a6..8c9dac2 100644
--- a/lib/langfuse/config.rb
+++ b/lib/langfuse/config.rb
@@ -41,7 +41,7 @@ class Config
     # @return [Integer] Maximum number of cached items
     attr_accessor :cache_max_size
 
-    # @return [Symbol] Cache backend (:memory or :rails)
+    # @return [Symbol] Cache backend (:memory, :rails, or :auto)
     attr_accessor :cache_backend
 
     # @return [Integer] Lock timeout in seconds for distributed cache stampede protection
@@ -57,6 +57,9 @@ class Config
     # @return [Integer] Number of background threads for cache refresh
     attr_accessor :cache_refresh_threads
 
+    # @return [#call, nil] Observer called for prompt cache events
+    attr_accessor :prompt_cache_observer
+
     # @return [Boolean] Use async processing for traces (requires ActiveJob)
     attr_accessor :tracing_async
 
@@ -158,6 +161,7 @@ def initialize
       @cache_stale_while_revalidate = DEFAULT_CACHE_STALE_WHILE_REVALIDATE
       @cache_stale_ttl = 0 # Default to 0 (SWR disabled, entries expire immediately after TTL)
       @cache_refresh_threads = DEFAULT_CACHE_REFRESH_THREADS
+      @prompt_cache_observer = nil
       @tracing_async = DEFAULT_TRACING_ASYNC
       @batch_size = DEFAULT_BATCH_SIZE
       @flush_interval = DEFAULT_FLUSH_INTERVAL
@@ -189,6 +193,7 @@ def validate!
       validate_swr_config!
 
       validate_cache_backend!
+      validate_prompt_cache_observer!
       validate_sample_rate!
       validate_should_export_span!
       validate_mask!
@@ -240,13 +245,19 @@ def initialize_tracing_defaults
     end
 
     def validate_cache_backend!
-      valid_backends = %i[memory rails]
+      valid_backends = %i[memory rails auto]
       return if valid_backends.include?(cache_backend)
 
       raise ConfigurationError,
             "cache_backend must be one of #{valid_backends.inspect}, got #{cache_backend.inspect}"
     end
 
+    def validate_prompt_cache_observer!
+      return if prompt_cache_observer.nil? || prompt_cache_observer.respond_to?(:call)
+
+      raise ConfigurationError, "prompt_cache_observer must respond to #call"
+    end
+
     def validate_swr_config!
       validate_swr_stale_ttl!
       validate_refresh_threads!
diff --git a/lib/langfuse/prompt_cache.rb b/lib/langfuse/prompt_cache.rb
index df401de..011a122 100644
--- a/lib/langfuse/prompt_cache.rb
+++ b/lib/langfuse/prompt_cache.rb
@@ -1,6 +1,7 @@
 # frozen_string_literal: true
 
 require "monitor"
+require "base64"
 require_relative "stale_while_revalidate"
 
 module Langfuse
@@ -14,6 +15,7 @@ module Langfuse
   #   cache.set("greeting:1", prompt_data)
   #   cache.get("greeting:1") # => prompt_data
   #
+  # rubocop:disable Metrics/ClassLength
   class PromptCache
     include StaleWhileRevalidate
 
@@ -79,6 +81,8 @@ def initialize(ttl: 60, max_size: 1000, stale_ttl: 0, refresh_threads: 5, logger
       @stale_ttl = stale_ttl
       @logger = logger
       @cache = {}
+      @global_generation = 0
+      @name_generations = Hash.new(0)
       @monitor = Monitor.new
       @locks = {} # Track locks for in-memory locking
       initialize_swr(refresh_threads: refresh_threads) if swr_enabled?
@@ -98,24 +102,46 @@ def get(key)
       end
     end
 
+    # Read a raw cache entry, including stale entries.
+    #
+    # @param key [String] Cache key
+    # @return [CacheEntry, nil] Raw cache entry
+    def entry(key)
+      @monitor.synchronize do
+        @cache[key]
+      end
+    end
+
     # Set a value in the cache
     #
     # @param key [String] Cache key
     # @param value [Object] Value to cache
     # @return [Object] The cached value
-    def set(key, value)
+    def set(key, value, ttl: nil, stale_ttl: nil)
       @monitor.synchronize do
         # Evict oldest entry if at max size
         evict_oldest if @cache.size >= max_size
 
         now = Time.now
-        fresh_until = now + ttl
-        stale_until = fresh_until + stale_ttl
+        effective_ttl = ttl.nil? ? self.ttl : ttl
+        effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl
+        fresh_until = now + effective_ttl
+        stale_until = fresh_until + effective_stale_ttl
         @cache[key] = CacheEntry.new(value, fresh_until, stale_until)
         value
       end
     end
 
+    # Delete one generated storage key.
+    #
+    # @param key [String] Generated storage key
+    # @return [Boolean] true if an entry was removed
+    def delete(key)
+      @monitor.synchronize do
+        !@cache.delete(key).nil?
+      end
+    end
+
     # Clear the entire cache
     #
     # @return [void]
@@ -125,6 +151,57 @@ def clear
       end
     end
 
+    # Logically invalidate every generated storage key.
+    #
+    # @return [Integer] New global generation
+    def clear_logically
+      @monitor.synchronize do
+        @global_generation += 1
+      end
+    end
+
+    # Logically invalidate every cache variant for one prompt name.
+    #
+    # @param name [String] Prompt name
+    # @return [Integer] New name generation
+    def invalidate_name(name)
+      @monitor.synchronize do
+        @name_generations[name.to_s] += 1
+      end
+    end
+
+    # Build a generated storage key for the current cache generation.
+    #
+    # @param logical_key [String] Stable logical cache identity
+    # @param name [String] Prompt name
+    # @return [String] Generated storage key
+    def storage_key(logical_key, name:)
+      @monitor.synchronize do
+        self.class.storage_key(
+          logical_key,
+          name: name,
+          global_generation: @global_generation,
+          name_generation: @name_generations[name.to_s]
+        )
+      end
+    end
+
+    # @return [Hash] Prompt cache statistics
+    def stats
+      @monitor.synchronize do
+        counts = count_entries_by_generation
+        {
+          backend: "memory",
+          enabled: true,
+          current_generation_entries: counts.fetch(:current),
+          orphaned_entries: counts.fetch(:orphaned),
+          total_entries: @cache.size,
+          global_generation: @global_generation,
+          unsupported_counts: []
+        }
+      end
+    end
+
     # Remove expired entries from cache
     #
     # @return [Integer] Number of entries removed
@@ -154,6 +231,15 @@ def empty?
       end
     end
 
+    # Validate that the memory cache backend is usable.
+    #
+    # @return [Boolean]
+    # rubocop:disable Naming/PredicateMethod
+    def validate!
+      true
+    end
+    # rubocop:enable Naming/PredicateMethod
+
     # Build a cache key from prompt name and options
     #
     # @param name [String] Prompt name
@@ -168,6 +254,18 @@ def self.build_key(name, version: nil, label: nil)
       key
     end
 
+    # Build a generated storage key from generation metadata.
+    #
+    # @param logical_key [String] Stable logical cache identity
+    # @param name [String] Prompt name
+    # @param global_generation [Integer] Global cache generation
+    # @param name_generation [Integer] Prompt-name cache generation
+    # @return [String] Generated storage key
+    def self.storage_key(logical_key, name:, global_generation:, name_generation:)
+      encoded_name = Base64.urlsafe_encode64(name.to_s, padding: false)
+      "g#{global_generation}:n#{encoded_name}:#{name_generation}:#{logical_key}"
+    end
+
     private
 
     # Implementation of StaleWhileRevalidate abstract methods
@@ -187,7 +285,7 @@ def cache_get(key)
     # @param key [String] Cache key
     # @param value [PromptCache::CacheEntry] Value to cache
     # @return [PromptCache::CacheEntry] The cached value
-    def cache_set(key, value)
+    def cache_set(key, value, **_options)
       @monitor.synchronize do
         # Evict oldest entry if at max size
         evict_oldest if @cache.size >= max_size
@@ -230,6 +328,29 @@ def release_lock(lock_key)
       end
     end
 
+    def count_entries_by_generation
+      @cache.each_key.with_object({ current: 0, orphaned: 0 }) do |key, counts|
+        if current_generation_key?(key)
+          counts[:current] += 1
+        else
+          counts[:orphaned] += 1
+        end
+      end
+    end
+
+    def current_generation_key?(key)
+      parts = key.split(":", 4)
+      return false unless parts.size == 4
+      return false unless parts[0].start_with?("g") && parts[1].start_with?("n")
+
+      global = Integer(parts[0][1..])
+      name = Base64.urlsafe_decode64(parts[1][1..])
+      name_generation = Integer(parts[2])
+      global == @global_generation && name_generation == @name_generations[name]
+    rescue ArgumentError
+      false
+    end
+
     # In-memory cache helper methods
 
     # Evict the oldest entry from cache
@@ -250,4 +371,5 @@ def default_logger
       Logger.new($stdout, level: Logger::WARN)
     end
   end
+  # rubocop:enable Metrics/ClassLength
 end
diff --git a/lib/langfuse/prompt_fetch_result.rb b/lib/langfuse/prompt_fetch_result.rb
new file mode 100644
index 0000000..b11e511
--- /dev/null
+++ b/lib/langfuse/prompt_fetch_result.rb
@@ -0,0 +1,121 @@
+# frozen_string_literal: true
+
+module Langfuse
+  # Public metadata returned by prompt fetch operations.
+  class PromptFetchResult
+    # @return [Object] Prompt data or prompt client returned by the fetch
+    attr_reader :prompt
+
+    # @return [String] Stable logical cache identity
+    attr_reader :logical_key
+
+    # @return [String] Generated backend key for the current cache generation
+    attr_reader :storage_key
+
+    # @return [Symbol] Cache status (:hit, :miss, :stale, :refresh, :bypass, :disabled)
+    attr_reader :cache_status
+
+    # @return [Symbol] Source of the returned prompt (:cache, :api, :fallback)
+    attr_reader :source
+
+    # @return [String] Prompt name
+    attr_reader :name
+
+    # @return [Integer, nil] Prompt version
+    attr_reader :version
+
+    # @return [String, nil] Prompt label
+    attr_reader :label
+
+    # @param prompt [Object] Prompt data or prompt client
+    # @param logical_key [String] Stable logical cache identity
+    # @param storage_key [String] Generated backend key
+    # @param cache_status [Symbol] Cache status
+    # @param source [Symbol] Prompt source
+    # @param name [String] Prompt name
+    # @param version [Integer, nil] Prompt version
+    # @param label [String, nil] Prompt label
+    # @return [PromptFetchResult]
+    # rubocop:disable Metrics/ParameterLists
+    def initialize(prompt:, logical_key:, storage_key:, cache_status:, source:, name:, version: nil, label: nil)
+      @prompt = prompt
+      @logical_key = logical_key
+      @storage_key = storage_key
+      @cache_status = cache_status.to_sym
+      @source = source.to_sym
+      @name = name
+      @version = version
+      @label = label
+    end
+    # rubocop:enable Metrics/ParameterLists
+
+    # Compatibility alias for callers that already use "cache key" language.
+    #
+    # @return [String] Stable logical cache identity
+    def cache_key
+      logical_key
+    end
+
+    # @return [Boolean] Whether this result used caller-provided fallback content
+    def fallback?
+      source == :fallback || (prompt.respond_to?(:is_fallback) && prompt.is_fallback)
+    end
+
+    # @return [Hash] Result metadata as a hash
+    def to_h
+      {
+        logical_key: logical_key,
+        storage_key: storage_key,
+        cache_status: cache_status,
+        source: source,
+        name: name,
+        version: version,
+        label: label,
+        fallback: fallback?
+      }
+    end
+  end
+
+  # Public key inspection result for prompt cache operations.
+  class PromptCacheKey
+    # @return [String] Prompt name
+    attr_reader :name
+
+    # @return [Integer, nil] Prompt version
+    attr_reader :version
+
+    # @return [String, nil] Prompt label
+    attr_reader :label
+
+    # @return [String] Stable logical cache identity
+    attr_reader :logical_key
+
+    # @return [String] Generated backend key for the current cache generation
+    attr_reader :storage_key
+
+    # @param name [String] Prompt name
+    # @param logical_key [String] Stable logical cache identity
+    # @param storage_key [String] Generated backend key
+    # @param version [Integer, nil] Prompt version
+    # @param label [String, nil] Prompt label
+    # @return [PromptCacheKey]
+    def initialize(name:, logical_key:, storage_key:, version: nil, label: nil)
+      @name = name
+      @version = version
+      @label = label
+      @logical_key = logical_key
+      @storage_key = storage_key
+    end
+
+    # @return [Hash] Cache key data as a hash
+    def to_h
+      {
+        name: name,
+        version: version,
+        label: label,
+        logical_key: logical_key,
+        storage_key: storage_key
+      }
+    end
+  end
+end
diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb
index 319b24c..8aaff06 100644
--- a/lib/langfuse/rails_cache_adapter.rb
+++ b/lib/langfuse/rails_cache_adapter.rb
@@ -14,6 +14,7 @@ module Langfuse
   #   adapter.set("greeting:1", prompt_data)
   #   adapter.get("greeting:1") # => prompt_data
   #
+  # rubocop:disable Metrics/ClassLength
   class RailsCacheAdapter
     include StaleWhileRevalidate
 
@@ -65,18 +66,36 @@ def get(key)
       Rails.cache.read(namespaced_key(key))
     end
 
+    # Read a raw cache entry, including stale entries.
+    #
+    # @param key [String] Cache key
+    # @return [Object, nil] Raw cache entry
+    def entry(key)
+      Rails.cache.read(namespaced_key(key))
+    end
+
     # Set a value in the cache
     #
     # @param key [String] Cache key
     # @param value [Object] Value to cache
     # @return [Object] The cached value
-    def set(key, value)
+    def set(key, value, ttl: nil, stale_ttl: nil)
       # Calculate expiration: use total_ttl if SWR enabled, otherwise just ttl
-      expires_in = swr_enabled? ? total_ttl : ttl
+      effective_ttl = ttl.nil? ? self.ttl : ttl
+      effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl
+      expires_in = swr_enabled? ? effective_ttl + effective_stale_ttl : effective_ttl
       Rails.cache.write(namespaced_key(key), value, expires_in:)
       value
     end
 
+    # Delete one generated storage key.
+    #
+    # @param key [String] Cache key
+    # @return [Boolean] true if an entry was removed
+    def delete(key)
+      Rails.cache.delete(namespaced_key(key))
+    end
+
     # Clear the entire Langfuse cache namespace
     #
     # Note: This uses delete_matched which may not be available on all cache stores.
@@ -88,6 +107,36 @@ def clear
       Rails.cache.delete_matched("#{namespace}:*")
     end
 
+    # Logically invalidate every generated storage key.
+    #
+    # @return [Integer] New global generation
+    def clear_logically
+      bump_generation(global_generation_key)
+    end
+
+    # Logically invalidate every cache variant for one prompt name.
+    #
+    # @param name [String] Prompt name
+    # @return [Integer] New name generation
+    def invalidate_name(name)
+      bump_generation(name_generation_key(name))
+    end
+
+    # Build a generated storage key for the current cache generation.
+    #
+    # @param logical_key [String] Stable logical cache identity
+    # @param name [String] Prompt name
+    # @return [String] Generated storage key
+    def storage_key(logical_key, name:)
+      generated = PromptCache.storage_key(
+        logical_key,
+        name: name,
+        global_generation: generation_value(global_generation_key),
+        name_generation: generation_value(name_generation_key(name))
+      )
+      namespaced_key(generated)
+    end
+
     # Get current cache size
     #
     # Note: Rails.cache doesn't provide a size method, so we return nil
@@ -98,6 +147,19 @@ def size
       nil
     end
 
+    # @return [Hash] Prompt cache statistics
+    def stats
+      {
+        backend: "rails",
+        enabled: true,
+        current_generation_entries: nil,
+        orphaned_entries: nil,
+        total_entries: nil,
+        global_generation: generation_value(global_generation_key),
+        unsupported_counts: %i[current_generation_entries orphaned_entries total_entries]
+      }
+    end
+
     # Check if cache is empty
     #
     # Note: Rails.cache doesn't provide an efficient way to check if empty,
@@ -108,6 +170,17 @@ def empty?
       false
     end
 
+    # Validate that Rails.cache is available for prompt caching.
+    #
+    # @return [Boolean]
+    # @raise [ConfigurationError] if Rails.cache is not available
+    # rubocop:disable Naming/PredicateMethod
+    def validate!
+      validate_rails_cache!
+      true
+    end
+    # rubocop:enable Naming/PredicateMethod
+
     # Build a cache key from prompt name and options
     #
     # @param name [String] Prompt name
@@ -135,7 +208,7 @@ def self.build_key(name, version: nil, label: nil)
     #   cache.fetch_with_lock("greeting:v1") do
     #     api_client.get_prompt("greeting")
     #   end
-    def fetch_with_lock(key)
+    def fetch_with_lock(key, ttl: nil)
       # 1. Check cache first (fast path - no lock needed)
       cached = get(key)
       return cached if cached
@@ -147,7 +220,7 @@ def fetch_with_lock(key)
         begin
           # We got the lock - fetch from source and populate cache
           value = yield
-          set(key, value)
+          set(key, value, ttl: ttl)
           value
         ensure
           # Always release lock, even if block raises
@@ -173,7 +246,7 @@ def fetch_with_lock(key)
     # @param key [String] Cache key
     # @return [Object, nil] Cached value
     def cache_get(key)
-      get(key)
+      entry(key)
     end
 
     # Set value in cache (SWR interface)
@@ -181,8 +254,9 @@ def cache_get(key)
     # @param key [String] Cache key
     # @param value [Object] Value to cache (expects CacheEntry)
     # @return [Object] The cached value
-    def cache_set(key, value)
-      set(key, value)
+    def cache_set(key, value, ttl: nil)
+      Rails.cache.write(namespaced_key(key), value, expires_in: ttl || total_ttl)
+      value
     end
 
     # Build lock key with namespace
@@ -248,7 +322,38 @@ def wait_for_cache(key)
     # @param key [String] Original cache key
     # @return [String] Namespaced cache key
     def namespaced_key(key)
-      "#{namespace}:#{key}"
+      key.start_with?("#{namespace}:") ? key : "#{namespace}:#{key}"
+    end
+
+    def global_generation_key
+      namespaced_key("__prompt_cache_generation__:global")
+    end
+
+    def name_generation_key(name)
+      encoded_name = Base64.urlsafe_encode64(name.to_s, padding: false)
+      namespaced_key("__prompt_cache_generation__:name:#{encoded_name}")
+    end
+
+    def generation_value(key)
+      Rails.cache.read(key).to_i
+    end
+
+    def bump_generation(key)
+      incremented = increment_generation(key)
+      return incremented if incremented
+
+      new_value = generation_value(key) + 1
+      Rails.cache.write(key, new_value)
+      new_value
+    end
+
+    def increment_generation(key)
+      return unless Rails.cache.respond_to?(:increment)
+
+      Rails.cache.write(key, 0, unless_exist: true)
+      Rails.cache.increment(key, 1)
+    rescue StandardError
+      nil
     end
 
     # Validate that Rails.cache is available
@@ -273,4 +378,5 @@ def default_logger
       end
     end
   end
+  # rubocop:enable Metrics/ClassLength
 end
diff --git a/lib/langfuse/stale_while_revalidate.rb b/lib/langfuse/stale_while_revalidate.rb
index 9d2e264..c469a95 100644
--- a/lib/langfuse/stale_while_revalidate.rb
+++ b/lib/langfuse/stale_while_revalidate.rb
@@ -41,6 +41,7 @@ module Langfuse
   #       # Implementation-specific lock release
   #     end
   #   end
+  # rubocop:disable Metrics/ModuleLength
   module StaleWhileRevalidate
     # Initialize SWR infrastructure
     #
@@ -72,7 +73,7 @@ def initialize_swr(refresh_threads: 5)
     #   cache.fetch_with_stale_while_revalidate("greeting:v1") do
     #     api_client.get_prompt("greeting")
     #   end
-    def fetch_with_stale_while_revalidate(key, &)
+    def fetch_with_stale_while_revalidate(key, ttl: nil, stale_ttl: nil, &)
       raise ConfigurationError, "fetch_with_stale_while_revalidate requires a positive stale_ttl" unless swr_enabled?
 
       entry = cache_get(key)
@@ -84,15 +85,50 @@ def fetch_with_stale_while_revalidate(key, &)
       elsif entry&.stale?
         # REVALIDATE - return stale + refresh in background
         logger.debug("CACHE STALE!")
-        schedule_refresh(key, &)
+        schedule_refresh(key, ttl: ttl, stale_ttl: stale_ttl, &)
         entry.data # Instant response!
       else
         # MISS - must fetch synchronously
         logger.debug("CACHE MISS!")
-        fetch_and_cache(key, &)
+        fetch_and_cache(key, ttl: ttl, stale_ttl: stale_ttl, &)
       end
     end
 
+    # Schedule a cache refresh without performing a read.
+    #
+    # @param key [String] Cache key
+    # @param ttl [Integer, nil] Optional fresh TTL override
+    # @param stale_ttl [Integer, nil] Optional stale TTL override
+    # @param on_success [#call, nil] Callback invoked after a successful write
+    # @param on_failure [#call, nil] Callback invoked when refresh raises
+    # @yield Block to execute to fetch fresh data
+    # @return [Boolean] true if a refresh was scheduled
+    def refresh_async(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: nil, &)
+      raise ConfigurationError, "refresh_async requires a positive stale_ttl" unless swr_enabled?
+
+      schedule_refresh(
+        key,
+        ttl: ttl,
+        stale_ttl: stale_ttl,
+        on_success: on_success,
+        on_failure: on_failure,
+        &
+      )
+    end
+
+    # Write a value with stale-while-revalidate metadata.
+    #
+    # @param key [String] Cache key
+    # @param value [Object] Value to cache
+    # @param ttl [Integer, nil] Optional fresh TTL override
+    # @param stale_ttl [Integer, nil] Optional stale TTL override
+    # @return [Object] The cached value
+    def write_with_stale_while_revalidate(key, value, ttl: nil, stale_ttl: nil)
+      raise ConfigurationError, "write_with_stale_while_revalidate requires a positive stale_ttl" unless swr_enabled?
+
+      set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl)
+    end
+
     # Check if SWR is enabled
     #
     # SWR is enabled when stale_ttl is positive, meaning there's a grace period
@@ -138,29 +174,35 @@ def initialize_thread_pool(refresh_threads)
     # @param key [String] Cache key
     # @yield Block to execute to fetch fresh data
     # @return [void]
-    def schedule_refresh(key, &block)
+    # rubocop:disable Naming/PredicateMethod
+    def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure: nil, &block)
       # Prevent duplicate refreshes
       lock_key = build_lock_key(key)
-      return unless acquire_lock(lock_key)
+      return false unless acquire_lock(lock_key)
 
       @thread_pool.post do
         value = yield block
-        set_cache_entry(key, value)
+        set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl)
+        on_success&.call(value)
       rescue StandardError => e
+        on_failure&.call(e)
         logger.error("Langfuse cache refresh failed for key '#{key}': #{e.class} - #{e.message}")
       ensure
         release_lock(lock_key)
       end
+
+      true
     end
+    # rubocop:enable Naming/PredicateMethod
 
     # Fetch data and cache it with SWR metadata
     #
     # @param key [String] Cache key
     # @yield Block to execute to fetch fresh data
     # @return [Object] Freshly fetched value
-    def fetch_and_cache(key, &block)
+    def fetch_and_cache(key, ttl: nil, stale_ttl: nil, &block)
       value = yield block
-      set_cache_entry(key, value)
+      set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl)
     end
 
     # Set value in cache with SWR metadata (CacheEntry)
@@ -168,13 +210,15 @@ def fetch_and_cache(key, &block)
     # @param key [String] Cache key
     # @param value [Object] Value to cache
     # @return [Object] The cached value
-    def set_cache_entry(key, value)
+    def set_cache_entry(key, value, ttl: nil, stale_ttl: nil)
       now = Time.now
-      fresh_until = now + ttl
-      stale_until = fresh_until + stale_ttl
+      effective_ttl = ttl.nil? ? self.ttl : ttl
+      effective_stale_ttl = stale_ttl.nil? ? self.stale_ttl : stale_ttl
+      fresh_until = now + effective_ttl
+      stale_until = fresh_until + effective_stale_ttl
       entry = PromptCache::CacheEntry.new(value, fresh_until, stale_until)
 
-      cache_set(key, entry)
+      cache_set(key, entry, ttl: effective_ttl + effective_stale_ttl)
 
       value
     end
@@ -213,7 +257,7 @@ def cache_get(_key)
     # @param value [Object] Value to cache
     # @return [Object] The cached value
     # @raise [NotImplementedError] if not implemented by including class
-    def cache_set(_key, _value)
+    def cache_set(_key, _value, ttl: nil)
       raise NotImplementedError, "#{self.class} must implement #cache_set"
     end
 
@@ -259,4 +303,5 @@ def logger
       @logger || raise(NotImplementedError, "#{self.class} must provide @logger")
     end
   end
+  # rubocop:enable Metrics/ModuleLength
 end
diff --git a/spec/langfuse/api_client_spec.rb b/spec/langfuse/api_client_spec.rb
index d34b0dc..4e74873 100644
--- a/spec/langfuse/api_client_spec.rb
+++ b/spec/langfuse/api_client_spec.rb
@@ -974,6 +974,207 @@ def set(_key, value)
     # rubocop:enable RSpec/MultipleMemoizedHelpers
   end
 
+  # rubocop:disable RSpec/MultipleMemoizedHelpers
+  describe "public prompt cache operations" do
+    let(:prompt_name) { "greeting" }
+    let(:prompt_response) do
+      {
+        "id" => "prompt-123",
+        "name" => prompt_name,
+        "version" => 1,
+        "prompt" => "Hello {{name}}!",
+        "type" => "text",
+        "labels" => ["production"]
+      }
+    end
+    let(:cache) { Langfuse::PromptCache.new(ttl: 60) }
+    let(:cached_client) do
+      described_class.new(public_key: public_key, secret_key: secret_key, base_url: base_url, cache: cache)
+    end
+
+    it "returns fetch metadata while keeping get_prompt prompt-data compatible" do
+      stub_prompt(prompt_response)
+
+      result = cached_client.get_prompt_result(prompt_name)
+      prompt = cached_client.get_prompt(prompt_name)
+
+      expect(result).to be_a(Langfuse::PromptFetchResult)
+      expect(result.prompt).to eq(prompt_response)
+      expect(result.logical_key).to eq("greeting:production")
+      expect(result.storage_key).to include("g0:")
+      expect(result.cache_status).to eq(:miss)
+      expect(result.source).to eq(:api)
+      expect(prompt).to eq(prompt_response)
+      expect(a_request(:get, prompt_url)).to have_been_made.once
+    end
+
+    it "reports cache hits and emits cache events" do
+      events = []
+      client = described_class.new(
+        public_key: public_key,
+        secret_key: secret_key,
+        base_url: base_url,
+        cache: cache,
+        cache_observer: ->(event, payload) { events << [event, payload] }
+      )
+      stub_prompt(prompt_response)
+
+      miss = client.get_prompt_result(prompt_name)
+      hit = client.get_prompt_result(prompt_name)
+
+      expect(miss.cache_status).to eq(:miss)
+      expect(hit.cache_status).to eq(:hit)
+      expect(hit.source).to eq(:cache)
+      expect(events.map(&:first)).to include(:miss, :write, :hit)
+      expect(events.last.last).to include(logical_key: "greeting:production", source: :cache)
+      expect(a_request(:get, prompt_url)).to have_been_made.once
+    end
+
+    it "bypasses cache reads and writes when cache_ttl is zero" do
+      stub_request(:get, prompt_url)
+        .to_return(
+          { status: 200, body: prompt_response.merge("version" => 1).to_json,
+            headers: { "Content-Type" => "application/json" } },
+          { status: 200, body: prompt_response.merge("version" => 2).to_json,
+            headers: { "Content-Type" => "application/json" } }
+        )
+
+      first = cached_client.get_prompt_result(prompt_name, cache_ttl: 0)
+      second = cached_client.get_prompt_result(prompt_name, cache_ttl: 0)
+
+      expect(first.cache_status).to eq(:bypass)
+      expect(second.cache_status).to eq(:bypass)
+      expect(second.version).to eq(2)
+      expect(a_request(:get, prompt_url)).to have_been_made.twice
+    end
+
+    it "refreshes a prompt and writes the refreshed value through to cache" do
+      stub_request(:get, prompt_url)
+        .to_return(
+          { status: 200, body: prompt_response.merge("version" => 1).to_json,
+            headers: { "Content-Type" => "application/json" } },
+          { status: 200, body: prompt_response.merge("version" => 2, "prompt" => "Hi {{name}}!").to_json,
+            headers: { "Content-Type" => "application/json" } }
+        )
+
+      cached_client.get_prompt_result(prompt_name)
+      refresh = cached_client.refresh_prompt(prompt_name)
+      hit = cached_client.get_prompt_result(prompt_name)
+
+      expect(refresh.cache_status).to eq(:refresh)
+      expect(refresh.source).to eq(:api)
+      expect(hit.cache_status).to eq(:hit)
+      expect(hit.prompt["prompt"]).to eq("Hi {{name}}!")
+      expect(a_request(:get, prompt_url)).to have_been_made.twice
+    end
+
+    it "invalidates one exact logical key without prefix matching" do
+      other_response = prompt_response.merge("name" => "greeting-extra")
+      stub_prompt(prompt_response)
+      stub_request(:get, "#{base_url}/api/public/v2/prompts/greeting-extra")
+        .to_return(status: 200, body: other_response.to_json, headers: { "Content-Type" => "application/json" })
+
+      cached_client.get_prompt_result(prompt_name)
+      cached_client.get_prompt_result("greeting-extra")
+      invalidated = cached_client.invalidate_prompt_cache(prompt_name)
+      cached_client.get_prompt_result(prompt_name)
+      cached_client.get_prompt_result("greeting-extra")
+
+      expect(invalidated.logical_key).to eq("greeting:production")
+      expect(a_request(:get, prompt_url)).to have_been_made.twice
+      expect(a_request(:get, "#{base_url}/api/public/v2/prompts/greeting-extra")).to have_been_made.once
+    end
+
+    it "invalidates all variants for a prompt name after prompt mutation" do
+      stub_request(:get, prompt_url)
+        .to_return(
+          { status: 200, body: prompt_response.merge("version" => 1).to_json,
+            headers: { "Content-Type" => "application/json" } },
+          { status: 200, body: prompt_response.merge("version" => 2).to_json,
+            headers: { "Content-Type" => "application/json" } }
+        )
+      stub_request(:post, "#{base_url}/api/public/v2/prompts")
+        .to_return(status: 201, body: prompt_response.merge("version" => 2).to_json,
+                   headers: { "Content-Type" => "application/json" })
+
+      cached_client.get_prompt_result(prompt_name)
+      cached_client.create_prompt(name: prompt_name, prompt: "Hi", type: "text")
+      result = cached_client.get_prompt_result(prompt_name)
+
+      expect(result.version).to eq(2)
+      expect(a_request(:get, prompt_url)).to have_been_made.twice
+    end
+
+    it "uses Rails generation keys for name-wide and global invalidation" do
+      store = build_rails_cache_store
+      stub_const("Rails", Class.new)
+      allow(Rails).to receive(:cache).and_return(store)
+      rails_cache = Langfuse::RailsCacheAdapter.new(ttl: 60)
+      client = described_class.new(
+        public_key: public_key,
+        secret_key: secret_key,
+        base_url: base_url,
+        cache: rails_cache
+      )
+      stub_prompt(prompt_response)
+
+      first_key = client.prompt_cache_key(prompt_name).storage_key
+      client.get_prompt_result(prompt_name)
+      client.invalidate_prompt_cache_by_name(prompt_name)
+      second_key = client.prompt_cache_key(prompt_name).storage_key
+      client.clear_prompt_cache
+      third_key = client.prompt_cache_key(prompt_name).storage_key
+
+      expect(second_key).not_to eq(first_key)
+      expect(third_key).not_to eq(second_key)
+      expect(store.deleted_patterns).to be_empty
+    end
+
+    def prompt_url
+      "#{base_url}/api/public/v2/prompts/#{prompt_name}"
+    end
+
+    def stub_prompt(response)
+      stub_request(:get, prompt_url)
+        .to_return(status: 200, body: response.to_json, headers: { "Content-Type" => "application/json" })
+    end
+
+    def build_rails_cache_store
+      Class.new do
+        attr_reader :deleted_patterns
+
+        def initialize
+          @data = {}
+          @deleted_patterns = []
+        end
+
+        def read(key)
+          @data[key]
+        end
+
+        # rubocop:disable Naming/PredicateMethod
+        def write(key, value, **_options)
+          @data[key] = value
+          true
+        end
+
+        def delete(key)
+          !@data.delete(key).nil?
+        end
+        # rubocop:enable Naming/PredicateMethod
+
+        def delete_matched(pattern)
+          @deleted_patterns << pattern
+        end
+
+        def increment(key, amount)
+          @data[key] = @data.fetch(key, 0).to_i + amount
+        end
+      end.new
+    end
+  end
+  # rubocop:enable RSpec/MultipleMemoizedHelpers
+
   describe "#create_prompt" do
     let(:prompt_name) { "new-prompt" }
     let(:text_prompt_request) do
diff --git a/spec/langfuse/client_spec.rb b/spec/langfuse/client_spec.rb
index 66403ab..917b31d 100644
--- a/spec/langfuse/client_spec.rb
+++ b/spec/langfuse/client_spec.rb
@@ -154,6 +154,35 @@ class << self
       end
     end
 
+    context "with auto cache backend" do
+      let(:config_with_auto_cache) do
+        Langfuse::Config.new do |config|
+          config.public_key = "pk_test_123"
+          config.secret_key = "sk_test_456"
+          config.base_url = "https://cloud.langfuse.com"
+          config.cache_ttl = 60
+          config.cache_backend = :auto
+        end
+      end
+
+      it "uses memory cache when Rails.cache is unavailable" do
+        client = described_class.new(config_with_auto_cache)
+        expect(client.api_client.cache).to be_a(Langfuse::PromptCache)
+      end
+
+      it "uses Rails cache when Rails.cache is available" do
+        rails_class = Class.new do
+          def self.cache
+            Object.new
+          end
+        end
+        stub_const("Rails", rails_class)
+
+        client = described_class.new(config_with_auto_cache)
+        expect(client.api_client.cache).to be_a(Langfuse::RailsCacheAdapter)
+      end
+    end
+
     context "with invalid cache backend" do
       let(:config_invalid_backend) do
         Langfuse::Config.new do |config|
@@ -520,6 +549,27 @@ class << self
           a_request(:get, "#{base_url}/api/public/v2/prompts/greeting")
         ).to have_been_made.once
       end
+
+      it "returns prompt clients with cache metadata" do
+        result = cached_client.get_prompt_result("greeting")
+        cached_result = cached_client.get_prompt_result("greeting")
+
+        expect(result.prompt).to be_a(Langfuse::TextPromptClient)
+        expect(result.logical_key).to eq("greeting:production")
+        expect(result.cache_status).to eq(:miss)
+        expect(result.source).to eq(:api)
+        expect(cached_result.cache_status).to eq(:hit)
+        expect(cached_result.source).to eq(:cache)
+      end
+
+      it "exposes flat prompt cache inspection and validation APIs" do
+        key = cached_client.prompt_cache_key("greeting")
+
+        expect(key.logical_key).to eq("greeting:production")
+        expect(key.storage_key).to include("g0:")
+        expect(cached_client.prompt_cache_stats).to include(backend: "memory", enabled: true)
+        expect(cached_client.validate_prompt_cache_backend!).to be(true)
+      end
     end
 
     context "with fallback support" do
@@ -543,6 +593,16 @@ class << self
           expect(result.is_fallback).to be(true)
         end
 
+        it "returns fallback fetch metadata" do
+          result = client.get_prompt_result("missing", fallback: "Hello!", type: :text)
+
+          expect(result.prompt).to be_a(Langfuse::TextPromptClient)
+          expect(result.fallback?).to be(true)
+          expect(result.source).to eq(:fallback)
+          expect(result.cache_status).to eq(:miss)
+          expect(result.logical_key).to eq("missing:production")
+        end
+
         it "raises error when no fallback provided" do
           expect do
             client.get_prompt("missing")
diff --git a/spec/langfuse/config_spec.rb b/spec/langfuse/config_spec.rb
index 6c860b1..40b89cf 100644
--- a/spec/langfuse/config_spec.rb
+++ b/spec/langfuse/config_spec.rb
@@ -300,6 +300,26 @@
         config.cache_backend = :rails
         expect { config.validate! }.not_to raise_error
       end
+
+      it "allows :auto backend" do
+        config.cache_backend = :auto
+        expect { config.validate! }.not_to raise_error
+      end
+    end
+
+    context "when prompt_cache_observer is invalid" do
+      it "raises ConfigurationError when observer is not callable" do
+        config.prompt_cache_observer = Object.new
+        expect { config.validate! }.to raise_error(
+          Langfuse::ConfigurationError,
+          "prompt_cache_observer must respond to #call"
+        )
+      end
+
+      it "allows callable observers" do
+        config.prompt_cache_observer = ->(_event, _payload) {}
+        expect { config.validate! }.not_to raise_error
+      end
     end
 
     context "when sample_rate is invalid" do
diff --git a/spec/langfuse/prompt_cache_spec.rb b/spec/langfuse/prompt_cache_spec.rb
index c786933..33e5a14 100644
--- a/spec/langfuse/prompt_cache_spec.rb
+++ b/spec/langfuse/prompt_cache_spec.rb
@@ -224,6 +224,50 @@
     end
   end
 
+  describe "generated cache keys" do
+    it "keeps logical keys stable while changing storage keys by generation" do
+      logical_key = described_class.build_key("greeting")
+      first_storage_key = cache.storage_key(logical_key, name: "greeting")
+
+      cache.invalidate_name("greeting")
+      second_storage_key = cache.storage_key(logical_key, name: "greeting")
+
+      cache.clear_logically
+      third_storage_key = cache.storage_key(logical_key, name: "greeting")
+
+      expect(logical_key).to eq("greeting:production")
+      expect(second_storage_key).not_to eq(first_storage_key)
+      expect(third_storage_key).not_to eq(second_storage_key)
+    end
+
+    it "tracks current and orphaned entries after logical invalidation" do
+      logical_key = described_class.build_key("greeting")
+      old_storage_key = cache.storage_key(logical_key, name: "greeting")
+      cache.set(old_storage_key, test_data)
+
+      cache.invalidate_name("greeting")
+      new_storage_key = cache.storage_key(logical_key, name: "greeting")
+      cache.set(new_storage_key, test_data)
+
+      expect(cache.stats).to include(
+        current_generation_entries: 1,
+        orphaned_entries: 1,
+        total_entries: 2
+      )
+    end
+
+    it "deletes one generated storage key without touching sibling names" do
+      greeting_key = cache.storage_key(described_class.build_key("greeting"), name: "greeting")
+      sibling_key = cache.storage_key(described_class.build_key("greeting-extra"), name: "greeting-extra")
+      cache.set(greeting_key, test_data)
+      cache.set(sibling_key, { "name" => "greeting-extra" })
+
+      expect(cache.delete(greeting_key)).to be(true)
+      expect(cache.get(greeting_key)).to be_nil
+      expect(cache.get(sibling_key)).to eq({ "name" => "greeting-extra" })
+    end
+  end
+
   describe "TTL expiration" do
     it "returns nil for expired entries" do
       cache.set("key1", test_data)

From b7e2f2bbc81f74105d2647b3bc37deb468bfa252 Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Mon, 4 May 2026 03:36:58 -0600
Subject: [PATCH 2/7] refactor(prompts): simplify prompt cache operations

- Centralize label-defaulting on PromptCacheKey#resolved_label
- Share event payload builder between ApiClient and Client
- Memoize cache backend name and ActiveSupport::Notifications lookup
- Skip event emission early when no observers are configured
- Drop dead to_sym coercions and uncalled fetch_with_simple_cache helpers
---
 lib/langfuse/api_client.rb          | 55 ++++++++++++-----------------
 lib/langfuse/client.rb              | 17 ++++-----
 lib/langfuse/prompt_fetch_result.rb | 12 +++++--
 3 files changed, 39 insertions(+), 45 deletions(-)

diff --git a/lib/langfuse/api_client.rb b/lib/langfuse/api_client.rb
index 0a2ea41..67f861b 100644
--- a/lib/langfuse/api_client.rb
+++ b/lib/langfuse/api_client.rb
@@ -63,6 +63,8 @@ def initialize(public_key:, secret_key:, base_url:, timeout: 5, logger: nil, cac
       @logger = logger || Logger.new($stdout, level: Logger::WARN)
       @cache = cache
       @cache_observer = cache_observer
+      @cache_backend_name = compute_cache_backend_name
+      @active_support_notifications = defined?(ActiveSupport::Notifications) ? ActiveSupport::Notifications : nil
     end
     # rubocop:enable Metrics/ParameterLists
 
@@ -237,11 +239,25 @@ def prompt_cache_stats
     # @param payload [Hash] Event payload
     # @return [void]
     def emit_prompt_cache_event(event, payload)
+      return if @cache_observer.nil? && @active_support_notifications.nil?
+
       normalized_payload = payload.merge(event: event.to_sym)
       notify_cache_observer(event, normalized_payload)
       notify_active_support(normalized_payload)
     end
 
+    # Build a prompt cache event payload for one logical key.
+    #
+    # @api private
+    # @param key [PromptCacheKey] Logical and storage cache key
+    # @param cache_status [Symbol] Cache status (:hit, :miss, :stale, :refresh, :bypass, :disabled)
+    # @param source [Symbol] Prompt source (:cache, :api, :fallback)
+    # @param extra [Hash] Extra fields to merge into the payload
+    # @return [Hash] Event payload
+    def prompt_event_payload(key, cache_status, source, extra = {})
+      event_payload(key, cache_status, source, extra)
+    end
+
     # Create a new prompt (or new version if prompt with same name exists)
     #
     # @param name [String] The prompt name
@@ -887,7 +903,7 @@ def build_prompt_result(key, prompt_data, cache_status, source)
         source: source,
         name: prompt_data["name"] || key.name,
         version: prompt_data["version"] || key.version,
-        label: key.label || (key.version ? nil : "production")
+        label: key.resolved_label
       )
     end
 
@@ -895,7 +911,7 @@ def event_payload(key, cache_status, source, extra = {})
       {
         name: key.name,
         version: key.version,
-        label: key.label || (key.version ? nil : "production"),
+        label: key.resolved_label,
         logical_key: key.logical_key,
         storage_key: key.storage_key,
         backend: cache_backend_name,
@@ -904,7 +920,9 @@ def event_payload(key, cache_status, source, extra = {})
       }.merge(extra)
     end
 
-    def cache_backend_name
+    attr_reader :cache_backend_name
+
+    def compute_cache_backend_name
       return "disabled" unless cache
       return "rails" if cache.is_a?(RailsCacheAdapter)
       return "memory" if cache.is_a?(PromptCache)
@@ -943,30 +961,13 @@ def notify_cache_observer(event, payload)
     end
 
     def notify_active_support(payload)
-      return unless defined?(ActiveSupport::Notifications)
+      return unless @active_support_notifications
 
-      ActiveSupport::Notifications.instrument("prompt_cache.langfuse", payload)
+      @active_support_notifications.instrument("prompt_cache.langfuse", payload)
     rescue StandardError => e
       logger.warn("Langfuse ActiveSupport cache notification failed: #{e.class} - #{e.message}")
     end
 
-    # Fetch prompt using the most appropriate caching strategy available
-    #
-    # @param cache_key [String] The cache key for this prompt
-    # @param name [String] The name of the prompt
-    # @param version [Integer, nil] Optional specific version number
-    # @param label [String, nil] Optional label
-    # @return [Hash] The prompt data
-    def fetch_with_appropriate_caching_strategy(cache_key, name, version, label)
-      if swr_cache_available?
-        fetch_with_swr_cache(cache_key, name, version, label)
-      elsif distributed_cache_available?
-        fetch_with_distributed_cache(cache_key, name, version, label)
-      else
-        fetch_with_simple_cache(cache_key, name, version, label)
-      end
-    end
-
     # Check if SWR cache is available
     def swr_cache_available?
       cache.respond_to?(:swr_enabled?) && cache.swr_enabled?
@@ -1062,16 +1063,6 @@ def fetch_with_distributed_cache(cache_key, name, version, label)
       end
     end
 
-    # Fetch with simple cache (in-memory cache)
-    def fetch_with_simple_cache(cache_key, name, version, label)
-      cached_data = cache.get(cache_key)
-      return cached_data if cached_data
-
-      prompt_data = fetch_prompt_from_api(name, version: version, label: label)
-      cache.set(cache_key, prompt_data)
-      prompt_data
-    end
-
     # Fetch a prompt from the API (without caching)
     #
     # @param name [String] The name of the prompt
diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb
index de6df25..9c50cda 100644
--- a/lib/langfuse/client.rb
+++ b/lib/langfuse/client.rb
@@ -873,23 +873,18 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con
         source: :fallback,
         name: name,
         version: version || prompt_client.version,
-        label: key.label || (key.version ? nil : "production")
+        label: key.resolved_label
       )
     end
 
     def fallback_event_payload(key, fetch_context, error)
-      {
-        name: key.name,
-        version: key.version,
-        label: key.label || (key.version ? nil : "production"),
-        logical_key: key.logical_key,
-        storage_key: key.storage_key,
-        backend: api_client.prompt_cache_stats.fetch(:backend),
-        cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)),
-        source: :fallback,
+      api_client.prompt_event_payload(
+        key,
+        fallback_cache_status(fetch_context.fetch(:cache_ttl)),
+        :fallback,
         error_class: error.class.name,
         error_message: error.message
-      }
+      )
     end
 
     def fallback_cache_status(cache_ttl)
diff --git a/lib/langfuse/prompt_fetch_result.rb b/lib/langfuse/prompt_fetch_result.rb
index b11e511..3ef5021 100644
--- a/lib/langfuse/prompt_fetch_result.rb
+++ b/lib/langfuse/prompt_fetch_result.rb
@@ -41,8 +41,8 @@ def initialize(prompt:, logical_key:, storage_key:, cache_status:, source:, name
       @prompt = prompt
       @logical_key = logical_key
       @storage_key = storage_key
-      @cache_status = cache_status.to_sym
-      @source = source.to_sym
+      @cache_status = cache_status
+      @source = source
       @name = name
       @version = version
       @label = label
@@ -107,6 +107,14 @@ def initialize(name:, logical_key:, storage_key:, version: nil, label: nil)
       @storage_key = storage_key
     end
 
+    # Resolve the effective label, defaulting to "production" when neither
+    # an explicit label nor a version was specified.
+    #
+    # @return [String, nil] Effective label
+    def resolved_label
+      label || (version ? nil : "production")
+    end
+
     # @return [Hash] Cache key data as a hash
     def to_h
       {

From 36218b27cfd94de89f30002c58881bdfd7d6465f Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Mon, 4 May 2026 04:01:14 -0600
Subject: [PATCH 3/7] fix(prompts): address prompt cache review comments

---
 lib/langfuse/client.rb                 | 12 +++++++-----
 lib/langfuse/stale_while_revalidate.rb |  4 ++--
 spec/langfuse/client_spec.rb           |  7 +++++++
 spec/langfuse/prompt_cache_spec.rb     | 18 ++++++++++++++++++
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb
index 9c50cda..f059f37 100644
--- a/lib/langfuse/client.rb
+++ b/lib/langfuse/client.rb
@@ -114,7 +114,7 @@ def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil,
 
       # Log warning and return fallback
       config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.")
-      build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e)
+      build_fallback_prompt_result(name, version, label, fallback, type, { cache_ttl: cache_ttl, error: e })
     end
 
     # Refresh a prompt from the API, optionally writing through to cache.
@@ -859,17 +859,19 @@ def build_client_fetch_result(api_result, prompt_client)
     end
 
     def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context)
+      cache_ttl = fetch_context.fetch(:cache_ttl)
+      error = fetch_context.fetch(:error)
       prompt_client = build_fallback_prompt_client(name, fallback, type)
       key = api_client.prompt_cache_key(name, version: version, label: label)
       api_client.emit_prompt_cache_event(
         :fallback,
-        fallback_event_payload(key, fetch_context, fetch_context.fetch(:error))
+        fallback_event_payload(key, cache_ttl, error)
       )
       PromptFetchResult.new(
         prompt: prompt_client,
         logical_key: key.logical_key,
         storage_key: key.storage_key,
-        cache_status: fallback_cache_status(fetch_context.fetch(:cache_ttl)),
+        cache_status: fallback_cache_status(cache_ttl),
         source: :fallback,
         name: name,
         version: version || prompt_client.version,
@@ -877,10 +879,10 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con
       )
     end
 
-    def fallback_event_payload(key, fetch_context, error)
+    def fallback_event_payload(key, cache_ttl, error)
       api_client.prompt_event_payload(
         key,
-        fallback_cache_status(fetch_context.fetch(:cache_ttl)),
+        fallback_cache_status(cache_ttl),
         :fallback,
         error_class: error.class.name,
         error_message: error.message
diff --git a/lib/langfuse/stale_while_revalidate.rb b/lib/langfuse/stale_while_revalidate.rb
index c469a95..11afa03 100644
--- a/lib/langfuse/stale_while_revalidate.rb
+++ b/lib/langfuse/stale_while_revalidate.rb
@@ -181,7 +181,7 @@ def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure:
       return false unless acquire_lock(lock_key)
 
       @thread_pool.post do
-        value = yield block
+        value = block.call
         set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl)
         on_success&.call(value)
       rescue StandardError => e
@@ -201,7 +201,7 @@ def schedule_refresh(key, ttl: nil, stale_ttl: nil, on_success: nil, on_failure:
     # @yield Block to execute to fetch fresh data
     # @return [Object] Freshly fetched value
     def fetch_and_cache(key, ttl: nil, stale_ttl: nil, &block)
-      value = yield block
+      value = block.call
       set_cache_entry(key, value, ttl: ttl, stale_ttl: stale_ttl)
     end
 
diff --git a/spec/langfuse/client_spec.rb b/spec/langfuse/client_spec.rb
index 917b31d..6903f0d 100644
--- a/spec/langfuse/client_spec.rb
+++ b/spec/langfuse/client_spec.rb
@@ -603,6 +603,13 @@ def self.cache
           expect(result.logical_key).to eq("missing:production")
         end
 
+        it "returns bypass metadata when fallback is used during a cache bypass" do
+          result = client.get_prompt_result("missing", fallback: "Hello!", type: :text, cache_ttl: 0)
+
+          expect(result.fallback?).to be(true)
+          expect(result.cache_status).to eq(:bypass)
+        end
+
         it "raises error when no fallback provided" do
           expect do
             client.get_prompt("missing")
diff --git a/spec/langfuse/prompt_cache_spec.rb b/spec/langfuse/prompt_cache_spec.rb
index 33e5a14..484a7e5 100644
--- a/spec/langfuse/prompt_cache_spec.rb
+++ b/spec/langfuse/prompt_cache_spec.rb
@@ -128,6 +128,24 @@
         cache.fetch_with_stale_while_revalidate("test") { "value" }
       end
 
+      it "fetches misses with strict zero-arity callables" do
+        cache = described_class.new(ttl: 60, stale_ttl: 120)
+        fetch_prompt = -> { "value" }
+
+        expect(cache.fetch_with_stale_while_revalidate("test", &fetch_prompt)).to eq("value")
+      end
+
+      it "refreshes stale entries with strict zero-arity callables" do
+        cache = described_class.new(ttl: 60, stale_ttl: 120)
+        thread_pool = cache.instance_variable_get(:@thread_pool)
+        allow(thread_pool).to receive(:post).and_yield
+        fetch_prompt = -> { "refreshed" }
+
+        cache.send(:schedule_refresh, "test", &fetch_prompt)
+
+        expect(cache.entry("test").data).to eq("refreshed")
+      end
+
       it "accepts custom refresh_threads parameter" do
         # Can't verify thread pool size directly, but can verify it doesn't error
         expect do

From dbca6ebfe363629dc814191ff52dd4706222f35e Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Mon, 4 May 2026 04:12:39 -0600
Subject: [PATCH 4/7] refactor(prompts): use keyword args for fallback context

---
 lib/langfuse/client.rb | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/langfuse/client.rb b/lib/langfuse/client.rb
index f059f37..7602d8e 100644
--- a/lib/langfuse/client.rb
+++ b/lib/langfuse/client.rb
@@ -114,7 +114,7 @@ def get_prompt_result(name, version: nil, label: nil, fallback: nil, type: nil,
 
       # Log warning and return fallback
       config.logger.warn("Langfuse API error for prompt '#{name}': #{e.message}. Using fallback.")
-      build_fallback_prompt_result(name, version, label, fallback, type, { cache_ttl: cache_ttl, error: e })
+      build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl: cache_ttl, error: e)
     end
 
     # Refresh a prompt from the API, optionally writing through to cache.
@@ -858,9 +858,8 @@ def build_client_fetch_result(api_result, prompt_client)
       )
     end
 
-    def build_fallback_prompt_result(name, version, label, fallback, type, fetch_context)
-      cache_ttl = fetch_context.fetch(:cache_ttl)
-      error = fetch_context.fetch(:error)
+    # rubocop:disable Metrics/ParameterLists
+    def build_fallback_prompt_result(name, version, label, fallback, type, cache_ttl:, error:)
       prompt_client = build_fallback_prompt_client(name, fallback, type)
       key = api_client.prompt_cache_key(name, version: version, label: label)
       api_client.emit_prompt_cache_event(
@@ -878,6 +877,7 @@ def build_fallback_prompt_result(name, version, label, fallback, type, fetch_con
         label: key.resolved_label
       )
     end
+    # rubocop:enable Metrics/ParameterLists
 
     def fallback_event_payload(key, cache_ttl, error)
       api_client.prompt_event_payload(

From 88372675e1ee70e09cbf342cf5baec17ce4475df Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Mon, 4 May 2026 14:01:39 -0600
Subject: [PATCH 5/7] fix(prompts): surface cache generation increment failures

---
 docs/CACHING.md                           |  2 ++
 lib/langfuse/rails_cache_adapter.rb       |  3 ++-
 spec/langfuse/rails_cache_adapter_spec.rb | 24 +++++++++++++++++++++++
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/docs/CACHING.md b/docs/CACHING.md
index 7717fa2..621f2f0 100644
--- a/docs/CACHING.md
+++ b/docs/CACHING.md
@@ -73,6 +73,8 @@ Cache identity is prompt name plus version or label. When neither is supplied, t
 
 Name-wide invalidation and whole-cache clear use generation counters. Old Rails.cache entries are not physically scanned or deleted; they become unreachable under the new generated storage keys and expire by TTL.
 
+Automatic mutation invalidation only covers `create_prompt` and `update_prompt` calls made by the current SDK process. Prompt edits made in the Langfuse UI or by other SDKs become visible through TTL expiry, `refresh_prompt`, or explicit invalidation.
+
 ### Cache Events
 
 Set `prompt_cache_observer` to receive cache events without binding the SDK to your metric names:
diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb
index 8aaff06..ffea7d8 100644
--- a/lib/langfuse/rails_cache_adapter.rb
+++ b/lib/langfuse/rails_cache_adapter.rb
@@ -352,7 +352,8 @@ def increment_generation(key)
 
       Rails.cache.write(key, 0, unless_exist: true)
       Rails.cache.increment(key, 1)
-    rescue StandardError
+    rescue StandardError => e
+      logger.warn("Langfuse prompt cache generation increment failed for key '#{key}': #{e.class} - #{e.message}")
       nil
     end
 
diff --git a/spec/langfuse/rails_cache_adapter_spec.rb b/spec/langfuse/rails_cache_adapter_spec.rb
index d4165f2..2e539cc 100644
--- a/spec/langfuse/rails_cache_adapter_spec.rb
+++ b/spec/langfuse/rails_cache_adapter_spec.rb
@@ -261,6 +261,30 @@ class << self
     end
   end
 
+  describe "#invalidate_name" do
+    let(:logger) { instance_double(Logger, warn: nil) }
+    let(:adapter) { described_class.new(logger: logger) }
+    let(:generation_key) do
+      encoded_name = Base64.urlsafe_encode64("greeting", padding: false)
+      "langfuse:__prompt_cache_generation__:name:#{encoded_name}"
+    end
+
+    it "warns when atomic generation increment fails" do
+      allow(mock_cache).to receive(:respond_to?).with(:increment).and_return(true)
+      expect(mock_cache).to receive(:write)
+        .with(generation_key, 0, unless_exist: true)
+        .and_raise(StandardError, "connection refused")
+      warning = "Langfuse prompt cache generation increment failed for key '#{generation_key}': " \
+                "StandardError - connection refused"
+      expect(logger).to receive(:warn)
+        .with(warning)
+      expect(mock_cache).to receive(:read).with(generation_key).and_return(2)
+      expect(mock_cache).to receive(:write).with(generation_key, 3).and_return(true)
+
+      expect(adapter.invalidate_name("greeting")).to eq(3)
+    end
+  end
+
   describe "#size" do
     let(:adapter) { described_class.new }
 

From bee4e78601d36fe75cce4735399f9aab7b4775c3 Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Mon, 4 May 2026 14:23:40 -0600
Subject: [PATCH 6/7] perf(prompts): memoize Rails cache generations

---
 lib/langfuse/rails_cache_adapter.rb       | 47 ++++++++++++++++++++++-
 spec/langfuse/rails_cache_adapter_spec.rb | 36 +++++++++++++++++
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/lib/langfuse/rails_cache_adapter.rb b/lib/langfuse/rails_cache_adapter.rb
index ffea7d8..9986e7f 100644
--- a/lib/langfuse/rails_cache_adapter.rb
+++ b/lib/langfuse/rails_cache_adapter.rb
@@ -18,6 +18,8 @@ module Langfuse
   class RailsCacheAdapter
     include StaleWhileRevalidate
 
+    GENERATION_MEMO_TTL_SECONDS = 1.0
+
     # @return [Integer] Time-to-live in seconds
     attr_reader :ttl
 
@@ -55,6 +57,8 @@ def initialize(ttl: 60, namespace: "langfuse", lock_timeout: 10, stale_ttl: 0, r
       @lock_timeout = lock_timeout
       @stale_ttl = stale_ttl
       @logger = logger
+      @generation_memo = {}
+      @generation_memo_mutex = Mutex.new
       initialize_swr(refresh_threads: refresh_threads) if swr_enabled?
     end
 
@@ -105,6 +109,7 @@ def delete(key)
     def clear
       # Delete all keys matching the namespace pattern
       Rails.cache.delete_matched("#{namespace}:*")
+      clear_generation_memo
     end
 
     # Logically invalidate every generated storage key.
@@ -335,15 +340,25 @@ def name_generation_key(name)
     end
 
     def generation_value(key)
-      Rails.cache.read(key).to_i
+      now = monotonic_time
+      memoized = memoized_generation_value(key, now)
+      return memoized unless memoized.nil?
+
+      Rails.cache.read(key).to_i.tap do |value|
+        memoize_generation_value(key, value, now)
+      end
     end
 
     def bump_generation(key)
       incremented = increment_generation(key)
-      return incremented if incremented
+      if incremented
+        memoize_generation_value(key, incremented.to_i)
+        return incremented
+      end
 
       new_value = generation_value(key) + 1
       Rails.cache.write(key, new_value)
+      memoize_generation_value(key, new_value)
       new_value
     end
 
@@ -357,6 +372,34 @@ def increment_generation(key)
       nil
     end
 
+    def memoized_generation_value(key, now)
+      @generation_memo_mutex.synchronize do
+        entry = @generation_memo[key]
+        return nil unless entry
+
+        return entry.fetch(:value) if now < entry.fetch(:expires_at)
+
+        @generation_memo.delete(key)
+        nil
+      end
+    end
+
+    def memoize_generation_value(key, value, now = monotonic_time)
+      @generation_memo_mutex.synchronize do
+        @generation_memo[key] = { value: value, expires_at: now + GENERATION_MEMO_TTL_SECONDS }
+      end
+    end
+
+    def clear_generation_memo
+      @generation_memo_mutex.synchronize do
+        @generation_memo.clear
+      end
+    end
+
+    def monotonic_time
+      Process.clock_gettime(Process::CLOCK_MONOTONIC)
+    end
+
     # Validate that Rails.cache is available
     #
     # @raise [ConfigurationError] if Rails.cache is not available
diff --git a/spec/langfuse/rails_cache_adapter_spec.rb b/spec/langfuse/rails_cache_adapter_spec.rb
index 2e539cc..1c3bfdb 100644
--- a/spec/langfuse/rails_cache_adapter_spec.rb
+++ b/spec/langfuse/rails_cache_adapter_spec.rb
@@ -285,6 +285,42 @@ class << self
     end
   end
 
+  describe "#storage_key" do
+    let(:adapter) { described_class.new }
+    let(:logical_key) { "greeting:production" }
+    let(:encoded_name) { Base64.urlsafe_encode64("greeting", padding: false) }
+    let(:global_generation_key) { "langfuse:__prompt_cache_generation__:global" }
+    let(:name_generation_key) { "langfuse:__prompt_cache_generation__:name:#{encoded_name}" }
+
+    it "memoizes generation reads for repeated key inspection" do
+      expect(mock_cache).to receive(:read).with(global_generation_key).once.and_return(4)
+      expect(mock_cache).to receive(:read).with(name_generation_key).once.and_return(7)
+
+      first_key = adapter.storage_key(logical_key, name: "greeting")
+      second_key = adapter.storage_key(logical_key, name: "greeting")
+
+      expect(second_key).to eq(first_key)
+      expect(first_key).to eq("langfuse:g4:n#{encoded_name}:7:greeting:production")
+    end
+
+    it "updates memoized generations after same-process name invalidation" do
+      allow(mock_cache).to receive(:respond_to?).with(:increment).and_return(true)
+      expect(mock_cache).to receive(:read).with(global_generation_key).once.and_return(0)
+      expect(mock_cache).to receive(:read).with(name_generation_key).once.and_return(0)
+      expect(mock_cache).to receive(:write)
+        .with(name_generation_key, 0, unless_exist: true)
+        .and_return(true)
+      expect(mock_cache).to receive(:increment).with(name_generation_key, 1).and_return(1)
+
+      first_key = adapter.storage_key(logical_key, name: "greeting")
+      adapter.invalidate_name("greeting")
+      second_key = adapter.storage_key(logical_key, name: "greeting")
+
+      expect(first_key).to eq("langfuse:g0:n#{encoded_name}:0:greeting:production")
+      expect(second_key).to eq("langfuse:g0:n#{encoded_name}:1:greeting:production")
+    end
+  end
+
   describe "#size" do
     let(:adapter) { described_class.new }
 

From cf77981430aa7cbf5301c249a2a573e599f9dcce Mon Sep 17 00:00:00 2001
From: kadekillary <kadekillary@pm.me>
Date: Tue, 5 May 2026 00:29:46 -0600
Subject: [PATCH 7/7] docs(prompts): align cache operation guidance

---
 docs/API_REFERENCE.md  |  3 ++-
 docs/CACHING.md        | 34 +++++++++++++++++++++++-----------
 docs/CONFIGURATION.md  |  5 +++--
 docs/ERROR_HANDLING.md |  9 +++++----
 docs/MIGRATION.md      |  4 ++--
 docs/RAILS.md          | 11 +++++++----
 6 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/docs/API_REFERENCE.md b/docs/API_REFERENCE.md
index 745d803..17309ec 100644
--- a/docs/API_REFERENCE.md
+++ b/docs/API_REFERENCE.md
@@ -42,11 +42,12 @@ Block receives a `Langfuse::Config` object with these properties:
 | `timeout`                      | Integer | No       | `5`                            | HTTP timeout (seconds)            |
 | `cache_ttl`                    | Integer | No       | `60`                           | Prompt cache TTL (seconds)        |
 | `cache_max_size`               | Integer | No       | `1000`                         | Max cached prompts                |
-| `cache_backend`                | Symbol  | No       | `:memory`                      | `:memory` or `:rails`             |
+| `cache_backend`                | Symbol  | No       | `:memory`                      | `:memory`, `:rails`, or `:auto`   |
 | `cache_lock_timeout`           | Integer | No       | `10`                           | Lock timeout (seconds)            |
 | `cache_stale_while_revalidate` | Boolean | No       | `false`                        | Advisory SWR intent flag (effective activation depends on `cache_stale_ttl`) |
 | `cache_stale_ttl`              | Integer or `:indefinite` | No | `0`                  | Stale TTL (seconds, `>0` enables SWR) |
 | `cache_refresh_threads`        | Integer | No       | `5`                            | Background refresh threads        |
+| `prompt_cache_observer`        | Callable | No      | `nil`                          | Prompt cache event hook           |
 | `batch_size`                   | Integer | No       | `50`                           | Score + trace export batch size   |
 | `flush_interval`               | Integer | No       | `10`                           | Score + trace export interval (s) |
 | `sample_rate`                  | Float   | No       | `1.0`                          | Trace + trace-linked score sampling rate (`0.0..1.0`) |
diff --git a/docs/CACHING.md b/docs/CACHING.md
index 621f2f0..cefbb65 100644
--- a/docs/CACHING.md
+++ b/docs/CACHING.md
@@ -566,10 +566,13 @@ RUN bundle exec rake langfuse:warm_cache_all
 
 See [CONFIGURATION.md](CONFIGURATION.md) for all cache-related configuration options:
 
-- `cache_backend` - `:memory` or `:rails`
+- `cache_backend` - `:memory`, `:rails`, or `:auto`
 - `cache_ttl` - Time-to-live in seconds
 - `cache_max_size` - Max prompts (in-memory only)
 - `cache_lock_timeout` - Lock timeout (Rails.cache only)
+- `cache_stale_ttl` - Stale serving window; `> 0` enables SWR
+- `cache_refresh_threads` - Background refresh worker count
+- `prompt_cache_observer` - Optional cache event hook
 
 ## Performance Considerations
 
@@ -626,12 +629,11 @@ config.cache_backend = :rails
 ### 2. Enable SWR for Production
 
 ```ruby
-# Development: disabled for predictable behavior
-config.cache_stale_while_revalidate = !Rails.env.development?
-
-# Production: enabled for best performance
 if Rails.env.production?
+  config.cache_stale_while_revalidate = true  # Advisory intent flag
   config.cache_stale_ttl = config.cache_ttl  # Set explicitly (common default)
+else
+  config.cache_stale_ttl = 0  # Disabled for predictable prompt iteration
 end
 ```
 
@@ -655,8 +657,14 @@ bundle exec rake langfuse:warm_cache_all
 ### 5. Monitor Cache Performance
 
 ```ruby
-# Log cache hits/misses
-Rails.logger.info "Fetching prompt: #{name} (cache: #{cache_hit? ? 'HIT' : 'MISS'})"
+config.prompt_cache_observer = lambda do |event, payload|
+  Rails.logger.info(
+    event: event,
+    prompt: payload[:name],
+    status: payload[:cache_status],
+    source: payload[:source]
+  )
+end
 ```
 
 ### 6. Handle Cache Failures Gracefully
@@ -674,9 +682,11 @@ prompt = Langfuse.client.get_prompt(
 
 ```ruby
 # Rails console
-Langfuse.client.api_client.cache&.clear
+Langfuse.client.invalidate_prompt_cache("greeting", label: "production")
+Langfuse.client.invalidate_prompt_cache_by_name("greeting")
+Langfuse.client.clear_prompt_cache
 
-# Or use rake task
+# Or use the rake task
 rake langfuse:clear_cache
 ```
 
@@ -725,8 +735,10 @@ end
 **Solutions**:
 
 1. Wait for TTL to expire
-2. Clear cache manually: `rake langfuse:clear_cache`
-3. Reduce `cache_ttl` in development
+2. Refresh one prompt now: `Langfuse.client.refresh_prompt("greeting")`
+3. Invalidate cached prompt entries: `Langfuse.client.invalidate_prompt_cache_by_name("greeting")`
+4. Clear the prompt cache namespace: `Langfuse.client.clear_prompt_cache`
+5. Reduce `cache_ttl` in development
 
 ### Stampede Protection Not Working
 
diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md
index 6eb4714..ffbe2fd 100644
--- a/docs/CONFIGURATION.md
+++ b/docs/CONFIGURATION.md
@@ -618,8 +618,9 @@ Validation rules:
 
 - `public_key` must be present
 - `secret_key` must be present
-- `cache_backend` must be `:memory` or `:rails`
-- If `:rails`, Rails must be defined
+- `cache_backend` must be `:memory`, `:rails`, or `:auto`
+- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` must be available
+- `prompt_cache_observer` must respond to `#call` (if set)
 - `should_export_span` must respond to `#call` (if set)
 - `mask` must respond to `#call` (if set)
 
diff --git a/docs/ERROR_HANDLING.md b/docs/ERROR_HANDLING.md
index 9d536d4..82130ad 100644
--- a/docs/ERROR_HANDLING.md
+++ b/docs/ERROR_HANDLING.md
@@ -50,8 +50,8 @@ end
 **Validation checklist:**
 - `public_key` present and starts with `pk-lf-`
 - `secret_key` present and starts with `sk-lf-`
-- `cache_backend` is `:memory` or `:rails`
-- If `:rails`, Rails is defined
+- `cache_backend` is `:memory`, `:rails`, or `:auto`
+- If `:rails` is selected, or `:auto` resolves to `:rails`, Rails and `Rails.cache` are available
 
 ### `Langfuse::UnauthorizedError`
 
@@ -432,8 +432,9 @@ puts config.inspect
 ### Check Cache State
 
 ```ruby
-cache = Langfuse.client.api_client.cache
-puts "Cache backend: #{cache&.class || 'disabled'}"
+stats = Langfuse.client.prompt_cache_stats
+puts "Cache backend: #{stats[:backend]}"
+puts "Cache enabled: #{stats[:enabled]}"
 ```
 
 ### Test Credentials
diff --git a/docs/MIGRATION.md b/docs/MIGRATION.md
index 6abaca3..12b077c 100644
--- a/docs/MIGRATION.md
+++ b/docs/MIGRATION.md
@@ -685,8 +685,8 @@ prompts:
 ```ruby
 # In Rails console
 Langfuse.reset!  # Clears everything
-# Or just clear cache
-Langfuse.client.api_client.cache&.clear
+# Or just clear the prompt cache namespace
+Langfuse.client.clear_prompt_cache
 ```
 
 ### Problem: Variables not substituting correctly
diff --git a/docs/RAILS.md b/docs/RAILS.md
index 9b29bf9..154c0cd 100644
--- a/docs/RAILS.md
+++ b/docs/RAILS.md
@@ -253,7 +253,7 @@ Useful console checks:
 
 ```ruby
 Langfuse.configuration
-Langfuse.client.api_client.cache
+Langfuse.client.prompt_cache_stats
 ```
 
 ## Troubleshooting
@@ -263,11 +263,14 @@ Langfuse.client.api_client.cache
 The usual problem is stale cache, not a broken prompt API.
 
 1. Wait for `cache_ttl` to expire.
-2. Clear the cache entry store directly if you need to inspect fresh state now.
-3. Lower `cache_ttl` in development if you are iterating quickly.
+2. Refresh the specific prompt if you need fresh state now.
+3. Invalidate the prompt name or clear the prompt cache namespace.
+4. Lower `cache_ttl` in development if you are iterating quickly.
 
 ```ruby
-Langfuse.client.api_client.cache&.clear
+Langfuse.client.refresh_prompt("greeting", label: "production")
+Langfuse.client.invalidate_prompt_cache_by_name("greeting")
+Langfuse.client.clear_prompt_cache
 ```
 
 ### Traces Missing Entirely