diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md
index 89c8964..e567704 100644
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@@ -79,7 +79,7 @@ Development dependencies include:
 ```python
 import featcopilot
 print(featcopilot.__version__)
-# Output: 0.1.0
+# Output: 0.3.7
 
 from featcopilot import AutoFeatureEngineer
 print("Installation successful!")
diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md
new file mode 100644
index 0000000..ed68367
--- /dev/null
+++ b/docs/user-guide/cli.md
@@ -0,0 +1,236 @@
+# Command-Line Interface
+
+FeatCopilot ships a stable, agent-friendly `featcopilot` CLI for using the
+library from shells, CI pipelines, and **agentic / LLM tool-use** workflows
+without writing Python glue. All subcommands accept `--json` for
+machine-readable stdout; user-facing errors are written to **stderr** with
+a non-zero exit code so that automation can parse failures
+deterministically.
+
+The CLI is installed automatically with the package via the
+`[project.scripts]` entry point (`featcopilot = "featcopilot.cli:main"`),
+so after `pip install featcopilot` the `featcopilot` command is available
+on `$PATH`. The equivalent module form `python -m featcopilot ...` always
+works regardless of how the package was installed.
+
+## Subcommands
+
+| Command | Purpose |
+| --- | --- |
+| `featcopilot info` | Print version, supported engines, selection methods, leakage guards, I/O formats, and a runtime `parquet_available` flag. |
+| `featcopilot transform` | Read a CSV / Parquet / JSON file, run [`AutoFeatureEngineer`](../user-guide/overview.md), and write engineered features to an output file. |
+| `featcopilot explain` | Fit and print a JSON document with `{name, explanation, code}` per feature for downstream LLM consumption (no output file is written). |
+
+Run any subcommand with `--help` to see the full flag list:
+
+```bash
+featcopilot --help
+featcopilot transform --help
+featcopilot explain --help
+```
+
+## Output contract
+
+All three subcommands honor the same agent-friendly contract:
+
+* **`stdout`** carries the result. With `--json` (always implicit for
+  `explain`), exactly one JSON document is written.
+* **`stderr`** is reserved for failures. A successful run keeps `stderr`
+  empty even when `AutoFeatureEngineer` emits leakage warnings or
+  `verbose` logger output ─ those are surfaced via the JSON payload's
+  `warnings` field instead. This same contract covers warnings emitted
+  during pandas / pyarrow read or write phases (e.g. `DtypeWarning` on
+  mixed-type CSVs, `FutureWarning` from a successful Parquet write):
+  they are routed to the JSON `warnings` field, never to `stderr`.
+* **Exit codes**: `0` on success; `2` for user-input errors (missing
+  files, malformed config, unknown target, etc.); `1` for unexpected
+  internal errors.
+
+## `featcopilot info`
+
+Discover capabilities without running an engineer:
+
+```bash
+featcopilot info --json
+```
+
+Sample (truncated) output:
+
+```json
+{
+  "version": "0.3.7",
+  "supported_engines": ["llm", "relational", "tabular", "text", "timeseries"],
+  "supported_selection_methods": [
+    "chi2",
+    "correlation",
+    "f_test",
+    "importance",
+    "mutual_info",
+    "xgboost"
+  ],
+  "supported_leakage_guards": ["off", "raise", "warn"],
+  "supported_input_formats": ["csv", "json"],
+  "supported_output_formats": ["csv", "json"],
+  "parquet_available": false
+}
+```
+
+When a parquet engine (`pyarrow` or `fastparquet`) IS importable in the
+current environment, `"parquet"` is added to `supported_input_formats`
+and `supported_output_formats` (in source order, so the lists become
+`["csv", "parquet", "json"]`) and `parquet_available` flips to `true`.
+
+`parquet_available` reflects whether `pyarrow` or `fastparquet` is
+importable in the current environment. The base FeatCopilot install does
+not pin a parquet engine; install one with
+`pip install pyarrow` (or `fastparquet`) to enable Parquet I/O.
+
+## `featcopilot transform`
+
+Run feature engineering on a tabular input and write the engineered
+features to disk:
+
+```bash
+featcopilot transform \
+    --input data.csv --target label --output features.csv \
+    --engines tabular --max-features 50 \
+    --json
+```
+
+Common flags:
+
+| Flag | Purpose |
+| --- | --- |
+| `--input / -i` | Path to input file (CSV / Parquet / JSON). Required. |
+| `--output / -o` | Path to output file. Required. |
+| `--target / -t` | Target column. Required when feature selection is applied (i.e. when `--max-features` / config `max_features` is set). |
+| `--input-format` / `--output-format` | Override format detection (`csv` / `parquet` / `json`). |
+| `--engines` | One or more engines to enable (default: `tabular`). |
+| `--max-features N` | Cap on engine output / selection. Forwarded both to engine constructors and to the selector. |
+| `--no-selection` | Skip feature selection entirely (raw feature generation). |
+| `--selection-methods` | Override the default `mutual_info importance` selection set. |
+| `--leakage-guard` | How to handle suspicious column names: `warn` (default — log a warning and continue), `raise` (hard-fail with an error), or `off` (disable the check). |
+| `--include-target` | Re-attach the target column to the output file (collision-safe). |
+| `--task-description` | Free-form ML task description forwarded to LLM-aware engines. |
+| `--config FILE` | JSON config with nested keys (e.g. `llm_config`, `selection_methods`). CLI flags override config values. |
+| `--verbose / --no-verbose` | Toggle verbose logging. With `--json`, log records are routed to the JSON `warnings` field rather than `stderr`. |
+| `--gate-n-jobs` | Parallelism for the do-no-harm gate's RF (default 1; `-1` = all cores). |
+| `--json` | Emit a one-line JSON status object on stdout instead of human-readable text. |
+
+A successful `--json` run prints something like:
+
+```json
+{
+  "status": "ok",
+  "input": "data.csv",
+  "output": "features.csv",
+  "input_format": "csv",
+  "output_format": "csv",
+  "n_rows": 1000,
+  "n_features": 47,
+  "n_input_columns": 12,
+  "n_generated_features": 47,
+  "engines": ["tabular"],
+  "selection_methods": ["mutual_info", "importance"],
+  "max_features": 50,
+  "target": "label",
+  "selection_applied": true,
+  "warnings": []
+}
+```
+
+## `featcopilot explain`
+
+Fit the engineer (without writing any output file) and print a JSON
+catalog of generated features for downstream LLM consumption:
+
+```bash
+featcopilot explain --input data.csv --target label
+```
+
+Each entry in the `features` array contains the feature `name`, an
+LLM-style natural-language `explanation`, and the executable Python
+`code` used to produce it.
+
+`explain` defaults to running on the **full** input so the metadata is
+a faithful description of what a corresponding `transform` would
+generate. Some engines (notably the tabular engine's categorical
+encoding) consult per-row / per-category statistics when planning
+features, so blind subsampling can silently change results. For very
+large inputs where metadata-only `explain` should not pay full memory
+or compute cost, opt in with:
+
+```bash
+featcopilot explain --input big.csv --target label --explain-sample-size 5000
+```
+
+The cap is a deterministic *head slice* (the first N rows), threaded
+through `pd.read_csv(nrows=N)` for CSV so memory is bounded natively.
+For Parquet / JSON pandas has no native row-limit, so the file is
+fully read and then truncated; a `UserWarning` explaining the
+limitation is emitted (and surfaced in the JSON `warnings` field) only
+when the cap actually truncates the input.
+
+## Configuration files
+
+Pass `--config config.json` to provide nested keys that don't have
+matching CLI flags, such as the `llm_config` engine kwargs:
+
+```json
+{
+  "engines": ["tabular", "llm"],
+  "max_features": 80,
+  "selection_methods": ["mutual_info", "importance"],
+  "llm_config": {
+    "backend": "litellm",
+    "model": "gpt-4o",
+    "max_suggestions": 20
+  }
+}
+```
+
+Explicit CLI flags override values from the config file. Any malformed
+scalar (e.g. `"max_features": "5"`, `"verbose": "false"`) is rejected
+with a clean exit-2 error rather than failing later inside the
+engineer.
+
+## Parquet I/O
+
+The base FeatCopilot install does not pin a parquet engine. To use
+`--input file.parquet` / `--output file.parquet` (or the `parquet`
+value of `--input-format` / `--output-format`), install one of:
+
+```bash
+pip install pyarrow      # recommended
+# or
+pip install fastparquet
+```
+
+Confirm with `featcopilot info --json`:
+
+```json
+{ "parquet_available": true, ... }
+```
+
+If neither engine is installed, attempting Parquet I/O fails with a
+clean exit-2 error pointing at the missing dependency.
+
+## Agentic-usage tips
+
+* Always pass `--json`. Treat anything on `stderr` as a hard failure;
+  treat anything on `stdout` as the JSON result.
+* Treat the JSON `warnings` field as a list of human-readable
+  diagnostic strings ─ it is non-empty for `transform` runs that
+  generated leakage / mock-mode / sampling notices, and empty for
+  fully clean runs.
+* For long-running batch jobs, prefer `featcopilot transform` to
+  `python -m featcopilot transform` only because the former is shorter;
+  both invoke the exact same entry point.
+
+## See also
+
+* [Overview](overview.md) ─ the underlying `AutoFeatureEngineer` API.
+* [Engines](engines.md) ─ what each engine generates.
+* [LLM Features](llm-features.md) ─ configuring the LLM backend (provide
+  an `llm_config` object inside the JSON file passed to `--config`, as
+  shown in the [Configuration files](#configuration-files) section above).
diff --git a/featcopilot/core/feature.py b/featcopilot/core/feature.py
index 4e3e933..dc13209 100644
--- a/featcopilot/core/feature.py
+++ b/featcopilot/core/feature.py
@@ -12,6 +12,45 @@
 logger = get_logger(__name__)
 
 
+# Curated set of safe Python builtins exposed to ``Feature.compute``'s
+# stored code. Without this whitelist (i.e. with ``{"__builtins__": {}}``)
+# even basic idioms like ``len(df)``, ``range(...)``, ``sum(...)``, or
+# ``int(x)`` raise ``NameError`` at exec time, which means a feature whose
+# code legitimately uses a Python builtin crashes during ``compute`` even
+# though the snippet is otherwise valid. The set mirrors the one used by
+# :class:`featcopilot.core.transform_rule.TransformRule` so both code
+# execution paths agree on what is safe.
+_SAFE_BUILTINS: dict[str, Any] = {
+    "len": len,
+    "sum": sum,
+    "max": max,
+    "min": min,
+    "int": int,
+    "float": float,
+    "str": str,
+    "bool": bool,
+    "abs": abs,
+    "round": round,
+    "pow": pow,
+    "range": range,
+    "list": list,
+    "dict": dict,
+    "set": set,
+    "tuple": tuple,
+    "sorted": sorted,
+    "reversed": reversed,
+    "enumerate": enumerate,
+    "zip": zip,
+    "any": any,
+    "all": all,
+    "map": map,
+    "filter": filter,
+    "isinstance": isinstance,
+    "hasattr": hasattr,
+    "getattr": getattr,
+}
+
+
 class FeatureType(Enum):
     """Types of features."""
 
@@ -109,6 +148,42 @@ def compute(self, df: pd.DataFrame) -> pd.Series:
         """
         Compute feature values from DataFrame using stored code.
 
+        The stored ``code`` is executed in a single shared namespace
+        with ``df``, ``np`` and ``pd`` bound as names alongside a
+        curated set of safe Python builtins (``len``, ``range``,
+        ``sum``, numeric / sequence constructors, etc.) so common
+        idioms work without giving the snippet a Python import system
+        — ``__import__`` is intentionally NOT in the safe builtins, so
+        an ``import foo`` statement inside the snippet raises at exec
+        time. The snippet must bind its output to a name called
+        ``result``.
+
+        .. note::
+           This is **not** a security sandbox for untrusted code.
+           ``pd`` is in scope, which means the snippet can reach
+           pandas' file I/O helpers (``pd.read_csv``, ``pd.read_parquet``,
+           ``df.to_csv``, ...), and dunder attribute access on objects
+           reachable from ``df`` / ``np`` / ``pd`` is not blocked. The
+           builtin whitelist limits the *namespace* available to plain
+           Python idioms; it does not isolate FeatCopilot from the
+           ambient process. Stored snippets must therefore come from a
+           trusted source (your own code generator, a vetted feature
+           store, or a transform-rule registry you control).
+
+        A *fresh copy* of the safe-builtins dict is passed into ``exec``
+        on every call so that any mutation the snippet performs on
+        ``__builtins__`` (rebinding entries, ``del``, ``pop``) does not
+        bleed into subsequent ``compute`` calls. Likewise the
+        data-bound namespace is constructed fresh per call. Using a
+        SINGLE dict for both ``globals`` and ``locals`` is what makes
+        free variables inside comprehensions and lambdas — which Python
+        resolves against the enclosing function's globals, not the
+        caller's locals — see ``df``, ``np`` and ``pd`` correctly.
+        With separate ``locals`` and ``globals`` dicts a snippet such
+        as ``[df['c'].iloc[i] for i in range(len(df))]`` would
+        otherwise raise ``NameError`` because the implicit comprehension
+        function's body looks ``df`` up in the (empty) ``globals``.
+
         Parameters
         ----------
         df : DataFrame
@@ -118,14 +193,40 @@ def compute(self, df: pd.DataFrame) -> pd.Series:
         -------
         Series
             Computed feature values
+
+        Raises
+        ------
+        ValueError
+            * If ``self.code`` is empty / missing — message
+              ``"No code defined for feature ..."``.
+            * If ``self.code`` is present but did not bind a
+              ``result`` variable — message
+              ``"Feature ... code did not produce a 'result' variable"``.
+              These two cases produce DIFFERENT messages so a failing
+              snippet is distinguishable from an unset feature when
+              debugging.
         """
-        if self.code:
-            # Execute stored code to compute feature
-            local_vars = {"df": df, "np": np, "pd": pd}
-            exec(self.code, {"__builtins__": {}}, local_vars)
-            if "result" in local_vars:
-                return local_vars["result"]
-        raise ValueError(f"No code defined for feature {self.name}")
+        if not self.code:
+            raise ValueError(f"No code defined for feature {self.name}")
+
+        # Single shared namespace so comprehensions / lambdas /
+        # generator expressions inside the snippet see ``df``, ``np``,
+        # ``pd`` and the safe builtins. Fresh dicts per call so the
+        # snippet cannot pollute either the safe-builtins whitelist or
+        # the data bindings for later ``compute`` invocations.
+        namespace: dict[str, Any] = {
+            "__builtins__": dict(_SAFE_BUILTINS),
+            "df": df,
+            "np": np,
+            "pd": pd,
+        }
+        exec(self.code, namespace)
+        if "result" not in namespace:
+            raise ValueError(
+                f"Feature {self.name!r} code did not produce a 'result' variable. "
+                "Stored snippet must bind its output to a name called 'result'."
+            )
+        return namespace["result"]
 
 
 class FeatureSet:
diff --git a/featcopilot/llm/__init__.py b/featcopilot/llm/__init__.py
index d63780e..3039868 100644
--- a/featcopilot/llm/__init__.py
+++ b/featcopilot/llm/__init__.py
@@ -4,7 +4,7 @@
 """
 
 from featcopilot.llm.code_generator import FeatureCodeGenerator
-from featcopilot.llm.copilot_client import CopilotFeatureClient
+from featcopilot.llm.copilot_client import CopilotFeatureClient, SyncCopilotFeatureClient
 from featcopilot.llm.explainer import FeatureExplainer
 from featcopilot.llm.litellm_client import LiteLLMFeatureClient, SyncLiteLLMFeatureClient
 from featcopilot.llm.openai_client import OpenAIFeatureClient, SyncOpenAIFeatureClient
@@ -13,6 +13,7 @@
 
 __all__ = [
     "CopilotFeatureClient",
+    "SyncCopilotFeatureClient",
     "LiteLLMFeatureClient",
     "SyncLiteLLMFeatureClient",
     "OpenAIFeatureClient",
diff --git a/featcopilot/utils/cache.py b/featcopilot/utils/cache.py
index ad8e23e..1067727 100644
--- a/featcopilot/utils/cache.py
+++ b/featcopilot/utils/cache.py
@@ -8,6 +8,10 @@
 
 import pandas as pd
 
+from featcopilot.utils.logger import get_logger
+
+logger = get_logger(__name__)
+
 
 class FeatureCache:
     """
@@ -85,8 +89,18 @@ def get(self, key: str, data_hash: str | None = None) -> Any | None:
                     # Store in memory cache
                     self._memory_cache[cache_key] = value
                     return value
-                except Exception:
-                    pass
+                except Exception as exc:  # noqa: BLE001 - we re-surface via logger
+                    # Cache reads must never crash the caller (a corrupted
+                    # entry should fall back to recomputing the feature)
+                    # but the failure must not be silent: log a warning
+                    # so users can diagnose disk issues / pickle version
+                    # mismatches / partially-written files.
+                    logger.warning(
+                        "FeatureCache: failed to read cache entry %s (%s: %s)",
+                        cache_path,
+                        type(exc).__name__,
+                        exc,
+                    )
 
         return None
 
@@ -137,8 +151,18 @@ def set(
                 meta_path = self.cache_dir / f"{cache_key}.meta.json"
                 with open(meta_path, "w") as f:
                     json.dump(metadata or {}, f)
-            except Exception:
-                pass
+            except Exception as exc:  # noqa: BLE001 - we re-surface via logger
+                # Persisting to disk is best-effort: a write failure (e.g.
+                # disk full, read-only filesystem, unpicklable value)
+                # should not crash the caller, but it must not be silent
+                # either. Log a warning so operational issues with the
+                # cache directory surface in observability.
+                logger.warning(
+                    "FeatureCache: failed to persist cache entry %s (%s: %s)",
+                    cache_path,
+                    type(exc).__name__,
+                    exc,
+                )
 
     def has(self, key: str, data_hash: str | None = None) -> bool:
         """Check if key exists in cache."""
@@ -205,8 +229,17 @@ def get_metadata(self, key: str, data_hash: str | None = None) -> dict | None:
                 try:
                     with open(meta_path) as f:
                         return json.load(f)
-                except Exception:
-                    pass
+                except Exception as exc:  # noqa: BLE001 - we re-surface via logger
+                    # Same rationale as the cache-read path: a corrupted
+                    # metadata file should fall back to "no metadata
+                    # available" rather than crashing the caller, but
+                    # the failure must surface via the logger.
+                    logger.warning(
+                        "FeatureCache: failed to read cache metadata %s (%s: %s)",
+                        meta_path,
+                        type(exc).__name__,
+                        exc,
+                    )
 
         return None
 
diff --git a/mkdocs.yml b/mkdocs.yml
index 8136adf..48ca57f 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -80,6 +80,7 @@ nav:
     - LLM Features: user-guide/llm-features.md
     - Feature Selection: user-guide/selection.md
     - Feature Stores: user-guide/feature-stores.md
+    - Command-Line Interface: user-guide/cli.md
     - Benchmarks: user-guide/benchmarks.md
     - Best Practices: user-guide/best-practices.md
   - Examples:
diff --git a/tests/test_cli.py b/tests/test_cli.py
index 2f8c1f3..4cb0e9b 100644
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -124,6 +124,86 @@ def test_info_text_mode_is_human_readable():
     assert __version__ in out
 
 
+def test_cli_doc_leakage_guard_values_match_source_of_truth():
+    """Regression for round-1 review: the user-guide CLI doc previously
+    listed ``--leakage-guard`` choices as ``warn`` / ``block`` / ``ignore``,
+    but the actual ``AutoFeatureEngineer.SUPPORTED_LEAKAGE_GUARDS`` set is
+    ``{off, warn, raise}``. Following the doc with
+    ``--leakage-guard block`` produced an argparse error rather than the
+    described behavior. This test pins the doc against the source of
+    truth so any future drift fails fast.
+    """
+    from featcopilot.transformers.sklearn_compat import AutoFeatureEngineer
+
+    expected = sorted(AutoFeatureEngineer.SUPPORTED_LEAKAGE_GUARDS)
+    cli_doc = (Path(__file__).resolve().parent.parent / "docs" / "user-guide" / "cli.md").read_text(encoding="utf-8")
+
+    # The ``info`` JSON sample MUST list the actual values (sorted).
+    expected_json_fragment = '"supported_leakage_guards": ' + json.dumps(expected)
+    assert (
+        expected_json_fragment in cli_doc
+    ), f"docs/user-guide/cli.md must show {expected_json_fragment!r} in the info sample; got mismatch with source of truth"
+
+    # The ``--leakage-guard`` row MUST mention each valid value as
+    # ```value`` and MUST NOT mention any value that is not actually
+    # accepted. Using a substring check on the literal backtick-quoted
+    # token avoids false positives from prose elsewhere on the page
+    # (e.g. ``warn`` appears in unrelated wording).
+    invalid_examples = ("`block`", "`ignore`")
+    for bad in invalid_examples:
+        assert bad not in cli_doc, (
+            f"docs/user-guide/cli.md must not advertise {bad} as a leakage_guard value; "
+            f"actual valid values are {expected}"
+        )
+    for value in expected:
+        assert f"`{value}`" in cli_doc, (
+            f"docs/user-guide/cli.md must advertise `{value}` as a valid leakage_guard value; "
+            f"current values are {expected}"
+        )
+
+
+def test_cli_doc_info_sample_matches_engines_and_selection_methods():
+    """Regression for round-2 review: the CLI doc's ``info`` JSON sample
+    must list the FULL sorted ``SUPPORTED_ENGINES`` and
+    ``SUPPORTED_SELECTION_METHODS`` sets, not a truncated subset.
+
+    Round-2 specifically caught that the sample omitted ``llm`` from
+    ``supported_engines`` and ``chi2`` / ``f_test`` / ``xgboost`` from
+    ``supported_selection_methods``. Even with a "truncated" caveat the
+    arrays look complete and steered users away from valid CLI choices,
+    so we now pin the entire sorted list against
+    ``AutoFeatureEngineer.SUPPORTED_*`` so future drift in either
+    direction (adding a new engine, renaming a method) fails fast.
+    """
+    from featcopilot.transformers.sklearn_compat import AutoFeatureEngineer
+
+    cli_doc = (Path(__file__).resolve().parent.parent / "docs" / "user-guide" / "cli.md").read_text(encoding="utf-8")
+
+    expected_engines = sorted(AutoFeatureEngineer.SUPPORTED_ENGINES)
+    engines_fragment = '"supported_engines": ' + json.dumps(expected_engines)
+    assert engines_fragment in cli_doc, (
+        f"docs/user-guide/cli.md ``info`` sample must show {engines_fragment!r}; "
+        "the JSON sample drifted away from AutoFeatureEngineer.SUPPORTED_ENGINES"
+    )
+
+    expected_methods = sorted(AutoFeatureEngineer.SUPPORTED_SELECTION_METHODS)
+    # The methods array is multi-line in the doc (one entry per line)
+    # for readability, so check that each value is present as a
+    # JSON-quoted string in the sample. We anchor the search inside the
+    # ``"supported_selection_methods": [`` block to avoid false
+    # positives from prose like "``mutual_info``".
+    sm_marker = '"supported_selection_methods": ['
+    sm_idx = cli_doc.find(sm_marker)
+    assert sm_idx != -1, "docs/user-guide/cli.md must contain a supported_selection_methods JSON sample block"
+    sm_block_end = cli_doc.find("]", sm_idx)
+    sm_block = cli_doc[sm_idx:sm_block_end]
+    for method in expected_methods:
+        assert f'"{method}"' in sm_block, (
+            f"docs/user-guide/cli.md ``info`` sample must list {method!r} in supported_selection_methods; "
+            f"current set is {expected_methods}"
+        )
+
+
 def test_top_level_version_flag(capsys):
     # ``--version`` (argparse action) prints to stdout; main() now traps the
     # SystemExit and returns the code so the API contract is consistent.
diff --git a/tests/test_core.py b/tests/test_core.py
index cbee907..a4c4b59 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -60,6 +60,157 @@ def test_feature_compute(self):
 
         assert list(result) == [1, 4, 9, 16]
 
+    def test_feature_compute_uses_safe_builtins(self):
+        """Regression: ``Feature.compute`` previously called
+        ``exec(code, {"__builtins__": {}}, locals)`` which broke any
+        snippet that used a Python builtin (``len``, ``range``, ``int``,
+        ``sum``, ...). The fix exposes a curated set of safe builtins
+        identical to ``TransformRule._get_safe_builtins`` so common
+        idioms work without giving the snippet unrestricted access.
+        """
+        df = pd.DataFrame({"col1": [10, 20, 30, 40, 50]})
+
+        # ``len`` (most common builtin used in a feature) — would NameError
+        # before the fix.
+        f_len = Feature(name="row_count", code="result = pd.Series([len(df)] * len(df))")
+        result = f_len.compute(df)
+        assert list(result) == [5, 5, 5, 5, 5]
+
+        # ``range`` + ``sum`` + numeric builtins — exercises a broader
+        # subset of the whitelist in one snippet.
+        f_range = Feature(
+            name="cumulative_index_sum",
+            code="result = pd.Series([sum(range(int(v))) for v in df['col1']])",
+        )
+        result = f_range.compute(df)
+        # sum(range(10))=45, sum(range(20))=190, sum(range(30))=435,
+        # sum(range(40))=780, sum(range(50))=1225
+        assert list(result) == [45, 190, 435, 780, 1225]
+
+        # ``abs`` + ``round`` + ``min`` / ``max`` — ensure the
+        # whitelist's full set is exposed.
+        f_clip = Feature(
+            name="clipped",
+            code="result = pd.Series([round(min(max(abs(v - 25), 0), 20), 2) for v in df['col1']])",
+        )
+        result = f_clip.compute(df)
+        assert list(result) == [15, 5, 5, 15, 20]
+
+    def test_feature_compute_resolves_df_inside_comprehension(self):
+        """Regression for round-2 review: when a snippet uses a
+        comprehension or lambda whose body references ``df``, ``np`` or
+        ``pd``, Python resolves those free variables against the
+        enclosing function's *globals*, not the caller's locals. With
+        a previous implementation that passed ``df`` only via the
+        ``locals`` argument to ``exec`` (and had ``globals`` set to
+        only ``__builtins__``), the inner reference would fail with
+        ``NameError`` and the call would either crash or be silently
+        dropped by ``FeatureSet.compute_all``. The fix uses a single
+        shared namespace for both ``globals`` and ``locals``.
+        """
+        df = pd.DataFrame({"col1": [1, 2, 3, 4, 5]})
+
+        # List comprehension whose body references ``df`` and uses
+        # ``range`` / ``len`` builtins — this is the exact pattern
+        # called out in the round-2 review.
+        f_compre = Feature(
+            name="from_comprehension",
+            code="result = pd.Series([df['col1'].iloc[i] * 2 for i in range(len(df))])",
+        )
+        result = f_compre.compute(df)
+        assert list(result) == [2, 4, 6, 8, 10]
+
+        # Lambda whose body references ``df`` and ``pd`` — same
+        # globals-resolution rule applies.
+        f_lambda = Feature(
+            name="from_lambda",
+            code=(
+                "fn = lambda i: int(df['col1'].iloc[i]) ** 2\n" "result = pd.Series([fn(i) for i in range(len(df))])"
+            ),
+        )
+        result = f_lambda.compute(df)
+        assert list(result) == [1, 4, 9, 16, 25]
+
+        # Generator expression piped into a builtin — exercises the
+        # iterator-protocol path through the same namespace.
+        f_gen = Feature(
+            name="from_generator",
+            code="result = pd.Series([sum(v for v in df['col1'] if v <= n) for n in df['col1']])",
+        )
+        result = f_gen.compute(df)
+        # cumulative sums for thresholds 1, 2, 3, 4, 5: 1, 3, 6, 10, 15
+        assert list(result) == [1, 3, 6, 10, 15]
+
+    def test_feature_compute_isolates_safe_builtins_across_calls(self):
+        """Regression: ``_SAFE_BUILTINS`` must NOT be passed to ``exec``
+        by reference. If it were, a snippet that rebinds entries in its
+        own ``__builtins__`` view (e.g. ``__builtins__['len'] = lambda
+        x: -999``) would leak that mutation into every subsequent
+        ``Feature.compute`` call in the same process, breaking unrelated
+        features. The fix passes ``dict(_SAFE_BUILTINS)`` (a fresh
+        shallow copy) per call so that side effects are local to that
+        single ``exec``.
+
+        Round-2 review note: an earlier version of this test started
+        the attacker snippet with ``import builtins``, but ``__import__``
+        is intentionally NOT in the safe-builtins whitelist, so the
+        snippet raised at the very first line and the surrounding
+        ``try/except`` swallowed the failure ─ meaning the test would
+        pass even if ``_SAFE_BUILTINS`` were still being shared by
+        reference. This rewrite mutates ``__builtins__`` directly via
+        the dict that ``exec`` sees as the snippet's builtins source,
+        AND asserts inside the snippet (via ``len(df) == -999``) that
+        the rebinding actually took effect. That dual assertion is
+        what makes the test a real regression check.
+        """
+        df = pd.DataFrame({"col1": [1, 2, 3]})
+
+        # Attacker snippet: rebind ``len`` in this call's __builtins__
+        # dict. The trailing ``result = pd.Series([len(df)])`` would
+        # yield 3 if the rebinding had failed, and -999 if the
+        # rebinding succeeded. That's the in-snippet proof that
+        # mutation IS possible within a single exec ─ which is what
+        # makes the per-call-copy isolation contract meaningful.
+        f_attacker = Feature(
+            name="poisoner",
+            code=("__builtins__['len'] = lambda x: -999\nresult = pd.Series([len(df)])\n"),
+        )
+        poisoned = f_attacker.compute(df)
+        assert list(poisoned) == [-999], (
+            "Attacker snippet failed to mutate __builtins__: the regression test "
+            "is no longer exercising the isolation path it claims to protect"
+        )
+
+        # A clean, well-behaved feature run AFTER the attacker must
+        # see the original ``len`` builtin, not the poisoned version.
+        f_victim = Feature(
+            name="clean_user",
+            code="result = pd.Series([len(df)] * len(df))",
+        )
+        clean = f_victim.compute(df)
+        assert list(clean) == [3, 3, 3], (
+            "Per-call dict copy of _SAFE_BUILTINS is broken: a previous " "feature's mutation leaked into a later call"
+        )
+
+    def test_feature_compute_distinguishes_missing_code_from_missing_result(self):
+        """Regression: the ``ValueError`` raised by ``Feature.compute``
+        must use DIFFERENT messages for the two distinct failure
+        modes (``self.code`` empty / missing vs. snippet ran but did
+        not bind ``result``). Previously both paths surfaced the
+        misleading ``"No code defined ..."`` message even when the
+        code was clearly present, making debugging harder.
+        """
+        # No code at all → "No code defined" branch.
+        f_empty = Feature(name="no_code")
+        with pytest.raises(ValueError, match="No code defined for feature no_code"):
+            f_empty.compute(pd.DataFrame({"x": [1]}))
+
+        # Code present but does not produce ``result`` → distinct
+        # message that points at the actual cause.
+        f_no_result = Feature(name="no_result", code="x = df['col1'] * 2")
+        with pytest.raises(ValueError, match=r"Feature 'no_result' code did not produce a 'result' variable"):
+            f_no_result.compute(pd.DataFrame({"col1": [1, 2, 3]}))
+
 
 class TestFeatureSet:
     """Tests for FeatureSet class."""
diff --git a/tests/test_llm_modules.py b/tests/test_llm_modules.py
index e1c426b..a42a2d8 100644
--- a/tests/test_llm_modules.py
+++ b/tests/test_llm_modules.py
@@ -2596,3 +2596,52 @@ def test_send_prompt_handles_concurrent_futures_timeout_error(self):
 
         result = asyncio.run(client.send_prompt("hi"))
         assert "timed out" in result.lower()
+
+
+# ---------------------------------------------------------------------------
+# featcopilot.llm package public-API consistency
+# ---------------------------------------------------------------------------
+
+
+class TestLLMPackageExports:
+    """Pin the public API of the ``featcopilot.llm`` package.
+
+    Regression: ``SyncCopilotFeatureClient`` was defined in
+    ``copilot_client.py`` and used by sibling modules (e.g.
+    ``code_generator.py``, ``explainer.py``) but was missing from
+    ``featcopilot/llm/__init__.py``'s imports and ``__all__`` while the
+    sister classes ``SyncOpenAIFeatureClient`` and
+    ``SyncLiteLLMFeatureClient`` WERE both exported. That made
+    ``from featcopilot.llm import SyncCopilotFeatureClient`` raise
+    ``ImportError`` even though the class itself was importable from the
+    submodule, which is a public-API surface inconsistency callers
+    legitimately notice.
+    """
+
+    def test_sync_clients_are_all_exported(self):
+        """All three sync LLM clients must be importable directly from
+        the ``featcopilot.llm`` package."""
+        from featcopilot.llm import (
+            SyncCopilotFeatureClient,
+            SyncLiteLLMFeatureClient,
+            SyncOpenAIFeatureClient,
+        )
+
+        # Identity check: the names exposed by the package must be the
+        # same objects defined in the underlying submodules.
+        from featcopilot.llm.copilot_client import SyncCopilotFeatureClient as _SCC
+        from featcopilot.llm.litellm_client import SyncLiteLLMFeatureClient as _SLC
+        from featcopilot.llm.openai_client import SyncOpenAIFeatureClient as _SOC
+
+        assert SyncCopilotFeatureClient is _SCC
+        assert SyncLiteLLMFeatureClient is _SLC
+        assert SyncOpenAIFeatureClient is _SOC
+
+    def test_sync_clients_are_listed_in_dunder_all(self):
+        """``__all__`` must mirror the imports so ``from featcopilot.llm
+        import *`` is consistent with explicit imports."""
+        import featcopilot.llm as llm_pkg
+
+        assert "SyncCopilotFeatureClient" in llm_pkg.__all__
+        assert "SyncLiteLLMFeatureClient" in llm_pkg.__all__
+        assert "SyncOpenAIFeatureClient" in llm_pkg.__all__
diff --git a/tests/test_utils.py b/tests/test_utils.py
index 75731d3..9996895 100644
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@@ -204,6 +204,81 @@ def test_list_keys_includes_disk(self):
             cache._memory_cache.clear()
             assert "disk_key" in cache.list_keys()
 
+    def test_corrupted_cache_read_logs_warning(self):
+        """Regression: ``FeatureCache.get`` previously had a bare
+        ``except Exception: pass`` that silently swallowed ``pickle``
+        errors when the on-disk cache entry was corrupted (truncated
+        write, version mismatch, partial flush). The fix logs a
+        warning via ``featcopilot.utils.cache``'s module logger so
+        users can diagnose disk issues. This test corrupts the on-disk
+        ``.pkl`` and asserts ``get()`` returns ``None`` AND emits a
+        warning containing the cache path.
+        """
+        from pathlib import Path
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            cache = FeatureCache(cache_dir=tmpdir)
+            cache.set("victim", {"value": 42})
+
+            # Find the persisted file and corrupt it.
+            pkl_path = next(Path(tmpdir).glob("*.pkl"))
+            pkl_path.write_bytes(b"\x00\x01\x02not-a-valid-pickle-stream")
+
+            # Drop the in-memory entry so ``get`` falls through to disk.
+            cache._memory_cache.clear()
+
+            with patch("featcopilot.utils.cache.logger") as mock_logger:
+                value = cache.get("victim")
+                assert value is None
+                assert mock_logger.warning.called, "FeatureCache must log on read failure"
+                msg = mock_logger.warning.call_args[0][0]
+                assert "failed to read cache entry" in msg
+
+    def test_unpicklable_value_logs_warning_on_set(self):
+        """Regression: ``FeatureCache.set`` previously had a bare
+        ``except Exception: pass`` that silently swallowed
+        ``pickle.PicklingError`` when the value could not be persisted
+        (e.g. lambda / open file handle). The fix logs a warning so
+        operational issues with the cache directory surface in
+        observability.
+        """
+        with tempfile.TemporaryDirectory() as tmpdir:
+            cache = FeatureCache(cache_dir=tmpdir)
+
+            # Lambdas cannot be pickled by the default protocol.
+            unpicklable = lambda x: x  # noqa: E731
+
+            with patch("featcopilot.utils.cache.logger") as mock_logger:
+                cache.set("bad", unpicklable)
+                assert mock_logger.warning.called, "FeatureCache must log on persist failure"
+                msg = mock_logger.warning.call_args[0][0]
+                assert "failed to persist cache entry" in msg
+
+    def test_corrupted_metadata_read_logs_warning(self):
+        """Regression: ``FeatureCache.get_metadata`` previously had a
+        bare ``except Exception: pass`` that hid corrupted JSON
+        metadata files. The fix logs a warning.
+        """
+        from pathlib import Path
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            cache = FeatureCache(cache_dir=tmpdir)
+            cache.set("k", "v", metadata={"src": "tabular"})
+
+            # Corrupt the metadata JSON.
+            meta_path = next(Path(tmpdir).glob("*.meta.json"))
+            meta_path.write_text("{not valid json")
+
+            # Force re-read from disk by clearing the in-memory metadata cache.
+            cache._metadata.clear()
+
+            with patch("featcopilot.utils.cache.logger") as mock_logger:
+                md = cache.get_metadata("k")
+                assert md is None
+                assert mock_logger.warning.called, "FeatureCache must log on metadata read failure"
+                msg = mock_logger.warning.call_args[0][0]
+                assert "failed to read cache metadata" in msg
+
 
 class TestFeatureCacheDataHash:
     """Tests for FeatureCache with data_hash parameter."""