thinkall · thinkall · May 6, 2026 · May 6, 2026 · May 6, 2026 · May 6, 2026
diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md
@@ -79,7 +79,7 @@ Development dependencies include:
 ```python
 import featcopilot
 print(featcopilot.__version__)
-# Output: 0.1.0
+# Output: 0.3.7
 
 from featcopilot import AutoFeatureEngineer
 print("Installation successful!")

diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md
@@ -0,0 +1,236 @@
+# Command-Line Interface
+
+FeatCopilot ships a stable, agent-friendly `featcopilot` CLI for using the
+library from shells, CI pipelines, and **agentic / LLM tool-use** workflows
+without writing Python glue. All subcommands accept `--json` for
+machine-readable stdout; user-facing errors are written to **stderr** with
+a non-zero exit code so that automation can parse failures
+deterministically.
+
+The CLI is installed automatically with the package via the
+`[project.scripts]` entry point (`featcopilot = "featcopilot.cli:main"`),
+so after `pip install featcopilot` the `featcopilot` command is available
+on `$PATH`. The equivalent module form `python -m featcopilot ...` always
+works regardless of how the package was installed.
+
+## Subcommands
+
+| Command | Purpose |
+| --- | --- |
+| `featcopilot info` | Print version, supported engines, selection methods, leakage guards, I/O formats, and a runtime `parquet_available` flag. |
+| `featcopilot transform` | Read a CSV / Parquet / JSON file, run [`AutoFeatureEngineer`](../user-guide/overview.md), and write engineered features to an output file. |
+| `featcopilot explain` | Fit and print a JSON document with `{name, explanation, code}` per feature for downstream LLM consumption (no output file is written). |
+
+Run any subcommand with `--help` to see the full flag list:
+
+```bash
+featcopilot --help
+featcopilot transform --help
+featcopilot explain --help
+```
+
+## Output contract
+
+All three subcommands honor the same agent-friendly contract:
+
+* **`stdout`** carries the result. With `--json` (always implicit for
+  `explain`), exactly one JSON document is written.
+* **`stderr`** is reserved for failures. A successful run keeps `stderr`
+  empty even when `AutoFeatureEngineer` emits leakage warnings or
+  `verbose` logger output ─ those are surfaced via the JSON payload's
+  `warnings` field instead. This same contract covers warnings emitted
+  during pandas / pyarrow read or write phases (e.g. `DtypeWarning` on
+  mixed-type CSVs, `FutureWarning` from a successful Parquet write):
+  they are routed to the JSON `warnings` field, never to `stderr`.
+* **Exit codes**: `0` on success; `2` for user-input errors (missing
+  files, malformed config, unknown target, etc.); `1` for unexpected
+  internal errors.
+
+## `featcopilot info`
+
+Discover capabilities without running an engineer:
+
+```bash
+featcopilot info --json
+```
+
+Sample (truncated) output:
+
+```json
+{
+  "version": "0.3.7",
+  "supported_engines": ["llm", "relational", "tabular", "text", "timeseries"],
+  "supported_selection_methods": [
+    "chi2",
+    "correlation",
+    "f_test",
+    "importance",
+    "mutual_info",
+    "xgboost"
+  ],
+  "supported_leakage_guards": ["off", "raise", "warn"],
+  "supported_input_formats": ["csv", "json"],
+  "supported_output_formats": ["csv", "json"],
+  "parquet_available": false
+}
+```
+
+When a parquet engine (`pyarrow` or `fastparquet`) IS importable in the
+current environment, `"parquet"` is added to `supported_input_formats`
+and `supported_output_formats` (in source order, so the lists become
+`["csv", "parquet", "json"]`) and `parquet_available` flips to `true`.
+
+`parquet_available` reflects whether `pyarrow` or `fastparquet` is
+importable in the current environment. The base FeatCopilot install does
+not pin a parquet engine; install one with
+`pip install pyarrow` (or `fastparquet`) to enable Parquet I/O.
+
+## `featcopilot transform`
+
+Run feature engineering on a tabular input and write the engineered
+features to disk:
+
+```bash
+featcopilot transform \
+    --input data.csv --target label --output features.csv \
+    --engines tabular --max-features 50 \
+    --json
+```
+
+Common flags:
+
+| Flag | Purpose |
+| --- | --- |
+| `--input / -i` | Path to input file (CSV / Parquet / JSON). Required. |
+| `--output / -o` | Path to output file. Required. |
+| `--target / -t` | Target column. Required when feature selection is applied (i.e. when `--max-features` / config `max_features` is set). |
+| `--input-format` / `--output-format` | Override format detection (`csv` / `parquet` / `json`). |
+| `--engines` | One or more engines to enable (default: `tabular`). |
+| `--max-features N` | Cap on engine output / selection. Forwarded both to engine constructors and to the selector. |
+| `--no-selection` | Skip feature selection entirely (raw feature generation). |
+| `--selection-methods` | Override the default `mutual_info importance` selection set. |
+| `--leakage-guard` | How to handle suspicious column names: `warn` (default — log a warning and continue), `raise` (hard-fail with an error), or `off` (disable the check). |
+| `--include-target` | Re-attach the target column to the output file (collision-safe). |
+| `--task-description` | Free-form ML task description forwarded to LLM-aware engines. |
+| `--config FILE` | JSON config with nested keys (e.g. `llm_config`, `selection_methods`). CLI flags override config values. |
+| `--verbose / --no-verbose` | Toggle verbose logging. With `--json`, log records are routed to the JSON `warnings` field rather than `stderr`. |
+| `--gate-n-jobs` | Parallelism for the do-no-harm gate's RF (default 1; `-1` = all cores). |
+| `--json` | Emit a one-line JSON status object on stdout instead of human-readable text. |
+
+A successful `--json` run prints something like:
+
+```json
+{
+  "status": "ok",
+  "input": "data.csv",
+  "output": "features.csv",
+  "input_format": "csv",
+  "output_format": "csv",
+  "n_rows": 1000,
+  "n_features": 47,
+  "n_input_columns": 12,
+  "n_generated_features": 47,
+  "engines": ["tabular"],
+  "selection_methods": ["mutual_info", "importance"],
+  "max_features": 50,
+  "target": "label",
+  "selection_applied": true,
+  "warnings": []
+}
+```
+
+## `featcopilot explain`
+
+Fit the engineer (without writing any output file) and print a JSON
+catalog of generated features for downstream LLM consumption:
+
+```bash
+featcopilot explain --input data.csv --target label
+```
+
+Each entry in the `features` array contains the feature `name`, an
+LLM-style natural-language `explanation`, and the executable Python
+`code` used to produce it.
+
+`explain` defaults to running on the **full** input so the metadata is
+a faithful description of what a corresponding `transform` would
+generate. Some engines (notably the tabular engine's categorical
+encoding) consult per-row / per-category statistics when planning
+features, so blind subsampling can silently change results. For very
+large inputs where metadata-only `explain` should not pay full memory
+or compute cost, opt in with:
+
+```bash
+featcopilot explain --input big.csv --target label --explain-sample-size 5000
+```
+
+The cap is a deterministic *head slice* (the first N rows), threaded
+through `pd.read_csv(nrows=N)` for CSV so memory is bounded natively.
+For Parquet / JSON pandas has no native row-limit, so the file is
+fully read and then truncated; a `UserWarning` explaining the
+limitation is emitted (and surfaced in the JSON `warnings` field) only
+when the cap actually truncates the input.
+
+## Configuration files
+
+Pass `--config config.json` to provide nested keys that don't have
+matching CLI flags, such as the `llm_config` engine kwargs:
+
+```json
+{
+  "engines": ["tabular", "llm"],
+  "max_features": 80,
+  "selection_methods": ["mutual_info", "importance"],
+  "llm_config": {
+    "backend": "litellm",
+    "model": "gpt-4o",
+    "max_suggestions": 20
+  }
+}
+```
+
+Explicit CLI flags override values from the config file. Any malformed
+scalar (e.g. `"max_features": "5"`, `"verbose": "false"`) is rejected
+with a clean exit-2 error rather than failing later inside the
+engineer.
+
+## Parquet I/O
+
+The base FeatCopilot install does not pin a parquet engine. To use
+`--input file.parquet` / `--output file.parquet` (or the `parquet`
+value of `--input-format` / `--output-format`), install one of:
+
+```bash
+pip install pyarrow      # recommended
+# or
+pip install fastparquet
+```
+
+Confirm with `featcopilot info --json`:
+
+```json
+{ "parquet_available": true, ... }
+```
+
+If neither engine is installed, attempting Parquet I/O fails with a
+clean exit-2 error pointing at the missing dependency.
+
+## Agentic-usage tips
+
+* Always pass `--json`. Treat anything on `stderr` as a hard failure;
+  treat anything on `stdout` as the JSON result.
+* Treat the JSON `warnings` field as a list of human-readable
+  diagnostic strings ─ it is non-empty for `transform` runs that
+  generated leakage / mock-mode / sampling notices, and empty for
+  fully clean runs.
+* For long-running batch jobs, prefer `featcopilot transform` to
+  `python -m featcopilot transform` only because the former is shorter;
+  both invoke the exact same entry point.
+
+## See also
+
+* [Overview](overview.md) ─ the underlying `AutoFeatureEngineer` API.
+* [Engines](engines.md) ─ what each engine generates.
+* [LLM Features](llm-features.md) ─ configuring the LLM backend (provide
+  an `llm_config` object inside the JSON file passed to `--config`, as
+  shown in the [Configuration files](#configuration-files) section above).
diff --git a/featcopilot/core/feature.py b/featcopilot/core/feature.py
@@ -12,6 +12,45 @@
 logger = get_logger(__name__)
 
 
+# Curated set of safe Python builtins exposed to ``Feature.compute``'s
+# stored code. Without this whitelist (i.e. with ``{"__builtins__": {}}``)
+# even basic idioms like ``len(df)``, ``range(...)``, ``sum(...)``, or
+# ``int(x)`` raise ``NameError`` at exec time, which means a feature whose
+# code legitimately uses a Python builtin crashes during ``compute`` even
+# though the snippet is otherwise valid. The set mirrors the one used by
+# :class:`featcopilot.core.transform_rule.TransformRule` so both code
+# execution paths agree on what is safe.
+_SAFE_BUILTINS: dict[str, Any] = {
+    "len": len,
+    "sum": sum,
+    "max": max,
+    "min": min,
+    "int": int,
+    "float": float,
+    "str": str,
+    "bool": bool,
+    "abs": abs,
+    "round": round,
+    "pow": pow,
+    "range": range,
+    "list": list,
+    "dict": dict,
+    "set": set,
+    "tuple": tuple,
+    "sorted": sorted,
+    "reversed": reversed,
+    "enumerate": enumerate,
+    "zip": zip,
+    "any": any,
+    "all": all,
+    "map": map,
+    "filter": filter,
+    "isinstance": isinstance,
+    "hasattr": hasattr,
+    "getattr": getattr,
+}
+
+
 class FeatureType(Enum):
     """Types of features."""
 
@@ -109,6 +148,42 @@ def compute(self, df: pd.DataFrame) -> pd.Series:
         """
         Compute feature values from DataFrame using stored code.
 
+        The stored ``code`` is executed in a single shared namespace
+        with ``df``, ``np`` and ``pd`` bound as names alongside a
+        curated set of safe Python builtins (``len``, ``range``,
+        ``sum``, numeric / sequence constructors, etc.) so common
+        idioms work without giving the snippet a Python import system
+        — ``__import__`` is intentionally NOT in the safe builtins, so
+        an ``import foo`` statement inside the snippet raises at exec
+        time. The snippet must bind its output to a name called
+        ``result``.
+
+        .. note::
+           This is **not** a security sandbox for untrusted code.
+           ``pd`` is in scope, which means the snippet can reach
+           pandas' file I/O helpers (``pd.read_csv``, ``pd.read_parquet``,
+           ``df.to_csv``, ...), and dunder attribute access on objects
+           reachable from ``df`` / ``np`` / ``pd`` is not blocked. The
+           builtin whitelist limits the *namespace* available to plain
+           Python idioms; it does not isolate FeatCopilot from the
+           ambient process. Stored snippets must therefore come from a
+           trusted source (your own code generator, a vetted feature
+           store, or a transform-rule registry you control).
+
+        A *fresh copy* of the safe-builtins dict is passed into ``exec``
+        on every call so that any mutation the snippet performs on
+        ``__builtins__`` (rebinding entries, ``del``, ``pop``) does not
+        bleed into subsequent ``compute`` calls. Likewise the
+        data-bound namespace is constructed fresh per call. Using a
+        SINGLE dict for both ``globals`` and ``locals`` is what makes
+        free variables inside comprehensions and lambdas — which Python
+        resolves against the enclosing function's globals, not the
+        caller's locals — see ``df``, ``np`` and ``pd`` correctly.
+        With separate ``locals`` and ``globals`` dicts a snippet such
+        as ``[df['c'].iloc[i] for i in range(len(df))]`` would
+        otherwise raise ``NameError`` because the implicit comprehension
+        function's body looks ``df`` up in the (empty) ``globals``.
+
         Parameters
         ----------
         df : DataFrame
@@ -118,14 +193,40 @@ def compute(self, df: pd.DataFrame) -> pd.Series:
         -------
         Series
             Computed feature values
+
+        Raises
+        ------
+        ValueError
+            * If ``self.code`` is empty / missing — message
+              ``"No code defined for feature ..."``.
+            * If ``self.code`` is present but did not bind a
+              ``result`` variable — message
+              ``"Feature ... code did not produce a 'result' variable"``.
+              These two cases produce DIFFERENT messages so a failing
+              snippet is distinguishable from an unset feature when
+              debugging.
         """
-        if self.code:
-            # Execute stored code to compute feature
-            local_vars = {"df": df, "np": np, "pd": pd}
-            exec(self.code, {"__builtins__": {}}, local_vars)
-            if "result" in local_vars:
-                return local_vars["result"]
-        raise ValueError(f"No code defined for feature {self.name}")
+        if not self.code:
+            raise ValueError(f"No code defined for feature {self.name}")
+
+        # Single shared namespace so comprehensions / lambdas /
+        # generator expressions inside the snippet see ``df``, ``np``,
+        # ``pd`` and the safe builtins. Fresh dicts per call so the
+        # snippet cannot pollute either the safe-builtins whitelist or
+        # the data bindings for later ``compute`` invocations.
+        namespace: dict[str, Any] = {
+            "__builtins__": dict(_SAFE_BUILTINS),
+            "df": df,
+            "np": np,
+            "pd": pd,
+        }
+        exec(self.code, namespace)
+        if "result" not in namespace:
+            raise ValueError(
+                f"Feature {self.name!r} code did not produce a 'result' variable. "
+                "Stored snippet must bind its output to a name called 'result'."
+            )
+        return namespace["result"]
 
 
 class FeatureSet:

diff --git a/featcopilot/llm/__init__.py b/featcopilot/llm/__init__.py
@@ -4,7 +4,7 @@
 """
 
 from featcopilot.llm.code_generator import FeatureCodeGenerator
-from featcopilot.llm.copilot_client import CopilotFeatureClient
+from featcopilot.llm.copilot_client import CopilotFeatureClient, SyncCopilotFeatureClient
 from featcopilot.llm.explainer import FeatureExplainer
 from featcopilot.llm.litellm_client import LiteLLMFeatureClient, SyncLiteLLMFeatureClient
 from featcopilot.llm.openai_client import OpenAIFeatureClient, SyncOpenAIFeatureClient
@@ -13,6 +13,7 @@
 
 __all__ = [
     "CopilotFeatureClient",
+    "SyncCopilotFeatureClient",
     "LiteLLMFeatureClient",
     "SyncLiteLLMFeatureClient",
     "OpenAIFeatureClient",