jonross · jonross · May 12, 2026 · May 14, 2026 · May 14, 2026 · May 15, 2026
diff --git a/.claude/plans/row-source.md b/.claude/plans/row-source.md
@@ -0,0 +1,201 @@
+# Implementation Plan: Named Scopes + `from:` Unification
+
+Two related improvements to the YAML extension mechanism. They can be implemented
+sequentially on one branch or separately; Phase 1 is a prerequisite for Phase 2's
+scope-aware path resolution.
+
+Scope references use a consistent `in <name>` suffix, mirroring the `as <name>` suffix
+in `row_source` declarations: `as` binds a name, `in` references it.
+
+---
+
+## Phase 1: Named Scopes in `row_source`
+
+**Status: implemented with `<scope>.` prefix syntax — needs revision to `in <scope>` suffix.**
+
+### Goal
+
+Replace the `^` parent-hop syntax with named scope references.
+
+**Single row_source** (the common case — default `["items"]` or one explicit entry):
+path expressions resolve against the one implicit object; no scope qualifier required.
+
+**Multiple row_source entries**: every entry must carry `as <name>`, and every path /
+label expression must end with an explicit `in <name>` qualifier.  There is no implicit
+"current object" when more than one level exists.
+
+```yaml
+create:
+  - table: node_taints
+    resource: nodes
+    row_source:
+      - items as node
+      - spec.taints as taint
+    columns:
+      - name: node_uid
+        path: metadata.uid in node
+      - name: taint_key
+        path: key in taint
+```
+
+### Changes
+
+**`kugl/impl/tables.py` — `Itemizer`**
+
+- Parse `as <name>` suffix from row_source entries.  `"items as node"` yields
+  `Itemizer(expr="items", name="node", finder=..., unpack=False)`.
+- Store `name: Optional[str]` on the dataclass.
+
+**`kugl/impl/tables.py` — `RowContext`**
+
+- Add `_scopes: dict[int, dict[str, object]]`.  Key is `id(child)`; value is the
+  map of scope names visible at that child's level.
+- `set_scope(child, name, parent)` records the child's scope map, inheriting all
+  ancestor scopes from parent and adding `name → child`.
+- Add `get_scope(obj, name) -> Optional[object]` that looks up the named object.
+
+**`kugl/impl/tables.py` — `TableFromConfig._itemize`**
+
+- After calling `context.set_parent(child, item)`, also call
+  `context.set_scope(child, source.name, item)` when `source.name` is not None,
+  carrying forward all ancestor scopes so deeper levels can still reference `node`.
+
+**`kugl/impl/extract.py` — `FieldRef` / `PathExtractor` / `LabelExtractor`**
+
+- `FieldRef.parse`: remove `^` handling; detect a trailing ` in <word>` suffix as a
+  scope name.  Store as `scope_name: Optional[str]` and strip it from the target
+  before JMESPath compilation.
+- In `PathExtractor.extract` and `LabelExtractor.extract`, when `self._ref.scope_name`
+  is set, resolve the object via `context.get_scope(obj, scope_name)`.
+- Validation at table-build time (`TableFromConfig.__init__`): if `len(row_source) > 1`,
+  every `row_source` entry must have a name and every column path/label must carry an
+  `in <name>` qualifier; raise a clear `ConfigError` if either constraint is violated.
+
+### Builtin Update
+
+`kugl/builtins/schemas/kubernetes.yaml` — convert `node_taints` to use named scopes
+as a self-contained example:
+
+```yaml
+    row_source:
+      - items as node
+      - spec.taints as taint
+    columns:
+      - name: node_uid
+        path: metadata.uid in node
+      - name: taint_key
+        path: key in taint
+```
+
+### Tests
+
+- Update the existing `node_taints` test (wherever it lives) to verify the new
+  syntax produces the same output.
+- Add a new test with three levels of nesting (e.g. `pod → container → env`) using
+  two named scopes, verifying that both ancestor levels are reachable by name.
+- Add a test that `^` in a path raises a clear parse error.
+- Add a test that a multi-step `row_source` with a missing `as` name raises a `ConfigError`.
+- Add a test that a multi-step `row_source` with a bare (un-scoped) column path raises a `ConfigError`.
+
+---
+
+## Phase 2: `from:` Key Unification
+
+### Goal
+
+Replace the two-key `path:` / `label:` vocabulary with a single `from:` key that
+auto-detects extraction type.  Named scope qualifiers compose naturally via the same
+`in <name>` suffix.
+
+Single row_source (no scope qualifier needed):
+
+```yaml
+    columns:
+      - name: node_pool
+        from: karpenter.sh/nodepool   # auto-detected: label
+      - name: provider_id
+        from: spec.providerID         # auto-detected: JMESPath
+```
+
+Multi-step row_source (all entries named, all columns scoped):
+
+```yaml
+    row_source:
+      - items as pod
+      - spec.containers as container
+    columns:
+      - name: pod_name
+        from: metadata.name in pod            # JMESPath on pod scope
+      - name: pod_pool
+        from: karpenter.sh/nodepool in pod    # label on pod scope — unambiguous
+      - name: container_name
+        from: name in container               # JMESPath on container scope
+```
+
+### Auto-Detection Rule
+
+Strip any trailing ` in <word>` suffix first, then apply to the remainder:
+
+- Matches `[a-zA-Z0-9.-]+/[a-zA-Z0-9._/-]+` (K8s label format: DNS domain + `/` +
+  key) → `LabelExtractor`
+- Otherwise → `PathExtractor`
+
+A value like `metadata.labels.foo/bar` is a JMESPath, not a label — the `/` appears
+inside a path segment, not as the label-domain separator.  The regex handles this
+correctly because `metadata.labels.foo` is not a valid DNS domain segment.
+
+Parsing ` in <word>` is safe because neither JMESPath expressions nor label keys
+contain spaces, so the delimiter is unambiguous.
+
+### Changes
+
+**`kugl/impl/config.py` — `UserColumn`**
+
+- Add `from_: Optional[str] = Field(None, alias="from")` (Pydantic alias needed
+  because `from` is a Python keyword).
+- In `gen_extractor`, handle `from_` alongside `path` and `label`.
+  - If `from_` is set alongside `path` or `label`, raise `ValueError`.
+  - Strip any ` in <word>` suffix from `from_` to extract the scope name.
+  - Apply the label-vs-path regex to the remainder.
+  - Construct the appropriate extractor, passing the scope name through.
+- Keep `path:` and `label:` fully supported so existing configs are not broken.
+
+**`kugl/impl/extract.py` — `FieldRef`**
+
+- Centralise the ` in <scope>` parsing in `FieldRef.parse_scoped(s)`; both
+  `gen_extractor` (for `from:`) and `FieldRef.parse` (for `path:`/`label:`) delegate
+  to it.
+- Known scopes are not available at Pydantic parse time.  Use lazy validation: accept
+  any ` in <word>` suffix as a potential scope; fail at table-build time in
+  `TableFromConfig.__init__` if the referenced scope name is not declared in
+  `row_source`.
+
+### Tests
+
+- `from: karpenter.sh/nodepool` produces the same result as `label: karpenter.sh/nodepool`.
+- `from: spec.providerID` produces the same result as `path: spec.providerID`.
+- `from: metadata.name in pod` with a named `pod` scope resolves correctly.
+- `from: karpenter.sh/nodepool in pod` with a named `pod` scope resolves as a label
+  on the pod object.
+- Error: `from:` and `path:` both specified → validation error.
+- Error: `from: foo in unknownscope` where `unknownscope` is not in `row_source` → clear
+  error message at table-build time.
+
+---
+
+## Files Touched
+
+| File | Change |
+|---|---|
+| `kugl/impl/extract.py` | `FieldRef.parse`: detect ` in <scope>` suffix; extractors: resolve via scope |
+| `kugl/impl/tables.py` | `Itemizer`: parse `as <name>`; `RowContext`: track named scopes |
+| `kugl/impl/config.py` | `UserColumn`: add `from_` field and dispatch in `gen_extractor` |
+| `kugl/builtins/schemas/kubernetes.yaml` | Convert `node_taints` to named scope syntax |
+| `tests/` | Update node_taints test; add multi-level and `from:` tests |
+
+---
+
+## Out of Scope
+
+- The broader resource-coverage gaps from `discuss.md` (deployments, containers table,
+  etc.) are separate work and should not be bundled here.
diff --git a/.claude/plans/shortcomings.md b/.claude/plans/shortcomings.md
@@ -0,0 +1,113 @@
+# Kugl Discussion Summary
+
+## What Kugl Is
+
+Kugl is a Python CLI tool that queries Kubernetes resources using SQL (SQLite). It runs `kubectl get` commands, caches the JSON output, and loads it into an in-memory SQLite database. Users write SQL queries directly on the command line or via saved shortcuts.
+
+Built-in tables: `pods`, `jobs`, `nodes`, `node_labels`, `pod_labels`, `job_labels`, `node_taints`. Resource types, namespaces, and cache TTL are controlled via CLI flags (`-a`, `-n`, `-u`, `-c`, `-t`).
+
+Kugl automatically converts Kubernetes-specific value formats to queryable numerics: `50Mi` → bytes, `100m` CPU → float, ISO8601 timestamps → epoch seconds. Helper functions `to_size()`, `to_age()`, `to_utc()` convert back to human-readable strings for output.
+
+---
+
+## Strengths
+
+- **SQL is better than jq for aggregation.** Queries involving `GROUP BY`, `SUM`, `JOIN`, `ORDER BY`, and CTEs are dramatically more readable in SQL than in jq pipelines. The target use case — "how is compute distributed across node pools and taints?" — is well served.
+- **Automatic type coercion.** CPU, memory, and timestamp conversion is handled transparently. Steampipe's Kubernetes plugin likely exposes these as raw strings or JSONB; kugl makes them directly comparable numerically.
+- **Built-in caching.** A 2-minute TTL cache avoids hammering the API server during exploratory queries.
+- **Declarative extensions require no code.** Adding a label or nested field to an existing table takes 4 lines of YAML, no build step, no Go, no Python. Far more accessible than Steampipe's Go plugin model.
+- **Multi-schema queries.** Joining Kubernetes data with other JSON sources (files, exec output) via `kubernetes.nodes JOIN ec2.instances` is architecturally sound, even if the AWS side is experimental.
+
+---
+
+## Weaknesses
+
+### Priority (blocking credibility)
+
+1. **Narrow built-in resource coverage.** Only pods, jobs, and nodes are built in. Deployments, StatefulSets, DaemonSets, CronJobs, Services, Ingresses, Namespaces, PVs/PVCs are absent. Users can add them via YAML config, but requiring setup before querying standard resources is a significant barrier.
+
+2. **No per-container table.** Pod-level resource data aggregates across all containers. For multi-container pods (sidecars, init containers), individual container visibility is lost. A `containers` table (one row per container, joinable to `pods` via pod UID) is needed.
+
+3. **No context selection at invocation time.** Users must `kubectl config use-context` before running kugl. A `--context` flag is table stakes for anyone with more than one cluster.
+
+4. **No structured output.** Output is human-readable tabular text only. Without `--output csv` or `--output json`, kugl cannot participate in pipelines or feed dashboards.
+
+5. **No shortcut parameters.** Shortcuts are static query aliases. The docs acknowledge this gap and suggest wrapper scripts as the workaround. Named parameter substitution (e.g., `{{namespace}}`) is needed for real team adoption.
+
+### Nice-to-Have
+
+- **Events table.** `kubectl get events` is one of the most-used debugging commands; it should be built in.
+- **PVs/PVCs.** Important for stateful workloads.
+- **RBAC tables.** Roles, RoleBindings, ClusterRoles for security auditing.
+- **Metrics integration.** Joining `kubectl top pods` data with resource requests would enable requests-vs-actual-usage analysis.
+- **Shell completions,** especially for shortcuts.
+- **Richer `--schema` output** (columns, types, source paths).
+
+---
+
+## Comparison to Steampipe (Kubernetes plugin)
+
+| Capability | Kugl | Steampipe |
+|---|---|---|
+| Built-in resource types | pods, jobs, nodes + labels/taints | All standard K8s types |
+| SQL dialect | SQLite | PostgreSQL (full) |
+| CPU/memory type handling | Auto-converted to numerics | Likely raw strings/JSONB |
+| Adding a label column | 4 lines of YAML | Go code + rebuild + reinstall |
+| Adding a new resource type | YAML `create:` block | Go plugin with K8s client call |
+| Ecosystem integration | CLI output only | Postgres wire protocol (Grafana, psql, etc.) |
+| Multi-cluster | Not supported | Aggregator plugins |
+| Cross-source joins | Experimental | Core feature, 100+ plugins |
+| Caching | Built-in TTL cache | Plugin-level |
+| Maintenance | Personal project | Turbot-backed, active community |
+
+Steampipe's Kubernetes plugin likely does **not** pre-convert CPU/memory strings to numerics — this appears to be a genuine and specific kugl advantage for resource utilization queries.
+
+---
+
+## Extension Mechanism
+
+### Current model
+
+Users add columns via `~/.kugl/init.yaml` or `~/.kugl/kubernetes.yaml`:
+
+```yaml
+extend:
+  - table: nodes
+    columns:
+      - name: node_pool
+        type: text
+        label: karpenter.sh/nodepool      # shortcut for metadata.labels."..."
+      - name: provider_id
+        type: text
+        path: spec.providerID             # JMESPath expression
+```
+
+Special kugl types (`size`, `age`, `cpu`, `date`) handle K8s-specific string-to-numeric conversion.
+
+Multi-row-per-resource tables (e.g., one row per container or taint) use `row_source:` — a sequential JMESPath pipeline — with `^` prefix to reference parent-level fields.
+
+### Friction points
+
+1. **Two-vocabulary system (`path:` vs `label:`).** Users who don't know about `label:` write awkward quoted JMESPath: `metadata.labels."karpenter.sh/nodepool"`. The shortcut is useful but invisible until you need it.
+2. **`path:` is a required key even when it's the only thing expressed.** Three keys for a conceptually one-line mapping.
+3. **`row_source` + `^` parent references** are non-obvious, but affect only the minority of multi-row-per-resource cases.
+
+### Recommended improvement: unified `from:` key
+
+Replace `path:` / `label:` with a single `from:` key that auto-detects the extraction type:
+- Value containing `/` with no leading dot-path segment → label name (matches all real K8s labels)
+- Otherwise → JMESPath expression
+
+```yaml
+extend:
+  - table: nodes
+    columns:
+      - name: node_pool
+        type: text
+        from: karpenter.sh/nodepool      # auto-detected as label
+      - name: provider_id
+        type: text
+        from: spec.providerID            # auto-detected as JSON path
+```
+
+**Implementation:** add `from_` field to `UserColumn` in `config.py`; dispatch to `LabelExtractor` or `PathExtractor` in `gen_extractor` validator. Keep `path:` and `label:` for backward compatibility. Change is small and non-breaking.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,34 @@
+## 0.8.0
+
+New tables in ``kubernetes`` schema:
+
+- ``events``
+- ``cronjobs`` and ``cronjob_labels`` 
+- ``services`` and ``service_labels``
+- ``deployments`` and ``deployment_labels``
+
+CLI changes (breaking):
+
+- Added ``-c``/``--context`` option to specify a Kubernetes context
+- Renamed ``-a`` option to ``-A`` for consistency with ``kubectl``
+- Renamed ``-c``/``--cache`` to ``-s``/``--stale``
+- Renamed ``-u``/``--update`` to ``-r``/``--refresh``
+- Renamed ``-r``/``--reckless`` to ``-q``/``--quiet`` (and ``reckless:`` in settings to ``quiet:``)
+
+Extending tables:
+
+- Breaking: Named scope syntax for multi-step ``row_source``: each entry takes ``as <name>`` and
+  columns reference ancestor objects with ``in <name>`` suffix (e.g. ``metadata.uid in node``);
+  the old ``^`` parent-hop syntax is removed
+- New ``from:`` column key that auto-detects label vs JMESPath: values matching
+  ``domain/key`` format (e.g. ``karpenter.sh/nodepool``) use label extraction, everything
+  else uses JMESPath (``path:`` and ``label:`` to be removed in a future release)
+
+Documentation:
+
+- New masthead example of ``kugl`` vs ``kubectl | jq``
+
+
 ## 0.7.0
 
 - Add `init` subcommand to generate `kubernetes.yaml` per recommended post-install configuration
@@ -40,7 +71,7 @@
 - Allow environment variables in `file` resource paths
 - Fix the `exec` resource by adding a `cache_key` field; these resources would otherwise experience cache collisions
 - Resource cache paths and file formats have changed, and cache now lives in `~/.kuglcache`
-- `rm -r ~/.kugl/cache` is recommended to clear obsolete files
+- `rm -r ~/.kuglcache` is recommended to clear obsolete files
 
 ## 0.3.3