Skip to content

Discussion: should config models reject unknown fields (extra='forbid')? #138

@PolyphonyRequiem

Description

@PolyphonyRequiem

Observed behavior

Conductor's Pydantic config models inherit Pydantic's default extra='ignore', and WorkflowConfig.model_validate(data) (config/loader.py:327) does not override it. None of the models in config/schema.py declare a model_config = ConfigDict(extra=...). The result is that unknown fields in workflow YAML — whether typos or wrong-level placement — are silently dropped at validation time.

Minimal repro:

# silent-typo.yaml
workflow:
  name: demo
  entry_point: writer
  runtime:
    provider: claude
agents:
  - name: writer
    provider: claude
    model: claude-sonnet-4-5-20250929
    context_window: 1000000   # <-- not a field on AgentDef; silently ignored
    prompt: "Hello"
$ conductor validate silent-typo.yaml
✓ Configuration is valid

AgentDef (schema.py:390-610) has no context_window field, but no warning or error is raised. The author's intent is dropped on the floor.

Why it matters

The two patterns I keep tripping over in our own workflow repo:

  1. Field name that doesn't exist on the model. context_window: 1000000 on an agent. Real example I found in our polyphony-conductor-workflows repo, present in feature-pr.yaml, github-pr.yaml, and implement-pg.yaml — six occurrences. The author (me) clearly meant it to do something, and conductor accepted it without complaint, so it's been sitting there inert.

  2. Right field name, wrong nesting level. runtime.limits.max_iterations instead of limits.max_iterations. runtime (schema.py:752) and limits (schema.py:761) are siblings on WorkflowDef, but it's easy to assume limits belongs under runtime because it controls runtime behavior. Nested incorrectly, the limit is silently ignored and you get the default max_iterations: 10 (schema.py:293) — which, for a recursive planning workflow, fails fast in a confusing way.

Both classes of mistake are particularly painful because the symptom shows up minutes into a long-running workflow (hitting the wrong iteration limit, or an agent behaving as if it has the default context window) rather than at validate time.

Trade-offs

I'd be wary of flipping extra='forbid' repo-wide without thinking through:

  • Forward compatibility for new fields. If a YAML file is written against a newer conductor version that adds a field, older conductor versions would suddenly reject it as unknown. Today they just ignore it, which is a softer failure mode for distributed workflows.
  • Tooling metadata. Some YAML files intentionally carry metadata for external tools (dashboards, CI helpers, docs generators). Strict mode would force those tools to either round-trip through metadata: (schema.py:770, already a dict[str, Any] escape hatch — which is great, but not everyone knows to use it) or fork the schema.
  • Migration cost. Existing workflows out in the wild that have silent typos will start failing to validate after the change. That's arguably the point, but it's a breaking change in practice.

Possible alternatives

A few directions worth weighing — not advocating for any specific one:

  1. extra='forbid' on all config models. Simple, strict, breaks existing workflows that have inert typos. Probably the right end-state, possibly behind a major version bump.
  2. Warnings-first staging. Emit a deprecation-style warning for unknown fields now (e.g. printed during conductor validate), promote to error in a later release. Gives users a window to clean up before the hard break.
  3. Allowlist of known tooling-key prefixes. Permit unknown fields whose key matches a configurable set (e.g. x-*, meta:*) and reject the rest. Borrows the OpenAPI extension convention.
  4. Strict mode confined to specific models. Keep WorkflowDef lenient (forward-compat with new top-level keys), but apply extra='forbid' to leaf config models like AgentDef, LimitsConfig, RuntimeConfig, RetryPolicy where typos are most painful and the field surface is well-understood.

There may well be other framings I'm missing.

Question for maintainers

Is this a direction you've considered? Is there a preferred shape — strict everywhere, warnings-first, scoped to specific models, or "leave it as-is, here's why"? Happy to send a PR exploring whichever direction you'd like to see, once there's some agreement on the shape.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions