Skip to content

flowctl: add catalog disable / enable / restart lifecycle commands #3082

Description

@jwhartley

Summary

Add first-class lifecycle commands to flowctl for pausing, resuming, and restarting a task:

  • flowctl catalog disable --name <task>
  • flowctl catalog enable --name <task>
  • flowctl catalog restart --name <task>

Today there is no disable/enable/restart subcommand. Pausing or restarting a connector requires hand-editing the spec and republishing.

Current workaround and why it hurts

To restart or pause a task you must:

  1. flowctl catalog pull-specs --name <task>
  2. Hand-edit the task YAML to add shards: { disable: true }
  3. flowctl catalog publish --source flow.yaml --auto-approve
  4. Edit again to remove the block (must remove it, not set false), and delete the now-stale expectPubId line
  5. Publish again

Pain points:

  • Error-prone: setting disable: false instead of removing it misbehaves; a leftover expectPubId fails the publish.
  • Heavy for large specs: a capture with 500+ bindings round-trips its entire spec through the local filesystem just to flip one flag.
  • Risky on actively-published catalogs: if the tenant's own automation publishes between your two publishes, you hit expected publication ID was not matched, and a stale local spec can clobber newer bindings the automation just added.

Motivating use cases

  1. Pick up new journal topology after flowctl collections split-journals. A split only takes effect when the writing task restarts and re-reads the journal list; today there's no clean way to trigger that, so a freshly split collection stays on its old single journal until the next unrelated republish. A restart command would pair directly with split-journals.
  2. Clear retry backoff immediately (retries can back off up to ~15 min); a disable/enable cycle bypasses it.
  3. Pause a task for upstream/source maintenance and resume from checkpoint.

Proposed behavior

  • disable / enable operate on the named task(s) only, server-side, without rewriting or re-publishing unrelated bindings — avoiding the clobber/expectPubId races that come with the pull-edit-publish flow.
  • restart bounces the task (shard term restart, or disable+enable) so it re-initializes and re-reads journals, ideally without requiring two publications.
  • Support multiple --name flags and/or --prefix, a --dry-run, and clear status output (e.g. confirm TASK_DISABLED / running).
  • Resume from checkpoint on enable/restart (no data loss), matching today's disable/enable behavior.

Open question for the team

Is this best implemented purely as flowctl convenience wrappers over the existing publish path, or does a clean "disable/enable/restart this task only" need a dedicated control-plane endpoint (to avoid the whole-spec republish and the races above)? Filing under the control-plane project since the no-clobber version likely needs server-side support.
May be good to add a restart button in the UI potentially too (as that's often what Enable + Disable are used for)

Context

Came up while activating collections split-journals results on a tenant whose automation republishes its pipeline frequently — the hand-edit-and-publish restart was both clumsy and unsafe against the concurrent publishes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    flowctlIssues related to the user facing CLI

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions