Autoresearch v2: 42% faster via graph surgery + executor optimizations by sunapi386 · Pull Request #93 · aceteam-ai/workflow-engine

sunapi386 · 2026-03-21T18:25:14Z

Context

Why

The v1 autoresearch round optimized small arithmetic DAGs but didn't exercise the engine's heaviest code paths: ForEachNode sub-workflow expansion (graph surgery), large DAGs (100+ nodes), retry/backoff loops, and yield/resume. These are the patterns real production workflows hit.

What

Second autoresearch round targeting the 4 new heavy benchmarks. 16 experiments, 11 kept, 5 reverted. 42% faster, 70% less memory vs v1 baseline.

How

Focused on eliminating validation overhead in hot paths (_construct_trusted, model_construct), optimizing ForEachNode.run() loop (pre-computed templates, skip per-iteration with_namespace), and switching the topological executor to incremental successor-based ready-node tracking.

Summary

Component	Change	Rationale
`core/workflow.py`	`_construct_trusted` bypasses DAG validation; `defaultdict` for `edges_by_target`; `Edge.model_construct` in `expand_node`	Validation is redundant for programmatic construction from trusted code
`core/node.py`	Preserve cached properties in `with_namespace`; fast-path ShouldRetry; `model_construct` in `_cast_input`	Avoids re-creating `create_model` dynamic types for every namespaced copy
`core/edge.py`	`model_construct` in `with_namespace`	Skip Pydantic validation for simple ID prefix
`nodes/iteration.py`	Rewritten ForEachNode.run: skip `workflow.with_namespace` per iteration, pre-compute adapter templates	Biggest single target — was ~100ms, now ~39ms
`execution/topological.py`	Successor-based incremental ready-node tracking	Avoids scanning all edges on every node completion; `large_100` 3x faster
`execution/parallel.py`	Dispatch all initially-ready nodes	Was only dispatching input_node first

Results

Metric	v1 End	v2 End	Improvement
`total_time_s`	0.4151	0.2432	41% faster
`peak_memory_mb`	1.06	0.32	70% less memory
`correctness`	16/16	16/16	Maintained

Per-benchmark highlights (topological):

Benchmark	v1	v2	Speedup
`foreach_expand`	~100ms	39ms	2.6x
`large_100`	~91ms	31ms	2.9x
`retry_chain`	~27ms	19ms	1.4x

Test plan

All 413 existing tests pass
Benchmark correctness: 16/16 across all workflows and both executors
No new dependencies added
No test files modified

🤖 Generated with Claude Code

…EachNode via _construct_trusted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…lidation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… per iteration, use Edge.model_construct, pre-compute adapter templates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…chNode.run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…namespace Avoid clearing input_type, output_type, etc. when only the node ID changes. This prevents expensive create_model re-computation for namespaced copies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…dy match Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… via successor map Instead of scanning all nodes after each execution, only check successors of the just-completed node. Falls back to full scan after workflow expansion. Reduces large_100 topological from ~91ms to ~30ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…k formatting ShouldRetry is an expected transient error handled by the execution loop. Skip logger.exception and on_node_error for faster retry handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…g empty dicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sunapi386 and others added 12 commits March 21, 2026 14:01

autoresearch: skip DAG validation in with_namespace, expand_node, For…

08e3423

…EachNode via _construct_trusted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: optimize with_namespace for Edge and Node to skip re-va…

1fab673

…lidation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: optimize ForEachNode.run - skip workflow.with_namespace…

d050a7c

… per iteration, use Edge.model_construct, pre-compute adapter templates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: use Edge.model_construct for all edge creation in ForEa…

254d2e2

…chNode.run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: use model_construct in _cast_input when all types alrea…

068e3df

…dy match Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: use Edge.model_construct in expand_node for rewired edges

6a464fc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: dispatch all initially-ready nodes in parallel executor

8fc8c76

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: use defaultdict in edges_by_target to avoid pre-creatin…

97b9de4

…g empty dicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

autoresearch: update results.tsv with all v2 experiment records

1b84206

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

xujustinj self-assigned this Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoresearch v2: 42% faster via graph surgery + executor optimizations#93

Autoresearch v2: 42% faster via graph surgery + executor optimizations#93
sunapi386 wants to merge 12 commits into
autoresearch/2026-03-21from
autoresearch/2026-03-21-v2

sunapi386 commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sunapi386 commented Mar 21, 2026

Context

Why

What

How

Summary

Results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants