MaxQ

A lightweight DAG-based workflow orchestration engine for shell-based workflows.

Overview

MaxQ orchestrates multi-stage workflows using shell scripts and HTTP/JSON communication. Workflows are discovered from the filesystem, executed as processes, and coordinated through an SQLite-backed scheduler.

Design Philosophy:

Filesystem-Based: Flows discovered from directory structure, not API registration
Language Agnostic: Shell scripts can invoke any language or tool
Zero External Dependencies: Embedded SQLite database, no external services required
HTTP Protocol: All communication via REST API with JSON
Stateful: SQLite is the source of truth for all state
DAG Execution: Steps have dependencies and execute in topological order

Core Concepts

Flow

A workflow definition represented as an executable shell script (flow.sh) in the filesystem. Flows orchestrate stages and are called back when each stage completes.

Run

A single execution instance of a flow. Created when a flow is triggered via API, tracked through pending → running → completed/failed states.

Stage

A named batch of steps scheduled together by the flow (e.g., "data-fetch", "analysis"). Stages provide natural checkpoints and trigger flow callbacks when complete.

Step

An individual unit of work within a stage. Steps are shell scripts that execute as processes, have dependencies (DAG), and post results via HTTP API.

Prerequisites

Node.js 18+
Bash 4.0+
Standard Unix utilities: curl, jq

Environment Variables

# Required
MAXQ_FLOWS_ROOT=/path/to/flows          # Directory containing workflow definitions

# Optional
MAXQ_SQLITE_PATH=/path/to/maxq.db       # SQLite database path (default: ./data/maxq.db)
MAXQ_SERVER_PORT=5003                   # HTTP server port (default: 5003)
MAXQ_SCHEDULER_INTERVAL_MS=200          # Scheduler polling interval (default: 200ms)
MAXQ_SCHEDULER_BATCH_SIZE=10            # Steps per scheduler iteration (default: 10)
MAXQ_MAX_CONCURRENT_STEPS=10            # Max parallel step execution (default: 10)
LOG_LEVEL=info                          # Log level: debug, info, warn, error

Architecture

┌─────────────────────────────────────────────────────┐
│                    MaxQ Server                      │
│                                                     │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────┐ │
│  │   HTTP API  │  │  Scheduler   │  │  SQLite   │ │
│  │             │  │              │  │  Database │ │
│  └─────────────┘  └──────────────┘  └───────────┘ │
└─────────────────────────────────────────────────────┘
                         │
                         ├── Spawns: flow.sh processes
                         │
                         └── Spawns: step.sh processes

Execution Flow:

User triggers flow via POST /api/v1/runs
MaxQ creates run record and spawns flow.sh process
Flow schedules a stage by posting step definitions to API
Scheduler claims pending steps and spawns step.sh processes
Steps execute, post results via HTTP, exit with status code
When stage completes, MaxQ calls flow back with completed stage name
Flow schedules next stage or marks final stage
Run completes when final stage finishes

Simple Example

Directory Structure

FLOWS_ROOT/
└── hello_world/
    ├── flow.sh              # Flow orchestration
    └── steps/
        ├── greet/
        │   └── step.sh      # Step implementation
        └── farewell/
            └── step.sh

flow.sh

#!/bin/bash
set -e

if [ -z "$MAXQ_COMPLETED_STAGE" ]; then
  # First call - schedule greeting stage
  curl -X POST "$MAXQ_API/runs/$MAXQ_RUN_ID/steps" \
    -H "Content-Type: application/json" \
    -d '{
      "stage": "greeting",
      "final": false,
      "steps": [{
        "id": "greet-step",
        "name": "greet",
        "dependsOn": [],
        "maxRetries": 0
      }]
    }'

elif [ "$MAXQ_COMPLETED_STAGE" = "greeting" ]; then
  # Second call - schedule farewell stage
  curl -X POST "$MAXQ_API/runs/$MAXQ_RUN_ID/steps" \
    -H "Content-Type: application/json" \
    -d '{
      "stage": "farewell",
      "final": true,
      "steps": [{
        "id": "farewell-step",
        "name": "farewell",
        "dependsOn": [],
        "maxRetries": 0
      }]
    }'
fi

steps/greet/step.sh

#!/bin/bash
set -e

echo "Hello, World!"

# Post results via HTTP API
curl -X POST "$MAXQ_API/runs/$MAXQ_RUN_ID/steps/$MAXQ_STEP_ID/fields" \
  -H "Content-Type: application/json" \
  -d '{"fields": {"message": "Hello, World!", "timestamp": '$(date +%s)'}}'

exit 0  # Exit code determines success/failure

Trigger the Workflow

curl -X POST http://localhost:5003/api/v1/runs \
  -H "Content-Type: application/json" \
  -d '{"flowName": "hello_world"}'

Quick Start

Local Development

# Clone and build
git clone https://github.com/codespin-ai/maxq.git
cd maxq
./scripts/build.sh

# Start server (creates SQLite database automatically)
./scripts/start.sh

Docker

# Build image
./scripts/docker-build.sh

# Run
docker run -p 5003:5003 \
  -v /path/to/flows:/app/flows \
  -v /path/to/data:/app/data \
  maxq:latest

# Test the image
./scripts/docker-test.sh

Development Commands

./scripts/build.sh                  # Build all packages
./scripts/clean.sh                  # Remove build artifacts and node_modules
./scripts/lint-all.sh               # Run ESLint
./scripts/lint-all.sh --fix         # Run ESLint with auto-fix
./scripts/format-all.sh             # Format with Prettier
npm test                            # Run all tests
npm run test:grep -- "pattern"      # Search for specific tests

Key Features

Parallel Execution

Flows control parallelism by generating multiple step IDs with the same script name:

{
  "steps": [
    { "id": "scraper-0", "name": "scraper", "env": { "SHARD": "0" } },
    { "id": "scraper-1", "name": "scraper", "env": { "SHARD": "1" } },
    { "id": "scraper-2", "name": "scraper", "env": { "SHARD": "2" } }
  ]
}

All three execute steps/scraper/step.sh with unique MAXQ_STEP_ID environment variables.

DAG Dependencies

Steps specify dependencies using step IDs:

{
  "steps": [
    { "id": "fetch-data", "name": "fetch", "dependsOn": [] },
    { "id": "process-1", "name": "process", "dependsOn": ["fetch-data"] },
    { "id": "process-2", "name": "process", "dependsOn": ["fetch-data"] }
  ]
}

The scheduler ensures process-1 and process-2 only execute after fetch-data completes.

Data Passing Between Steps

Steps post arbitrary JSON data (fields) that downstream steps can query:

# Post results
curl -X POST "$MAXQ_API/runs/$MAXQ_RUN_ID/steps/$MAXQ_STEP_ID/fields" \
  -d '{"fields": {"articles": [...], "count": 42}}'

# Query results
curl "$MAXQ_API/runs/$MAXQ_RUN_ID/fields?stepId=fetch-data"

Scheduler-Driven Execution

Steps are queued and claimed by a background scheduler that:

Polls for pending steps at configurable intervals
Respects dependency ordering (DAG)
Supports horizontal scaling with worker IDs
Provides atomic step claiming to prevent double-execution

Abort and Retry

# Abort running workflow
curl -X POST "$MAXQ_API/runs/$RUN_ID/abort"

# Retry failed or aborted workflow
curl -X POST "$MAXQ_API/runs/$RUN_ID/retry"

Retry resets incomplete work to pending and resumes execution.

Documentation

Complete Specification - HTTP API, database schema, workflow examples
Coding Standards - Development guidelines and patterns
Examples - Working example workflows
- Market Analysis - Multi-stage workflow with parallel processing

HTTP API

Base URL: http://localhost:5003/api/v1

Key Endpoints

# Trigger flow
POST /runs
Body: {"flowName": "my_flow"}

# Get run status
GET /runs/{runId}

# List runs
GET /runs?flowName={name}&status={status}

# Schedule stage (called by flow.sh)
POST /runs/{runId}/steps

# Post step results (called by step.sh)
POST /runs/{runId}/steps/{stepId}/fields

# Query step results
GET /runs/{runId}/fields?stepId={id}

# Abort run
POST /runs/{runId}/abort

# Retry run
POST /runs/{runId}/retry

# Create log entry
POST /runs/{runId}/logs

# List logs
GET /runs/{runId}/logs

See docs/specification.md for complete API documentation.

Comparison to Other Systems

MaxQ differs from systems like Metaflow, Prefect, and Argo:

Language: Shell scripts instead of Python decorators or YAML
Infrastructure: Zero external dependencies (embedded SQLite)
Flow Definition: Filesystem discovery instead of code registration
Orchestration: Callback pattern with explicit stages
Execution: Native processes instead of containers or Python functions

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.vscode		.vscode
docs		docs
node/packages		node/packages
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.prettierignore		.prettierignore
CLAUDE.md		CLAUDE.md
CODING-STANDARDS.md		CODING-STANDARDS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json

License

codespin-ai/maxq

Folders and files

Latest commit

History

Repository files navigation

MaxQ

Overview

Core Concepts

Flow

Run

Stage

Step

Prerequisites

Environment Variables

Architecture

Simple Example

Directory Structure

flow.sh

steps/greet/step.sh

Trigger the Workflow

Quick Start

Local Development

Docker

Development Commands

Key Features

Parallel Execution

DAG Dependencies

Data Passing Between Steps

Scheduler-Driven Execution

Abort and Retry

Documentation

HTTP API

Key Endpoints

Comparison to Other Systems

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages