Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Starter Pipeline Guide

Get your first Stellar data pipeline running in 2 minutes.

Prerequisites

  • Internet access to download components on first run
  • Optional: duckdb CLI if you want to query the DuckDB file from your shell

The default DuckDB starter path uses driver: process and downloads component binaries directly from OCI images. You do not need a local Docker daemon for this path.

Installation

# Recommended: install without cloning the repo
go install github.com/withobsrvr/flowctl@latest
export PATH="$HOME/go/bin:$PATH"
flowctl version

# Alternative: curl installer
curl -sSL https://flowctl.withobsrvr.com/install.sh | sh
flowctl version

If you are contributing in this repository, you can still use:

git clone https://github.com/withobsrvr/flowctl.git
cd flowctl
make build
./bin/flowctl version
./scripts/quickstart.sh

Create Your First Pipeline

Option 1: Interactive Mode (Recommended)

flowctl init

Follow the prompts:

  1. Network: Select testnet (recommended for learning) or mainnet
  2. Destination: Select where to store data:
    • duckdb - Embedded analytics database (easiest)
    • postgres - PostgreSQL database

This creates a stellar-pipeline.yaml file.

Option 2: Non-Interactive Mode

For automation or CI/CD:

# Recommended preset for first run
flowctl init --preset testnet-duckdb

# Equivalent explicit form
flowctl init --non-interactive --network testnet --destination duckdb

# Create a mainnet pipeline with PostgreSQL sink
flowctl init --non-interactive --network mainnet --destination postgres -o mainnet-pipeline.yaml

Run the Pipeline

flowctl run stellar-pipeline.yaml

What happens:

  1. flowctl downloads required components from Docker Hub (first run only)
  2. Starts the embedded control plane
  3. Launches components: source -> contract-events-processor -> sink
  4. Data flows from Stellar network to your chosen destination

Press Ctrl+C to stop.

Verify Data

DuckDB

# Query the DuckDB file for contract events
duckdb stellar-pipeline.duckdb "SELECT * FROM contract_events LIMIT 5"

PostgreSQL

# Connect and query contract events
psql -h localhost -U postgres -d stellar_events -c "SELECT * FROM contract_events LIMIT 5"

Note: this path requires the postgres-consumer@v1.0.0 component image to be available in your registry. If it is not published yet, use the DuckDB starter path.

Sample Pipelines

This directory contains sample pipeline configurations generated by flowctl init:

File Network Sink Description
testnet-duckdb-pipeline.yaml testnet DuckDB Easiest setup for learning
testnet-postgres-pipeline.yaml testnet PostgreSQL Production-like setup

Understanding the Generated Configuration

A typical generated pipeline looks like:

apiVersion: flowctl/v1
kind: Pipeline
metadata:
  name: stellar-pipeline
  description: Process stellar contract events on testnet

spec:
  driver: process

  sources:
    - id: stellar-source
      type: stellar-live-source@v1.0.0
      config:
        network_passphrase: "Test SDF Network ; September 2015"
        backend_type: RPC
        rpc_endpoint: https://soroban-testnet.stellar.org
        start_ledger: 2187805

  processors:
    - id: contract-events
      type: contract-events-processor@v1.0.0
      config:
        network_passphrase: "Test SDF Network ; September 2015"
      inputs: ["stellar-source"]

  sinks:
    - id: duckdb-sink
      type: duckdb-consumer@v1.0.0
      config:
        database_path: ./stellar-pipeline.duckdb
      inputs: ["contract-events"]

Key points:

  • driver: process runs components as local processes
  • Components are automatically downloaded from Docker Hub
  • Pipeline has three stages: source → processor → sink
  • The contract-events processor extracts Soroban events from ledgers
  • inputs connects each component to its upstream data source

Troubleshooting

"Component not found" or "Image pull failed"

Check whether the component was cached locally:

find ~/.flowctl -maxdepth 4 -type f | head

Re-run with debug logging to see the exact pull failure:

flowctl run stellar-pipeline.yaml --log-level=debug

Common causes:

  • no network access
  • registry rate limiting or temporary registry errors
  • a typo in the component reference

"Connection refused" to control plane

Ensure port 8080 is available:

lsof -i :8080

No data appearing

  1. Check component logs:

    flowctl run stellar-pipeline.yaml --log-level=debug
  2. Verify network connectivity to Stellar Horizon

DuckDB file not created

Check the working directory and ensure the path is writable:

ls -la .

Next Steps

  • Add processors: Transform data with processors between source and sink
  • Monitor pipelines: Use flowctl dashboard for real-time monitoring
  • Deploy to production: for image-based pipelines, use flowctl translate; flowctl init starter pipelines are primarily intended for direct flowctl run process execution

Resources