Welcome to Flowbench, a free micro tools suite for automating common marketplace tasks.
- Node.js 18+
- pnpm 8+
- PostgreSQL 14+
- Supabase account (for file storage)
- Clone the repository:
git clone https://github.com/yourusername/flowbench.git
cd flowbench- Install dependencies:
pnpm install- Set up environment variables:
cp apps/web/.env.example apps/web/.env.localEdit .env.local with your database and Supabase credentials.
- Run database migrations:
pnpm db:migrate- Seed sample data (optional):
pnpm db:seed- Start the development server:
pnpm devVisit http://localhost:3000
Purpose: Clean and normalize spreadsheets
Input: CSV, XLSX files
Operations:
- Deduplicate exact rows
- Trim whitespace
- Normalize case (lower, upper, title)
- Fix dates to ISO format
- Remove empty rows
Performance: Handles 100k rows in under 60 seconds
Purpose: Extract structured data from invoices
Input: PDF invoices, image receipts
Output: CSV of invoices, CSV of line items, per-file JSON
Accuracy: 95%+ on sample dataset
See individual tool documentation in the tools/ directory.
Clean up old files and jobs:
psql $DATABASE_URL -c "SELECT cleanup_old_data();"Run this daily via cron job or Vercel Cron.
View recent jobs:
SELECT * FROM jobs ORDER BY created_at DESC LIMIT 100;Check storage usage:
SELECT
tool_id,
COUNT(*) as job_count,
SUM(size_bytes) as total_bytes
FROM jobs j
JOIN files f ON f.job_id = j.id
WHERE j.created_at > NOW() - INTERVAL '7 days'
GROUP BY tool_id;Current limits:
- Per IP: 100 requests/hour
- Per User: 500 requests/hour
Adjust in .env:
RATE_LIMIT_PER_IP=100
RATE_LIMIT_PER_USER=500
Default: Files auto-delete after 24 hours
Extended: Users can opt into 7-day retention
Implementation:
- Soft delete via
deleted_attimestamp - Cleanup runs daily
- Physical deletion happens immediately after soft delete
- Email addresses are redacted in logs
- Uploaded files are never logged
- Audit trails summarize but don't capture raw data
- File validation: Size limits, type checks, content scanning
- Rate limiting: Per-IP and per-user throttles
- Input sanitization: All user inputs validated with Zod
- No executable storage: Blocks .exe, .sh, .bat uploads
✅ Process your own data ✅ Automate repetitive tasks ✅ Clean and validate datasets ✅ Generate content from your own inputs
❌ Scraping third-party sites without permission ❌ Processing data you don't own ❌ Generating spam or malicious content ❌ Circumventing rate limits
AI-generated content (YouTube, Blog, Email tools) follows these rules:
- Temperature set to 0.2 for consistency
- System prompts checked into repo (no hidden instructions)
- Outputs are suggestions, not final copy
- User responsible for reviewing all generated content
Users can:
- Download all their job history as JSON
- Request immediate deletion via
/api/user/delete - Opt out of telemetry at any time
All tools expose a common API pattern:
POST /api/tools/{tool-slug}
// Request
FormData {
files: File[]
config: JSON
}
// Response
{
success: boolean
summary: Record<string, any>
auditSteps: AuditStep[]
downloadUrl: string
error?: string
}See API documentation for details.
vercel deploy --prodEnvironment variables required:
DATABASE_URLNEXTAUTH_URLNEXTAUTH_SECRETNEXT_PUBLIC_SUPABASE_URLSUPABASE_SERVICE_ROLE_KEY
On first deploy, run migrations:
pnpm --filter web db:migrateSet up daily cleanup:
{
"crons": [{
"path": "/api/cron/cleanup",
"schedule": "0 2 * * *"
}]
}This is an open-source project under MIT license. Contributions welcome!
- Fork the repo
- Create a feature branch
- Make your changes
- Run tests:
pnpm test - Submit a PR
- Issues: https://github.com/yourusername/flowbench/issues
- Docs: https://flowbench.app/docs
- Privacy: https://flowbench.app/privacy
MIT License - see LICENSE file