prompt2midi

Local-first AI co-producer for Ableton Live.

prompt2midi analyzes a reference track, turns the musical evidence into producer-readable structure, generates editable MIDI, creates an original inspired loop package, and prepares a Suno-ready prompt. The core path is designed to run locally: JUCE is the Ableton-facing client, Node owns the job API, and Python owns audio analysis and generation helpers.

The product goal is not to copy songs. It is a pre-production tool for making new material from reference traits: tempo, key, groove, drum feel, bass movement, arrangement energy, and production language.

prompt2midi is open source. Producers, artists, engineers, researchers, and tool builders are welcome to contribute.

There are two main ways to use it:

DAW inspiration starter: generate editable ideas, MIDI, arrangement notes, and prompts that a producer can continue shaping in Ableton, Logic, FL Studio, Bitwig, or any other DAW.
Pre-SUNO tool: turn an inspired idea/reference into a cleaner prompt, structure guide, and optional proxy package that an artist can finish in SUNO or continue developing locally.

Current Status

The repo already contains a working local vertical slice:

JUCE plugin UI for WAV/MP3 selection, prompt entry, job polling, result display, and prompt copy.
Local Node backend on 127.0.0.1:47321.
Python WAV analysis with BPM, key, energy, loudness, spectral features, genre/groove hints, chords, structure, stems, transcription, composition, and optional audio generation.
Deterministic inspired-loop generator that writes bass.mid, drums.mid, chords.mid, melody.mid, full_loop.mid, summary.json, and prompt.txt.
Optional model paths for Basic Pitch, Demucs, CLAP genre detection, Gemini Suno prompt generation, ACE-Step, and MusicGen.

This is still an MVP/research codebase. Some outputs are production useful, but model transcription, stem splitting, and MIDI mapping are still weak in places and should be treated as editable evidence, not finished arrangements. Improving stem separation, source-aware MIDI mapping, and full JUCE AU/VST integration are the next major production-readiness features.

Branch Flow

develop is the default integration branch for daily work.
Feature branches start from develop and merge back into develop.
main is release-only. When develop is ready to ship, merge develop into main and create a version tag.
Do not open normal feature PRs directly into main.

Why It Exists

Producers often know what they like about a reference record but cannot quickly turn that into reusable production material. prompt2midi closes that gap:

Drop in a reference track.
Extract musical facts and evidence.
Generate original MIDI parts that match the useful traits, not the exact song.
Generate a clear AI-music prompt for SUNO or similar tools.
Optionally render a local reference-inspired sample before uploading anything elsewhere.
Finish the idea either inside a DAW or inside SUNO.

When a generation service rejects direct artist/song prompts, the correct workflow is not to bypass the filter. prompt2midi uses the reference to identify neutral production traits, then creates new musical assets and a prompt that avoids artist imitation, copied hooks, lyrics, and vocal likeness.

Example framing:

Avoid: make a Michael Jackson song or copy the bassline from this record.
Prefer: 1980s pop-funk feel, tight dance groove, bright chord stabs, syncopated bass movement, crisp drums, original melody, no copied lyrics, no vocal imitation.

The tool is designed to reduce copying risk by creating original material and by describing musical traits instead of requesting a clone. It does not guarantee legal clearance, does not replace rights review, and should not be used to bypass copyright or platform policies.

Architecture

prompt2midi is a local-first desktop production system. The plugin is only the DAW-facing client; the local backend owns job orchestration; Python owns audio intelligence and generation helpers; optional model services improve output quality without becoming required for the core workflow.

flowchart TD
  Producer["Producer in Ableton Live"] --> Plugin["JUCE plugin UI<br/>file/prompt input, progress, result display"]
  Plugin -->|POST /analyze| Node["Local Node backend<br/>127.0.0.1:47321"]
  Node --> Jobs["Job store + progress events<br/>queued/running/succeeded/failed"]
  Node --> Decode["Input validation + FFmpeg MP3 decode<br/>WAV passed to Python"]
  Decode --> Python["Python analysis package<br/>analysis/analyze.py"]
  Python --> Core["Core facts<br/>BPM, key, loudness, energy, spectral features"]
  Python --> Deep["Optional deeper analysis<br/>chords, drums, structure, stems, transcription"]
  Python --> Compose["Original composition package<br/>bass/drums/chords/melody/full_loop MIDI"]
  Python --> Arrange["Arrangement Lock / full-track proxy<br/>maps, reports, guide MIDI, proxy audio"]
  Compose --> Exports["Local exports<br/>MIDI, summary.json, prompt.txt"]
  Arrange --> Exports
  Python --> Node
  Node --> Prompt["Prompt layer<br/>deterministic local prompt + optional Gemini SUNO prompt"]
  Node --> Result["Aggregated result JSON<br/>analysis, warnings, assets, prompts, paths"]
  Result --> Plugin
  Plugin --> DAW["Producer actions<br/>audition, copy prompt, import MIDI, package for SUNO"]

  ACE["Optional local ACE-Step API<br/>127.0.0.1:8001"] -. audio candidates .-> Arrange
  Models["Optional local engines<br/>Basic Pitch, Demucs, CLAP, MusicGen, All-In-One Docker"] -. evidence .-> Deep
  Gemini["Optional cloud Gemini<br/>GEMINI_API_KEY"] -. SUNO prompt .-> Prompt

Ableton / JUCE plugin
        |
        | POST /analyze
        v
Local Node backend
        |
        | validates input, decodes MP3, creates job, calls Python
        v
Python analysis engine
        |
        | returns structured JSON, MIDI paths, composition package, optional audio sample
        v
Node aggregation
        |
        | deterministic producer prompt, optional Gemini Suno prompt
        v
JUCE result display

Runtime Flow

The producer selects a WAV/MP3 reference and/or enters a direction in the plugin.
The JUCE client posts the request to the localhost Node backend and keeps the audio thread pass-through.
Node validates local paths, decodes MP3 to WAV when needed, creates a job, and publishes progress events.
Python analyzes the WAV, writes structured JSON, MIDI evidence, composition assets, and optional arrangement/proxy artifacts.
Node aggregates the Python result with producer-facing prompt text and optional Gemini SUNO text.
The plugin polls status/result and displays confidence-aware output paths, warnings, and copy/export actions.

Component Responsibilities

Layer	Files	Responsibility
JUCE plugin	`Source/PluginEditor.`, `Source/PluginProcessor.`, `Source/LocalApiClient.h`	UI only: choose/drop reference, send local job, poll status, show results. Audio processing stays pass-through.
Node backend	`backend/server.js`, `backend/lib/*`	Local API, job state, input validation, MP3 decode, Python invocation, prompt aggregation, error normalization.
Python analysis	`analysis/analyze.py`, `analysis/core/`, `analysis/detectors/`	Extract structured musical facts from WAV audio. Optional libraries improve results, but fallback paths keep the core running.
MIDI/transcription	`analysis/midi/*`	Write MIDI, run Basic Pitch, run Demucs, expose provenance and limitations for every MIDI asset.
Composition	`analysis/composition/*`	Generate a new original loop package from analysis hints. This is the main product output.
Prompting	`backend/lib/promptGenerator.js`, `backend/lib/geminiPromptGenerator.js`	Turn structured facts into producer-facing copy and Suno prompts. Gemini is optional.
Audio generation	`analysis/generation/*`	Optional local sample generation using ACE-Step first, then AudioCraft/MusicGen fallback paths.
Tooling scripts	`scripts/pipelines/`, `scripts/setup/`, `scripts/packaging/`, `scripts/services/`, `scripts/dev/*`	Local CLI runners, setup commands, package builders, service launchers, and developer refresh tools.
Requirements/docs	`requirements/`, `docs/pipelines/`, `docs/backend/`, `docs/qa/`	Optional engine dependency pins and topic-grouped operational docs.

End-to-End Flow

1. User Input

The plugin accepts:

A local .wav / .wave / .mp3 reference file.
A text direction.
Or prompt-only mode when no audio file is supplied.

The plugin posts JSON to the local backend:

{
  "audioPath": "/absolute/path/to/reference.mp3",
  "prompt": "same groove, change the bass notes a little, replace the main stab"
}

2. Node Job Orchestration

backend/server.js exposes:

GET /health
POST /analyze
GET /status?id=<job_id>
GET /result?id=<job_id>

Node creates a job immediately so the plugin remains responsive. It validates absolute audio paths, rejects unsupported formats, enforces a size limit, and decodes MP3 input through ffmpeg into tmp/jobs/<job_id>/decoded-input.wav.

Node then starts python -m analysis.analyze as a child process and records pipeline events so the UI can show progress.

3. Python Feature Analysis

The Python engine reads PCM WAV and returns structured JSON. The dependency-free base path extracts:

duration, sample rate, channel count
energy curve
loudness
zero-crossing rate and peak amplitude
approximate BPM
approximate key
warnings when confidence is low

Optional librosa/scipy paths improve:

BPM estimation
key estimation
chord progression detection
arrangement/section analysis
groove descriptors

Optional CLAP genre detection uses laion/larger_clap_music through transformers when available.

4. Stem and MIDI Evidence

Every MIDI file is labeled by source and confidence. This area is intentionally conservative: stem splitting and MIDI mapping exist, but they are not yet production-grade. They are useful for evidence, sketching, and direction, but the next feature work should improve source separation, note assignment, timing cleanup, and DAW-ready mapping.

Asset	How it is made	Meaning
`reference-sketch.mid`	Deterministic pattern from estimated BPM/key	Generated sketch, not transcription.
`model-transcription.mid`	Basic Pitch on full mix	Model transcription candidate; needs ear correction.
`source-bass-transcription.mid`	Demucs bass stem + Basic Pitch	Stem-aware bass candidate; still may contain bleed.
`source-drum-groove.mid`	Demucs drums stem + onset detection	Quantized drum groove estimate.
`model-bass-transcription.mid`	Pitch-filtered Basic Pitch notes from full mix	Fallback bass candidate, not source-separated.
`bass-transcription.mid`	Monophonic low-frequency tracking	Legacy heuristic fallback.

Recommended evidence exports are copied to:

tmp/jobs/<job_id>/exports/

The code intentionally distinguishes generated MIDI from transcription evidence. This matters because only source-aware paths should be described as source-aware.

Current limitations:

Demucs-style stem splitting can bleed bass, drums, vocals, and harmonic material into each other.
Full-mix model transcription often produces extra notes and wrong instrument ownership.
Bass, drum, chord, and melody mappings still need stronger source-aware cleanup before they should be considered arrangement-ready.
All extracted MIDI should be auditioned and edited in Ableton before being used as final material.

Next work:

Improve stem-aware bass, drum, chord, and melody extraction.
Improve mapping from analysis evidence into separate DAW tracks.
Tighten quantization, note filtering, register selection, and confidence labels.
Complete JUCE integration for real AU/VST plugin workflows, including more polished import/export behavior.

5. Reference Transformation

analysis/reference/reference_groove.py fingerprints the reference for:

kick accents
bass accents
hat/percussion accents
swing
low-end weight
bass note tendencies
club energy

analysis/reference/reference_transform.py converts the user's direction into controls such as:

preserve groove similarity
keep bass rhythm but vary notes
replace a stab/timbre role
keep kick and hat feel while using new samples

This is the bridge between "I like this song" and "make a new production with similar traits."

6. Original Composition Package

analysis/composition/composition.py generates the main product output:

tmp/jobs/<job_id>/exports/
  midi/
    bass.mid
    drums.mid
    chords.mid
    melody.mid
    full_loop.mid
  summary.json
  prompt.txt

The generator is deterministic in structure but randomized in musical choices. It uses BPM, key, detected chords, drum pattern evidence, genre/style hints, and user direction to choose one of several composition modes:

house
techno
synth wave
hip hop
ambient

full_loop.mid is MIDI format type 1 so a DAW can import separate tracks.

7. Suno Prompt Package

There are two prompt paths:

Python stub prompt from analysis/composition/composition.py, always local.
Optional Gemini prompt from backend/lib/geminiPromptGenerator.js when GEMINI_API_KEY is present.

The Gemini path uses gemini-2.0-flash by default and writes a single Suno paragraph from structured analysis and composition data. If Gemini is disabled, missing, times out, or fails, the job still succeeds with the local stub prompt.

The prompt contract ends with a protective instruction:

Instrumental, no vocals. Inspired by the reference groove and production style, not a cover and not a copy.

8. Optional Local Audio Sample

analysis/generation/audio_generation.py can prepare a 30-second local sample before the user uploads anything to Suno.

Provider order:

ACE-Step local API, enabled by PROMPT2MIDI_ENABLE_ACE_STEP=1.
AudioCraft MusicGen, enabled by PROMPT2MIDI_ENABLE_AUDIOCRAFT=1.
Transformers MusicGen Melody, enabled by PROMPT2MIDI_ENABLE_MUSICGEN=1.

ACE-Step uses a local API at 127.0.0.1:8001 by default and model settings around:

acestep-v15-turbo
acestep-5Hz-lm-0.6B
MLX backend by default on macOS

The sample is scored for duration, loudness, pulse consistency, clipping, harshness, and reference groove similarity. The score helps pick the best candidate, but listening is still required.

Models and Engines

Engine	Required?	Purpose	Setup
Python stdlib WAV analyzer	Yes	Base BPM/key/energy/loudness/spectral analysis	Built in
`ffmpeg`	For MP3	Decode MP3 to WAV and export reference sections	Install separately or set `PROMPT2MIDI_FFMPEG`
`librosa` / `scipy`	Optional	Better BPM/key, chords, structure, drums, groove	Python environment
CLAP `laion/larger_clap_music`	Optional	Zero-shot genre tags	Python ML deps
Basic Pitch	Optional	Model MIDI transcription	`npm run setup:transcription`
Demucs `htdemucs`	Optional	Bass/drum/other/vocal stems	`npm run setup:stems`
Gemini `gemini-2.0-flash`	Optional cloud	Higher quality Suno prompt	`GEMINI_API_KEY=...`
ACE-Step 1.5	Optional local service	Reference-guided audio samples	`npm run setup:ace-step`, then `npm run ace-step:start`
AudioCraft MusicGen	Optional local	Fallback sample generation	`npm run setup:musicgen`
Transformers MusicGen Melody	Optional local	Legacy fallback sample generation	Set `PROMPT2MIDI_ENABLE_MUSICGEN=1` with deps installed

Legal and Platform-Safety Position

prompt2midi is built around reference-inspired transformation, not cloning.

It should:

Analyze traits instead of copying a recording.
Generate new MIDI instead of exporting copyrighted melodies as final output.
Use neutral production language instead of artist-name prompting.
Avoid vocals, lyrics, artist likeness, copied hooks, and exact bass/melody sequences.
Keep every extracted/transcribed artifact labeled as evidence, not guaranteed clearance.

It should not:

Promise that any output is free of legal issues.
Claim to bypass Suno copyright filters.
Recreate a protected song, master recording, vocal likeness, lyric, or signature hook.
Tell users that a generated sample is automatically safe to upload commercially.

Use references you own, created, licensed, or are otherwise allowed to analyze. Treat the generated Suno prompt and local sample as a safer creative starting point, not legal advice.

Contributing

This is an open-source project and contributions are welcome. Useful areas include:

stronger stem separation and source-aware MIDI mapping
better AU/VST/JUCE host integration
Ableton, Logic, and other DAW workflow testing
prompt packaging for SUNO and other music tools
audio-analysis fixtures and regression tests
documentation, examples, setup scripts, and UX polish

Branch from develop, keep changes local-first, and label extracted MIDI honestly when confidence is limited.

Running Locally

Install Node dependencies:

npm install

Start the backend:

npm start

Run the developer loop with backend logs:

npm run dev:refresh

Install optional engines:

npm run setup:transcription
npm run setup:stems
npm run setup:ace-step
npm run setup:musicgen

Start ACE-Step when using local sample generation:

npm run ace-step:start

Useful environment flags:

PROMPT2MIDI_DISABLE_MODEL=1
PROMPT2MIDI_DISABLE_STEMS=1
PROMPT2MIDI_DISABLE_SUNO=1
PROMPT2MIDI_DISABLE_LIBROSA=1
PROMPT2MIDI_DISABLE_GENRE=1
PROMPT2MIDI_DISABLE_CHORDS=1
PROMPT2MIDI_DISABLE_STRUCTURE=1
PROMPT2MIDI_DISABLE_DRUMS=1
PROMPT2MIDI_ENABLE_ACE_STEP=1
PROMPT2MIDI_ENABLE_AUDIOCRAFT=1
PROMPT2MIDI_ENABLE_MUSICGEN=1
GEMINI_API_KEY=...

API Example

curl -s -X POST http://127.0.0.1:47321/analyze \
  -H 'Content-Type: application/json' \
  -d '{"audioPath":"/absolute/path/to/reference.wav","prompt":"keep the groove, vary the bass notes, replace the stab sound"}'

Poll status:

curl -s 'http://127.0.0.1:47321/status?id=<job_id>'

Fetch result:

curl -s 'http://127.0.0.1:47321/result?id=<job_id>'

Testing

Run Python tests:

python3 -m unittest analysis.tests.test_feature_extraction
python3 -m unittest analysis.tests.test_composition

Run Node tests:

npm test

Compile-check Python:

python3 -m compileall analysis

Important Invariants

Keep the core workflow local-first.
Do not run long jobs, subprocesses, network calls, or file-heavy analysis in JUCE processBlock.
Node owns orchestration and aggregation.
Python returns structured analysis JSON and file paths.
Generated MIDI is product output; extracted MIDI is evidence.
Optional engines must degrade to warnings, not hard job failure.
Producer-facing copy must describe confidence and limitations honestly.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.agents/skills		.agents/skills
.codex		.codex
.github		.github
.platform		.platform
Source		Source
analysis		analysis
backend		backend
docker/allin1		docker/allin1
docs		docs
requirements		requirements
scripts		scripts
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
GEMINI.md		GEMINI.md
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
prompt2midi.jucer		prompt2midi.jucer
promt.md		promt.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prompt2midi

Current Status

Branch Flow

Why It Exists

Architecture

Runtime Flow

Component Responsibilities

End-to-End Flow

1. User Input

2. Node Job Orchestration

3. Python Feature Analysis

4. Stem and MIDI Evidence

5. Reference Transformation

6. Original Composition Package

7. Suno Prompt Package

8. Optional Local Audio Sample

Models and Engines

Legal and Platform-Safety Position

Contributing

Running Locally

API Example

Testing

Important Invariants

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

prompt2midi

Current Status

Branch Flow

Why It Exists

Architecture

Runtime Flow

Component Responsibilities

End-to-End Flow

1. User Input

2. Node Job Orchestration

3. Python Feature Analysis

4. Stem and MIDI Evidence

5. Reference Transformation

6. Original Composition Package

7. Suno Prompt Package

8. Optional Local Audio Sample

Models and Engines

Legal and Platform-Safety Position

Contributing

Running Locally

API Example

Testing

Important Invariants

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages