Skip to content

sinedied/am2mid

Repository files navigation

am2mid

am2mid = Audio Melody to MIDI. Tiny CLI that turns a YouTube track into a melody MIDI file (and/or an isolated melody stem you can drag into NeuralNote or any other audio→MIDI VST).

Pipeline:

YouTube URL ──► yt-dlp ──► (optional trim) ──► Demucs (htdemucs) ──► melody stem ──► Basic Pitch ──► .mid

Requirements

System tools (install once):

brew install ffmpeg

Python deps are managed with uv:

uv sync

That installs yt-dlp, demucs, basic-pitch, etc. into a local .venv.

Usage

Single track (default: all stems saved as wav, MIDI generated for other):

uv run am2mid "https://www.youtube.com/watch?v=zGDzdps75ns" \
  --range "1:05-2:05" \
  --out ./output

Skip transcription if you plan to use NeuralNote VST instead:

uv run am2mid "<URL>" --range "1:05-2:05" --no-midi

Transcribe every stem (not just other):

uv run am2mid "<URL>" --range "1:05-2:05" --midi-all

Quantize the transcription to a 1/16th grid (BPM auto-detected from the drums stem):

uv run am2mid "<URL>" --range "1:05-2:05" --quantize 1/16

Force a specific BPM and also keep the stem .wav files:

uv run am2mid "<URL>" --range "1:05-2:05" --quantize 1/16 --bpm 138 --with-stems

Use the 6-stem model when the lead is a recognizable guitar/piano:

uv run am2mid "<URL>" --range "1:05-2:05" --model htdemucs_6s --stem piano

Batch mode from a YAML file (see example.yaml):

uv run am2mid --batch example.yaml

Options

Flag Default Description
url (positional) YouTube URL (omit if using --batch).
--range, -r full track Melody window, format MM:SS-MM:SS.
--batch, -b YAML file with a list of songs.
--out, -o ./output Output directory.
--name, -n from video title Folder name (single-URL mode).
--stem other Which stem gets transcribed to MIDI. All stems are always saved as .wav.
--model htdemucs Demucs model (use htdemucs_6s for guitar/piano stems).
--midi-all off Transcribe every stem, not just --stem.
--no-midi off Skip basic-pitch (use a VST instead).
--quantize, -q off Snap MIDI notes to a grid. Subdivision as 1/16, 1/8, 16… (powers of 2, 1–32). Writes a sidecar *.q<N>.mid.
--bpm auto (when --quantize) Tempo. Either a number (138) or auto to detect from the drums stem.
--with-stems, -s / --no-stems on Save the chosen melody stem .wav (only --stem).
--all-stems, -a off Save ALL stems as .wav (for A/B comparison).
--with-full / --no-full on Save the un-separated full audio as midi/stems/<name>_full.wav.
--bpm-range 120-160 Window used to fold auto-detected BPM (e.g. 70 → 140). Format MIN-MAX. Ignored when --bpm is explicit.
--force, -f off Overwrite existing midi/stem files without prompting.

Output layout

output/
  midi/
    <name>.mid                    # primary MIDI (the --stem one, default 'other')
    <name>.q16.mid                # quantized sidecar (if --quantize)
    <name>_drums.mid              # extra MIDIs when --midi-all
      stems/                        # created when --with-stems / --all-stems / --with-full
        <name>_full.wav             # un-separated audio (default on)
        <name>_other.wav            # chosen melody stem (default on)
        <name>_drums.wav            # other stems only with --all-stems
        ...
  .work/<name>/                   # raw download + demucs intermediates

<name> is your --name, or the YouTube title sanitized into a readable, filesystem-safe form.

YAML batch + global defaults + per-song overrides

example.yaml:

defaults:                     # any CLI flag, applied to every song; CLI overrides win
  quantize: 1/16
  bpm: auto                   # auto-detected, folded into bpm_range
  bpm_range: 120-160          # trance-friendly window
  with_full: true

songs:
  - url: https://www.youtube.com/watch?v=...
      name: my-track
      range: "1:05-2:05"
  - url: https://www.youtube.com/watch?v=...
      name: tricky-tempo
      bpm: 140                  # per-song override beats defaults + CLI
      all_stems: true           # also dump every stem just for this track

Run with:

uv run am2mid --batch example.yaml
# Override a default at runtime (per-song overrides still win):
uv run am2mid --batch example.yaml --no-full

Auto BPM folding

auto (the default when --quantize is set) runs librosa's beat tracker on the drums stem (cleaner than the full mix). The raw estimate is then folded into --bpm-range (default 120-160) by ×2 / ÷2 multiples, so half-time detections like 70 BPM become 140 BPM. Explicit --bpm 138 is never altered.

Tips for trance leads

  • The synth lead almost always lands in the other stem of htdemucs (the default 4-stem model). If the track has a clear guitar or piano lead, switch to the 6-stem model with --model htdemucs_6s --stem guitar (or piano).
  • Pick a clean melodic section (no vocal chops, no big build-up FX) with --range — Basic Pitch is much happier with 30–60s of clear melody than the whole 7-minute track.
  • For best quality, keep the stem (--keep-stem) and run it through NeuralNote in your DAW, then nudge octaves and quantize manually.

About

YouTube → isolated melody stem → MIDI pipeline for trance & co. (yt-dlp + Demucs + Basic Pitch)

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages