Rebuild Dembow as an event-token Transformer (v2.0.0) by baezor · Pull Request #1 · baezor/dembow

baezor · 2026-06-17T04:44:16Z

Dembow, rebuilt 🔥

The 2016 "first A.I. that generates reggaeton hits" was unrunnable (Python 2, TensorFlow 1.x, the dead python-midi) and, once revived, sounded like noise. This PR rebuilds the generator from the ground up around the modern recipe for symbolic music: a decoder-only Transformer over a REMI-style event language.

How it works now

Music is treated like language. Every song is tokenized into a stream of musical events:

BOS  BAR  POS_0  INST_drums DRUM_kick  DUR_1 VEL_5
              POS_0  INST_bass  PITCH_36   DUR_4 VEL_6
              POS_4  INST_drums DRUM_snare DUR_1 VEL_5  ...
     BAR  ...  EOS

Each note carries its instrument group (drums / bass / mid / high), pitch, duration, and velocity, so the model writes expressive, multi-instrument arrangements instead of a flat on/off grid. A small Transformer learns to predict the next event with masked self-attention, and generates autoregressively with temperature + nucleus (top-p) sampling.

What changed from the original

Then (2016)	Now
Python 2, TensorFlow 1.x	Python 3, PyTorch
Restricted Boltzmann Machine	Decoder-only Transformer
Binary piano roll (on/off only)	Event tokens: pitch + duration + velocity
All tracks flattened into one roll	Multi-instrument (drums / bass / mid / high)
No sense of time	Self-attention over the whole sequence
Trained on ~76 raw files	Pitch-augmented corpus (~7×)
`python-midi` (Py2, dead)	`mido`
Threw the weights away	Saves & loads checkpoints
One-shot script	A real CLI + installable package + CI

New layout

dembow/
  tokenizer.py   MIDI <-> event tokens (the REMI-style music language)
  model.py       the decoder-only Transformer (+ temperature / top-p sampling)
  data.py        corpus loading, pitch augmentation, windowing
  train.py       training loop + checkpointing
  generate.py    sample new songs and write MIDI
  cli.py         the `dembow` command
fire.py          one-shot entry point
tests/           a fast end-to-end smoke test
.github/workflows/ci.yml   runs the smoke tests on every PR

The legacy RBM / LSTM / groove / piano-roll modules are removed (preserved in git history).

Try it

pip install -r requirements.txt
dembow train       # -> dembow.pt
dembow generate    # -> generated/dembow_*.mid

Verified

✅ Smoke tests pass (tokenizer round-trip, pitch augmentation, windowing, tiny train→generate→decode showing loss decreasing) — runs in CI on every push
✅ Tokenizer round-trips MIDI to a 5-track arrangement and back
✅ Generated token streams decode into valid multi-instrument MIDI

Honest note

The corpus is only ~76 short MIDI files, so even a Transformer is data-limited — it captures the feel (groove, instrumentation, key) more than polished songwriting. The biggest lever from here is more clean reggaeton MIDI in reggaeton_samples/.

🤖 Generated with Claude Code

https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

The original RBM reggaeton generator could no longer run: it used Python 2 syntax, the removed TensorFlow 1.x graph API, and the unmaintained Python-2-only python-midi library, and it never saved the weights it trained. This brings it back to life while keeping its essence -- a Restricted Boltzmann Machine that learns the dembow groove and Gibbs-samples new patterns: - Reimplement the RBM in PyTorch (CD-k + Gibbs sampling), CPU/GPU capable, reproducible, with checkpoint save/load. - Replace python-midi with mido for MIDI <-> piano-roll conversion; fix the glob that silently skipped uppercase .MID files in the corpus. - Package it (dembow/) with a real CLI: `dembow train` / `dembow generate`, plus a nostalgic one-shot fire.py. - Seed generation from real reggaeton grooves so output stays in the pocket. - Add requirements.txt, pyproject.toml, a fast end-to-end smoke test, and a rewritten README documenting the revival. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

The revived RBM ran, but its output sounded like noise rather than reggaeton. Two root causes: (1) all ~6-7 tracks of each song were flattened into one piano roll, scrambling the dembow drums (channel 9, ~45% of corpus notes) in with bass and melody; (2) the RBM models an unordered "bag of notes" with no sense of time and sampled every pitch independently, yielding 300+ simultaneous notes. Fix the representation for both engines and add a sequence model as the default: - dembow/representation.py: separate drums into musical classes (kick, snare, hats, ...), transpose pitched content to a common key, and reconstruct a 2-track (drums + pitched) MIDI on the way out. - dembow/lstm.py: an LSTM that reads the song one 16th-note step at a time and predicts the next, so it learns the groove over time. Generation primes from a real song, keeps output sparse (top-k notes/step), and re-rolls if the beat drifts into silence. - Keep the RBM as a "classic mode" (`--model rbm`); generation auto-detects the model type from the checkpoint. - CLI: `dembow train --model lstm|rbm` plus generation knobs (num-steps, max-pitched, temperature). Default `dembow train` now trains the LSTM. - Extend the smoke test to cover the representation round-trip and a tiny LSTM train+generate; rewrite the README to explain both engines, why the early output was noise, and how to push quality further. Verified: LSTM training converges (loss 0.43 -> 0.18) and generation produces musically dense output (~1-2.5 drums/step, ~3-5 pitched/step, matching real songs) with consistent kick/snare/hat dembow grooves. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

The LSTM grooved most of the time but could drop the snare or drift off the beat, and the dembow drum pattern is the signature of the genre. So pin it down: - dembow/groove.py: derive the canonical one-bar drum pattern straight from the corpus (average drum onsets across every bar, keep the positions that fire often). The textbook dembow emerges -- kick on the downbeats, snare at steps 3/6/11/14 ("boom-ch-boom-chick"), steady hats -- with a hardcoded fallback. - lstm.generate: accept a drum_track and lock the drums to it, so the model only improvises bass/melody, conditioned on a rock-solid beat. - generate: new --groove auto|dembow|none (default auto, from the corpus). Refactor the per-sample roll-out with a guard that re-rolls if either the beat (when not locked) or the melody drifts into silence, so every track has both. - Add a GitHub Actions CI workflow running the smoke tests on PRs, and extend the suite to cover groove extraction and drum-locking. Update the README. Verified: with the groove on, all generated samples carry the identical steady dembow beat (kick/snare/hat every bar) plus a melody; 7/7 smoke tests pass under pytest. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

The smoke job failed with `ModuleNotFoundError: No module named 'dembow'` because `pytest tests/` ran without the package installed or the repo root on the path. Add pytest's pythonpath config so the local package is importable in CI (and locally) without a separate install step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

Per the new direction -- improve the engine rather than preserve the original RBM soul -- replace the entire generation stack with the modern recipe for symbolic music: a decoder-only Transformer over a REMI-style event language. - tokenizer.py: encode MIDI into event tokens (bar, position, instrument group, pitch, duration, velocity) and decode back to a multi-track MIDI. Captures per-note duration/velocity and a drums/bass/mid/high arrangement -- far richer than the old binary piano roll. - model.py: a decoder-only Transformer (causal self-attention, weight-tied embeddings) with temperature + nucleus (top-p) / top-k sampling. - data.py: corpus loader with pitch-shift augmentation (~7x the ~76-song corpus) and windowing for next-token training. - train.py / generate.py / cli.py: rewritten around the Transformer; generation primes from a couple of real bars and samples a continuation. - Remove the legacy RBM / LSTM / groove / piano-roll modules (kept in history). - Rewrite the smoke tests (tokenizer round-trip, augmentation, windowing, tiny train+generate+decode); update README and pyproject to v2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

…data guide Address several improvements for training quality and usability (v2.1.0): - Validation + early stopping (train.py, data.py): hold out songs at the *song* level (so pitch-augmented copies don't leak), report val loss each epoch, save the checkpoint with the best val loss, and stop early when it plateaus. On a ~76-song corpus this is what separates generalizing from memorizing. - Hardware presets (cli.py): `--preset cpu|gpu|auto` picks model size / epochs / augmentation sensibly (CPU is small + early-stops; GPU is bigger). Explicit flags still override. Fixes the default config timing out on CPU. - Repetition control (model.py): `--repetition-penalty` gently down-weights recently used tokens and optional `--no-repeat-ngram` hard-bans exact repeats, so generation doesn't collapse into a degenerate loop -- while still allowing the musical repetition that makes a groove. - reggaeton_samples/SOURCES.md: where to find more training MIDI (free libraries, open datasets, audio-to-MIDI), cleaning tips, and licensing notes. - Extend smoke tests (split disjointness, repetition controls); update README. Example outputs from a demo model are added in a follow-up commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

Three multi-track MIDI files (drums + bass + mid + high) so listeners can hear Dembow without training first. Generated with repetition-penalty 1.2 from a small demo model (val loss ~1.44) trained on the bundled corpus. See examples/README.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

Make Dembow enjoyable without any setup: - Bundle a small pretrained checkpoint (dembow/assets/dembow-pretrained.pt) and fall back to it when no local checkpoint exists, so `dembow generate` works out of the box -- no training step. Included as package data; gitignore exception keeps it tracked despite the global *.pt ignore. - Add render.py + `--render`: turn generated MIDI into .wav so you can actually hear it. Uses FluidSynth + a SoundFont when available, otherwise a tiny dependency-free NumPy synth (oscillators for pitched parts, shaped noise for drums) so rendering always works. - Wire `--render` / `--soundfont` into the CLI; extend smoke tests (bundled model loads, builtin render produces audio); update README. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Ho9V6TBXozB23VjHqvXuT8

claude added 5 commits June 17, 2026 04:43

baezor changed the title ~~Revive Dembow on a modern stack (Python 3 + PyTorch + mido)~~ Rebuild Dembow as an event-token Transformer (v2.0.0) Jun 17, 2026

claude added 3 commits June 17, 2026 05:55

baezor merged commit 862bd97 into master Jun 17, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebuild Dembow as an event-token Transformer (v2.0.0)#1

Rebuild Dembow as an event-token Transformer (v2.0.0)#1
baezor merged 8 commits into
masterfrom
claude/music-generation-ml-revival-iw1ll3

baezor commented Jun 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

baezor commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dembow, rebuilt 🔥

How it works now

What changed from the original

New layout

Try it

Verified

Honest note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

baezor commented Jun 17, 2026 •

edited

Loading