Skip to content

EveryOneIsGross/CHANT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMAGE

CHANTER

a python speech synthesizer inspired by IRCAM's CHANT system—FOF (fonction d'onde formantique) grains shaped by coarticulation and phrase dynamics.

what it does

transforms text into wavering, vowel-rich audio via:

  • FOF grain synthesis — each glottal pulse spawns decaying sinusoids at formant frequencies
  • coarticulation engine — phonemes influence neighbors, smoothing transitions like actual vocal tracts
  • FORMES-style scheduling — phrase contours, stress patterns, final lengthening emerge from rule systems
  • register modeling — formant tuning shifts with pitch, simulating soprano/alto/tenor/bass resonance
  • jitter & shimmer — multi-rate random wandering keeps it organic

Architecture


Text
↓
NLPParser
↓
FORMES (phrase + phoneme rules)
↓
Sparsegram (time-indexed control frames)
↓
CHANTEngine
↓
.wav

quick start

python CHANTER.py "hello world" -v tenor -o hello.wav
python CHANTER.py "what is the sound of one hand clapping?" -v alto --speed 0.5

arguments

flag default meaning
-o/--out auto output wav path
-f/--pitch voice-dependent base f0 in hz
-s/--speed 0.38 phoneme duration scale
-v/--voice tenor soprano/alto/tenor/bass
-t/--tempo 1.0 global tempo multiplier
-d/--dynamics 0.7 effort/loudness 0-1
--lookahead 8 coarticulation window
--phonemes built-in custom yaml/json phoneme table
--text-file read input from file

dependencies

numpy scipy pyyaml(optional)

phoneme customization

drop a phonemes.yaml alongside the script or pass --phonemes path/to/custom.yaml:

AA:
  phoneme_type: vowel
  formants: [[700, 130, 0], [1220, 70, -6], [2600, 160, -24]]
  voicing: 1.0
  glottal_tension: 0.5

each formant tuple: [center_freq_hz, bandwidth_hz, amplitude_db]

how it works

  1. NLPParser splits text on punctuation into breath groups, maps letters→phonemes via digraph+simple lookup
  2. FORMES scheduler applies phrase dynamics, stress accents, legato rules
  3. CoarticulationEngine blends formant targets across phoneme boundaries
  4. RegisterModel shifts formants based on f0 and voice type
  5. CHANTEngine renders: for each glottal period, spawns FOF grains per formant, applies antiformants, mixes noise for fricatives/plosives

lineage

inspired by xavier rodet's CHANT (1984) and the FORMES composition environment—where synthesis parameters became musical gestures and vocal models sang ircam's electroacoustic dreams.

What is Chant?

The CHANT Project: From the Synthesis of the Singing Voice to Synthesis in General

to do

  1. add streaming
  2. reduce compute costs
  3. some of the frequency sweeps could be manually tightened up
  4. midi?

license

do what thou wilt

About

implementation of CHANT - text2speech

Topics

Resources

Stars

Watchers

Forks

Languages