Key & Dromos Detection: The Hard Part
This is where Rast does what off-the-shelf tools cannot. Most key detectors decide between 24 keys — twelve major, twelve minor. Greek music lives in a richer modal world: eleven dromoi (singular dromos, "road"), the Greek counterpart to Turkish makam and Arabic maqam. Telling those eleven scales apart from audio alone is hard, and it is the project's core IP.
The core problem in one example
Look at these two scales rooted on D:
- D Harmonic Minor: D, E, F, G, A, B♭, C#
- A Hidjaz: A, B♭, C#, D, E, F, G
They contain exactly the same seven notes. The difference is which note feels like home and which chords resolve where. A chroma vector — the 12-pitch-class summary used by virtually every traditional key detector — is direction-blind: it has no idea which pitch is the tonic. To a chroma matcher, those two keys look identical.
Greek music is full of this. D Niavent and A Hidjazkar share notes. C Nikriz and A Harmonic Minor share notes. The eleven dromoi cluster into a few families with overlapping pitch class sets, and chroma alone cannot pick between members.
same notes ─────────────┐
│
D Harmonic Minor ◄──────► A Hidjaz (root differs)
D Hidjaz ◄──────► G Harmonic Minor
D Nikriz ◄──────► A Harmonic Minor
D Minor ◄──────► F Major ◄──────► A Ousak
│
chroma cannot break this tieThe trick: chord qualities reveal the root
What chroma can't see, the chord stream can. The qualities of the chords on the I and V scale degrees, plus whether a bII chord ever shows up, read the modal fingerprint directly. Hidjaz with a major I and a diminished V is unmistakable; Hidjazkar with a major I and a present flat-II is unmistakable. The chord stream from chord detection carries that information.
The strategy lives in rust/rast-analysis/src/key_detection.rs:
- Find the tonic. Build a duration-weighted histogram of chord roots. Boost first and last chords (songs tend to begin and end on the tonic) and chords in the first and last 10% of the song; pick the strongest pitch class.
- Read the I and V quality. Look at every chord rooted on the tonic and every chord rooted seven semitones above. Collapse their qualities into four practice-buckets —
Maj,Min,Dom7,Dim— and majority-vote per degree. - Check for a bII. Is there a chord rooted one semitone above the tonic? That note is a Phrygian-family marker.
- Look up candidates. A disambiguation table maps
(I-quality, V-quality, has-bII)to a shortlist of dromoi.
The disambiguation table reflects the rules in the Greek music theory reference — see Music theory reference for the full mapping. A few illustrative rows:
(I=Maj, V=Maj, no-bII) → Major
(I=Maj, V=Dim, any) → Hidjaz
(I=Maj, V=Maj, has-bII) → Hidjazkar / Peiraiotikos
(I=Min, V=Min, no-bII) → Minor / Nikriz
(I=Min, V=Maj, no-bII) → Harmonic Minor / Niavent
(I=Min, V=Dim, has-bII) → OusakWhere the table returns more than one candidate, chroma breaks the tie: a duration-weighted average chroma over the whole song, weighted by RMS energy so silence doesn't dilute it, matched against each candidate's reference profile. A practice palette score also rewards songs that use the candidate dromos's diatonic chords.
Notes from Basic Pitch, when available
The pipeline runs an optional fourth model — Spotify's Basic Pitch — on the instrumental stem to transcribe a list of individual notes. From those notes Rast builds two pitch-class histograms: one over all notes and one restricted to the bass register (MIDI 55 and below). The bass histogram is a strong tonic signal because bass lines outline roots; it helps in edge cases where a song's V chord is louder than its I (Hidjaz V → I cadences can otherwise flip the detected tonic by a fifth). If Basic Pitch isn't available, key detection falls back to chroma-only.
What you get
Rast returns a primary key string ("D Hidjaz"), a secondary candidate when ambiguity is significant, all 132 (root × scale) candidates with scores, and a list of relatives — dromoi sharing the same pitch-class set. Confidence under 60% surfaces the top candidates rather than a single confident answer.
Honest caveats
- Greek musicians borrow notes from neighbouring dromoi mid-song; a piece may not fit any single template perfectly. That is a feature of the music.
- Chroma is direction-blind, so true seyir analysis (Rast ascending as Major, descending as Mixolydian) is approximated.
- Microtonal scales — Sabah and Huzam — are flagged with an "approximate, microtonal intervals not in 12-TET" warning. The profile is the closest 12-tone match.
- The whole approach depends on the chord stream being roughly right. A song where chord detection fails will produce a confidently wrong key. Fixing chords in the editor and reanalysing usually fixes the key.
For the full disambiguation rule table, the diatonic chord list of each dromos, and the relative-dromoi map, see Music theory reference.