Center for Listening Research Dispatches · Long-form
CLR-002 · v.1
Filed 29 · APR · 2026
Dispatch № 002 · Discovery feed · Goose The algorithm flagged it. We re-listened.

A Prince cover, pegged at one.

A method note, written around a candidate match. CLR-001 calibrated our subtype detector against a canonical Phish reference (UIC '98 Bathtub Gin); the natural follow-up is to point that detector at unseen tape from a different band and see whether it surfaces analogs that hold up under listening review. We ran it across forty-eight Goose performances from late 2025 onward. This one, an eighteen-minute Prince cover from 4/24/2026, head in D♯ minor, came back with all three detectable subtypes firing, with the harmonic-distance metric pinned at the ceiling (1.000) for a sustained five-and-a-half-minute stretch. Sits in the same A + B + C bucket as the UIC reference. We re-listened. We're writing it up.

FILE COPY Do · Not · Remove
Performance
2026 · 04 · 24
Band · song
Goose · I Would Die 4 U
Run time
18:00
Keys (head/jam/tail)
D♯ min / C♯ maj / F♯ min
Subtypes detected
A · B · C · (D ?)
Classification
Uncatalogued

This is a methods follow-up to CLR-001. That dispatch took a long-canonical Type II, UIC '98 Bathtub Gin, and used it to anchor our detector chain: this is what an A + B + C performance looks like, here is each metric on it, here is the threshold each one had to cross. Useful by itself, but the more interesting question is whether a detector calibrated against one band's reference cases generalises to a different band's catalog. That's a standard validation move in any pattern-detection project. This dispatch is that move.

The protocol. Take the same detector chain we anchored against UIC '98. Aim it at a corner of the catalog the original calibration didn't see, Goose 2025–2026, forty-eight performances we ingested via Archive.org for an ongoing discovery feed. Run the four subtype detectors over each one. Look at the candidates that fire all three. Then sit down and listen to them, the way the dispatches always end: with ears.

One came back as a triple-fire: an eighteen-minute Prince cover from 4/24/2026, head in D♯ minor. We'd been listening with our ears, but not to this. The detector hit it harder than any of the Phish references it was anchored against, sustained harmonic-distance pinned at 1.000 from 6:30 to 12:00, chord-vocabulary divergence reaching 0.61. Those are bigger numbers than UIC '98, on those two axes, in our small sample. (UIC '98 still leads on the sustained-drone metric.) What does that mean? Probably nothing, on its own. Two recordings is two recordings. What we can say with a straight face is: the detector calibrated against a 1998 Phish reference put a 2026 Goose Prince cover next to it, in the same algorithmic bucket, on a regular tour night. That's the kind of pattern a method should produce if the method works. So we wrote it up.

Couple of caveats up front. This is forty-eight performances of one band over six months, against a reference set of twelve Bathtub Gins. Phish has played 2,000+ shows; Goose has played several hundred. Neither sample is anything like comprehensive. Three of the forty-eight Goose performances came back as 2-of-3 partials, close but missing one detector, which we list later as runners-up. There are absolutely more triple-fires sitting on tape we haven't pulled yet, in either band's catalog. None of this is a ranking. The dispatch walks through what the detector saw on this recording, minute by minute, with the figures styled the same way as CLR-001 so you can read the two side by side. Hit play and follow along.

Hit play
0:00 / --:--
Figure 01 · The chord ribbon

Where the band sits, over eighteen minutes.

Each cell is the dominant chord during one thirty-second window. Color = chord identity. The thin white ticks underneath are minute marks. Read it left-to-right with the audio. The composed Prince head sits in D♯ minor for the first two and a half minutes, D♯ minor and F♯ major cells alternating. After that the ribbon shifts into a tight C♯ minor / C♯ major palette and stays there for most of the next fifteen minutes, with brief excursions through E major and F♯ major before returning to D♯ minor at the end. Five distinct chords across the eighteen-minute performance, none of which dominate the head. Click anywhere to seek the audio there.

FIG-01 · Dominant chord per 30-second window
Figure 02 · The energy contour

The peak sits on the way home.

Loudness over time. Different shape from a Phish Type II, the dynamic climax doesn't come during the deepest harmonic departure. It arrives near the end of the jam, around 16:00, after the band has already started working back toward home. The stretch where the harmonic-distance metric is highest (6:30 – 12:00) is actually quieter than the head was. The crescendo is a release, not a peak in itself.

FIG-02 · RMS envelope
Reference · The four subtypes

What the algorithm is looking for.

Before the detector-signal plot below. IWD4U lights up the first three on chord-only metrics. The fourth, originally written as “not yet measurable,” now has a candidate non-chroma metric from § 04 below: longest contiguous jam stretch with no multi-feature regime change. IWD4U sustains a single texture for 10 minutes (2:30 to 12:30), passing the provisional ≥ 8-minute threshold. So by this candidate metric IWD4U may be a four-of-four. The metric is provisional and a small-sample finding (n=10 BTGs, replicated in § 04). Threshold not yet calibrated.

A
Key departure

Leaving the home key.

The jam ventures to a non-home key. Detected by checking, window by window, how far the tonal center has drifted from the head's key.

Threshold 0.10  ·  cosine distance · 0=identical key, 1=orthogonal
IWD4U: 0.63 (peak window 1.000, sustained 5.5 min)
B
Vocabulary expansion

New chords, same key.

The jam stays near the home key but uses a chord palette the head never visits. Measured by the divergence between the head's and jam's chord histograms.

Threshold 0.18  ·  Jensen-Shannon divergence · 0=same chord palette, 0.69=disjoint
IWD4U: 0.30 (peak window 0.61)
C
Sustained drone

Sitting on one chord.

The band holds individual chords for ten to thirty-plus seconds. Measured by the ratio of average chord-segment length in the jam compared to the head.

Threshold 1.5×  ·  jam segments held this many times longer than head's average
IWD4U: 1.61× · weaker than UIC '98 (3.98×)
D
Textural commitment · candidate

Locking in a feel.

A non-chroma metric: the longest contiguous jam stretch with no multi-feature regime change (MFCC novelty + onset density + centroid + spectral flux). The probe is the one used in § 04 below. Type II performances stay textured-committed for ≥ 8 minutes; baselines do not (4/4 vs 0/2 in our 10-performance sample).

Threshold ≥ 8 min  ·  longest contiguous jam stretch with no multi-feature non-chroma regime change
IWD4U: 10.0 min · 2:30 → 12:30 · candidate D firing
Figure 03 · The subtype signals

The pink line, pegged at one.

The pink line is Subtype A (key-departure). The blue line is Subtype B (chord-vocabulary divergence). Each has a dashed threshold drawn. The bottom strip shades pink / blue / orange when each subtype is firing.

FIG-03 · Subtype detector signals + firing strip
The launch is noisy. Both signals first cross threshold at 2:30 (one window), then drop entirely at 3:00 as the band returns briefly to D♯ minor, then partial firing at 3:30, then sustained from 4:00 onward. Per-window dropouts in the launch period are typical, the chord-classifier briefly grabs home-key content during transitional moments. The same stutter pattern appears at the launch of UIC '98. From 6:30 to 12:00 the pink line is at or near 1.0 for most windows, the band is in territory orthogonal to D♯ minor for roughly five and a half minutes. The drone strip (orange) lights up only twice in the entire performance (5:00 and 11:30), patchy compared to UIC '98's steady wall.
03 · Minute by minute

The score, marked up.

A typographic transcription of the time-series above. Clock at left. Subtype state in the middle. Field note at right. The two intensity peaks fall at roughly 11:30 and 19:30, a textbook two-summit dynamic arc.

Hit play, then read along. Each row highlights as the audio passes through it. Click any timestamp at left to jump there.
0:00 to 2:30
composed head
Composed Prince cover in D♯ minor. The verse-chorus structure runs straight: alternating D♯ minor and F♯ major cells. Energy builds from a quiet opening to RMS ≈ 0.17 by 2:00. None of the subtype signals are firing, this is just the song.
2:30
A + B fire (noisy)
First launch attempt. Both A and B cross threshold in the same window. Band moves from D♯ minor into C♯ minor; chord-vocab divergence jumps from 0.03 in the previous window to 0.50. The launch is not clean: in the very next window (3:00 to 3:30) both signals drop entirely, kd back to 0.0, the chord-classifier reads D♯ minor again. The same single-window dropout pattern shows up at the launch of UIC '98. By 4:00 both signals lock in continuously.
4:00 to 6:30
wandering · A + B
Sustained departure begins. The chord-classifier sits in C♯ minor; harmonic-distance is climbing window by window: 0.89 → 0.94 → 0.99. Subtype C (drone) fires for one brief window at 5:00 to 5:30 when chord_diversity drops to 1, then the band keeps changing chords again.
6:30 to 12:00
A pegged at 1.0
Five and a half minutes orthogonal to home. The Subtype A signal is pinned at 1.000, the maximum possible value, sustained across eleven consecutive windows. The chord-classifier reads C♯ major / minor most of the time. Drone fires on and off in this stretch (chord_diversity drops to 1–2 for many windows). Energy drops through this central passage to RMS ≈ 0.10; the band is quieter than the head was.
10:30 to 11:00
B peaks
Peak chord-vocabulary divergence: Subtype B reaches 0.61. Top chord here is C♯maj, a chord that doesn't appear in the composed head's palette at all. Drone is not firing in this exact window (chord_diversity is 2); C fires one window later at 11:30 to 12:00.
12:30 to 16:00
working back · A + B
The harmonic-distance metric falls off the ceiling but stays high (0.70 → 0.50 → 0.66 → 0.63). Band steps through G♯ minor and F♯ minor en route home. Energy starts climbing again.
16:00 to 16:30
dynamic peak
The loudest moment of the song. RMS reaches 0.16, the dynamic climax doesn't come during the deepest harmonic departure, it comes during the return. The band releases on the way home. F♯ minor here. Subtype A still firing but moderating.
17:00 to 18:00
settles · B only
Subtype A signal drops below threshold. F♯ major and F♯ minor, close enough to home that the key-departure detector turns off. B (chord-vocabulary) stays slightly above threshold through to the end. Tail key reads as F♯ minor; not quite the D♯ minor we started in. The song ends in a different mode than it began.
04 · What the chroma misses

The signals our detector chain cannot see.

A reader of an early draft of this dispatch listened back and reported hearing things changing in the recording that the score didn't describe. That observation is a useful test of the framework. Our four-subtype taxonomy (A, B, C, plus the not-yet-built D) is built on chroma features alone. Chroma is a 12-bin pitch-class energy vector, harmony in, harmony out. By construction it cannot see timbre, brightness, instrument density, or rhythmic feel. So we ran a non-chroma probe over the same recording and looked for moments where a non-chroma feature moves while the binary A/B/C firing pills stay flat.

The probe is four standard audio features computed at the same 30-second resolution as the chroma data: MFCC novelty (frame-to-frame distance in MFCC space, a textural / timbral motion signal), onset density (note onsets per second, a playing-density signal), spectral centroid (the perceptual brightness of the sound, sensitive to instrument color and distortion), and spectral flux (total spectral change per window). Five boundaries in the recording show a non-chroma feature jumping more than one standard deviation while A/B/C does not change at all. The clearest one is at 12:00 to 12:30.

FIG-04 · MFCC novelty + onset density · normalised, per 30-second window

Two regimes inside one supposedly-flat A+B firing zone. From 4:00 to 12:00 the band plays dense and lower-register: high MFCC novelty (texture moves a lot frame to frame), high onset density (5 to 6 onsets per second), centroid in the 1300 to 1800 Hz range. From 12:30 to 17:00 all three flip together: novelty drops by a third, onset density drops by a third, centroid jumps to 2000 to 2400 Hz. Same chord (C♯ minor, mostly), same firing state (A=1, B=1), completely different texture. The chord-classifier shows kd dropping from 0.94 to 0.70 across the same boundary, so we know something changes; what we did not know before this probe was that three independent non-chroma features all pivot at the same window.

The other ABC-independent jumps the probe surfaced: a textural shift at 6:00 to 6:30 (MFCC novelty z=1.14), onset-density jumps at 6:30 to 7:00 (z=1.33) and 7:00 to 7:30 (z=1.19), and a centroid drop at 16:30 to 17:00 (z=1.36) as the band starts working back toward home. None of these move the existing A/B/C pills. All of them are things a careful listener notices.

What this was, on first writing: a single-performance demonstration that the chroma-only framework collapses real motion into a binary state, and that off-the-shelf timbre / density / brightness features expose at least some of the motion that gets collapsed.

Update, same day: we ran the probe on nine more BTG performances (the full reference set + partials + baselines from CLR-001's table). The expected result, “Type II performances have more non-chroma motion,” turned out backwards. Baselines have more regime changes per minute (0.41) than A+B+C performances (0.18). Direction surprise. The metric that does separate them is the inverse: longest contiguous jam stretch with no multi-feature regime change. Type II performances lock into a single textural feel for an extended period; baselines cycle between feels.

Performance Class Jam length Longest sustained-texture stretch ≥ 8 min?
UIC '98 BTG A+B+C 22.6m 13.5m
VB '98 BTG A+B+C 13.5m 11.5m
Cincinnati '03 BTG C only † 25.2m 10.0m
Bethel '24 BTG A+B+C 17.3m 8.0m
CMAC '14 BTG A+B partial 9.9m 8.0m
Sphere '24 BTG baseline 12.8m 7.5m
Worcester '95 BTG C only 9.6m 6.0m
Berkeley '10 BTG A+B partial 9.6m 5.5m
Alpine '24 BTG baseline 10.6m 5.0m

Cincinnati '03 is community-tagged Type II but our chord-only chain only catches its drone (C). It passes this metric, agreeing with the listener consensus against our chroma-only chain. That's a useful signal that the metric is pointing in the right direction.

At threshold ≥ 8 minutes, the metric agrees with all four A+B+C performances and disagrees with both baselines. Partials split. Subtype D candidate, by this name and shape: textural commitment, an axis somewhat orthogonal to A/B/C, capturing how long the band locks into a single feel during the jam. It is provisional, n=10 is small, threshold is not formally calibrated, and the metric is a sum of off-the-shelf features (MFCC novelty + onset density + centroid + spectral flux) that almost certainly leaves room for refinement. Logged in the research notebook as RQ-D.1 through RQ-D.4.

Era effect worth flagging: 1990s Phish A+B+C performances (UIC, VB) sustain texture longer (13.5m, 11.5m) than modern Phish A+B+C (Bethel '24, IWD4U at 8.0m, 10.0m). Modern jams may be textured-by-design even when triple-firing on harmony.

Where the metric breaks. We extended the probe to non-BTG performances, including the 1994 Bangor and Bozeman Tweezers (community-canonical “outside the box” cases) and the 2024-07-21 Great Woods Tweezer (the other uncatalogued match in CLR-001 § 05). Bozeman '94 passes (13.5min sustained); Great Woods '24 passes (11.8min). Bangor '94 fails (7.5min), even though listeners agree it is unmistakably Type II. Bangor's profile turns out to be: high regime-change rate (0.52/min, baseline-like), and low harmonic concentration (the dominant pitch class shifts across 10 different keys with no clear home base, a chord-stability profile no other performance in the sample shares). So: the metric we built captures one mode of Subtype D and misses another. Bangor is doing something neither our chroma chain nor our non-chroma probe captures cleanly. RQ-D.6 (per-window MERT, the pre-trained transformer audio embedding the project uses elsewhere for jam similarity) is the natural next probe; it might unify the two modes, or reveal a third axis we haven't named yet. Logged in the research notebook as RQ-D.10 (the Tweezer extension) and RQ-D.11 (the falsified D-shift hypothesis).

05 · Side-by-side with the calibration set

The reference cases, plus this one.

Four performances in our working sample cross all three thresholds. Three are the Phish Bathtub Gin reference cases CLR-001 was calibrated against, UIC '98 (the canonical anchor), Virginia Beach '98, Bethel '24. The fourth is this. Not “four out of two thousand”, that's not a number we have. It's four out of the ~120 performances we've ingested for method development, and the reference cases were specifically chosen because they were already known triple-fires. The point of the comparison is to look at the metric values side by side and see how the Goose performance shapes up against the cases the detector was anchored on. Different shape, same classification.

Performance Length Head key Jam key Key dep. Chord div. Drone ratio Notes
1998 · 08 · 09 Virginia BeachPhish · Bathtub Gin · jamcharts 15:02 G min C maj 0.24 0.12 2.72 1998-era jamchart classic
1998 · 11 · 09 UICPhish · Bathtub Gin · canonical reference 24:06 G maj C maj 0.34 0.21 3.98 The reference. → CLR-001
2024 · 08 · 11 BethelPhish · Bathtub Gin · recent 18:49 C maj F maj 0.43 0.11 1.57 Modern A+B+C, twenty-six years later

Two of these chord-vocabulary cells (Virginia Beach 0.12, Bethel 0.11) sit below the headline 0.18 threshold on the aggregate measurement, but the per-window aggregator catches their B firings during specific passages. We surfaced the call here for the same reason CLR-001 surfaced it for Bethel: borderline measurement-method choices should be visible in the dispatch, not buried.

Read across the “Key dep.” column: 0.24, 0.34, 0.43, 0.63. Read “Chord div.”: 0.12, 0.21, 0.11, 0.30. The Goose performance lands at the high end of both, in this small reference set. Read “Drone ratio”: 2.72, 3.98, 1.57, 1.61, drone is where IWD4U is weakest. What this means is intentionally narrow: against the four performances we're comparing, the Goose recording sits high on the harmonic axes and moderate on the textural one. Same classification, different shape. Wider claims would need a wider sample.

06 · The candidates, with their rough edges

The other near-misses in the same forty-eight.

When we ran the calibrated detector chain over the forty-eight Goose 2025-26 performances we'd ingested, IWD4U was the only triple-fire. Three more came back as 2-of-3 partials, performances where two detectors crossed threshold and one didn't. We list them here for transparency, not as a leaderboard. Each is a candidate, not a finding, they need their own close listening review before we'd say anything stronger. (And the sample itself is a working slice: forty-eight performances over six months, against a detector calibrated on twelve Bathtub Gins. More tape will land on the bench, and the picture will move.)

Runner-up · 01

Pancakes

2026 · 04 · 10 Asheville, 15:45

Key dep.
1.000
Chord div.
0.50
Drone ratio
0.81

Aggregate harmonic-distance pegged at 1.000; highest chord-vocab divergence among the forty-eight (0.50). Misses on drone, the band keeps changing chords throughout. A + B, no C. A candidate, pending listening review.

Runner-up · 02

Into The Myst

2026 · 04 · 24 to 11:24 · same night as IWD4U

Key dep.
0.872
Chord div.
0.444
Drone ratio
0.29

From the same show as IWD4U, two slots later. D maj → G min, strong on the harmonic axes. A + B, no C. Worth flagging that the 4/24 show has multiple candidates in it; whether that pattern means anything (or whether it's coincidence in a small sample) is a question for a later dispatch.

Runner-up · 03

Dustin Hoffman

2026 · 04 · 10 Asheville, 14:36

Key dep.
0.328
Chord div.
0.172
Drone ratio
2.29

Highest drone ratio in our forty-eight-show Goose 2025-26 sample (2.29). Modest key shift, B falls just below the 0.18 line. A + C, no B. Different texture from IWD4U, a candidate worth listening to with the drone hypothesis in mind.

07 · What follows

The dispatches queued behind this one.

A working notebook, written one performance at a time. Each entry develops or stress-tests a measurement; some will pan out, some won't, and the failures get filed too. The point is the method, not the catalog. A short list of what's on the bench next:

We post when something's worth filing, not on a schedule. If you want to hear about new dispatches, send a note: listen@zabriskie.app.

← Back to the Center