Center for Listening Research Dispatches · Long-form
CLR-001 · v.1
Filed 28 · APR · 2026
Dispatch № 001 · Improvisation Group Field-recording deep dive

Three signals on, all at once.

A method note, written around one Type II performance. (“Type II” is Phish-fan vernacular for an improvised jam that leaves the song behind, the band stops thinking about the composition and starts thinking about each other.) We've broken Type II into four candidate subtypes: key-departure, chord-vocabulary expansion, sustained-drone, and a fourth we can't measure yet. To calibrate the detectors we need a reference performance, something the community already agrees is a textbook Type II. We picked UIC '98: twenty-four minutes from the UIC Pavilion, 11/9/1998, jamcharts-tagged, decades-deep consensus. This dispatch walks through what each of the three detectable subtypes looks like on that recording, end to end. Once anchored against a canonical reference, the same detector can be pointed at unseen tape from other bands. That's the validation test we run in CLR-002.

FILE COPY Do · Not · Remove
Performance
1998 · 11 · 09
Venue
UIC Pavilion
Run time
24:06
Keys (head/jam/tail)
G maj / C maj / C maj
Subtypes detected
A · B · C · (D ?)
Classification
Canonical

This is our first dispatch, and it's a methods note. The Center is a small research lab studying improvisation in jam-band recordings. Before we can find anything in the corpus, we need to know what we're looking for, a worked example of an A + B + C performance, with each metric anatomised one by one. UIC '98 is that worked example. It's here as a reference, not a verdict.

The protocol. Pick a performance the community has long agreed is a textbook Type II, UIC's 11/9/1998 Bathtub Gin is jamcharts-tagged and decades-deep canon. Run our detector chain over it. Show that each subtype detector lights up where a careful listener would say it should: home key abandoned (Subtype A), chord palette the head never visits (Subtype B), individual chords held for thirty-plus seconds in sustained drones (Subtype C). Cross-check on the other Type-II-tagged Bathtub Gins in our small sample, Virginia Beach 1998-08-09 and Bethel 2024-08-11 also fire all three, with different shapes, and the unflagged versions (Worcester '95 has the drone but never leaves home; the 2024-04-19 and 2024-07-28 versions barely move at all). The point isn't a leaderboard. The point is: detector behaves the way a fan would expect against the reference cases.

Vocabulary, since we'll lean on it: the head is the composed first ninety seconds of the song; the jam is the improvised middle; the tail is the last ninety seconds. Once a detector behaves sensibly against canonical references, the natural next step is to point it at unseen data, recordings the community hasn't flagged, ideally from a different band, and ask whether it surfaces analogs that hold up under listening review. CLR-002 is that next step. This one is the calibration.

Hit play
0:00 / --:--
Figure 01 · The chord ribbon

Where the band sits, over twenty-four minutes.

Each cell is the dominant chord during one thirty-second window. Color = chord identity. The thin white ticks underneath are minute marks. Read it left-to-right with the audio. The first ninety seconds are the composed head, many chords. The long swath of one color in the middle is the band sitting on a single chord. Click anywhere to seek the audio there.

FIG-01 · Dominant chord per 30-second window
Figure 02 · The energy contour

Two summits, about eight minutes apart.

Loudness over time. The two peaks at 11:30 and 19:30 are the dynamic climaxes, moments where the band locks in and the room comes up with them. The central drone passage sits between them, holding the energy steady before each release.

FIG-02 · RMS envelope
Reference · The four subtypes

What the algorithm is looking for.

Before the detector-signal plot below. UIC '98 lights up the first three on chord-only metrics. Subtype D (textural commitment) was originally listed as “not yet measurable”; a candidate non-chroma metric came out of CLR-002 § 04 and applied here gives UIC '98 a 13.5-minute sustained-texture stretch, the longest in our 10-performance sample so far. So UIC '98 may light up all four after all, by this candidate metric. The metric is provisional and the threshold is not yet calibrated. Originally intended to capture the 1994 Tweezer feel, the metric instead surfaced a different axis: how long the band locks into a single texture during the jam. Most likely the originally-named “Subtype D” concept will end up subdivided.

A
Key departure

Leaving the home key.

The jam ventures to a non-home key. Detected by checking, window by window, how far the tonal center has drifted from the head's key.

Threshold 0.10  ·  cosine distance · 0=identical key, 1=orthogonal
UIC '98: 0.34 (peak window 0.66)
B
Vocabulary expansion

New chords, same key.

The jam stays near the home key but uses a chord palette the head never visits. Measured by the divergence between the head's and jam's chord histograms.

Threshold 0.18  ·  Jensen-Shannon divergence · 0=same chord palette, 0.69=disjoint
UIC '98: 0.21 · highest in corpus
C
Sustained drone

Sitting on one chord.

The band holds individual chords for ten to thirty-plus seconds. Measured by the ratio of average chord-segment length in the jam compared to the head.

Threshold 1.5×  ·  jam segments held this many times longer than head's average
UIC '98: 3.98×
D
Textural commitment · candidate

Locking in a feel.

The 1994 Tweezer pattern, same harmony / chord palette, but melody, timbre, and dynamics tell you the band has left the room. We have a candidate metric: the longest contiguous jam stretch with no multi-feature regime change (MFCC novelty + onset density + centroid + spectral flux). Type II performances stay textured-committed for ≥ 8 minutes; Type I performances do not.

Threshold ≥ 8 min  ·  longest contiguous jam stretch with no multi-feature non-chroma regime change
UIC '98: 13.5 min · highest in the sample so far (n=10) → CLR-002 § 04
Figure 03 · The subtype signals

A and B fire together at 1:30.

This is the moment that makes UIC '98 a triple-fire performance. The pink line is Subtype A (key-departure). The blue line is Subtype B (chord-vocabulary divergence). Each has a dashed threshold. The bottom strip shades pink/blue/orange when each subtype is firing, A on top, B middle, C (drone) at the bottom.

FIG-03 · Subtype detector signals + firing strip
Both signals cross threshold around 1:30. The launch is noisy, A drops below threshold for individual windows at roughly 3:00 and 4:00, and B is intermittent through the same period, before both lock in continuously from 5:30 onward. That stuttery 1:30 to 5:30 zone is the band committing in starts and stops; the post-5:30 stretch is the sustained Type II. The drone strip (orange) fades in by 5:30 and dominates the central section. All three signals release in the final minute as the band works back home.
03 · Minute by minute

The score, marked up.

A typographic transcription of the time-series above. Clock at left. Subtype state in the middle. Field note at right. The two intensity peaks fall at roughly 11:30 and 19:30, a textbook two-summit dynamic arc.

Hit play, then read along. Each row highlights as the audio passes through it. Click any timestamp at left to jump there.
0:00 to 1:30
composed head
Composed head in G major. Five to nine distinct chords per thirty-second window, typical Bathtub Gin verse-chorus alternation. Top chords: Cmaj · Gmaj · Emin.
1:30
A + B fire (noisy)
Subtypes A and B fire simultaneously. Band moves from G major into C major. This is when the performance starts to commit, but the launch is not clean: A drops back below threshold for one window at roughly 3:00 and again at 4:00, and B is intermittent through the same span. Per-window dropouts in the launch period are typical, the chord-classifier briefly grabs home-key content during transitional moments. By 5:30 both signals lock in for good.
5:30 to 15:00
full A + B + C
The central drone passage. For most thirty-second windows, the chord-classifier finds only one chord, the band sits on a single Cmaj for thirty-plus seconds at a stretch. The drone ratio (jam-segment length over head-segment length) reaches 3.98, second only to Worcester '95 in the entire corpus.
11:30
peak I
First intensity climax. RMS ≈ 0.575. The drone resolves upward into the loudest moment so far.
16:00 to 16:30
A peaks
Peak key-distance from the head. The Subtype A signal reaches 0.664, the most harmonically distant moment of the twenty-four minutes. This is the band as far from G major as it ever gets.
19:30
peak II
Second intensity climax. RMS ≈ 0.575 again. Two distinct dynamic peaks, the second one slightly later in tonal-distance space than the first.
21:00 →
return
Band starts working back toward home. Subtype signals stay on through 23:00, then begin to subside in the order they came on, C first, then B, then A as the head returns.
04 · UIC '98 in context

The reference set, all twelve.

A small reference set we hand-picked for calibrating the detector chain, six community-tagged Type IIs spanning three decades, two unremarkable 2010s versions, three 2024 baselines, and one post-hiatus monster. Twelve performances is not a survey of Bathtub Gin, there are hundreds. It's a working sample chosen to span the kinds of behavior we want the detectors to discriminate. What we use it for: confirm that the three jamcharts-tagged Type IIs (UIC '98, Virginia Beach '98, Bethel '24) all fire A + B + C, and confirm that the unremarkable versions and baselines do not. Both checks pass. UIC remains the loudest reference on chord-vocabulary divergence (0.21); Virginia Beach and Bethel sit close behind. That's the point, the detector should agree with the consensus on calibration cases. The interesting work happens later, when we point it at recordings nobody's tagged.

Performance Length Head key Jam key Key dep. Chord div. Drone ratio Subtypes firing
1995 · 12 · 29 WorcesterCCT, MA 11:06 C maj C maj 0.00 0.05 4.05 C
1997 · 08 · 17 LoringLimestone, ME · “Loaded Gin” 15:21 C maj C maj 0.00 0.17 1.98 B + C
1998 · 07 · 29 RiverportMaryland Heights, MO 24:04 G min G min 0.00 0.11 2.04 B + C
1998 · 08 · 09 Virginia BeachGTE Amphitheater · jamcharts 15:02 G min C maj 0.24 0.12 2.72 A + B + C
1999 · 12 · 31 Big CypressSeminole Reservation · NYE 16:23 G min C maj 0.18 0.06 1.80 A + C
2003 · 02 · 22 CincinnatiU.S. Bank Arena · post-hiatus 26:43 C maj C maj 0.00 0.07 3.90 C
2010 · 08 · 06 BerkeleyGreek Theatre, CA 11:07 C min C maj 0.10 0.11 1.20 A + B
2014 · 07 · 15 CMACCanandaigua, NY 11:25 G maj C maj 0.10 0.13 1.42 A + B
2024 · 04 · 19 Spherebaseline 14:18 C maj C maj 0.00 0.06 1.09 None
2024 · 07 · 28 Alpine Valleybaseline 12:03 C maj C maj 0.00 0.05 0.82 None
2024 · 08 · 11 BethelBethel Woods · jamcharts (recent) 18:49 C maj F maj 0.43 0.11 1.57 A + B + C

A note on the Bethel call. Bethel '24's whole-jam chord-vocabulary divergence averages to 0.11, below the 0.18 threshold the table is checking. We score it as Subtype B anyway because the per-window measurement crosses 0.18 during the F-major passage in the back half. The aggregate metric averages that excursion away; the per-window aggregation catches it. That decision is a measurement-method choice, not a free pass, and it's the same choice that promoted Bethel from “A + C partial” on our earlier multi-metric scorecard to “A + B + C” here. A different reasonable threshold would leave it as A + C. We flag it because that kind of borderline call is exactly the sort of thing the dispatch should surface, not bury.

05 · The validation step

Take the calibrated detector, aim it at unseen tape.

A reference example is only useful if the detector trained against it generalises. So once we'd anchored A + B + C against UIC '98 (and cross-checked it against Virginia Beach '98 and Bethel '24), we ran the same detector chain over recordings the community had not flagged. Two early candidates came back as triple-fires from outside the Bathtub Gin reference set. These are working candidates, not pronouncements, both still need close listening review before we'd call either a confirmed Type II. The point of showing them is methodological: this is what the validation step looks like, with all the rough edges visible.

Match · 01 · Phish

Great Woods Tweezer

2024 · 07 · 21 to 21:48 minutes

Key dep.
0.513
Chord div.
0.161
Drone ratio
1.94

Plus a tail-outlier score of 0.177 (how unusual the song's last ninety seconds sound versus other Tweezer tails in our small reference set, higher = more), the highest of the Tweezers we've looked at, including the 1995 Memphis fifty-minute Tweezer at 0.087. Within-song z-score on chord-vocabulary divergence is +1.84. A candidate match worth a closer listen, not a verdict.

Match · 02 · Goose

I Would Die 4 U

2026 · 04 · 24 to 18:00 minutes · Prince cover

Key dep.
0.630
Chord div.
0.295
Drone ratio
1.61

A four-minute Prince cover taken into eighteen minutes of A + B + C territory by a different band. This is the case CLR-002 walks through in detail, the validation test against unseen data from outside Phish entirely.

06 · What follows

This is the first dispatch.

A calibration note, not a textbook. Each subtype has its own anatomy. Each candidate match has its own minute-by-minute story. Each open question, the fourth subtype, the role of recording quality, the day-of-week effects, gets its own dispatch as we work through it. The point of writing in public, slowly, against a small bench of recordings, is to make the reasoning visible while it's still being figured out. A short list of what's next on the bench:

We post when something's worth filing, not on a schedule. If you want to hear about new dispatches, send a note: listen@zabriskie.app.

← Back to the Center