Aesthetically Phenotyping the Claudes

What would a Claude's bookshelf look like?

This project tries to surface the aesthetic fingerprint of each Claude model — the textures, themes, and cultural reference points that emerge when you strip away the task-following and let a model just... talk.

When I visit a new friend's living space for the first time, I often find myself scanning their bookshelf and/or music and film collections to build up a kind of map of their tastes and get an initial location for them in my personal version of 'cultural latent space'. This is an attempt to do something like this.

The result is a curated bookshelf, record collection, and DVD shelf for each of six Claude models, from Haiku 3.5 through Opus 4.6. No two shelves are the same.

NOTE: As I currently don't have API access for Claude 3 Opus (the most fascinating of the Claudes!) it's not included here. If someone with research access would like to help out, get in touch and I can supply the necessary code to run.

The models

Haiku 3.5 — the original small model
Haiku 4.5 — the current small model
Sonnet 4 — the mid-range workhorse
Sonnet 4.5 — the current mid-range model
Opus 4.5 — the previous flagship
Opus 4.6 — the current flagship

Phase 1: Free association

Each model was given 100 conversations with minimal, open-ended prompts — things like "Free associate. No questions. Go!", "Stream of consciousness. Begin.", and even just a single period. Each conversation ran for 6 turns at temperature 1.0, producing a substantial corpus of unconstrained output per model.

10 prompt variants × 10 runs each = 100 conversations per model. Each conversation has 6 turns (initial response + 5 continuations using prompts like "keep going", "continue", "...", or just ".").

All calls used temperature: 1.0 and max_tokens: 2000 to encourage maximum expressiveness. The prompts are deliberately minimal — the goal is to see what a model reaches for when it has nothing to respond to.

The result: ~600 turns of free-form text per model, covering whatever the model gravitates toward naturally — nature imagery, philosophical musings, narrative fragments, lists, poetry, whatever emerges.

Phase 2: Cross-model assessment

A sample of each model's corpus was then shown to every other model (and itself), with the assessor asked: "What thinkers does this call to mind? Now give me a reading list of 20 books. Now a soundtrack of 20 albums. Now 20 films."

This produces 5–6 independent nominations per model, each from a different Claude's perspective on the same corpus.

For each target model, 30 representative conversations were sampled (3 per prompt variant, deterministic seed for reproducibility). This corpus was sent as context to each assessor model with a system prompt explaining they're analyzing anonymous LLM output.

The assessment is a 4-turn conversation:

Turn 1: "Which philosophers, thinkers and writers does this body of text call to mind? What intellectual and aesthetic sensibilities do you detect?"
Turn 2: "Provide a reading list of 20 books that seem aligned with the vibe and content of this text."
Turn 3: "Now suggest the soundtrack — 20 albums that feel aligned with the aesthetic sensibility of this corpus."
Turn 4: "Now suggest 20 films — the DVD collection that would sit naturally alongside these books and albums."

That's 6 models × 6 assessors × 4 turns = 144 API calls for the assessment phase alone. Each assessor builds on its own analysis across the conversation, so the albums/films are informed by the books, which are informed by the initial thinker analysis.

Phase 3: Curation

Finally, all the nominations are pooled with frequency counts (how many of the 5–6 assessors independently suggested each title?) and sent to Opus 4.6 for a final curation pass. The curator selects the best 20 books, 20 albums, and 20 DVDs that represent the model's aesthetic — prioritising consensus, range, and diversity.

The curator (always Opus 4.6) receives every nomination across all assessors, grouped by frequency. A book nominated by 5 out of 5 assessors carries more weight than one nominated by 1. The curator is instructed to:

Prioritise strong consensus items
Represent the full range of the aesthetic, not just the dominant note
Ensure diversity across era, genre, culture, and medium
Remove near-duplicates (two books serving the same role)
Make every slot earn its place — no filler

The output is structured JSON with author/artist, title, nomination count, and a one-sentence description of why each item earned its place. Books and albums are curated together (informed by each other); DVDs are curated in a separate pass using the same methodology.

Download the data

Curated selections

The final 20 books, 20 albums, and 20 DVDs per model, as JSON.

Raw corpora & assessments

The full free-association outputs (100 conversations × 6 turns per model) plus all cross-model assessment transcripts. Use these if you want to phenotype the models in your own way.