Portfolio · An AI music teacher
I had a practice tool that listened to me play — highlighted where the timing drifted, where a note was wrong, where I was getting stronger. When it was removed, so was my progression. I wanted a teacher who could see my score, hear me play, and know me well enough to push where I was ready and back off where I wasn't. No product does this. So I'm building one.
§ Two — Today
Tom is running. I talk to him every day.
Beat 01
Tom reads the manual so I don't have to. My Roland Fantom 08 keyboard ships with 190 pages of reference documentation and 354 figures — switches, parameters, patch diagrams, MIDI tables. I ingested the whole manual into Tom's knowledge base: chunked by heading, images captioned by GPT-4o-mini in line with the text they belong to. Now when I need to know how the split point works or which patch is the Rhodes, Tom tells me — citing the page and showing the figure.
The pattern is general: ingest a vendor's technical documentation, emit an expert assistant for the product's user. It applies to any domain where the knowledge lives in dense, illustrated documents people don't read.
Beat 02
Tom lives across surfaces. Telegram for daily check-ins and quick exchanges; Claude Desktop for longer teaching conversations. Both surfaces share the same memory — when I mention a piece in Telegram on Monday, Tom knows about it in Claude Desktop on Sunday. Every incoming message is classified — coach mode for practice check-ins, teacher mode for deeper reasoning — and routed to the right model. Haiku is cheap and responsive; Sonnet is expensive and thoughtful. The right one picks the message up automatically.
Beat 03
Memory is not one store. Conversations and observations ("triplets now solid at 230bpm") are episodic — they live in mem0, semantically searchable. The weekly plan and the daily check-in state machine are structured — JSON in Qdrant with deterministic reads and writes, because "did I log my guitar session yet today" isn't a semantic question. The FANTOM reference material is its own vector collection. Three stores, each chosen for what the data actually is.
Tom runs daily. Adapts to my actual pieces — Paradise, For Whom The Bell Tolls, Slow Dancing in a Burning Room. Total cost to run: about £0.70 a month.
§ Three — Tomorrow
What Tom is today is a foundation. What he becomes is a teacher who can see the score, hear the performance, and close the loop between the two.
AppFactory
This is a large build — larger than I would attempt alone. AppFactory is what makes it feasible — the agent-system that turns architecture decisions into shipped software. Tom is the first real demonstration of what one engineer can build with AppFactory behind them.
The stack
Six modalities cohering through one teacher.
Asymmetry
One subtle call: piano and guitar need different pipelines. Piano emits MIDI natively — a clean, millisecond-accurate signal. Guitar emits audio through a pickup — messier, polyphonic, needs inference. Same teacher, two capture paths. The architecture mirrors the instruments rather than flattening them into one pipeline.
Score-Match
At the end of that pipeline: a score-aligned playback diff. Record a take, compare it to the score, colour each note as I played it — green correct, amber timing off, red wrong note, grey missed. The capability whose removal ended my RSL journey — rebuilt, and this time it lives inside a teacher who knows me.