‧
7 min read
How we built ten custom agents to tame our giant codebase
Bryan Maass
‧ 7 min read
Share this article
Metabase’s backend is big. We’re talking 500K lines of Clojure code spread across a query processor, permissions system, numerous database drivers, a notification pipeline, serialization layer, search engine, and more. And like all big codebases, each subsystem has its own idioms, gotchas, and “you just have to know” moments.
I’ve been using Claude Code for backend work on Metabase for a while now. It’s pretty good. But overloads Claude’s context window quickly. Every time Claude needs to understand a subsystem, it explores, greps, and reads files. All of that exploration eats your context window. Even when Claude spawns subagents, they will need to do a lot of extra work to get up-to-speed on the domain.
I built some custom subagents to fix this.
What are subagents and why did I make ten of them?
Metabase’s backend has natural domain boundaries. The query processor is a 68-stage middleware pipeline that compiles MBQL (the Metabase Query Language) to SQL across 18 database dialects. The permissions system is a multi-granularity graph that handles row-level security, db routing, and connection impersonation. The notification system renders charts to images inside a JVM. These are different worlds.
A single generalist Claude session can navigate any of them, but it pays a context tax every time it switches domains. Subagents eliminate that tax by front-loading domain knowledge into the system prompt.
Subagents are a Claude Code feature that lets you define specialized AI assistants as markdown files. They each get their own context window, system prompt, memory, toolkit, and model selection.
I used Claude to write the “job descriptions” for each agent. I described the domain and what an expert would know, and Claude helped me flesh out the codebase locations, investigation patterns, caveats, and testing strategies. Each agent ended up being roughly 2,000-3,000 tokens worth (about 150 lines of markdown) of dense, useful context that can’t be easily inferred from the code.
What’s inside an agent file?
Each agent is a markdown file that follows the same pattern of, domain knowledge → codebase locations → investigation approach → caveats → testing strategies. It’s a “here’s everything you need to be useful in this corner of the codebase” document.
Every file starts with YAML frontmatter:
---
name: mbql-expert
description: "Use this agent when working on Metabase's
query processor, MBQL query language, SQL compilation..."
model: opus
memory: user
---
The description tells Claude when to delegate. The model picks which Claude model the subagent uses. And memory: user gives the agent a persistent directory at ~/.claude/agent-memory/mbql-expert/ where it records learnings across sessions.
The body of the file is the actual domain knowledge. Here’s a trimmed look at what the mbql-expert knows:
You are a senior backend engineer with deep expertise
in Metabase's query processor (QP), MBQL query language,
and the entire query compilation pipeline.
## Your Domain Knowledge
### The Query Processor Pipeline
You understand the QP's ring-style middleware pipeline
with its four phases:
- **Around middleware** (3 layers)
- **Preprocessing** (44 layers) — source card resolution,
parameter substitution, join resolution, temporal bucketing...
- **Execution** (8 layers) — caching, permissions, result metadata
- **Postprocessing** (13 layers) — formatting, timezone conversion...
### Key Codebase Locations
- `src/metabase/query_processor/` — QP core
- `src/metabase/driver/sql/` — SQL driver base
- `modules/drivers/` — database-specific drivers
### Important Caveats
- Middleware ordering matters. Adding middleware in the wrong
position causes subtle bugs.
- A fix at the `:sql` level affects ALL SQL databases.
- BigQuery is not standard SQL. Oracle has no BOOLEAN type.
### REPL-Driven Development
Use `clj-nrepl-eval` to evaluate middleware transformations
step by step...
The ten agents
I used Claude to help with defining these specific agents, framed as “job descriptions” like you’d post online (but specific to each section of our code). Our module system and namespace documentation help here, but I reviewed it to make sure it was reasonable.
| Agent | Domain |
|---|---|
| mbql-expert | Query processor, MBQL language, SQL compilation, middleware pipeline, HoneySQL, streaming execution |
| permissions-expert | Access control, sandboxing, SSO (SAML/OIDC/LDAP), connection impersonation, embedding security |
| platform-expert | App database, HTTP server, API framework, settings system, migrations, Quartz scheduling |
| enterprise-expert | Serialization, SCIM provisioning, multi-tenancy, database routing, dependency tracking |
| content-expert | Collections, dashboards, cards, models, metrics, revisions, parameter mappings |
| notifications-expert | Dashboard subscriptions, alerts, email/Slack rendering, chart image generation |
| drivers-and-sync | Database drivers, metadata sync, fingerprinting, type mapping, connection management |
| search-expert | Search indexing, scoring/ranking, X-ray auto-analysis, semantic search |
| ai-expert | Metabot v3, LLM tool calling, context engineering, SQL generation |
| transforms-expert | Data actions, CSV uploads, transform pipeline, workspace management, model persistence |
I’ve made all ten markdown files available for you to take a look at.
My Favorite: mbql-expert
The query processor is the heart of Metabase and the hardest thing to navigate. It’s a 68-stage middleware pipeline where a query enters as MBQL, gets rewritten 44 times during preprocessing, compiled to SQL via HoneySQL, executed, and then post-processed through 13 more stages. Oh, and some middleware runs twice because later stages can introduce structure that earlier stages need to process again. Query processing is a complex problem, especially when dealing with many different databases.
The mbql-expert already knows all of this. When I say “trace why this nested query with joins produces wrong results on Redshift,” it doesn’t start by grepping. It reasons about which middleware stages touch join aliases, checks Redshift-specific driver overrides, and examines the HoneySQL output. That’s the difference between a generalist exploring and a specialist investigating.
How I actually use the agents
The nice thing is you don’t need special syntax. Just mention the agent:
- “Bounce this off the enterprise expert — will this serialization change break round-trip import/export?”
- “Ask the permissions expert how row-and-column security interacts with joined tables.”
- “Have the mbql expert review this HoneySQL compilation change.”
Claude reads the intent and delegates work to the right agent. If you want to be explicit, you can also @-mention agents directly.
One useful pattern for explicit mentions: launching multiple agents in parallel. When reviewing a change that touches the query processor and permissions, I’ll ask Claude to have both experts weigh in simultaneously. Each expert investigates in its own context, and the results come back without cross-contaminating each other’s exploration.
Tips for making your own subagents
The pattern I’ve described works for any large codebase with distinct subsystems. See Claude’s subagent documentation for the details on how to structure the files.
Here are a few things that worked for me:
- Have Claude help you write the agents: describe the domain and what an expert would know, and iterate on the system prompt together. (That’s how I built these.)
- Spend more time iterating on the description than the actual system prompts. The 2-3 sentence description in the frontmatter of each Markdown file is what Claude reads to decide when to delegate. A description that says “use for query processor work” is too vague, Claude won’t reliably match it. You want specific trigger words: “MBQL query language, SQL compilation, middleware pipeline, HoneySQL, streaming execution.” Think of it as writing a routing rule, not a job title.
- I include codebase locations in every agent, but the most durable content is the investigation patterns and caveats. Directories get renamed but the fact that “some middleware runs twice because later stages introduce structure that earlier stages need to re-process” isn’t going away anytime soon.
Personal agents live in ~/.claude/agents/, and project-local agents go in .claude/agents/.
Now go build some agents!