playbook
Write the query and learn the SQL
Get the exact query for the question you actually have, a line-by-line plain-English walkthrough of every clause, and a reconciliation against a known total — the answer and the understanding to defend it.
when to reach for this
You need a number — monthly active users by plan, revenue by region, whatever your manager just asked for — and you know roughly what SQL is but not enough to write it cold, or to trust a query someone hands you. Copy-pasting a query you don't understand is how a wrong number ends up in a deck with your name on it. This system gets you the query, a walkthrough that teaches you what each clause does, and a reconciliation step — so you can both produce the answer and explain it when someone pushes back.
gather this first
- The schema or a sample: the table name and its columns — paste the
CREATE TABLEor aschema.sql, or just a few example rows fromevents.csv. - The question in one plain sentence ("active users by plan, last 6 months") and how you'd define the fuzzy words in it — what counts as 'active', where the month boundary falls.
- A known total to reconcile against — a row count from the *Interview a dataset before you touch it* profile, or a number from last month's report.
the workflow
-
State the question precisely before any SQL
Most wrong queries come from a fuzzy question, not bad syntax. Make Claude pin down the definitions first — 'active' and 'last 6 months' mean nothing until you agree on them — so the query answers the question you actually have.
you askI want monthly active users by plan for the last 6 months from the events table. Before you write any SQL, read it back to me: how are you defining 'active', where does each month start and end, and which timezone? List any assumption you're making. Don't write the query yet.what you get back A short read-back: "active = at least one event in the month; months are calendar months in UTC; last 6 means Jan–Jun." If 'active' should mean a *login* not any event, you fix that here — before it's baked into the query.
This 'read the question back to me first' step is the whole game. A query that answers the wrong question runs perfectly and lies confidently.
-
Get the query with a line-by-line explanation
Ask for the SQL and the walkthrough in the same breath. The explanation is what turns this from copy-paste into something you understand well enough to modify and defend.
you askNow write the query for that. Then walk me through it clause by clause in plain English: what the WHERE filter does, what GROUP BY plan, month groups together, why you used that date function, and what each column in the output means. Assume I'm not a SQL expert.what you get back The query, then a numbered walkthrough: "WHERE event_date >= '2026-01-01' keeps only the last 6 months; GROUP BY plan, month makes one row per plan per month; COUNT(DISTINCT user_id) counts each user once even if they fired ten events." You leave knowing why, not just what.
-
Reconcile the result against a known total
A number you can't trace back to something you already trust is a number you can't share. Tie the query's output to the row count or a prior report before you believe it.
you askRun it and show me the result. Then reconcile: does the sum of users across all plans and months line up with the distinct user count in the raw table? If they don't match, find the gap — a filter dropping rows, a NULL plan, a duplicate — and tell me which.what you get back The result table plus a reconciliation: "14,210 distinct users total, matches the raw table — except 38 rows have a NULL plan and were excluded; here's the count with and without them." Now the number is defensible, edge cases and all.
Reconcile against a *known* total — a row count or last month's figure. A result that matches nothing you already trust is just a confident guess.
-
Stress-test the edges and save the explained version
Have Claude attack its own query for the cases that quietly break aggregates — duplicates, NULLs, timezone boundaries — then keep the commented version so future-you (and a reviewer) can read it.
you askWhat could make this query wrong that the result wouldn't reveal? Check for double-counting, NULLs in the GROUP BY, off-by-one month boundaries, and timezone edge cases. Then give me the final query with a one-line comment above each clause explaining it, ready to save as active-users.sql.what you get back A short risk list ("a user who switched plans is counted under both — is that intended?") and a commented
active-users.sqlyou can reread in three months and still understand.
make it your own
- **Spreadsheet, not a database:** same flow against a CSV — Claude can compute the answer directly over
events.csvand still explain each step; you don't need a live database to learn the logic. - **The number moved and you don't know why:** when the question is 'why did this change,' hand the explained query straight into *Find out why the number moved* — having the query already understood makes the root-cause hunt far faster.
- **A query you run every month:** once it's explained and reconciled, save the prompt as a
/maucustom command or a skill (see the *Features* tab) so the recurring number comes back the same way each time — the foundation the *Build a recurring analysis report* capstone leans on.
watch out for
- A query can be syntactically perfect and answer the wrong question. The read-back in step 1 is non-negotiable — verify the *definition* of every fuzzy word ('active', 'churned', 'month') before you trust the *output*.
- If the table holds personal data, run the query locally and share only the aggregate. Paste a schema and a few sample rows when you can rather than dumping a raw table full of emails into the prompt.
- Claude writes and explains the query; you own whether the answer is right. Read the walkthrough, reconcile the total, and confirm the logic matches your intent — don't ship a number you couldn't explain out loud.
you'll end up with The exact query for your real question, a plain-English walkthrough that taught you why it works, a reconciliation that proves the number, and a commented `.sql` file you can rerun and defend.