playbook
Build a recurring analysis report, end to end
Profile the data, write and explain the queries, reconcile every result, chart the key views, and wrap it in a plain-English narrative with caveats — a report you re-run the same way every month so comparisons compound.
when to reach for this
Someone needs the monthly numbers — the metrics review, the board update, the ops dashboard — and right now it's a half-day scramble of fragile queries and a deck you rebuild from scratch each time, slightly differently, so this month's numbers don't even compare to last month's. This is the capstone: it strings the other Data playbooks into one repeatable pipeline — profile, query, reconcile, chart, narrate — so the report comes out the same way every month. The payoff isn't just saving the half-day; it's that because the method is fixed, the month-over-month comparisons are finally trustworthy.
gather this first
- The source data as files or a table —
events.csv,sales.csv,customers.csv— and last month's version of this report, so you can build for comparison from the start. - The locked list of metrics the report must carry and exactly how each is defined (what 'active' means, where the month boundary falls) — the definitions can't drift between months or the comparison is meaningless.
- Known totals to reconcile against — last month's figures, a row count, a finance number — and the audience and format (an exec one-pager reads nothing like an analyst appendix).
the workflow
-
Profile every source the same way (Interview a dataset before you touch it)
Start by running the profiling playbook on each input so you know this month's files are shaped like last month's. A silent change — a new blank column, a shifted date range — is what makes two months' reports secretly incomparable.
you askProfile each source for this month's report — events.csv, sales.csv, customers.csv — exactly as a dataset interview: row counts, column meanings, date ranges, blanks, duplicates. Then compare each against last month's file and flag anything that changed: a new column, a jump in blanks, a different date format. Don't compute any metrics yet.what you get back A per-file profile plus a change report: "events.csv this month 1.1M rows vs 980K last month; new column experiment_id; blank-rate on region steady. sales.csv unchanged." You catch a schema change before it quietly breaks the comparison.
This 'is the data shaped like last time' check is what makes month-over-month numbers comparable. Skip it and you're comparing two slightly different things and calling it a trend.
-
Write and explain every metric query (Write the query and learn the SQL)
Build each metric as an explained, reusable query — locked to the agreed definitions — so the report computes the *same* thing every month and you can defend each number when asked.
you askFor each metric in the locked list — active users by plan, revenue by region, new vs returning customers — write the query, walk me through each clause in plain English, and use the exact definitions we agreed (active = a login in the month, UTC month boundaries). Save each as a named, commented .sql file so next month reruns the identical logic.what you get back One commented
.sqlper metric plus a walkthrough of each, all pinned to the same definitions —active-users.sql,revenue-by-region.sql— so the report is computed identically every month, not improvised. -
Reconcile every result and resolve the surprises (Find out why the number moved)
Tie each metric to a known total, then for anything that moved meaningfully, run the root-cause discipline — baseline, rule out a data artifact, break down by dimension — so the narrative explains real changes, not query bugs.
you askReconcile each metric against last month and against the raw row counts — do the totals tie out? For any metric that moved more than 10%, before we report it: rule out a data artifact (missing days, a tracking change), then break it down by the right dimension to find what actually moved. Tell me which changes are real and which are data quirks.what you get back A reconciliation table ("active users 14,210, ties to raw count; revenue +18% MoM") and, for each big mover, a verified cause: "revenue jump is real, driven by West region; the churn 'spike' is a data artifact — last month's file was missing two days." Real changes separated from noise.
Reconcile against a known total every single month, and never let an unexplained move into the narrative. An unreconciled number in a recurring report compounds — it sits in the trend line misleading everyone until someone finally checks.
-
Chart the key views (From a raw CSV to a chart you can paste into a slide)
Render the handful of views that carry the story as labeled image files, using the same chart choices as last month so the visuals are comparable at a glance — and verify each chart's numbers against the reconciled tables.
you askChart the key views as PNGs at slide resolution, matching last month's chart types exactly: revenue-by-month line, active-users-by-plan bars, MoM growth line. Title and label each with units, and confirm each chart's numbers match the reconciled tables before saving. Then assemble them into one report-dashboard.png with a takeaway headline.what you get back A consistent set of labeled PNGs and a
report-dashboard.pngthat looks like last month's so a reader can compare instantly — each chart's numbers already confirmed against the reconciled results, not redrawn from scratch. -
Wrap it in a narrative with caveats — and save the runbook
Turn the reconciled numbers and charts into a plain-English story an exec can read in two minutes, with an explicit caveats section — then capture the whole pipeline as a runbook (or command) so next month is a rerun, not a rebuild.
you askWrite the report as a plain-English narrative for execs: the 3 headline takeaways, what moved and the verified reason, and an honest 'caveats and data limitations' section (the churn artifact, the missing days, anything we excluded). Then write a runbook listing every step, file, and query so I — or anyone — can rerun this exact report next month and get a comparable result.what you get back A two-minute narrative report ("Revenue up 18% MoM, led by West; active users flat; note: churn understated this month due to a data gap") plus a
report-runbook.mdthat turns the half-day pipeline into a repeatable checklist.The caveats section is what makes the report trustworthy — a confident report that hides its data gaps is worse than one that names them. State what you're unsure of.
make it your own
- **A new source needs joining first:** if this month's report pulls together two files (CRM + billing), run *Turn two messy files into one trustworthy dataset* before step 2, and feed the reconciled combined file into the queries — the same
match_statusand totals discipline keeps the merged input trustworthy. - **Make it actually recurring:** once the runbook is stable, wrap the pipeline as a
/monthly-reportcustom command or hand it to a scheduled agent (see the *Features* tab) that profiles the new data, reruns the saved queries, and drafts the narrative for you to review — turning a half-day into a 30-minute review pass. - **Different cadence or audience:** collapse to a weekly one-pager (profile → top 3 metrics → reconcile → one chart → three sentences) for a standup, or expand with a full analyst appendix of the explained queries — the five-step spine is the same at any size.
watch out for
- The whole value is *comparability over time*, so the threat is silent drift — a redefined metric, a changed chart type, an unprofiled new file. Hold the definitions and the method fixed across months; if you must change one, say so explicitly in the caveats so the trend line stays honest.
- Recurring reports carry sensitive data by nature — revenue, customers, headcount. Aggregate before charting, keep raw files local, never paste identifying rows into a prompt, and have a human confirm the numbers before the report is distributed.
- Claude runs the pipeline; a human owns the report. Every headline number gets reconciled, every big mover gets a verified cause, and a person signs off before it goes out — Claude takes you from raw files to a defensible draft fast, it doesn't remove the judgment of what to claim and what to flag.
you'll end up with A complete, repeatable monthly report — profiled sources, explained and reconciled queries, comparable charts, a plain-English narrative with honest caveats, and a runbook that lets you rerun it identically — so each month's numbers genuinely compare to the last.