playbook

Build a recurring analysis report, end to end

Profile the data, write and explain the queries, reconcile every result, chart the key views, and wrap it in a plain-English narrative with caveats — a report you re-run the same way every month so comparisons compound.

advanced ~half a day

when to reach for this

Someone needs the monthly numbers — the metrics review, the board update, the ops dashboard — and right now it's a half-day scramble of fragile queries and a deck you rebuild from scratch each time, slightly differently, so this month's numbers don't even compare to last month's. This is the capstone: it strings the other Data playbooks into one repeatable pipeline — profile, query, reconcile, chart, narrate — so the report comes out the same way every month. The payoff isn't just saving the half-day; it's that because the method is fixed, the month-over-month comparisons are finally trustworthy.

gather this first

The source data as files or a table — events.csv, sales.csv, customers.csv — and last month's version of this report, so you can build for comparison from the start.
The locked list of metrics the report must carry and exactly how each is defined (what 'active' means, where the month boundary falls) — the definitions can't drift between months or the comparison is meaningless.
Known totals to reconcile against — last month's figures, a row count, a finance number — and the audience and format (an exec one-pager reads nothing like an analyst appendix).

the workflow

01
Profile every source the same way (Interview a dataset before you touch it)

Start by running the profiling playbook on each input so you know this month's files are shaped like last month's. A silent change — a new blank column, a shifted date range — is what makes two months' reports secretly incomparable.
you ask
```
Profile each source for this month's report — events.csv, sales.csv, customers.csv — exactly as a dataset interview: row counts, column meanings, date ranges, blanks, duplicates. Then compare each against last month's file and flag anything that changed: a new column, a jump in blanks, a different date format. Don't compute any metrics yet.
```
what you get back A per-file profile plus a change report: "events.csv this month 1.1M rows vs 980K last month; new column experiment_id; blank-rate on region steady. sales.csv unchanged." You catch a schema change before it quietly breaks the comparison.

This 'is the data shaped like last time' check is what makes month-over-month numbers comparable. Skip it and you're comparing two slightly different things and calling it a trend.
02
Write and explain every metric query (Write the query and learn the SQL)

Build each metric as an explained, reusable query — locked to the agreed definitions — so the report computes the *same* thing every month and you can defend each number when asked.
you ask
```
For each metric in the locked list — active users by plan, revenue by region, new vs returning customers — write the query, walk me through each clause in plain English, and use the exact definitions we agreed (active = a login in the month, UTC month boundaries). Save each as a named, commented .sql file so next month reruns the identical logic.
```
what you get back One commented .sql per metric plus a walkthrough of each, all pinned to the same definitions — active-users.sql, revenue-by-region.sql — so the report is computed identically every month, not improvised.
03
Reconcile every result and resolve the surprises (Find out why the number moved)

Tie each metric to a known total, then for anything that moved meaningfully, run the root-cause discipline — baseline, rule out a data artifact, break down by dimension — so the narrative explains real changes, not query bugs.
you ask
```
Reconcile each metric against last month and against the raw row counts — do the totals tie out? For any metric that moved more than 10%, before we report it: rule out a data artifact (missing days, a tracking change), then break it down by the right dimension to find what actually moved. Tell me which changes are real and which are data quirks.
```
what you get back A reconciliation table ("active users 14,210, ties to raw count; revenue +18% MoM") and, for each big mover, a verified cause: "revenue jump is real, driven by West region; the churn 'spike' is a data artifact — last month's file was missing two days." Real changes separated from noise.

Reconcile against a known total every single month, and never let an unexplained move into the narrative. An unreconciled number in a recurring report compounds — it sits in the trend line misleading everyone until someone finally checks.
04
Chart the key views (From a raw CSV to a chart you can paste into a slide)

Render the handful of views that carry the story as labeled image files, using the same chart choices as last month so the visuals are comparable at a glance — and verify each chart's numbers against the reconciled tables.
you ask
```
Chart the key views as PNGs at slide resolution, matching last month's chart types exactly: revenue-by-month line, active-users-by-plan bars, MoM growth line. Title and label each with units, and confirm each chart's numbers match the reconciled tables before saving. Then assemble them into one report-dashboard.png with a takeaway headline.
```
what you get back A consistent set of labeled PNGs and a report-dashboard.png that looks like last month's so a reader can compare instantly — each chart's numbers already confirmed against the reconciled results, not redrawn from scratch.
05
Wrap it in a narrative with caveats — and save the runbook

Turn the reconciled numbers and charts into a plain-English story an exec can read in two minutes, with an explicit caveats section — then capture the whole pipeline as a runbook (or command) so next month is a rerun, not a rebuild.
you ask
```
Write the report as a plain-English narrative for execs: the 3 headline takeaways, what moved and the verified reason, and an honest 'caveats and data limitations' section (the churn artifact, the missing days, anything we excluded). Then write a runbook listing every step, file, and query so I — or anyone — can rerun this exact report next month and get a comparable result.
```
what you get back A two-minute narrative report ("Revenue up 18% MoM, led by West; active users flat; note: churn understated this month due to a data gap") plus a report-runbook.md that turns the half-day pipeline into a repeatable checklist.

The caveats section is what makes the report trustworthy — a confident report that hides its data gaps is worse than one that names them. State what you're unsure of.

make it your own

**A new source needs joining first:** if this month's report pulls together two files (CRM + billing), run *Turn two messy files into one trustworthy dataset* before step 2, and feed the reconciled combined file into the queries — the same match_status and totals discipline keeps the merged input trustworthy.
**Make it actually recurring:** once the runbook is stable, wrap the pipeline as a /monthly-report custom command or hand it to a scheduled agent (see the *Features* tab) that profiles the new data, reruns the saved queries, and drafts the narrative for you to review — turning a half-day into a 30-minute review pass.
**Different cadence or audience:** collapse to a weekly one-pager (profile → top 3 metrics → reconcile → one chart → three sentences) for a standup, or expand with a full analyst appendix of the explained queries — the five-step spine is the same at any size.

watch out for

The whole value is *comparability over time*, so the threat is silent drift — a redefined metric, a changed chart type, an unprofiled new file. Hold the definitions and the method fixed across months; if you must change one, say so explicitly in the caveats so the trend line stays honest.
Recurring reports carry sensitive data by nature — revenue, customers, headcount. Aggregate before charting, keep raw files local, never paste identifying rows into a prompt, and have a human confirm the numbers before the report is distributed.
Claude runs the pipeline; a human owns the report. Every headline number gets reconciled, every big mover gets a verified cause, and a person signs off before it goes out — Claude takes you from raw files to a defensible draft fast, it doesn't remove the judgment of what to claim and what to flag.

you'll end up with A complete, repeatable monthly report — profiled sources, explained and reconciled queries, comparable charts, a plain-English narrative with honest caveats, and a runbook that lets you rerun it identically — so each month's numbers genuinely compare to the last.

Profile every source the same way (Interview a dataset before you touch it)

Write and explain every metric query (Write the query and learn the SQL)

Reconcile every result and resolve the surprises (Find out why the number moved)

Chart the key views (From a raw CSV to a chart you can paste into a slide)

Wrap it in a narrative with caveats — and save the runbook