playbook

Audit the month's spend

Scan transactions for anything unusual — charges far above normal, duplicates, vendors you've never paid — suggest a category for every uncategorized row, and surface the one line that needs explaining to leadership.

medium ~40 min

when to reach for this

Spend creeps in quietly: a subscription that auto-renewed at 3x last year's price, a vendor charge nobody recognizes, the same invoice paid twice. By the time it shows up in a quarterly review, three months have leaked. A spend audit is the monthly loop that catches it early — scan the transactions for outliers, duplicates, and first-time vendors, tidy up the uncategorized rows, and pull out the one line you'd actually have to explain to leadership. It's a habit, not a fire drill.

gather this first

This month's transactions, profiled via *Make a messy export trustworthy* — transactions.csv with date, vendor, amount, and (where it exists) category.
A baseline to compare against — last month's or the trailing 3 months' transactions, so 'unusual' means something relative to your normal.
Your category list — the buckets you actually use (Payroll, Hosting, Software, Travel, Contractors) so suggested categories match your chart of accounts, not generic ones.

the workflow

01
Establish what normal looks like

You can't flag 'unusual' without defining 'usual.' Have Claude summarize the baseline first — typical spend per category and per vendor — so outliers are measured against your real pattern, not a guess.
you ask
```
Read transactions.csv and last-3-months.csv. Before flagging anything, tell me what normal looks like: average monthly spend per category, the top 10 vendors by spend, and the typical range for each. Don't flag outliers yet — just establish the baseline.
```
what you get back A baseline snapshot: "Software averages 9,200/mo across 14 vendors; Hosting 18,000; top vendor is Payroll Co at 165,000. Most single charges fall between 50 and 4,000." Now 'unusual' has a yardstick.

Anomaly detection without a baseline is just guessing. The baseline step is what turns 'this feels high' into 'this is 4x the trailing average.'
02
Flag the three kinds of anomaly

Name the specific patterns — large outliers, duplicates, new vendors — so Claude checks each deliberately instead of giving a vague 'nothing jumps out.' Each one is a different kind of risk.
you ask
```
Now scan this month for three things: (1) charges far above the normal range for their category or vendor, (2) likely duplicate charges — same vendor, amount, and near-same date, (3) vendors that don't appear in the trailing 3 months. List each with the row, the amount, and why it's flagged. Mark each as a flag to check, not a confirmed problem.
```
what you get back A flag list: "Outlier: Cloud Co 14,200 vs usual ~4,000 (3.5x). Possible duplicate: Tool Co 480 charged 5/12 and 5/13. New vendor: Northwind Partners 6,800 — first appearance." Each tagged as something to verify.
03
Categorize the loose ends

Uncategorized rows are where spend hides from every report. Have Claude propose a category for each, matched to your real buckets, but as suggestions you confirm — a wrong category is a wrong report.
you ask
```
List every uncategorized row and suggest a category for each from my list (Payroll, Hosting, Software, Travel, Contractors), with a one-line reason from the vendor name or amount. Where you're unsure, say so and offer two options instead of guessing. Don't change the file — I'll confirm each.
```
what you get back A suggestion table: "GitHub 21 → Software (dev tool). Delta Air 640 → Travel (airline). 'Northwind Partners 6,800' → uncertain: Contractors or Software? Needs your call." Honest uncertainty where it exists, not false confidence.
04
Surface the one line for leadership

A list of 12 flags is noise; the single line that needs a sentence in the leadership update is signal. Force the audit down to the one thing someone above you would ask about.
you ask
```
Of everything flagged, which single line most needs explaining to leadership this month, and why? Give me one sentence I could put in the update, plus a short list of the items I should personally verify before we close. Save the full audit as spend-audit-may.md.
```
what you get back A sharp callout — "Software jumped 8,400 this month, driven by a one-time Cloud Co overage of 14,200 worth a sentence in the update" — plus a verify-these list and a saved spend-audit-may.md you can re-run next month.

make it your own

**Feeds the summary:** the one-line callout drops straight into *Brief leadership in plain English* as the 'thing to watch' — run the audit first so the summary writes itself.
**Tune the thresholds:** if 'far above normal' flags too much noise, tell Claude to only flag charges over 2.5x the category average or above a dollar floor — calibrate it to your tolerance.
**Run it on a schedule:** once stable, a scheduled agent (see the *Features* tab) can run the audit when the export lands and surface only the flags — but a human investigates each one before acting.

watch out for

Every flag is a hypothesis, not a verdict. A 'new vendor' might be a renamed old one; a 'duplicate' might be two legitimate charges. Claude surfaces; a human verifies before anything is reversed.
Don't let suggested categories auto-apply. A miscategorized row quietly distorts every downstream total — confirm each, especially the ones Claude marked uncertain.
Transaction data is sensitive. Keep it in your approved workspace, swap any account numbers or personal detail for [placeholder] before sharing, and never let a flag become an accusation about a person without checking the facts.

you'll end up with A monthly spend audit that catches outliers, duplicates, and unknown vendors early, cleans up uncategorized rows with your sign-off, and hands you the one line leadership needs to hear — plus a saved audit you re-run next month.

Establish what normal looks like

Flag the three kinds of anomaly

Categorize the loose ends

Surface the one line for leadership