The metrics everyone tracks vs. the ones that matter

Jun 9, 2026

Over the past few months, several big tech companies started ranking employees on how many AI tokens they consume. Internal leaderboards. Nvidia's CEO said he'd be deeply alarmed if an engineer making $500K wasn't burning $250K of compute.

The logic felt sound: tokens are measurable, real time, and tied directly to cost. Heavy usage must mean heavy adoption.

You can guess what happened. Engineers reportedly wrote scripts that ran models in loops overnight, burning tokens to climb the leaderboard. The Databricks CEO pointed out how easy the metric is to game: resubmit the same query ten times, run a loop that does nothing useful. Meta quietly shut its leaderboard down in April after the story broke.

There's a name for this. Goodhart's Law, after the economist Charles Goodhart: when a measure becomes a target, it stops being a good measure. Software learned this once already, in the 80s, with lines of code. Engineers wrote longer programs, not better ones. Tokenmaxxing is the same movie with a bigger budget.

I keep thinking about this when I look at eCom dashboards.

ROAS is the token counter. Visible, it updates hourly, looks rigorous in a deck. So it becomes the target. A team can spend a full quarter moving it from 2.8 to 3.1 and feel productive, the same way an engineer burning tokens overnight felt productive.

Meanwhile the numbers that decide whether the store actually makes money mostly go unmeasured. Checkout-to-purchase rate. Revenue per visitor. The share of first-time buyers who used a discount they never needed. No dashboards.

So they never become targets. So nobody manages them.

The lesson from tokenmaxxing isn't that tokens are bad. It's that the easiest number to see is rarely the number that matters.

The metrics everyone tracks vs. the ones that matter

Other posts

Your customers are asking ChatGPT about your category. You're not in the answer.

Your customers are asking ChatGPT about your category. You're not in the answer.

Your customers are asking ChatGPT about your category. You're not in the answer.

Everyone celebrates the sale. Nobody mourns the churn.

Everyone celebrates the sale. Nobody mourns the churn.

Everyone celebrates the sale. Nobody mourns the churn.