Which AI vendors actually
let you track what you spend?
Nobody should make the two admins at your company doing a second job cobble together usage reports downloaded from a wonky dashboard. Non-aggregated atomic units of spend should be programmatically available to integrate into major cloud cost optimization tools.
This leads us here -- a comparison of cost APIs, usage endpoints, and billing exports across AI providers.
| Vendor | Cost API | Usage API | Billing Export | Granularity | Grade |
|---|---|---|---|---|---|
| Yes | Yes | Yes | Per-model, per-token, per-invocation | A+ | |
| Yes | Yes | Yes | Per-model, per-token, per-deployment, subscription-level | A+ | |
| Yes | Yes | Yes | Per-model, per-prediction, SKU-level | A+ | |
| Yes | Yes | Partial | Per-model, per-request, tokens aggregated to API key* | A- | |
| Yes | Yes | Partial | Per-model, per-request tokens | A- | |
| Yes | No | Yes | SKU and character-transaction level detail | B | |
| Partial | Yes | Partial | Per-model, per-request tokens | B | |
| Partial | Yes | Partial | Per-model, per-request (input tokens, output tokens, search units) | B | |
| Partial | Yes | Partial | Per-character, per-product, per-voice, per-API-key | B | |
| Yes | Partial | Yes | Per-request USD cost in response (opt-in), credit balance | B | |
| Partial | Partial | No | Per-model, per-request tokens | B- | |
| Partial | Partial | No | Compute seconds per run; multiply by hardware rate for USD | B- | |
| Partial | Yes | No | Credit balance and per-generation dollar cost in response | B- | |
| Partial | Yes | No | Credit balance and 90-day usage history by model and day | B- | |
| Partial | Partial | No | Credit balance via API, per-operation dollar costs | B- | |
| Partial | Partial | No | Credit balance via billing endpoint; credits require rate conversion | B- | |
| Yes | Yes | No | Credit balance via API, per-generation credit costs | B- | |
| Partial | Partial | No | Per-request cost in response, credit balance via API | B- | |
| Partial | Yes | No | Per-request, per-model, per-user token and cost data | B- | |
| No | Partial | Partial | Per-request token counts in response only | C | |
| No | No | Yes | Per-request token counts | D | |
| No | Partial | No | Per-request token counts in API response | D | |
| No | Partial | No | Per-request compute time; separate pricing and usage platform APIs | D | |
| No | Partial | No | Index-level metrics | D | |
| No | No | No | None | F | |
| No | No | No | None | F |
Per-model, per-token, per-deployment, subscription-level
Per-model, per-request, tokens aggregated to API key*
Per-model, per-request (input tokens, output tokens, search units)
Per-character, per-product, per-voice, per-API-key
Per-request USD cost in response (opt-in), credit balance
Compute seconds per run; multiply by hardware rate for USD
Credit balance and per-generation dollar cost in response
Credit balance and 90-day usage history by model and day
Credit balance via API, per-operation dollar costs
Credit balance via billing endpoint; credits require rate conversion
Credit balance via API, per-generation credit costs
Per-request cost in response, credit balance via API
Per-request, per-model, per-user token and cost data
Per-request token counts in response only
Per-request token counts in API response
Per-request compute time; separate pricing and usage platform APIs
Methodology
Grades
Grades are based on whether a vendor provides programmatic access to cost data, usage metrics, and billing exports.