Tracking AI vendor transparency

Which AI vendors actually
let you track what you spend?

Nobody should make the two admins at your company doing a second job cobble together usage reports downloaded from a wonky dashboard. Non-aggregated atomic units of spend should be programmatically available to integrate into major cloud cost optimization tools.

This leads us here -- a comparison of cost APIs, usage endpoints, and billing exports across AI providers.

39 vendors

Vendor	Cost API	Usage API	Billing Export	Granularity	Grade
AWS	Yes	Yes	Yes	Per-model, per-token, per-invocation; plus per-IAM-principal, per-application-inference-profile, per-project/workspace, and per-request-metadata attribution (native dollar grain is per usage type per day)	A+
Azure	Yes	Yes	Yes	Per-model, per-token, per-deployment, subscription-level	A+
GCP	Yes	Yes	Yes	Per-model, per-prediction, SKU-level	A+
Alibaba Qwen	Yes	Yes	Yes	Cost/usage by billing-item, instance, model, API key, workspace, token type; daily or monthly cycle	A+
Modal	Yes	Yes	Yes	per-App, per-day or per-hour cost; optional per-App-tag breakdown	A+
OpenAI	Yes	Yes	Partial	Per-model, per-request, tokens aggregated to API key*	A-
Anthropic	Yes	Yes	Partial	Per-model, per-workspace, per-API-key, per-service-tier, per-context-window; aggregated in 1m/1h/1d buckets, per-request tokens in the response usage object	A-
fal.ai	Yes	Yes	Partial	Per-request and per-billing-event cost in USD; aggregated usage by timeframe (minute/hour/day/week/month); on-demand FOCUS CSV report	A-
DeepInfra	Yes	Yes	Partial	per-request (POST /v1/request-costs, costNanoUsd) and per-model-per-month aggregate (GET /payment/usage)	A-
Novita AI	Yes	Yes	Partial	per-bill line with cycleType (Hour/Day/Week/Month) and productCategory (llm/gpu/storage); monthly bills; USD as 1/10000 USD	A-
RunPod	Yes	Yes	Partial	per serverless-endpoint / pod / GPU-type, time-bucketed (hour to year) over an arbitrary startTime/endTime range	A-
Oracle	Yes	No	Yes	SKU and character-transaction level detail	B
Mistral AI	Partial	Yes	Partial	Per-model, per-request tokens	B
Cohere	Partial	Yes	Partial	Per-model, per-request (input tokens, output tokens, search units)	B
DeepSeek	Partial	Partial	Partial	per-request token counts in response; per-currency balance via GET /user/balance; manual per-month export	B
Fireworks AI	Partial	Yes	Yes	per-request response usage; daily buckets (billingUsage) grouped by model/api_key/deployment/tags; per-event export-metrics CSV (tokens-only)	B
SiliconFlow	Partial	Partial	Partial	per-request token counts in response; account balance in USD; aggregated usage/cost dashboard-only	B
ElevenLabs	Partial	Yes	Partial	Per-character/credit, per-product, per-voice, per-API-key, per-user, per-model, per-region, per-workspace; USD spend via fiat_units_spent metric	B
Nebius	Partial	Partial	Yes	hourly FOCUS 1.2 line items with per-resource attribution over a billing-period range; Token Factory per-request token counts	B
AI21 Labs	Partial	Partial	No	per-request token usage in response; dashboard aggregates by model	B-
xAI	Yes	Yes	No	per-request response usage (cost_in_usd_ticks, 1 USD = 1e10 ticks) plus aggregated history via Management API /usage with time buckets, group_by, and aggregations	B-
Groq	Partial	Partial	No	Per-model, per-request tokens	B-
Replicate	Partial	Partial	No	Compute seconds per run; multiply by hardware rate for USD	B-
Baseten	Yes	Yes	No	Per-deployment / per-model / per-training-job; daily; tokens and billable minutes	B-
Leonardo.ai	Partial	Yes	No	Per-generation cost object (amount plus unit, dollars or credits) in response; USD account balance under PAYG; credit balance via GET /me	B-
Runway	Partial	Yes	No	Credit balance and 90-day usage history by model and day	B-
Runware	Yes	Yes	No	Per-request USD cost in response (opt-in); account balance and aggregated usage (credits/requests) via accountManagement getDetails for today/7d/30d/lifetime	B-
Recraft	Partial	Partial	No	Credit balance via API, per-operation dollar costs	B-
Mureka	Partial	Partial	No	Credit balance via billing endpoint; credits require rate conversion	B-
Luma	Yes	Yes	No	Credit balance via API (returned in USD cents); per-generation credit costs; queryable generations list endpoint	B-
Black Forest Labs	Partial	Partial	No	Per-request cost in response, credit balance via API	B-
Cursor AI	Partial	Yes	No	Per-request, per-model, per-user token and cost data	B-
BytePlus	No	Yes	No	Per-request token counts in API response; aggregated token usage queryable by day/hour via GetUsage API, by project/endpoint	C
Together AI	No	Partial	Partial	Per-request token counts in response only	C
Cerebras	No	Partial	Partial	per-request token usage in response; Prometheus per-minute metrics; Console cost by model/token type per month	C
MiniMax.io	No	Partial	No	Per-request token counts; current-window quota-remaining endpoint	D
Marqo	No	Partial	No	Index-level metrics; ecommerce event analytics (search/click/conversion)	D
Soundraw	No	No	No	None	F
Bria	No	No	No	None	F

AWSA+

Cost APIYes

Usage APIYes

Billing ExportYes

Granularity

Per-model, per-token, per-invocation; plus per-IAM-principal, per-application-inference-profile, per-project/workspace, and per-request-metadata attribution (native dollar grain is per usage type per day)

AzureA+

Cost APIYes

Usage APIYes

Billing ExportYes

Granularity

Per-model, per-token, per-deployment, subscription-level

GCPA+

Cost APIYes

Usage APIYes

Billing ExportYes

Granularity

Per-model, per-prediction, SKU-level

Alibaba QwenA+

Cost APIYes

Usage APIYes

Billing ExportYes

Granularity

Cost/usage by billing-item, instance, model, API key, workspace, token type; daily or monthly cycle

ModalA+

Cost APIYes

Usage APIYes

Billing ExportYes

Granularity

per-App, per-day or per-hour cost; optional per-App-tag breakdown

OpenAIA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

Per-model, per-request, tokens aggregated to API key*

AnthropicA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

Per-model, per-workspace, per-API-key, per-service-tier, per-context-window; aggregated in 1m/1h/1d buckets, per-request tokens in the response usage object

fal.aiA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

Per-request and per-billing-event cost in USD; aggregated usage by timeframe (minute/hour/day/week/month); on-demand FOCUS CSV report

DeepInfraA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

per-request (POST /v1/request-costs, costNanoUsd) and per-model-per-month aggregate (GET /payment/usage)

Novita AIA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

per-bill line with cycleType (Hour/Day/Week/Month) and productCategory (llm/gpu/storage); monthly bills; USD as 1/10000 USD

RunPodA-

Cost APIYes

Usage APIYes

Billing ExportPartial

Granularity

per serverless-endpoint / pod / GPU-type, time-bucketed (hour to year) over an arbitrary startTime/endTime range

OracleB

Cost APIYes

Usage APINo

Billing ExportYes

Granularity

SKU and character-transaction level detail

Mistral AIB

Cost APIPartial

Usage APIYes

Billing ExportPartial

Granularity

Per-model, per-request tokens

CohereB

Cost APIPartial

Usage APIYes

Billing ExportPartial

Granularity

Per-model, per-request (input tokens, output tokens, search units)

DeepSeekB

Cost APIPartial

Usage APIPartial

Billing ExportPartial

Granularity

per-request token counts in response; per-currency balance via GET /user/balance; manual per-month export

Fireworks AIB

Cost APIPartial

Usage APIYes

Billing ExportYes

Granularity

per-request response usage; daily buckets (billingUsage) grouped by model/api_key/deployment/tags; per-event export-metrics CSV (tokens-only)

SiliconFlowB

Cost APIPartial

Usage APIPartial

Billing ExportPartial

Granularity

per-request token counts in response; account balance in USD; aggregated usage/cost dashboard-only

ElevenLabsB

Cost APIPartial

Usage APIYes

Billing ExportPartial

Granularity

Per-character/credit, per-product, per-voice, per-API-key, per-user, per-model, per-region, per-workspace; USD spend via fiat_units_spent metric

NebiusB

Cost APIPartial

Usage APIPartial

Billing ExportYes

Granularity

hourly FOCUS 1.2 line items with per-resource attribution over a billing-period range; Token Factory per-request token counts

AI21 LabsB-

Cost APIPartial

Usage APIPartial

Billing ExportNo

Granularity

per-request token usage in response; dashboard aggregates by model

xAIB-

Cost APIYes

Usage APIYes

Billing ExportNo

Granularity

per-request response usage (cost_in_usd_ticks, 1 USD = 1e10 ticks) plus aggregated history via Management API /usage with time buckets, group_by, and aggregations

GroqB-

Cost APIPartial

Usage APIPartial

Billing ExportNo

Granularity

Per-model, per-request tokens

ReplicateB-

Cost APIPartial

Usage APIPartial

Billing ExportNo

Granularity

Compute seconds per run; multiply by hardware rate for USD

BasetenB-

Cost APIYes

Usage APIYes

Billing ExportNo

Granularity

Per-deployment / per-model / per-training-job; daily; tokens and billable minutes

Leonardo.aiB-

Cost APIPartial

Usage APIYes

Billing ExportNo

Granularity

Per-generation cost object (amount plus unit, dollars or credits) in response; USD account balance under PAYG; credit balance via GET /me

RunwayB-

Cost APIPartial

Usage APIYes

Billing ExportNo

Granularity

Credit balance and 90-day usage history by model and day

RunwareB-

Cost APIYes

Usage APIYes

Billing ExportNo

Granularity

Per-request USD cost in response (opt-in); account balance and aggregated usage (credits/requests) via accountManagement getDetails for today/7d/30d/lifetime

RecraftB-

Cost APIPartial

Usage APIPartial

Billing ExportNo

Granularity

Credit balance via API, per-operation dollar costs

MurekaB-

Cost APIPartial

Usage APIPartial

Billing ExportNo

Granularity

Credit balance via billing endpoint; credits require rate conversion