Consistent usage metrics on every model response

Provider usage metrics (including token counts) are now propagated to the model response in both sync and async paths. This ensures reliable cost tracking, quota enforcement, and observability without custom plumbing.

‍

Details:

Uniform access to usage data across sync and async responses
Simplifies building budgets, alerts, and chargeback reporting
No migration or code changes required

‍

Who this is for: Platform owners, FinOps, and engineering teams who track spend, quotas, or performance across models.