v2.3.23
January 7, 2026
Consistent usage metrics on every model response
Provider usage metrics (including token counts) are now propagated to the model response in both sync and async paths. This ensures reliable cost tracking, quota enforcement, and observability without custom plumbing.
Details
- Uniform access to usage data across sync and async responses
- Simplifies building budgets, alerts, and chargeback reporting
- No migration or code changes required
Who this is for: Platform owners, FinOps, and engineering teams who track spend, quotas, or performance across models.
