v2.3.23

January 7, 2026

Consistent usage metrics on every model response

Provider usage metrics (including token counts) are now propagated to the model response in both sync and async paths. This ensures reliable cost tracking, quota enforcement, and observability without custom plumbing.

Details

  • Uniform access to usage data across sync and async responses
  • Simplifies building budgets, alerts, and chargeback reporting
  • No migration or code changes required

Who this is for: Platform owners, FinOps, and engineering teams who track spend, quotas, or performance across models.