v2.4.1
January 21, 2026
Accurate cost and usage reporting for Perplexity streaming
We corrected streaming token accounting for Perplexity by collecting usage only on the final chunk for providers that return cumulative metrics. This change prevents inflated token counts so your dashboards, budgets, and alerts reflect actual usage.
Details
- More accurate token and cost metrics for streaming responses
- Historical comparisons may show a step change; adjust thresholds as needed
- No application changes required
Who this is for: Platform, FinOps, and observability teams tracking model usage and spend.
