v2.2.2
October 23, 2025
Speed up development and reduce costs with response caching
Now you can make iterative development faster and more cost-efficient by caching LLM responses. Set cache_response=True on your model to store responses and avoid redundant API calls, ideal for testing and refining workflows.
from agno.agent import Agent
from agno.models.openai import OpenAIChat
agent = Agent(
model=OpenAIChat(
id="gpt-4o",
cache_response=True
)
)
Learn more about response caching.
