v2.2.2

October 23, 2025

Speed up development and reduce costs with response caching

Now you can make iterative development faster and more cost-efficient by caching LLM responses. Set cache_response=True on your model to store responses and avoid redundant API calls, ideal for testing and refining workflows.

from agno.agent import Agent
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(
        id="gpt-4o",
        cache_response=True
    )
)

‍

Learn more about response caching.