Changelog_
Resolved an unhandled msg_too_long error in the Slack streaming path that caused the agent to fail silently or crash when a streamed response exceeded Slack's message length limit. Long responses are now handled gracefully rather than surfacing an error to the user.
Resolved a collection of bugs affecting agents deployed with Coda, including issues in CodingTools, Slack interface behavior, team streaming output, and the learning pipeline. These fixes restore correct end-to-end behavior for Coda-integrated agents across all affected surfaces.
Resolved an issue where server-side tool blocks in Claude conversations were not being preserved when building subsequent request messages. This caused Claude to lose track of tool interactions mid-conversation, breaking multi-turn flows that relied on server tool results being visible in history.
DoclingTools gives agents the ability to convert documents on demand using the Docling library — accepting PDFs, DOCX, PPTX, XLSX, HTML, images, audio, and video files as input and exporting to Markdown, plain text, HTML, JSON, YAML, DocTags, and VTT. Each output format is a separately togglable tool, so agents only expose the conversions they actually need. Advanced PDF handling is also available, with configurable OCR engines, language settings, table structure recognition, picture classification, and per-document timeouts for scanned or complex documents.
Example: The following agent converts a PDF to Markdown
from agno.agent import Agent
from agno.tools.docling import DoclingTools
agent = Agent(
tools=[DoclingTools(all=True)],
description="You are an agent that converts documents from all Docling parsers and exports to all supported output formats.",
)
agent.print_response(
"Convert to Markdown: cookbook/07_knowledge/testing_resources/cv_1.pdf",
markdown=True,
)
See the DoclingTools docs for more.
We’ve introduced GoogleSlidesTools to give agents full control over Google Slides. With it, you can create presentations, build out slides, and manage content end to end, all directly from your agent.
Agents can add and reorder slides, insert text boxes, tables, images, and videos, and read existing slide content to stay context-aware. Whether you are building decks from scratch or modifying existing ones, everything happens programmatically in a single workflow.
We support both OAuth and service account authentication, so you can use the toolkit in interactive setups or deploy it in server-side, multi-user environments.
from agno.agent import Agent
from agno.models.google import Gemini
from agno.tools.google.slides import GoogleSlidesTools
agent = Agent(
model=Gemini(id="gemini-2.0-flash"),
tools=[
GoogleSlidesTools(
oauth_port=8080,
)
],
instructions=[
"You are a Google Slides assistant that helps users create and manage presentations.",
"Always call get_presentation_metadata before modifying slides to get current slide IDs.",
"Use slide_id values returned by the API -- never guess them.",
"Return the presentation ID and URL after creating a presentation.",
],
add_datetime_to_context=True,
markdown=True,
)
agent.print_response(
"Create a new Google Slides presentation titled 'Quarterly Business Review'. "
"Then add the following slides: "
"1. A TITLE slide with title 'Q3 2025 Business Review' and subtitle 'Prepared by the Strategy Team'. "
"2. A TITLE_AND_BODY slide with title 'Agenda' and body listing: Revenue Overview, Key Metrics, Product Roadmap, Q4 Goals.",
stream=True,
)
See the Google Slides docs for more.
Tool call schemas are now normalized across model providers, so switching an agent from one model to another no longer requires adjusting how tools are defined or how their outputs are parsed. This removes a common source of friction when benchmarking models, migrating providers, or running the same agent across multiple backends.
Details:
- Tool call inputs and outputs are translated into a consistent internal format regardless of the originating model provider
- Eliminates provider-specific edge cases in tool schema generation and response parsing
- Enables drop-in model swapping without changes to tool definitions or agent logic
See Fallback Models docs for more.
A new PerplexitySearch toolkit gives agents access to the Perplexity Search API, returning ranked web results with titles, URLs, snippets, and publication dates in a single tool call. Built-in filtering by recency and domain makes it straightforward to build agents that need up-to-date, source-controlled retrieval without additional post-processing.
Check out this example of basic search:
from agno.agent import Agent
from agno.tools.perplexity import PerplexitySearch
agent = Agent(tools=[PerplexitySearch()], markdown=True)
agent.print_response("What are the latest developments in AI?")
Details:
searchandasearch(async) functions return a JSON array of results with URL, title, snippet, and date per resultsearch_recency_filterrestricts results to content from the pastday,week,month, oryearsearch_domain_filterlimits results to a specific list of domains (e.g.,reuters.com,bloomberg.com)search_language_filteraccepts ISO language codes for language-scoped retrievalmax_results(default 5) andmax_tokens_per_page(default 2048) give fine-grained control over result volume and content length- Requires a
PERPLEXITY_API_KEYenvironment variable; no other configuration needed
See the Perplexity docs for reference.
AgenticChunking now accepts a custom_prompt parameter, letting you override the default model-driven chunking instructions with domain-specific logic. Rather than relying solely on the built-in heuristics for finding semantic breakpoints, you can now describe exactly how the model should segment your documents — for example, splitting at major section boundaries, preserving clause integrity, or separating structured metadata from body content — making it straightforward to tune retrieval quality for specialized corpora.
Details:
- Pass any string to
custom_promptto override the default chunking behavior; custom prompts are prioritized over built-in instructions - The default output format constraints are still enforced automatically —
custom_promptonly needs to describe the chunking logic itself - Always pair
custom_promptwithmax_chunk_sizeto bound output length; the defaultmax_chunk_sizeis 5000 characters - The
modelparameter accepts any Agno-compatible model, allowing you to route chunking to a smaller or cheaper model independently of your agent
See the Custom Prompts docs for more.
Resolved an issue where LanceDB's search() could return the same document multiple times when hybrid search retrieved it via both vector similarity and full-text search. Results are now deduplicated before being returned, ensuring each document appears only once regardless of which search path surfaced it.
Details:
- Fixes duplicate results in hybrid search caused by the same document matching both the vector and FTS indices
- Deduplication is applied automatically; no configuration changes required
- Improves result quality and reduces noise for agents and workflows using LanceDB hybrid search
The Seltz toolkit has been updated to align with the breaking changes introduced in the Seltz SDK 0.2.0 release, replacing the previous 0.1.x integration. Teams using the Seltz toolkit should update their Seltz SDK dependency to 0.2.0 alongside this release.
Details:
- Updates the Seltz toolkit integration from SDK
0.1.xto0.2.0 - Ensures compatibility with the latest Seltz SDK API surface
- Upgrade the
seltzpackage to0.2.0to avoid integration errors
Resolved an issue where tools from async toolkits were not included in the tool name list injected into the team system message, leaving the team unaware of those tools at the prompt level.
Resolved an additional case where hybrid search could surface the same document more than once when it matched across multiple search indices.
Fixed output_config not being applied correctly on Claude model wrappers, $defs being stripped from tool schemas, and file_ids and container information not being surfaced during streaming for skills.
Resolved a bug where streamed tool call data was overwriting accumulated state instead of appending to it, causing incomplete or incorrect tool calls to be dispatched.
Resolved an issue where empty string values in streamed LiteLLM responses could overwrite previously accumulated tool names, resulting in tool calls with missing identifiers.
Added an early error when AWS_BEDROCK_API_KEY is set for Claude models on AWS Bedrock, which is not a supported authentication path, rather than failing silently later in the request lifecycle.
Overrode deepcopy behavior on the Azure OpenAI model class to preserve live client references, preventing connection failures that occurred when the model object was copied during agent or team setup.
Resolved an issue where empty reasoning blocks returned by OpenRouter for non-reasoning models were being processed unnecessarily, causing noise in parsed responses.
Resolved a failure in cache key generation when the input contained types that are not directly JSON-serializable, ensuring caching works reliably across a broader range of agent inputs.
Resolved an incorrect import of the pymongo async modules that could cause runtime failures when using MongoDB with async agents or workflows. The import now correctly references the async-compatible pymongo interfaces.
Details:
- Fixes a broken import path for
pymongoasync modules in the MongoDB database backend - Resolves runtime errors encountered when running async agents or workflows with MongoDB storage
- No configuration changes required; upgrading applies the fix automatically
Resolved a bug in parse_tool_calls where shared dictionary references across parsed tool calls would cause the same tool to be executed multiple times during streaming. Each tool call is now constructed from an independent copy, eliminating the duplication.
Details:
- Fixes duplicate tool executions that occurred in streaming mode when multiple tool calls were parsed in the same pass
- Caused by a mutable shared dict reference being reused across tool call objects in
parse_tool_calls - No configuration changes required; the fix applies automatically to all streaming tool call workflows
Resolved an issue where structured output support was not correctly detected for certain Claude models, causing agents to fall back to less reliable output parsing strategies even when the model fully supports structured output. Affected models now use the correct path automatically.
Details:
- Fixes structured output capability detection across supported Claude model variants
- Improves reliability and consistency of structured output for agents using response schemas
- No configuration changes required; the fix applies automatically
Resolved a race condition in MCPTools where parallel tool calls using a header_provider would each independently spin up their own MCP session instead of sharing one, leaving the agent in a stuck state. Session creation is now correctly coordinated so that concurrent tool calls share a single session as intended.
Details:
- Fixes duplicate session creation when multiple MCP tool calls execute in parallel with
header_providerconfigured - Eliminates the agent hang caused by conflicting concurrent sessions
- No configuration changes required; the fix applies automatically to all
MCPToolssetups usingheader_provider
The Gemini model class now accepts a timeout parameter, giving teams explicit control over how long a request is allowed to run before being cancelled. This is particularly useful for production deployments where unbounded request durations can affect reliability and resource utilization.
Details:
- Set
timeout(in seconds) directly on theGeminimodel instance - Applies to all request types made through the Gemini model class
- Falls back to the existing default behavior when not set; no migration required
See reference in docs.
The Mistral model provider now supports the mistralai v2 SDK while continuing to work with v1. Teams can upgrade their SDK dependency and take advantage of v2 improvements without any changes to their agent or model configuration.
Details:
- Full support for
mistralaiv2 SDK alongside continued v1 compatibility - No migration required; existing configurations work without modification
- Enables access to v2 SDK features and performance improvements for teams ready to upgrade
The GET /workflows/{id} endpoint now accepts a version query parameter, allowing callers to fetch a specific version of a workflow rather than always receiving the latest. Workflows also now support run-level parameters — metadata, dependencies, add_dependencies_to_context, and add_session_state_to_context — bringing them to parity with agents and teams for consistent configuration across all execution types.
Details:
- Pass
?version=<version>toGET /workflows/{id}to retrieve a specific workflow version metadata,dependencies,add_dependencies_to_context, andadd_session_state_to_contextare now available at the run level on workflows- Aligns the workflow runtime configuration surface with agents and teams
- No breaking changes; existing workflow definitions and API calls are unaffected
AgentTools now includes ToolParallelAiSearch, a native integration with Vertex AI's Parallel AI Search that allows agents to issue multiple search queries concurrently and aggregate results. This brings Vertex AI search into the same parallel retrieval pattern as other search tools, reducing latency for knowledge-intensive tasks that benefit from broad, simultaneous retrieval.
Details:
ToolParallelAiSearchintegrates directly with Vertex AI's native parallel search API- Enables concurrent multi-query search within a single tool call, reducing round-trip latency
- Consistent with existing parallel search patterns in the toolkit; no special agent configuration required
- Suitable for RAG workflows, research agents, and any use case requiring broad, fast retrieval from Vertex AI
View the cookbook.
The WhatsApp interface has been significantly extended in V2, adding support for rich media, interactive message types, teams, and workflows. Agents can now send and receive images, video, audio, and documents, and respond with structured interactive elements like reply buttons, list menus, location shares, and message reactions, moving beyond plain text into a full conversational interface.
Create an agent, expose it with the Whatsapp interface, and serve via AgentOS:
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
from agno.os import AgentOS
from agno.os.interfaces.whatsapp import Whatsapp
image_agent = Agent(
model=OpenAIResponses(id="gpt-5.2"), # Ensure OPENAI_API_KEY is set
tools=[OpenAITools(image_model="gpt-image-1")],
markdown=True,
add_history_to_context=True,
)
agent_os = AgentOS(
agents=[image_agent],
interfaces=[Whatsapp(agent=image_agent)],
)
app = agent_os.get_app()
if __name__ == "__main__":
agent_os.serve(app="basic:app", port=8000, reload=True)
View the Whatsapp docs for more.
The new Telegram interface mounts webhook endpoints directly on AgentOS, turning any agent, team, or workflow into a fully functional Telegram bot. Inbound messages — text, photos, voice notes, audio, video, documents, stickers, and animations — are handled natively and passed to the agent as structured inputs. Responses stream back in real time with live message edits, throttled to stay within Telegram's rate limits, so users see output as it is generated rather than waiting for a complete reply.
Create an agent, expose it with the Telegram interface, and serve via AgentOS:
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.google import Gemini
from agno.os.app import AgentOS
from agno.os.interfaces.telegram import Telegram
agent_db = SqliteDb(session_table="telegram_sessions", db_file="tmp/telegram_basic.db")
telegram_agent = Agent(
name="Telegram Bot",
model=Gemini(id="gemini-2.5-pro"),
db=agent_db,
instructions=[
"You are a helpful assistant on Telegram.",
"Keep responses concise and friendly.",
],
add_history_to_context=True,
num_history_runs=3,
add_datetime_to_context=True,
markdown=True,
)
agent_os = AgentOS(
agents=[telegram_agent],
interfaces=[Telegram(agent=telegram_agent)],
)
app = agent_os.get_app()
if __name__ == "__main__":
agent_os.serve(app="basic:app", port=7777, reload=True)
See the Telegram docs for more.
The DoclingReader provides a single, unified interface for processing the full range of document formats an AI agent encounters — PDFs, Word files, PowerPoint decks, Excel spreadsheets, images, and even audio and video files — all through the same reader, without format-specific ingestion logic or a sprawling set of dependencies. Built on IBM Research's open-source Docling library, it preserves document structure (headings, tables, hierarchies, formulas, and layout) during extraction, so context is not lost in translation before content reaches your vector store.
Details:
- Supports PDFs,
.docx,.pptx,.xlsx, markup files, images (JPEG, PNG), and audio/video (MP4 and others via FFmpeg and Whisper) - Structure-preserving extraction keeps tables, headings, and hierarchies intact for higher-quality RAG retrieval
- Outputs flow directly into Agno's chunking pipeline with no additional preprocessing required
- Configurable
output_formatsupports Markdown (default), plain text, JSON, HTML, DocTags, and VTT for audio/video transcripts - Load from local paths or directly from URLs with the same interface
Production agent systems demand visibility. Agno now integrates with MLflow to deliver complete, end-to-end trace observability across every model call, tool invocation, and agent step—without custom instrumentation or additional configuration overhead.
With a single call to mlflow.agno.autolog() at startup, all agent activity is automatically captured and surfaced in the MLflow UI. This applies to both individual agents and full AgentOS deployments.
Details:
- Full trace capture across model calls, tool use, and agent steps — out of the box
- Works with self-hosted and managed MLflow servers (AWS, Azure, GCP)
- Supports AgentOS applications with no additional setup beyond the single autolog call
- Traces are OpenTelemetry-native, making them compatible with existing observability pipelines
View the MLflow docs for more.
LearningMode.PROPOSE now automatically enables chat history for the session, ensuring that the multi-turn confirmation flow — where the agent proposes a learned fact and waits for user approval — has full conversational context available across rounds. Previously, history was not retained between turns, causing the agent to lose track of pending proposals mid-confirmation.
Details:
- Chat history is enabled automatically when
LearningMode.PROPOSEis active; no manual configuration needed - Ensures proposed facts and user responses remain in context throughout the confirmation loop
- Fully backward-compatible; no changes required for existing learning configurations
Updated the default base_url for the Siliconflow model provider from .com to .cn to match Siliconflow's actual API endpoint. Requests were previously routed to an incorrect domain, causing connection failures for users relying on the default configuration.
Details:
- Corrects the default
base_urltosiliconflow.cn - Users who had already overridden
base_urlexplicitly are unaffected - No other configuration changes required
Fixed a formatting issue where tool parameter descriptions were incorrectly prefixed with (None) when no type annotation was present. Parameter descriptions now render cleanly in all contexts — tool schemas, AgentOS views, and model prompts — without extraneous noise that could confuse the model or degrade tool call accuracy.
Details:
- Removes the
(None)prefix from parameter descriptions that lack explicit type annotations - Improves the quality and readability of generated tool schemas
- No changes required; the fix applies automatically to all tools
Resolved a bug where add_history_to_context was not correctly applied during Human-in-the-Loop runs that involved multiple conversation rounds. Agents paused for human review and subsequently resumed now have access to the full conversation history in context, preventing gaps in reasoning across approval boundaries.
Details:
- Fixes history injection for HITL workflows using
add_history_to_contextacross multiple rounds - Ensures agents resuming after a pause have full conversational context available
- No configuration changes required; the fix applies automatically to existing HITL setups
A new datetime_format parameter on Agent and Team lets you control exactly how the current datetime is presented in the agent's context using any valid strftime format string. This removes the need to manually inject formatted timestamps through instructions and ensures consistent datetime representation across different locales, regions, and output requirements.
Details:
- Pass any
strftimecompatible format string (e.g.,"%Y-%m-%dT%H:%M:%S"for ISO-8601,"%Y-%m-%d"for date-only, or locale-specific patterns) - Applies wherever datetime context is injected, including
add_datetime_to_context=True - Defaults to existing behavior when not set; no migration required
See cookbook.
Tool pre- and post-hooks, as well as agent-level tool_hooks, can now read the current run's complete message history via run_context.messages. This makes it straightforward to build hooks that inspect prior conversation turns for auditing, conditional logic, prompt injection detection, or logging without needing to pass history through separate channels.
Details:
run_context.messagesis available in both pre- and post-hooks at the tool and agent level- Enables hooks to make decisions based on the full conversation up to the current tool call
- No changes required to existing hooks that don't need message history; fully backward-compatible
See cookbook for reference.
GoogleCalendarTools has been extended with additional tools, a service account authentication path, and new cookbooks to help teams get started quickly. Agents can now handle a broader set of calendar workflows, from listing and creating events to managing multi-user deployments, without requiring per-user OAuth flows.
This agent will use GoogleCalendarTools to find today’s events:
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.google.calendar import GoogleCalendarTools
agent = Agent(
model=OpenAIChat(id="gpt-4o"),
tools=[GoogleCalendarTools()],
add_datetime_to_context=True,
markdown=True,
)
agent.print_response("What meetings do I have tomorrow?", stream=True)
Details:
- New tools extend coverage beyond
list_eventsandcreate_eventfor richer calendar management - Service account authentication enables server-side and multi-tenant deployments without personal OAuth credentials
allow_updateflag (defaultFalse) gates write operations, providing a safe default for read-heavy workflowscalendar_id,oauth_port,token_path, andaccess_tokenparameters give fine-grained control over auth and calendar targeting- Existing OAuth credential flows continue to work; no migration required
See the Google Calendar docs for more.
Agents and teams can now automatically generate actionable followup prompts after each response by setting followups=True.
A second model call produces a configurable number of short, context-aware suggestions based on the conversation, giving users a clear path to continue their work without having to formulate the next question themselves.
Details:
- Enable with
followups=True; control the number of suggestions withnum_followups(default 3) - Use
followup_modelto route followup generation to a smaller, cheaper model independently of the main agent model - Suggestions are available on
response.followupsfor non-streaming runs - Streaming surfaces suggestions via the
FollowupsCompletedevent, emitted after the main response finishes - Works for both agents and teams with no additional configuration
See agent with Followup Suggestions docs for more.
Resolved a bug where each iteration of a Loop always received the original input rather than the output of the previous iteration, causing loops to repeat work instead of building on it. A new forward_iteration_output flag lets you explicitly opt in to passing each iteration's output forward as the next iteration's input.
Details:
- Fixes the default behavior where loop iterations were incorrectly re-receiving the original input
- Set
forward_iteration_output=Trueto chain iteration outputs sequentially through the loop - Default behavior remains unchanged for workflows that do not set the flag, preserving backward compatibility
See docs
A json_serializer is now passed during MySQL engine creation, ensuring that JSON fields are correctly serialized when reading from and writing to MySQL-backed databases. This resolves silent data corruption and type errors that could occur when storing structured agent state or session data.
Details:
json_serializeris applied automatically at engine creation time- Fixes incorrect handling of JSON columns in MySQL storage backends
- No application-level changes required
Resolved a bug in OpenAIResponses where combining external_execution tools with standard tools caused incorrect dispatch behavior. Mixed tool configurations now route correctly, allowing both tool types to coexist in the same agent without unexpected failures or skipped calls.
Details:
- Fixes tool call handling for agents using both
external_executionand regular tools simultaneously - No configuration changes required; the fix applies automatically
- Improves reliability for agents with hybrid tool setups using the OpenAI Responses API
serve() now reads AGENT_OS_HOST and AGENT_OS_PORT environment variables as fallbacks when explicit values are not passed. This removes the need to hardcode host and port configuration at the call site, making containerized and orchestrated deployments cleaner to manage.
Details:
- Set
AGENT_OS_HOSTandAGENT_OS_PORTin your environment to configureserve()without code changes - Explicit arguments passed to
serve()continue to take precedence - No changes required for existing deployments that pass host and port directly
See environmental fallback docs for more.
Images and audio generated during a run are now consistently included in run output regardless of the store_media setting. Media is scrubbed before being written to the database, keeping storage lean while ensuring callers always receive the full output they expect.
Details:
- Generated media (images, audio) is present in run output in all cases
- Media is stripped from the payload prior to database storage, decoupling output completeness from persistence behavior
store_mediacontinues to control persistence; output delivery is no longer tied to it
Agents and teams are now assigned human-readable IDs (e.g., brave-falcon-7x3k) instead of raw UUIDs.
This makes it significantly easier to identify and track specific runs at a glance across logs, traces, and monitoring dashboards without needing to cross-reference opaque identifier strings.
Details:
- Human-readable IDs are generated automatically for all agents and teams
- Existing workflows referencing explicit session or agent IDs are unaffected
- Improves legibility across AgentOS trace views, logs, and debugging output
GmailTools has been extended with a broader set of email management functions and a new service account authentication path. Teams can now handle more of the Gmail workflow, including reading, drafting, sending, replying, labeling, and searching, directly from an agent, while platform teams gain a credential-free auth option suited for server-side and multi-user deployments.
from agno.agent import Agent
from agno.tools.google.gmail import GmailTools
agent = Agent(tools=[GmailTools()])
agent.print_response("Show me my latest 5 unread emails", markdown=True)
Details:
- New tools cover the full email lifecycle:
get_emails_by_date,get_emails_by_thread,send_email_reply,create_draft_email,mark_email_as_read,mark_email_as_unread,list_custom_labels,apply_label,remove_label, anddelete_custom_label - Service account authentication provides a governed, non-personal credential path for automated and multi-tenant workflows
- Existing OAuth credential flows (
creds,credentials_path,token_path) continue to work with no migration required - Use
include_toolsorexclude_toolsto expose only the subset of tools your agent needs
See Gmail docs for reference.
Engineering and platform teams using GitLab can now connect agents directly to their repositories. GitlabTools brings read-focused GitLab access to Agno agents, covering projects, merge requests, and issues, with async support and granular control over which tools are exposed.
This makes it straightforward to build agents that monitor repository activity, triage open issues, summarize merge request pipelines, or answer questions about project state, without custom API wrappers or manual data fetching.
Details:
- Covers five read operations out of the box: list and inspect projects, merge requests, and issues
- Supports both GitLab.com and self-hosted GitLab instances via a configurable base URL
- Each tool can be toggled on or off individually using
enable_*parameters, giving teams precise control over what the agent can access - Async support ensures GitLab operations don't block agent execution in concurrent or high-throughput deployments
- Authentication via a GitLab access token set through environment variables — no code changes needed to rotate credentials
from agno.agent import Agent
from agno.tools.gitlab import GitlabTools
agent = Agent(
instructions=[
"Use GitLab tools to answer repository questions.",
"Use read-only operations unless explicitly asked to modify data.",
],
tools=[GitlabTools()],
)
agent.print_response(
"List open merge requests for project 'gitlab-org/gitlab' and summarize the top 5 by recency.",
markdown=True,
)
The built-in session search tool has been upgraded from a single-pass lookup to a two-step process: the agent first calls search_past_sessions() to retrieve lightweight previews of recent sessions, then selectively fetches the full conversation for a specific session with read_past_session(session_id).
This reduces unnecessary data loading and gives the agent a clearer, more structured path to locating relevant history.
Details:
search_past_sessions()returns per-run previews across recent sessions without loading full message historiesread_past_session(session_id)fetches the complete conversation for a targeted session on demand- Control scope with
num_past_sessions_to_search(default 20) andnum_past_session_runs_in_search(default 3) to tune preview depth - Session history is scoped per user — agents cannot surface another user's sessions
- Enable with
search_past_sessions=Trueon theAgent; no other changes required
See cookbook for reference.
JSON schema generation now handles Literal types, ensuring that agents and tools using constrained value sets produce valid, complete schemas. This closes a gap that could cause schema generation to fail or produce incomplete type definitions for structured outputs.
Details:
Literaltypes are now correctly represented in generated JSON schemas- Improves reliability of structured output validation and tool definitions
- No migration required
OpenAIResponses now supports input_file, letting you pass files directly into OpenAI Responses API calls. This simplifies document-aware workflows by removing the need to pre-process or separately upload files before invoking a model.
Details:
- Pass files directly as input alongside text prompts
- Reduces pipeline complexity for document analysis, extraction, and summarization tasks
See cookbook for reference
We resolved a race condition in OpenAI Responses where file_search could silently return empty results due to eventual consistency in OpenAI's vector store file listing API. Polling now correctly waits for file readiness, ensuring that retrieval queries return complete, accurate results from the start.
Details:
- Eliminates silent empty results caused by premature polling
- No application changes required; the fix applies automatically
- Improves reliability for RAG and document-grounded workflows using OpenAI file search
Google tools have been restructured into a dedicated agno.tools.google sub-package (e.g., from agno.tools.google import GmailTools). This organizes a growing set of Google integrations under a single, predictable namespace. Existing import paths continue to work, so no migration is required.
Details:
- All Google tools consolidated under
agno.tools.google - Backward-compatible; legacy imports remain functional
- Establishes a consistent pattern as Google tool coverage expands
File upload endpoints now accept image/heic and image/heif formats, removing the need to convert Apple-native image formats before ingestion. This reduces friction for teams processing user-submitted or mobile-captured content and ensures broader device compatibility out of the box.
Details:
- Native support for HEIC/HEIF alongside existing image formats
- No configuration changes required
- Improves throughput for field, support, and on‑site capture flows
A new approval status endpoint lets you query where a paused run stands in the approval process, and admin-gated enforcement ensures that only authorized users can continue execution. Together, these changes give teams auditable, policy-driven control over Human-in-the-Loop workflows, closing gaps between initiating a pause and resuming work.
Details:
- Query approval status programmatically to build dashboards, alerts, or integration triggers
- Admin-gated continue-run enforcement prevents unauthorized resumption of paused executions
- Strengthens governance for high-stakes or compliance-sensitive workflows
See docs for reference
A new approval status endpoint lets you query where a paused run stands in the approval process, and admin-gated enforcement ensures that only authorized users can continue execution. Together, these changes give teams auditable, policy-driven control over Human-in-the-Loop workflows, closing gaps between initiating a pause and resuming work.
Details:
- Query approval status programmatically to build dashboards, alerts, or integration triggers
- Admin-gated continue-run enforcement prevents unauthorized resumption of paused executions
- Strengthens governance for high-stakes or compliance-sensitive workflows
See docs for reference
AgentOS now supports an advanced filtering DSL for traces, letting you construct precise, composable queries to isolate specific runs, models, components, or behaviors. This replaces broad, manual trace inspection with targeted retrieval, accelerating debugging, audit workflows, and performance analysis.
Details:
- Composable filter expressions for fine-grained trace queries
- Reduces time-to-resolution when diagnosing issues across complex agent and workflow executions
- Available through AgentOS trace endpoints
See cookbook for reference or view the docs
Knowledge sources now support GitHub App authentication (app_id, installation_id, private_key) in addition to personal access tokens. This gives platform and security teams a more governed authentication path, with scoped permissions, no personal credentials in the pipeline, and thread-safe token caching that handles expiration automatically. Both sync and async variants are supported.
Details:
- Authenticate as a GitHub App for fine-grained, org-managed access to private repositories
- Thread-safe token caching eliminates redundant auth requests and simplifies concurrent workloads
- Personal access tokens continue to work; no migration required
See cookbook for reference
ModelsLabTools now supports text-to-image generation with PNG/JPG outputs, an image fetch endpoint, and sizing options. Teams can add visual generation to agents and workflows without custom integrations, standardize on common file types, and speed prototyping. This reduces integration effort and helps deliver richer end-user experiences with minimal setup.
Details:
- Generate images from text prompts with configurable dimensions
- PNG and JPG outputs for compatibility with existing storage and delivery pipelines
- Unified tool interface works across agents and workflows
See docs for reference
Slack integrations now support real-time streaming with live progress cards, so end users see assistant activity as it happens rather than waiting for a final response. Each Slack installation can also supply its own token and signing secret, giving platform teams clean credential separation across workspaces and tenants. We also resolved an issue where the initial message state appeared blank, ensuring clean session starts from the first interaction.
Details:
- Live progress cards surface step-by-step updates during execution
- Per-instance credentials improve security posture and simplify multi-tenant deployments
- Modular internals reduce operational complexity and ease troubleshooting
See Cookbook for reference
Stop getting stale or irrelevant results. You now have direct control over how your agent scopes and retrieves web search results.
Getting started is simple
Pass the new parameters when initializing DuckDuckGoTools:
from agno.agent import Agent
from agno.tools.duckduckgo import DuckDuckGoTools
agent = Agent(
tools=[DuckDuckGoTools(
timelimit="d", # Limit results to the past day
region="us-en", # Scope results to a region
backend="api", # Control how results are fetched
)],
)
agent.print_response("What's happening in France?", markdown=True)
Need even more flexibility across multiple search backends? Check out WebSearchTools for Google, Bing, Brave, and more. Or view the DuckDuckGoTools docs.
The metrics system has been redesigned to provide granular, per-model and per-component tracking across the entire agent, team, and workflow lifecycle. Instead of a single aggregate view, you now get a clear breakdown of where resources are consumed — improving cost attribution, capacity planning, and performance optimization.
Details:
- Per-model metrics for accurate cost allocation and provider comparison
- Per-component tracking across agents, teams, and workflows for end-to-end visibility
- Consistent data across sync, async, and streaming execution paths
View the metrics docs for more.
SeltzTools now uses Seltz SDK 0.1.3, incorporating the latest fixes and improvements from the upstream SDK. This keeps your Seltz-powered search integrations current and stable with no action required.
Details:
- Bumped dependency to Seltz SDK 0.1.3
- No breaking changes or migration steps
Who this is for: Teams using SeltzTools for semantic search—upgrade to pick up the latest SDK improvements automatically.
View the SeltzTools docs for reference.
Stream real-time events during autonomous team task execution.
Watch your agent team work in real time as each member completes their task and sends updates immediately, without waiting for the full pipeline to finish.
Getting started is simple
Expose your tasks mode team via AgentOS, then hit the streaming endpoint:
team = Team(
id="research-team",
name="Research Team",
mode=TeamMode.tasks,
model=OpenAIChat(id="gpt-5-mini"),
members=[researcher, summarizer],
db=db,
)
agent_os = AgentOS(teams=[team])curl -X POST http://0.0.0.0:7777/v1/teams/research-team/runs/stream \
-H "Content-Type: application/json" \
-d '{"message": "What are the key benefits of microservices architecture?"}'
The events arrive in sequence, with the researcher reporting first, followed by the summarizer, and finally the completed summary.
Learn more & explore examples in this Cookbook.
Stop returning low-quality matches. Set a quality floor on your vector search results so your agents only work with context that's actually relevant.
Getting started is simple
Pass similarity_threshold when initializing PgVector. Anything below the threshold gets filtered out automatically.
from agno.vectordb.pgvector import PgVector
vector_db = PgVector(
table_name="vectors",
db_url=db_url,
similarity_threshold=0.2,
)
See it in action in this cookbook.
Workflows now support Human-in-the-Loop at the individual Step level, letting you pause execution to collect confirmation or user input before proceeding. This gives you fine-grained control over when and where a human needs to weigh in during an automated flow.
Details:
- Pause any workflow step to await human confirmation or freeform input
- Resume execution seamlessly once input is provided
Who this is for: Teams building workflows with high-stakes or ambiguous decision points that require human oversight before continuing.
View the Human-in-the-Loop in Workflows docs for more.
WorkflowRunOutput now exposes a files field and uses consistent JSON (de)serialization. This fixes prior serialization errors and makes file artifacts first-class, enabling teams to programmatically consume outputs and chain file-based steps across workflows.
Details:
- Files returned directly in API responses for simpler integration
- Consistent serialization eliminates client-side errors and custom plumbing
- No migration required
See Cookbook for reference.
PDF extraction now includes a sanitize_content option (default: True) to normalize fragmented text by collapsing excessive whitespace. By improving the default output quality, you’ll spend less time cleaning documents and deliver more reliable retrieval and analysis downstream.
Details:
- New parameter:
sanitize_content=Trueby default - Disable when exact layout/spacing must be preserved for specialized parsing
- No breaking changes; review behavior if you depend on raw spacing in existing pipelines
We’ve added remote content sources including Amazon S3, Google Cloud Storage, Azure Blob, GitHub, and SharePoint, along with new APIs to list sources and browse files before ingestion. This removes friction when bringing existing data into Agno and shortens time to value for knowledge driven use cases.
Details:
- Discover and browse files via new list/browse endpoints to target exactly what you ingest
- Reserved _agno metadata for system use; status now includes content_id for auditability and traceability
- No migration required
View the Cloud Storage Sources docs for more.
A new isolate_vector_search option on the Knowledge class lets multiple agents or teams share the same vector database while keeping their search results isolated. Each agent only retrieves documents relevant to its own knowledge scope, even when the underlying storage is shared.
This simplifies infrastructure by reducing the number of vector databases you need to manage, while preserving the accuracy and separation required for multi-agent or multi-tenant deployments.
Details:
- Enable with isolate_vector_search=True on any Knowledge instance
- Shared vector database, isolated query results per agent or team
- Reduces infrastructure cost and operational overhead for multi-agent systems
- No changes required to existing vector database configuration
View the Vector Search docs.
Tools, knowledge sources, and team members can now be defined as callable factories that are resolved at runtime. Each factory receives full RunContext access and supports caching, allowing dynamic configuration based on the current user, session, or environment.
This removes the need to pre-instantiate every component at startup and enables more flexible, multi-tenant, and context-aware agent configurations without custom wiring.
Details:
- Define tools, knowledge, and team members as functions that return the configured instance
- Factories are invoked at runtime with access to RunContext (user, session, environment)
- Built-in caching prevents redundant initialization across runs
- Supports multi-tenant and dynamic configuration patterns out of the box
Learn more in the Callable Factories docs.
A new approval system lets you require human sign-off before agents execute sensitive actions. Using the @approval decorator alongside a HITL primitive (requires_confirmation, requires_user_input, or external_execution), you can define blocking or audit-mode approval gates with persistent database records for review and governance.
This means high-stakes operations—such as financial transactions, data modifications, or external API calls—don't proceed without explicit admin authorization. Every approval decision is recorded, creating a clear audit trail for compliance and operational review.
Details:
- Blocking mode (
@approvalor@approval(type="required")) pauses execution and writes a pending record to your database. The run only resumes once an admin resolves it viadb.update_approval(...)andagent.continue_run()is called. - Audit mode (
@approval(type="audit")) is non-blocking — the run continues immediately after the HITL interaction is resolved, while an audit log is created for compliance and traceability. - Persistent approval records are stored in your configured database for compliance, audit, and post-incident review.
- Programmatic resolution is handled through your DB provider, with
expected_statuschecks to prevent race conditions, and can be integrated into existing review workflows and dashboards.
Learn more in the Agno approval docs.
Teams can accumulate and persist knowledge over time, improving their responses and decisions across sessions and runs.
This brings organizational learning to multi-agent systems. Teams retain context about goals, constraints, policies, and past outcomes, leading to better performance and consistency as usage scales.
Details:
- Teams now support LearningMachine for persistent knowledge retention
- Learned knowledge persists across sessions and is shared among team members
- Enables continuous improvement without manual re-prompting or context injection
Who this is for: Organizations running long-lived multi-agent teams that benefit from accumulated context, such as customer support, internal operations, or advisory systems.
store_history_messages now defaults to False. Previously, agent run history messages were stored automatically. If your application relies on persisted message history for context, continuity, or audit purposes, you will need to explicitly enable this setting.
This change reduces default storage usage and improves performance for the majority of use cases where history persistence is not required.
Details:
- Action required: If you rely on stored history, set store_history_messages=True on your agent configuration
- Default behavior now skips history storage, reducing write volume and storage cost
- No impact if you were already managing history explicitly
Who this is for: All teams upgrading to this version. Review your agent configurations to ensure history persistence is enabled where needed.
We updated the sessions, component configurations, and component links tables to use proper primary key constraints, replacing the previous unique constraint approach. This improves data integrity, query performance, and alignment with database best practices.
Details:
- Primary key constraints now enforce uniqueness and indexing at the database level
- Improves query performance for session lookups and component resolution
- No action required for most deployments; migrations are applied automatically
Who this is for: All Agno users, particularly those running at scale where database performance and integrity are critical.
Learn more about sessions and component configuration in the Agno docs.
Agno infrastructure now supports AWS Elastic File System (EFS) for persistent, shared storage across agent deployments. This enables agents and workflows to read and write files on durable, scalable storage without managing custom volume mounts or storage adapters.
Details:
- Native EFS integration for persistent file storage across agent instances
- Supports shared access patterns for multi-agent or multi-container deployments
- Operates within your existing AWS environment and security controls
The human-in-the-loop (HITL) system for teams has been significantly expanded. New run requirements support tool confirmation, user input collection, and external tool execution, giving you fine-grained control over when and how humans interact with running teams.
This means teams can pause mid-execution to confirm a tool call, collect additional input from a user, or wait for an external action to complete before continuing. The result is safer, more controllable multi-agent workflows that keep humans in the loop exactly where it matters.
Details:
- Tool confirmation pauses before executing a tool and waits for human approval
- User input collection requests information from a user mid-run
- External tool execution allows a human or external system to perform a step and return the result
- All controls work within team orchestration, not just single agents
Who this is for: Teams deploying multi-agent workflows in customer-facing, regulated, or high-stakes environments where human oversight at specific decision points is required.
Agno now includes a built-in scheduler for running agents, teams, and workflows on a recurring basis. Define cron schedules with support for retries, timeouts, and timezone configuration. No external scheduler or infrastructure required.
This simplifies operations for teams that need periodic agent runs, such as daily data processing, recurring report generation, or scheduled monitoring tasks, without introducing external dependencies like Airflow or cron jobs on separate infrastructure.
Details:
- Cron-based scheduling with standard cron expressions
- Configurable retry policies and execution timeouts
- Timezone-aware scheduling for globally distributed teams
- Works with agents, teams, and workflows
from agno.db.sqlite import SqliteDb
from agno.scheduler import ScheduleManager
db = SqliteDb(id="scheduler-demo", db_file="tmp/scheduler.db")
mgr = ScheduleManager(db)
schedule = mgr.create(
name="weekday-report",
cron="0 9 * * 1-5",
endpoint="/agents/reporter/runs",
payload={"message": "Generate morning report"},
timezone="UTC",
max_retries=2,
retry_delay_seconds=30,
)
for s in mgr.list(enabled=True):
print(s.name, s.next_run_at)
mgr.disable(schedule.id)
Teams now support four distinct execution modes (coordinate, route, broadcast, and tasks), giving you explicit control over how agents collaborate. Instead of building custom orchestration logic, you select the mode that matches your use case: route requests to a specialist, coordinate across multiple agents, broadcast to all members in parallel, or decompose work into discrete tasks.
This reduces orchestration boilerplate, makes team behavior predictable and auditable, and lets you switch strategies without rewriting agent logic.
Details:
- Coordinate mode lets a lead agent delegate and synthesize across team members
- Route mode directs each request to the best-fit agent based on the task
- Broadcast sends the same input to all members and collects parallel responses
- Tasks mode decomposes work into discrete, trackable units assigned to individual agents
- Configured via the TeamMode parameter when defining a team
from agno.agent import Agent
from agno.team.mode import TeamMode
from agno.team.team import Team
from agno.models.openai import OpenAIChat
# --- Specialist Agents (Team Members) ---
researcher = Agent(
name="Researcher",
model=OpenAIChat(id="gpt-4o"),
instructions="You are a research specialist. Find and summarize factual information on any given topic.",
)
analyst = Agent(
name="Analyst",
model=OpenAIChat(id="gpt-4o"),
instructions="You are a data analyst. Interpret findings and extract key insights from research.",
)
writer = Agent(
name="Writer",
model=OpenAIChat(id="gpt-4o"),
instructions="You are a technical writer. Turn insights into clear, concise reports.",
)
# --- Team with Coordinate Mode ---
# Swap TeamMode.coordinate for .route, .broadcast, or .tasks as needed
team = Team(
name="Research Team",
mode=TeamMode.coordinate, # Leader delegates and synthesizes across all members
members=[researcher, analyst, writer],
model=OpenAIChat(id="gpt-4o"), # Model used by the team leader
instructions="Coordinate the specialist agents to produce a well-researched, clearly written report.",
)
# --- Run the Team ---
response = team.print_response("Give me a report on the current state of quantum computing.", stream=True, show_member_responses=True)
Explore the Team Modes docs.
Agno now supports Neosantara, an Indonesian LLM gateway that provides an OpenAI-compatible API. This allows teams to use Neosantara models without changing existing agent or workflow integrations.
This expands deployment options for organizations operating in or serving Southeast Asia, enabling better regional performance, data residency alignment, and vendor flexibility.
Why this matters:
- Faster adoption with OpenAI-compatible interfaces
- More choice in model providers and regional infrastructure
- Reduced integration and switching costs
View the Neosantara docs.
The Workflow Router step now supports returning the name of a step instead of the step object itself, and can route to a group of steps as a single choice.
This makes routing logic easier to define, reason about, and maintain—especially in larger workflows with shared paths or reusable subflows. Named and grouped routing reduces tight coupling between steps and lowers the operational cost of evolving workflows over time.
Why this matters:
- Cleaner, more maintainable routing logic
- Easier reuse of shared workflow paths
- Reduced refactoring risk as workflows grow
Workflow Condition, Loop, and Router steps now support CEL (Common Expression Language) evaluators. Expressions are defined as strings, making workflows fully serializable and easier to store, review, and move across environments.
This change improves portability and governance for teams managing workflows as configuration rather than code. CEL provides a consistent, readable way to express branching and decision logic without embedding runtime objects, reducing coupling and deployment risk.
Why this matters:
- Easier workflow versioning and auditing
- Improved portability across environments and runtimes
- Less friction when storing workflows in configuration or policy systems
Agno now aligns with LanceDB’s latest API, replacing the deprecated table_names() with list_tables() and updating the minimum LanceDB version to 0.26.0. This avoids deprecation-related breakages and keeps integrations stable.
Details:
- Action required: upgrade LanceDB to version 0.26.0 or later
- If you call LanceDB directly, update usage to list_tables(); Agno’s adapter handles this internally
Who this is for: Teams using LanceDB as a vector store and relying on stable, forward-compatible storage integrations.
Human-in-the-loop confirmation now correctly applies to MCP Function tools via toolkit-level settings (e.g., requires_confirmation_tools). This restores predictable approval gates before tool execution, improving safety and oversight.
Details:
- Centralized policy control at the toolkit level
- No code changes required to enable HITL confirmation
Who this is for: Organizations enforcing governance, compliance, or risk controls in agentic workflows.
AwsBedrockEmbedder now supports Cohere Embed v4, including configurable output dimensions and multimodal (text + image) embeddings, with async variants. This expands what you can index and search while tuning for cost, latency, and quality.
Details:
- Control vector size via output_dimension for performance and cost management
- Operates through AWS Bedrock for governance and consolidated operations
Who this is for: Teams standardizing on Bedrock that need scalable, multimodal semantic search and RAG.
Introduce smarter retrieval with the new AwsBedrockReranker, supporting Cohere Rerank 3.5 and Amazon Rerank 1.0. By scoring and reordering retrieved passages, you can boost precision and reduce noise in generated answers.
Details:
- Plug-and-play integration for existing retrieval pipelines
- Convenience classes streamline setup and adoption on AWS
Who this is for: Teams building retrieval-augmented generation on AWS that need higher-quality, production-grade ranking.
Condition steps now support else_steps, allowing you to define a clear alternative path when a condition evaluates to false. This makes complex automations easier to express and maintain without extra workaround steps.
Details:
- First-class true/false branching directly in workflows
- Backward compatible; no changes required to existing flows
Who this is for: Teams orchestrating complex, decision-heavy workflows that need clearer control flow and easier maintenance.
Async text_reader.aread() now returns an empty list ([]) for empty files, aligning behavior with the sync API. This removes special-case handling and simplifies downstream pipelines.
Details:
- Consistent return types across sync and async code paths
- Reduced edge-case logic and clearer semantics for job orchestration
- Action required: Update any logic that expects a placeholder empty document
Who this is for: Teams running ingestion pipelines or ETL workflows that need predictable document handling.
We changed the WebsiteReader deduplication model to compute content hashes per page. This aligns skip_if_exists with page-level updates and ensures accurate re-crawls.
Details:
- Behavior change: Deduplication occurs at page granularity, not aggregate level
- Action required: Clear existing website crawl entries before re-indexing to prevent duplicates
- Benefits: Higher correctness, predictable re-crawls, and lower operational overhead
Who this is for: Engineering teams managing recurring website crawls and large content refreshes.
WebsiteReader now computes a unique content hash per crawled URL, fixing skip_if_exists for multi-page crawls. This ensures accurate per-page deduplication, reduces redundant ingestion, and saves processing cost during re-crawls.
Details:
- Correct per-page deduplication for predictable skip_if_exists behavior
- Fewer unnecessary writes and tokens when re-indexing multi-page sites
- Action required: Clear existing website crawl entries in your knowledge store before re-indexing to avoid duplicates
Who this is for: Teams maintaining search indexes, documentation portals, or knowledge bases sourced from websites.
We’ve added a first-class SeltzTools toolkit that brings Seltz-powered semantic search directly into Agno. Teams can now plug high-quality semantic retrieval into Agents and Workflows without building custom adapters, improving response relevance and cutting integration time from days to minutes.
Details
- Standard Tool interface works seamlessly across Agents and Workflows for consistent, composable search.
- Reduces maintenance by relying on a supported integration instead of bespoke connectors.
- Quick start:
- pip install seltz
- Set SELTZ_API_KEY in your environment
Who this is for: Teams building RAG assistants, enterprise search, or knowledge-heavy automation that want robust semantic retrieval with minimal integration effort.
Learning is now simpler and more effective. When learning=True, user memory is enabled by default, and the LearnedKnowledgeStore captures organizational context (goals, constraints, policies) to guide agent behavior. We also improved prompts, streamlined tool parameter handling, and updated status messages. New quickstart cookbooks help teams adopt faster.
Details:
- Faster time-to-value with sensible defaults — no extra setup to persist memory
- Better outcomes via richer organizational context and improved prompt quality
- Reduced integration effort with simpler tool parameter handling
- Clearer operational visibility with improved status text and cookbooks
Who this is for: Teams piloting or scaling learning agents that need strong governance signals, faster setup, and consistent outcomes.
Agno now supports Moonshot.ai as a model provider with initial models and examples to help you get started quickly. This broadens your options for performance/cost trade-offs and lets you evaluate or deploy Moonshot models using the same configuration patterns you use today. The provider integrates seamlessly, so you can swap or A/B test models without refactoring agents or workflows.
Details:
- Standardized configuration and invocation across providers
- Ready-to-use examples to accelerate evaluation and onboarding
- Compatible with Agents, Tools, and Workflows
Who this is for: Teams optimizing their model portfolio for accuracy, latency, budget, or regional availability.
We added UnsplashTools, a first-class toolkit for discovering and retrieving high-quality, royalty-free images directly in Agno. Teams can now search, fetch by ID, request a random image, and download assets without building or maintaining custom integrations. This streamlines image sourcing across agents and workflows, reduces time-to-value for media-heavy features, and lowers ongoing integration overhead.
Details:
- Turnkey tools: search_photos, get_photo, get_random_photo, download_photo
- Consistent interface usable from agents and workflows
- Eliminates custom API wrappers and reduces maintenance
Who this is for: Product, content, and AI assistant teams needing on-demand images for generation, prototyping, or production experiences.
We introduced a dedicated ExcelReader for .xls/.xlsx with sheet filtering, options to skip hidden sheets, and chunking controls. ReaderFactory now routes Excel files to ExcelReader automatically. This eliminates CSV conversions and reliance on CSVReader, reducing setup time and avoiding common formatting pitfalls. Teams gain more predictable ingestion of large workbooks and can tune performance and cost via chunk sizing.
Details:
- Automatic routing of .xls/.xlsx to ExcelReader; minimal code changes for common cases
- Include/exclude specific sheets and optionally skip hidden tabs to control what’s ingested
- Chunking controls to handle large files reliably and at scale
- Migration: Projects that used CSVReader for Excel should switch to ExcelReader and install the extra: pip install "agno[excel]"
Who this is for: Teams ingesting spreadsheets into knowledge bases or agent workflows; platform owners standardizing document ingestion.
A fix restores reliable table creation across AsyncSQLiteDb, AsyncPostgresDb, AsyncMySQLDb, and FirestoreDb. This removes a blocker that could prevent schema setup during initialization, improving startup reliability and reducing manual intervention across environments.
Details:
- Unblocks table creation during provisioning and cold starts
- Applies consistently across multiple async backends in one upgrade
- No application changes required
Who this is for: Teams using async database backends for storage who need predictable deployment and operations.
We introduced OpenAI Responses API–compatible clients, including a base OpenResponses and provider-specific clients for Ollama and OpenRouter. This gives teams a consistent request/response schema across local and hosted models, simplifying migrations and reducing provider-specific branching. The result is faster adoption, cleaner integrations, and more flexibility to switch or mix models without refactoring.
Details:
- One API shape across multiple providers for better portability and governance
- Supports self-hosted (Ollama) and hosted marketplaces (OpenRouter)
- No breaking changes — upgrade and start using Responses-compatible clients
Who this is for: Platform teams running hybrid model stacks and organizations seeking vendor flexibility with minimal integration overhead.
Knowledge now connects to private Azure Blob Storage as a first-class source — alongside SharePoint and GitHub — so Azure-centric organizations can centralize content without custom ETL. This enables teams to index documents securely from private containers and make them available to agents and workflows for retrieval-augmented generation and search.
Details:
- Works with private Azure Blob Storage containers under your existing access controls
- Parity with existing SharePoint and GitHub loaders for consistent operations
- Reduces setup time and ongoing maintenance for Azure-first environments
Who this is for: Teams standardizing on Azure that need governed, scalable ingestion for internal content.
Async generator tools now capture and surface errors on the tool call —matching synchronous behavior — instead of re-raising exceptions. This delivers more predictable orchestration and fewer unexpected failures in long-running or streaming tool workflows. If your implementation relied on exceptions being thrown, update handlers accordingly.
Details:
- Aligns async error handling with sync tools for consistent behavior
- Reduces unexpected cancellations caused by unhandled async exceptions
- Improves reliability in streaming and long-running workflows
Who this is for: Teams building automation with async or streaming tools.
We corrected streaming token accounting for Perplexity by collecting usage only on the final chunk for providers that return cumulative metrics. This change prevents inflated token counts so your dashboards, budgets, and alerts reflect actual usage.
Details:
- More accurate token and cost metrics for streaming responses
- Historical comparisons may show a step change; adjust thresholds as needed
- No application changes required
Who this is for: Platform, FinOps, and observability teams tracking model usage and spend.
Knowledge can now ingest content from private GitHub repositories and SharePoint, via SDK and API. This enables organizations to consolidate code, docs, and operational knowledge from private systems while maintaining governance, reducing manual exports, and improving coverage for enterprise RAG and analytics.
Details:
- Supports authenticated access to private GitHub and SharePoint sources
- Preserves structure and basic metadata to enhance retrieval relevance
- Reduces integration effort by using a single ingestion pathway
Who this is for: Enterprises with critical content in private repos and SharePoint who need secure, governed ingestion.
