v2.2.0
October 23, 2025
Agents now support media-only inputs
Agents can now process inputs that contain only media (no text). This unlocks use cases such as camera-to-answer, voice-only prompts, and file-first interactions, reducing friction in multimodal experiences.
Details
- Accepts images, audio, video, or files without requiring text
- Simplifies mobile and voice-first integrations
Who this is for: Teams shipping multimodal or mobile-first experiences that prioritize media over text.
