Turning your agents into learning machines

Ashpreet Bedi
January 15, 2026
7 min read
Why AI memory hasn’t been solved and what we should be building instead:

"Agents are changing how we work." For most people, this is aspirational. For me, it’s daily reality.

I do spec-first development and I use agents for all of it. They write the specs, generate the code, and keep everything in lockstep as the system evolves. They use the same tools I use. I haven’t touched Linear in weeks.

But, even within all this innovation, something fundamental is missing.

There’s no continuity. Each session starts cold.

I’m forced to provide the same context over and over: the projects, the constraints, the way I think, test, debug, and make decisions. I tried every workaround I could think of, including skills, MCP, and a massive collection of painstakingly honed prompts that was becoming increasingly difficult to maintain.

None of it really worked.

I tested memory systems as well. They made it so that an agent could remember that I live in New York, but not how I build, test, debug, or think through problems.

That frustration led me down a rabbit hole: hundreds (maybe thousands) of papers, blog posts, and systems on agent memory. And after all that, I came to a simple conclusion: AI memory hasn’t been solved because memory is the wrong abstraction.

The problem with “memory”

Most AI memory systems follow the same pattern: Message → Extract → Store → Retrieve → Dump into prompt → Repeat

This design has two fundamental problems:

1. They collect the wrong kind of information

Memory systems focus on static facts like the user’s name, preferences, and summaries of past conversations. That information isn’t useless, but it’s shallow. It captures what a user says, not how they think, build, test, debug, or make decisions.

2. They don’t know how to use what they collect

Even when information is stored correctly, memory systems still don’t know how to integrate it. When does learning actually happen? Before a response, after it, or in parallel? Is memory updated automatically, or does the agent deliberately control it?

And once something is “remembered,” how does the agent know when to use it, how much weight to give it, or how to act on it naturally?

You can’t just tell an agent, “You know XYZ about the user.” You have to teach it how to apply that knowledge, how to prioritize it, and how to reason with it in context. Without that, you don’t get an intelligent partner. You get a machine reciting facts from a database.

The shift: from memory to learning

I kept trying to solve the memory problem. Better extraction. Smarter retrieval. More context. It felt like tightening a screw with a hammer.

Then it clicked.

I was asking, “What should the agent remember?” when I should have been asking, “What should the agent learn?”

This might seem like wordplay, but the change in perspective yields a fundamental shift. Memory is static: it’s a database of facts about the user. Learning is dynamic: it evolves, compounds, sharpens, and serves a purpose.

This realization led me to build something different.

Introducing learning machines

A Learning Machine is an agent that continuously integrates information from its environment and improves over time—across users, sessions, and tasks.

These aren’t just agents with memory. They aren’t just really good prompts with big databases attached. A learning machine participates in learning, curating what it learns, and integrating that knowledge back into every response.

Here’s how it compares to a memory system: 

Traditional "Memory":
Message → Extract → Store → Retrieve → Dump into Prompt → Repeat

Learning Machine:
User Message ──────► Recall from Stores ◄────────┐
                            │                    │
                            ▼                    │
                      Build Context              │
                            │                    │
                            ▼                    │ LearningMachine
                Agent Responds (with tools)      │
                            │                    │
                            ▼                    │
                   Extract & Process             │
                            │                    │
                            ▼                    │
              Update Stores (agent learns) ──────┴──► Periodic Curation

The goal is for the agent to be fundamentally better on its thousandth interaction than on its first, with improvements that apply across the board and not just with the same user. 

Learning as infrastructure

Let’s define some terms: 

  • Agent = LLM + Tools + Instructions
  • Learning Machine = Agent + Learning Stores

The key innovation behind the Learning Machine is a shared learning protocol that coordinates a set of extensible learning stores.

Learning stores

Learning stores run quietly in the background, capturing different kinds of knowledge and picking up details that become useful in future runs.

Here are some stores I'm currently working on:

  • User Profile: Preferences, memories, personal context
  • Session Context: Goal, plan, progress, summary
  • Entity Memory: Facts, events, relationships
  • Learned Knowledge: Insights, patterns, best practices
  • Decision Logs: Why decisions were made
  • Behavioral Feedback: What worked, what didn't
  • Self-Improvement: Evolving instructions

The real breakthrough, though, is that these stores are extensible by design. You’re not limited to a fixed set. You can create custom stores that match your domain, your workflow, and your team’s weirdly specific way of doing things.

For example, when I’m coding, I want my agent to learn where the source code for a feature lives, where the testing cookbooks are, how to run the tests, and which environment to use. Once the agent learns how to test a feature, that knowledge shouldn’t stay siloed. It should benefit everyone on the team.

That’s the power of extensibility. You can extend the LearningStore protocol to fit your exact needs. Need shared project context? Build a ProjectContextStore. Need to track customers or accounts? Build an AccountStore. The system learns with you, capturing the knowledge that actually matters to how you work.

Learning protocol

The learning protocol defines how the learning stores behave: how they capture, process, and integrate knowledge. 

Each store is configured independently, you can mix and match as needed. The Learning Machine orchestrates it all.

What learning looks like in practice

I've been testing a prototype with my team for a few days, here's a sample session:

Session 1

Engineer A tells the agent: "We're trying to cut down on egress costs. Any ideas?"

The agent learns:

  • The team is cost-sensitive
  • Egress is a concern

Session 2

Engineer B, in an entirely new conversation, asks: “I’m deciding between Datadog and Grafana. Any recommendations?”

The agent responds: “Datadog’s egress costs can add up. Since the team is actively trying to reduce egress, Grafana might be a better fit here.”

No shared prompt. No handoff. No explicit memory call. One person taught the agent something and another person benefited from it.

That’s learning.

Let’s try another one. A new employee asks: “I’m starting work on the cloud migration project. What should I know?”

The agent responds with full context even though it’s never spoken to this person before.

It knows:

  • Acme is migrating from AWS to GCP
  • Alex (CTO) is leading it
  • Phase 2 is the most compute-heavy
  • Pricing changes are coming next quarter

And it even shares patterns from similar past migrations, unprompted. Why?

Because it learned.

How to code learning machines

In this example, one agent draws from multiple learning stores that are independently configured and orchestrated by the learning machine.

from agno.agent import Agent
from agno.db.postgres import PostgresDb
from agno.models.openai import OpenAIResponses

agent = Agent(
    model=OpenAIResponses(id="gpt-5.2"),
    db=PostgresDb(db_url="postgresql://..."),
    learning=LearningMachine(
        knowledge=my_vector_store,  # or graph if that's your thing
        user_profile=UserProfileConfig(
            mode=LearningMode.BACKGROUND,
            enable_agent_tools=True,
        ),
        session_context=SessionContextConfig(
            enable_planning=True,
        ),
        learned_knowledge=LearnedKnowledgeConfig(
            mode=LearningMode.PROPOSE,
        ),
        entity_memory=EntityMemoryConfig(
            mode=LearningMode.BACKGROUND,
        ),
    ),
)

And if the default stores aren’t enough? Just build your own.

from agno.learn import LearningStore

class ProjectContext(LearningStore):
    """Learns the structure and workflow of a specific project."""

    def recall(self, project_id: str, **kwargs):
        # What has the agent learned about this project?
        ...

    def process(self, messages: List[Message], **kwargs):
        # Extract learnings from every conversation:
        # - Where does the source code live?
        # - How do we run tests?
        # - Which environment to use?
        # - What are the gotchas?
        ...

    def build_context(self, data) -> str:
        # Inject learnings into the agent's context
        return f"<project_context>\n{data}\n</project_context>"

# Plug it in
learning = LearningMachine(
    custom_stores={
        "project": ProjectContextStore(
            context={
                "project_id": "learning-machine",
            },
        ),
    },
)

Define these three methods, and the agent gains a domain-specific learning store. The default stores get you started, while custom stores let you teach your agent anything.

Learning agents in the real world

When agents learn across users, sessions, and time:

  • Support agents get better with every ticket
  • Customer success agents know every account, across teams
  • Healthcare agents retain long-term patient history beyond individual visits or providers
  • Financial advisors remember goals, risk tolerance, and years of “what ifs”

And eventually:

Agents learn from their own failures, propose changes to their instructions, and evolve with human approval.

That’s the endgame.

Learning machines roadmap

Learning Machine is already part of Agno. Here’s a look at where we are and where we’re going: 

  • Phase 1 *we’re testing this now*
    User profiles, session context, entity memory, and learned knowledge
  • Phase 2
    Decision logs and behavioral feedback will allow agents to learn from what actually happened
  • Phase 3
    Self-improvement enables agents to refine their own instructions over time.

As for how learning stores will expand, here’s what I know: the most valuable ones don’t exist yet. And they shouldn’t come from us. They’ll come from developers who understand their domains—legal, medical, finance, and operations—better than we ever could.

My bet's not that we built the right stores, but that we built the right protocol for others to build theirs.

Learning was always the goal, we just didn’t know it

Memory was never the goal. Learning was.

If this resonates, dig into the code, build a store, break something, and tell us what’s missing. That’s how Learning Machines will get better, just like the agents themselves.

What sounds like a semantic distinction at first glance isn’t one at all. It’s the difference between an agent that passively stores facts and one that actively improves over time.