Overview
Microsoft Foundry Agent Service has shipped a significant capability upgrade: a native 3-tier memory architecture that gives enterprise AI agents the kind of persistent, structured memory that production workloads require. This is not a bolt-on feature — it is built directly into the Foundry runtime, integrated with Azure identity, and designed to work at scale.
If you have been building AI agents and wrestling with how to manage state across sessions, users, and runs, this is the architectural piece you have been waiting for.
The 3 Memory Tiers Explained
1. Procedural Memory — How-To Knowledge Across Runs
Procedural memory stores how the agent solved a problem. When an agent successfully handles a support ticket, resolves an infrastructure issue, or completes a workflow, it encodes that approach and retrieves it on future similar tasks.
The economic implication is significant: the 100th ticket costs a fraction of the first. The agent does not rediscover the solution — it retrieves and applies it. This is the cost moat for teams running high-volume, repetitive agent workflows.
2. User Memory — Per-Individual Persistence
User memory persists information about a specific individual across all interactions. Preferences, communication style, past decisions, role context — all of it survives between sessions. This enables agents to behave as genuine long-term assistants rather than starting from zero every conversation.
For enterprise deployments this is powerful: a financial services agent that knows a client manager's portfolio preferences, or an IT helpdesk agent that remembers a specific user's environment and past issues.
3. Session Memory — Per-Conversation Context
Session memory is scoped to a single conversation. It gives the agent working context for the current interaction — what has been discussed, what decisions were made, what the user is trying to accomplish right now. When the session ends, this memory can optionally be promoted to user or procedural memory.
How They Work Together in a Real Agent Run
Consider a cloud infrastructure agent handling an Azure cost optimisation request:
- Session memory tracks the current conversation — what resources the user wants to review, what constraints they mentioned
- User memory recalls that this user prefers Reserved Instances over Savings Plans and manages a healthcare workload with strict compliance requirements
- Procedural memory retrieves the approach used successfully for similar cost reviews — which APIs to query, how to structure the recommendation, which approval workflow to trigger
Foundry vs Other Memory Approaches
Custom memory stacks (LangChain Memory, Mem0, custom vector DBs) require you to build and maintain the retrieval logic, handle identity boundaries, and manage storage separately. Foundry's memory layer handles all of this natively, with Azure identity governing access at each tier.
AWS Bedrock Agents has memory capabilities but lacks the 3-tier separation and the deep Microsoft 365 identity integration that matters for enterprise deployments.
Key Trade-offs to Understand
- Foundry-only: Memory is tied to the Foundry runtime and Azure storage — not portable
- Novel task overhead: On tasks the agent has not seen before, procedural memory adds latency with no benefit
- Privacy and compliance: User memory must be governed carefully — what is stored, who can access it, how it is deleted on request
- Cost model: Memory retrieval adds token overhead; high-frequency agents need to design retrieval carefully
- Memory drift: Procedural memory needs periodic review — old approaches become outdated as systems change
Key Takeaways
- Foundry now has a production-grade 3-tier memory architecture built into the runtime
- Procedural memory is the cost efficiency play for high-volume, repeatable agent workflows
- User memory enables genuinely personalised, long-term AI assistants at enterprise scale
- Native Azure identity integration makes access control straightforward for compliance teams
- For teams evaluating agent platforms, this closes a significant gap that previously required custom infrastructure


