Most teams try to fix weak agents by rewriting prompts. That is usually the wrong move.
The real issue is that the agent has no durable memory model. It can answer the current turn, but it cannot reliably carry forward user preferences, prior decisions, task context, or the small facts that make an interaction feel intelligent instead of stateless. Microsoft Agent Framework already gives you the extension points to solve this. By default, it uses in-memory chat history or service-managed history depending on the underlying provider, and it separates conversation history from richer context injection through ChatHistoryProvider and AIContextProvider. That separation is exactly what you want if you are serious about building agents that behave like products rather than demos. (Microsoft Learn)
As of February 2026, Microsoft Agent Framework is in Release Candidate for both .NET and Python, with the API surface described as stable on the road to GA. Microsoft also positions it as the successor to Semantic Kernel and AutoGen for agent development, which makes it the right place to invest if you are building new agent systems on the Microsoft stack. (Microsoft for Developers)
The important architectural move is this: do not treat “memory” as one blob. In Microsoft Agent Framework, short-term conversational recall belongs in ChatHistoryProvider. Durable user and workflow memory belongs in an AIContextProviderbacked by your own memory service. External knowledge belongs in retrieval, not memory. The framework pipeline reflects that design. On each run, history is loaded first, then context providers add messages, tools, or instructions, then the LLM executes, and afterward both history and context providers are notified so they can store new information. (Microsoft Learn)
That means a good memory design has three layers.
First, keep recent turns available so the model can follow the conversation naturally. Second, maintain a durable memory store that saves distilled facts such as user preferences, entities, project state, or decisions. Third, use retrieval for external documents and domain knowledge. Microsoft’s RAG support already follows this pattern through TextSearchProvider and Semantic Kernel vector store integrations, which is why you should resist the temptation to dump everything into “memory.” Retrieval is for knowledge. Memory is for continuity. (Microsoft Learn)
The cleanest way to add a memory service in .NET is to create a custom AIContextProvider. Microsoft’s docs explicitly describe AIContextProvider as the extension point for memory and context enrichment, and they show a memory-service pattern that loads relevant memories before invocation and stores new memories after invocation. They also make an important point that many teams miss: the same provider instance is reused across sessions, so session-specific state must not live on the provider object itself. It should live in the AgentSession, typically through ProviderSessionState<TState>. (Microsoft Learn)
Here is the pattern I would use.
using System.Text;using Microsoft.Agents.AI;public interface IMemoryService{ Task<string> CreateMemoryContainerAsync(CancellationToken cancellationToken = default); Task<IReadOnlyList<MemoryFact>> SearchAsync( string memoryContainerId, string query, int top = 5, CancellationToken cancellationToken = default); Task StoreAsync( string memoryContainerId, IEnumerable<ChatMessage> messages, CancellationToken cancellationToken = default);}public sealed record MemoryFact(string Text, double Score);internal sealed class DurableMemoryProvider : AIContextProvider{ private readonly IMemoryService _memoryService; private readonly ProviderSessionState<State> _sessionState; public DurableMemoryProvider(IMemoryService memoryService) : base(null, null) { _memoryService = memoryService; _sessionState = new ProviderSessionState<State>( _ => new State(), stateKey: nameof(DurableMemoryProvider)); } public override string StateKey => _sessionState.StateKey; protected override async ValueTask<AIContext> ProvideAIContextAsync( InvokingContext context, CancellationToken cancellationToken = default) { var state = _sessionState.GetOrInitializeState(context.Session); if (string.IsNullOrWhiteSpace(state.MemoryContainerId)) { return new AIContext(); } var userInput = string.Join( "\n", context.AIContext.Messages? .Where(m => m.GetAgentRequestMessageSourceType() == AgentRequestMessageSourceType.External) .Select(m => m.Text) ?? []); if (string.IsNullOrWhiteSpace(userInput)) { return new AIContext(); } var memories = await _memoryService.SearchAsync( state.MemoryContainerId, userInput, top: 5, cancellationToken); if (memories.Count == 0) { return new AIContext(); } var memoryBlock = new StringBuilder(); memoryBlock.AppendLine("Relevant durable memory for this conversation:"); foreach (var memory in memories) { memoryBlock.AppendLine($"- {memory.Text}"); } memoryBlock.AppendLine("Use this only when relevant. Do not invent details."); return new AIContext { Messages = [ new ChatMessage(ChatRole.System, memoryBlock.ToString()) ] }; } protected override async ValueTask StoreAIContextAsync( InvokedContext context, CancellationToken cancellationToken = default) { var state = _sessionState.GetOrInitializeState(context.Session); if (string.IsNullOrWhiteSpace(state.MemoryContainerId)) { state.MemoryContainerId = await _memoryService.CreateMemoryContainerAsync(cancellationToken); _sessionState.SaveState(context.Session, state); } var newMessages = context.RequestMessages.Concat(context.ResponseMessages ?? []); await _memoryService.StoreAsync(state.MemoryContainerId!, newMessages, cancellationToken); } private sealed class State { public string? MemoryContainerId { get; set; } }}
This pattern does two things that matter. Before the model runs, it searches durable memory using only the new external input, not the full accumulated prompt. After the model runs, it stores the latest exchange back into the memory service. That lines up with the framework’s documented ProvideAIContextAsync and StoreAIContextAsync lifecycle, and it keeps the memory container ID in session state where it belongs. (Microsoft Learn)
Then wire it into your agent the way the framework intends:
using Azure.AI.OpenAI;using Azure.Identity;using Microsoft.Agents.AI;var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT") ?? throw new InvalidOperationException("Set AZURE_OPENAI_ENDPOINT");var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME") ?? "gpt-4o-mini";IMemoryService memoryService = new YourMemoryServiceImplementation();AIAgent agent = new AzureOpenAIClient(new Uri(endpoint), new AzureCliCredential()) .GetChatClient(deploymentName) .AsAIAgent(new ChatClientAgentOptions { ChatOptions = new() { Instructions = "You are a helpful assistant." }, ChatHistoryProvider = new InMemoryChatHistoryProvider(), AIContextProviders = [ new DurableMemoryProvider(memoryService) ] });AgentSession session = await agent.CreateSessionAsync();Console.WriteLine(await agent.RunAsync("My preferred reporting format is concise.", session));Console.WriteLine(await agent.RunAsync("How should you format future updates for me?", session));
That registration model is straight out of the framework’s current design: one history provider, plus a list of context providers for memory, RAG, or dynamic instructions. (Microsoft Learn)
There is one more design choice that separates solid implementations from messy ones. Do not store raw transcripts forever and call that memory. A real memory service should distill turns into durable facts. “User prefers concise updates.” “Client uses Workday.” “Case escalated on March 12.” “Do not recommend option B because legal rejected it.” Your StoreAsync method should extract candidate facts, score them, deduplicate them, and set an expiration policy where appropriate. The framework gives you the hook points, but the product quality comes from what you choose to remember.
You also need to manage context growth. Microsoft’s compaction guidance is useful here. If you use CompactionProvider, be deliberate about where you register it. When registered through ChatClientAgentOptions, synthetic summaries can become part of persisted history. If you only want to compact the in-flight context window while preserving the original stored history, register compaction on the chat client builder with UseAIContextProviders(...) instead. That is a subtle detail, but it matters once your agent starts running real workloads instead of toy chats. (Microsoft Learn)
For production hosting, this pattern becomes even more powerful when paired with durable execution. Microsoft’s Azure Functions integration for Agent Framework supports durable thread management and persistent state for long-running, reliable workloads, with automatic endpoints and state handling. That is the right direction if your memory-backed agent is part of a business workflow rather than a single-turn assistant. (Microsoft Learn)
The broader lesson is simple. Memory should not be a prompt hack. It should be an architectural layer.
Microsoft Agent Framework is opinionated in the right way here. It gives you a dedicated history mechanism, a dedicated context mechanism, and a clean place to plug in your own memory service. Use those primitives as intended and you get an agent that remembers the right things, forgets the wrong things, and behaves consistently across sessions. Ignore them and you end up with an expensive autocomplete loop pretending to be an agent.
