Edit

Share via


Memory system in SRE Agent preview

The SRE Agent memory system gives agents the knowledge they need to troubleshoot effectively. By adding runbooks, team standards, and service-specific context, you help agents provide better answers during incidents. The system learns from each session to improve over time.

Memory components

The memory system consists of four complementary components:

Component Purpose Setup Best for
User Memories Quick chat commands for team knowledge Instant (chat commands) Team standards, service configurations, workflow patterns
Knowledge Base Direct document uploads for runbooks Quick (file upload) Static runbooks, troubleshooting guides, internal documentation
Documentation connector Automated Azure DevOps synchronization Configuration required Living documentation, frequently updated guides
Session insights Agent-generated memories from sessions Automatic Learned troubleshooting patterns, past incident resolutions

How agents retrieve memory

During conversations, agents retrieve information from memory sources through configured tools.

Diagram of the Azure SRE Agent memory system loop.

Tool configuration

The SearchMemory tool retrieves all memory components. It searches across user memories, knowledge base, session insights, and documentation connector simultaneously.

  • SRE Agent (default): SearchMemory is built in
  • Custom subagents: Add SearchMemory tool to your configuration

Important

Don't store secrets, credentials, API keys, or sensitive data in any memory component. Memories are shared across your team and indexed for search.

Quick start

Begin by establishing foundational knowledge with user memories, and then expand to document storage and automated synchronization as your needs grow.

1. Start with user memories

Use chat commands to save immediate team knowledge:

#remember Team owns services: app-service-prod, redis-cache-prod, and sql-db-prod

#remember For latency issues, check Redis cache health first

#remember Production deployments happen Tuesdays at 2 PM PST

These facts are now available across all conversations.

2. Upload key documents

Add critical runbooks and guides to the knowledge base:

  1. Open your SRE Agent in the Azure portal.

  2. Go to Settings > Knowledge base.

  3. Select Add file or drag and drop files into the upload area.

  4. Upload .md or .txt files (up to 16 MB each).

  5. The system indexes files and makes them available for retrieval through SearchMemory.

3. Review session insights

After troubleshooting sessions, check Settings > Session insights to see what went well and where the agent needs more context. Use the insights to identify knowledge gaps and add targeted memories or documentation.

4. Connect repositories (optional)

For teams with existing documentation in Azure DevOps:

  1. Go to Settings > Connectors.

  2. Select Add connector and select Documentation connector.

  3. Enter your Azure DevOps repository URL and select a managed identity.

    The connector starts indexing automatically.

User memories

User memories let you save team facts, standards, and context that agents remember across all conversations. By using simple chat commands (#remember, #forget, #retrieve), you can build a persistent knowledge base that automatically enhances agent responses.

Chat commands

Save information by using #remember

Save facts, standards, or context for future conversations.

Syntax:

#remember [content to save]

Examples:

#remember Team owns app-service-prod in East US region
#remember For app-service-prod latency issues, check Redis cache health first
#remember Team uses Kusto for logs. Workspace is "myteam-prod-logs"

Content is embedded by using OpenAI, stored in Azure AI Search, and becomes available for automatic retrieval across all conversations. You see a confirmation: ✅ Agent Memory saved.

Remove memories by using #forget

Delete previously saved memories by searching for them.

Syntax:

#forget [description of what to forget]

Examples:

#forget NSG rules information
#forget production environment location

The system searches your memories semantically for the best match, shows you the content, and deletes it. You see a confirmation: ✅ Agent Memory forgotten: [deleted content]

Query memories by using #retrieve

Explicitly search and display saved memories without triggering agent reasoning.

Syntax:

#retrieve [search query]

Examples:

#retrieve production environment
#retrieve deployment process

Searches memories semantically, and then uses the top five matches to synthesize a response. Both the individual memories and the synthesized answer are displayed.

Scope and storage

  • Shared across the team: All users of the SRE Agent can access it.

  • Persist across all conversations: Save it once, and it's available forever.

  • Automatically retrieved when relevant: Agents search memories semantically during reasoning.

Knowledge base

The knowledge base provides direct document upload capabilities for runbooks, troubleshooting guides, and internal documentation that agents can retrieve during conversations.

Supported file types and limits

  • Formats: .md (markdown, recommended), .txt (plain text)
  • Per file: 16 MB maximum (Azure AI Search limit)
  • Per request: 100 MB total for all files in a single upload

Upload documents

  1. Go to Settings > Knowledge Base.

  2. Select Add file or drag and drop files into the upload area.

    The portal automatically validates, uploads, and indexes files.

Manage documents

  • View: Go to Settings > Knowledge Base to see all uploaded documents.

  • Update: To overwrite the previous version, upload a file with the same name.

  • Delete: Select documents and use the delete action. Changes take effect immediately.

Session insights

As the agent handles your incidents, it learns. Session insights capture what worked, what didn't, and key learnings from each session. The agent automatically applies that knowledge to help with similar issues in the future.

Automatic improvement

The agent learns from every session without any manual effort:

  • The agent handles an issue autonomously or works with you directly.
  • The agent captures symptoms, resolution steps, root cause, and pitfalls.
  • These insights become searchable memories.
  • Future sessions automatically retrieve relevant past insights.

The result: the agent gets better over time, suggesting proven resolutions and avoiding known pitfalls.

Discover opportunities

While session insights work automatically, reviewing them can surface valuable patterns you might want to act on.

Pattern you might discover Potential action
Same issue keeps recurring Fix the underlying code or configuration
Agent lacks context about your service Create a custom subagent with domain knowledge
Troubleshooting steps aren't documented Update or create a runbook
Telemetry gaps made diagnosis harder Improve logging or add metrics
Alert triggered but wasn't actionable Tune the alert or add runbook links

Think of session insights as a window into what the agent learns. You might find something worth acting on, or you might just let the agent handle any surfaced issues.

How it works

Session insights create a continuous improvement loop: the agent captures symptoms, steps, root cause, and pitfalls from each session, then retrieves relevant past insights when similar issues arise. This automatic cycle helps the agent resolve problems faster over time.

Diagram of Azure SRE Agent memory system loop.

What the agent captures

The agent captures series of data points from each session to improve future troubleshooting.

Captured How the agent uses it
Symptoms observed Recognizes similar patterns in future problems
Steps that worked Suggests proven resolution paths
Root cause found Jumps to likely causes faster
Pitfalls encountered Avoids repeating mistakes
Context you provided Remembers facts about your environment
Resources involved Connects past problems on same resources

When insights are generated

The system generates insights automatically after conversations finish, or you can request them on-demand.

  • Automatically: After conversations finish (runs periodically, approximately every 30 minutes)
  • On-demand: Select Generate Session insights in the chat footer for immediate results (about 30 seconds)

Browse insights

Go to Settings > Session insights to see what the agent learned:

  • Total count in the header
  • List of insights with session title and timestamp
  • Detail view with expandable Timeline and Agent Performance sections
  • Go to Thread to revisit the original conversation

Note

While periodic manual browsing of insights can surface recurring patterns worth addressing, the agent benefits from these insights whether you review them or not.

Insight structure

Each insight includes:

  • Timeline: Chronological milestones of the troubleshooting session (up to eight)
  • Agent Performance: What went well, areas for improvement, and key learnings
  • Investigation quality score: 1-5 rating for investigation completeness