Managed agent memory

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

Managed agent memory gives your AI agents long-term memory across conversations. Azure Databricks runs the infrastructure and isolates each scope's memories, so you don't manage storage or partitioning yourself.

With managed memory, your agents can:

  • Remember user preferences, past decisions, and accumulated context across conversations.
  • Secure that knowledge with Unity Catalog governance.
  • Share it across agents and projects.
  • Improve in accuracy and efficiency over time.

Requirements

  • A Databricks workspace with Unity Catalog enabled.
  • The CREATE MEMORY STORE privilege on the parent schema to create memory stores.

How managed memory works

Managed memory has two levels:

  • A memory store is a Unity Catalog securable that acts as a container for memory entries. A memory store inherits the same governance, access control, and lineage as any other Unity Catalog asset.
  • A memory entry is an individual piece of content stored inside a memory store. Each entry is identified by a scope and a path. The scope determines whose memories an entry belongs to, and the path organizes entries within a scope, similar to a file path (for example, /memories/preferences.md).

Scope

Scope is how managed memory keeps one agent's memories separated for different users or groups. Every memory entry belongs to exactly one scope, and a search only returns entries within the scope you query.

  • Personal memory: Use an end-user ID as the scope so each user gets their own private memory, such as their preferences and past decisions. Users only see their own entries.
  • Organizational knowledge: Use a shared key, such as an organization or team ID, to store knowledge that any user of the agent can draw on, such as company facts, glossaries, and best practices.

A single agent can use both at once: read from a user's personal scope and a shared organizational scope in the same conversation. The scope is required on every memory entry request.

Warning

Scope is the isolation boundary between users. Configure the scope in trusted code, and never let the model set it. The app service principal can read every scope.

Get started with managed memory

The easiest way to add managed memory to an agent is the managed-memory Claude Code skill. It handles the whole setup for you. The skill works with both the OpenAI Agents SDK and LangGraph.

Get the skill into your project one of two ways:

Start from a template

The skill ships inside the Databricks app templates. Scaffold a new agent from one of the agent templates, find the skill under .claude/skills/managed-memory/.

  1. Clone the templates repository:

    git clone https://github.com/databricks/app-templates.git
    
  2. Browse the app-templates, select an agent template to start from. For example, to use the OpenAI Agents SDK template:

    cd app-templates/agent-openai-agents-sdk
    

    Note

    For "advanced" app templates, after you deploy, you must grant the app service principal Lakebase Postgres privileges otherwise session setup will return a 502 error.

  3. Once the skill is in your project, describe what you want and your coding assistant takes care of the rest:

    Tip

    Add Databricks managed long-term memory to my agent.
    

Add the skill to an existing project

If you already have an agent project, add the skill to it.

  1. Create the skills directory if it doesn't exist:

    mkdir -p .claude/skills/managed-memory
    
  2. Download the SKILL.md file from the managed-memory skill directory and save it to .claude/skills/managed-memory/.

  3. Once the skill is in your project, describe what you want and your coding assistant takes care of the rest:

    Tip

    Add Databricks managed long-term memory to my agent.
    

Create and use a memory store

The following example sets up managed memory for a customer support agent that stores a user's preferences and retrieves them in a later conversation.

  1. Generate an OAuth token using the Databricks CLI to call the APIs:

    databricks auth login --host ${DATABRICKS_HOST}
    databricks auth token
    
  2. Create a memory store to hold your agent's memories:

    curl -X POST "https://${DATABRICKS_HOST}/api/2.1/unity-catalog/memory-stores" \
      -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "support_agent_memory",
        "catalog_name": "main",
        "schema_name": "default",
        "description": "Long-term memory for the customer support agent"
      }'
    
  3. Write a memory entry after the agent learns something about a user. The scope partitions the entry to a single user. Use the contents field for the full memory text and the description as a short summary that improves retrieval:

    curl -X POST \
      "https://${DATABRICKS_HOST}/api/2.1/unity-catalog/memory-stores/main.default.support_agent_memory/entries?scope=user-123" \
      -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
      -H "Content-Type: application/json" \
      -d '{
        "path": "/memories/preferences.md",
        "contents": "Prefers email communication. Timezone: PST. Has an Enterprise subscription.",
        "description": "User 123 communication preferences and account details"
      }'
    
  4. Search memory entries for that user in a later conversation to retrieve what the agent learned:

    curl -X POST \
      "https://${DATABRICKS_HOST}/api/2.1/unity-catalog/memory-stores/main.default.support_agent_memory/entries:search" \
      -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
      -H "Content-Type: application/json" \
      -d '{
        "scope": "user-123",
        "query": "communication preferences"
      }'
    

For the full REST API, including endpoints, request fields, and response fields, see Memory API reference.

Add memory to an agent with conversations

The REST workflow above calls the memory store and entry APIs directly. When you build an agent on a Azure Databricks model serving endpoint, connect a memory store to a conversation with the OpenAI-compatible client in the databricks-openai SDK instead.

A conversation is OpenAI-compatible conversation state — the running history of messages and tool calls — backed by a memory store and pinned to a single scope. Reuse the same conversation across requests to give the agent memory of earlier turns.

  1. Bind an existing memory store and a scope to a new conversation. memory_store.name is the three-level name of the store, and scope partitions the conversation's state, typically by end user:

    from databricks.sdk import WorkspaceClient
    from databricks_openai import DatabricksOpenAI
    
    workspace_client = WorkspaceClient()
    user_id = str(workspace_client.current_user.me().id)
    
    client = DatabricksOpenAI(workspace_client=workspace_client, use_ai_gateway=True)
    
    conversation = client.conversations.create(
        extra_body={
            "memory_store": {"name": "main.default.support_agent_memory"},
            "scope": {"kind": "user", "value": user_id},
        },
    )
    
  2. Pass the conversation ID to responses.create. The agent reads and writes the conversation's state in the bound memory store under that scope:

    response = client.responses.create(
        model="databricks-gpt-5-2",
        conversation=conversation.id,
        input=[{"type": "message", "role": "user", "content": "What is the average NYC taxi price?"}],
        stream=True,
    )
    
    for event in response:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
    
  3. Reuse the same conversation ID on later requests so the agent remembers earlier turns. Do not create a new conversation per turn:

    followup = client.responses.create(
        model="databricks-gpt-5-2",
        conversation=conversation.id,
        input=[{"type": "message", "role": "user", "content": "Restate the average taxi price you found, and how it was calculated."}],
        stream=True,
    )
    
    for event in followup:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
    

For the conversation endpoints and request fields, see Conversation APIs.

Memory access control

Memory stores are Unity Catalog securables. The following privileges control access:

Privilege Applies to Description
CREATE MEMORY STORE Parent schema Create new memory stores under a schema.
READ MEMORY STORE Memory store Read a memory store's metadata and its entries.
WRITE MEMORY STORE Memory store Create, update, and delete memory entries in a store.
MANAGE Memory store Update or delete the memory store itself. Grant permissions to other users.
USE SCHEMA Parent schema List memory stores in a schema.

Implement short-term memory

The memory entry APIs provide long-term memory only. To give your agent short-term memory in a session, use one of the following:

  • Persist session state server-side with a conversation.
  • Keep your agent framework's session memory, such as the OpenAI session= parameter or a LangGraph checkpointer.
  • Use self-managed agent memory.

Limitations

  • Memory entries provide long-term memory only. For the difference between short-term and long-term memory, see Short-term and long-term memory.
  • Memory stores and entries are created and managed through the Unity Catalog REST API only; there is no Python SDK for these APIs. To use a memory store from an agent, connect it to a conversation with the OpenAI-compatible client. See Add memory to an agent with conversations.

Next steps