Integrate Unity Catalog tools with third party generative AI frameworks

Unity Catalog AI agent tools can be used in popular gen AI libraries like LangChain, LlamaIndex, OpenAI, and Anthropic. These integrations combine Unity Catalog tool governance with the capabilities of third party agent authoring frameworks. For example:

  • In LangChain, Unity Catalog functions can be part of an agent's workflow to perform tasks like querying or transforming data.
  • In OpenAI or Anthropic integrations, the functions are called directly by the AI model during execution.

Select your framework in the following tabs to create a Unity Catalog tool and use it with that framework. Run the code in a Azure Databricks notebook or Python script.

Requirements

  • Install Python 3.10 or above.

LangChain

Use Azure Databricks Unity Catalog to integrate SQL and Python functions as tools in LangChain and LangGraph workflows. This integration combines the governance of Unity Catalog with LangChain capabilities to build powerful LLM-based applications.

In this example, you create a Unity Catalog tool, test its functionality, and add it to an agent.

Install dependencies

Install Unity Catalog AI packages with the Databricks optional and install the LangChain integration package.

# Install the Unity Catalog AI integration package with the Databricks extra
%pip install unitycatalog-langchain[databricks]

# Install Databricks Langchain integration package
%pip install databricks-langchain
dbutils.library.restartPython()

Initialize the Databricks Function Client

Initialize the Databricks Function Client.

from unitycatalog.ai.core.base import get_uc_function_client

client = get_uc_function_client()

Define the tool's logic

Create a Unity Catalog function containing the tool's logic.


CATALOG = "my_catalog"
SCHEMA = "my_schema"

def add_numbers(number_1: float, number_2: float) -> float:
  """
  A function that accepts two floating point numbers adds them,
  and returns the resulting sum as a float.

  Args:
    number_1 (float): The first of the two numbers to add.
    number_2 (float): The second of the two numbers to add.

  Returns:
    float: The sum of the two input numbers.
  """
  return number_1 + number_2

function_info = client.create_python_function(
  func=add_numbers,
  catalog=CATALOG,
  schema=SCHEMA,
  replace=True
)

Test the function

Test your function to check it works as expected:

result = client.execute_function(
  function_name=f"{CATALOG}.{SCHEMA}.add_numbers",
  parameters={"number_1": 36939.0, "number_2": 8922.4}
)

result.value # OUTPUT: '45861.4'

Wrap the function using the UCFunctionToolKit

Wrap the function using the UCFunctionToolkit to make it accessible to agent authoring libraries. The toolkit ensures consistency across different libraries and adds helpful features like auto-tracing for retrievers.

from databricks_langchain import UCFunctionToolkit

# Create a toolkit with the Unity Catalog function
func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(function_names=[func_name])

tools = toolkit.tools

Use the tool in an agent

Add the tool to a LangChain agent using the tools property from UCFunctionToolkit.

This example authors a simple agent using LangChain's AgentExecutor API for simplicity. For production workloads, use the agent authoring workflow seen in Author an AI agent and deploy it on Databricks Apps.

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.prompts import ChatPromptTemplate
from databricks_langchain import (
  ChatDatabricks,
  UCFunctionToolkit,
)
import mlflow

# Initialize the LLM (replace with your LLM of choice, if desired)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-3-70b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME, temperature=0.1)

# Define the prompt
prompt = ChatPromptTemplate.from_messages(
  [
    (
      "system",
      "You are a helpful assistant. Make sure to use tools for additional functionality.",
    ),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
  ]
)

# Enable automatic tracing
mlflow.langchain.autolog()

# Define the agent, specifying the tools from the toolkit above
agent = create_tool_calling_agent(llm, tools, prompt)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})

LlamaIndex

Use Azure Databricks Unity Catalog to integrate SQL and Python functions as tools in LlamaIndex workflows. This integration combines Unity Catalog governance with LlamaIndex's capabilities to index and query large datasets for LLMs.

  1. Install the Databricks Unity Catalog integration package for LlamaIndex.

    %pip install unitycatalog-llamaindex[databricks]
    dbutils.library.restartPython()
    
  2. Create an instance of the Unity Catalog functions client.

    from unitycatalog.ai.core.base import get_uc_function_client
    
    client = get_uc_function_client()
    
  3. Create a Unity Catalog function written in Python.

    CATALOG = "your_catalog"
    SCHEMA = "your_schema"
    
    func_name = f"{CATALOG}.{SCHEMA}.code_function"
    
    def code_function(code: str) -> str:
      """
      Runs Python code.
    
      Args:
        code (str): The Python code to run.
      Returns:
        str: The result of running the Python code.
      """
      import sys
      from io import StringIO
      stdout = StringIO()
      sys.stdout = stdout
      exec(code)
      return stdout.getvalue()
    
    client.create_python_function(
      func=code_function,
      catalog=CATALOG,
      schema=SCHEMA,
      replace=True
    )
    
  4. Create an instance of the Unity Catalog function as a toolkit, and run it to verify that the tool behaves properly.

    from unitycatalog.ai.llama_index.toolkit import UCFunctionToolkit
    import mlflow
    
    # Enable traces
    mlflow.llama_index.autolog()
    
    # Create a UCFunctionToolkit that includes the UC function
    toolkit = UCFunctionToolkit(function_names=[func_name])
    
    # Fetch the tools stored in the toolkit
    tools = toolkit.tools
    python_exec_tool = tools[0]
    
    # Run the tool directly
    result = python_exec_tool.call(code="print(1 + 1)")
    print(result)  # Outputs: {"format": "SCALAR", "value": "2\n"}
    
  5. Use the tool in a LlamaIndex ReActAgent by defining the Unity Catalog function as part of a LlamaIndex tool collection. Then verify that the agent behaves properly by calling the LlamaIndex tool collection.

    from llama_index.llms.openai import OpenAI
    from llama_index.core.agent import ReActAgent
    
    llm = OpenAI()
    
    agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)
    
    agent.chat("Please run the following python code: `print(1 + 1)`")
    

OpenAI

Use Azure Databricks Unity Catalog to integrate SQL and Python functions as tools in OpenAI workflows. This integration combines the governance of Unity Catalog with OpenAI to create powerful gen AI apps.

  1. Install the Databricks Unity Catalog integration package for OpenAI.

    %pip install unitycatalog-openai[databricks]
    %pip install mlflow -U
    dbutils.library.restartPython()
    
  2. Create an instance of the Unity Catalog functions client.

    from unitycatalog.ai.core.base import get_uc_function_client
    
    client = get_uc_function_client()
    
  3. Create a Unity Catalog function written in Python.

    CATALOG = "your_catalog"
    SCHEMA = "your_schema"
    
    func_name = f"{CATALOG}.{SCHEMA}.code_function"
    
    def code_function(code: str) -> str:
      """
      Runs Python code.
    
      Args:
        code (str): The python code to run.
      Returns:
        str: The result of running the Python code.
      """
      import sys
      from io import StringIO
      stdout = StringIO()
      sys.stdout = stdout
      exec(code)
      return stdout.getvalue()
    
    client.create_python_function(
      func=code_function,
      catalog=CATALOG,
      schema=SCHEMA,
      replace=True
    )
    
  4. Create an instance of the Unity Catalog function as a toolkit and verify that the tool behaves properly by running the function.

    from unitycatalog.ai.openai.toolkit import UCFunctionToolkit
    import mlflow
    
    # Enable tracing
    mlflow.openai.autolog()
    
    # Create a UCFunctionToolkit that includes the UC function
    toolkit = UCFunctionToolkit(function_names=[func_name])
    
    # Fetch the tools stored in the toolkit
    tools = toolkit.tools
    client.execute_function = tools[0]
    
  5. Submit the request to the OpenAI model along with the tools.

    import openai
    
    messages = [
      {
        "role": "system",
        "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user.",
      },
      {"role": "user", "content": "What is the result of 2**10?"},
    ]
    response = openai.chat.completions.create(
      model="gpt-4o-mini",
      messages=messages,
      tools=tools,
    )
    # check the model response
    print(response)
    
  6. After OpenAI returns a response, invoke the Unity Catalog function call to generate the response answer back to OpenAI.

    import json
    
    # OpenAI sends only a single request per tool call
    tool_call = response.choices[0].message.tool_calls[0]
    # Extract arguments that the Unity Catalog function needs to run
    arguments = json.loads(tool_call.function.arguments)
    
    # Run the function based on the arguments
    result = client.execute_function(func_name, arguments)
    print(result.value)
    
  7. Once the answer has been returned, you can construct the response payload for subsequent calls to OpenAI.

    # Create a message containing the result of the function call
    function_call_result_message = {
      "role": "tool",
      "content": json.dumps({"content": result.value}),
      "tool_call_id": tool_call.id,
    }
    assistant_message = response.choices[0].message.to_dict()
    completion_payload = {
      "model": "gpt-4o-mini",
      "messages": [*messages, assistant_message, function_call_result_message],
    }
    
    # Generate final response
    openai.chat.completions.create(
      model=completion_payload["model"], messages=completion_payload["messages"]
    )
    

Utilities

To simplify the process of crafting the tool response, the ucai-openai package has a utility, generate_tool_call_messages, that converts OpenAI ChatCompletion response messages so that they can be used for response generation.

from unitycatalog.ai.openai.utils import generate_tool_call_messages

messages = generate_tool_call_messages(response=response, client=client)
print(messages)

Note

If the response contains multiple choice entries, you can pass the choice_index argument when calling generate_tool_call_messages to choose which choice entry to utilize. There is currently no support for processing multiple choice entries.

Anthropic

Use Azure Databricks Unity Catalog to integrate SQL and Python functions as tools in Anthropic SDK LLM calls. This integration combines the governance of Unity Catalog with Anthropic models to create powerful gen AI apps.

Note

The Anthropic integration requires Databricks Runtime 15.0 and above.

  1. Install the Databricks Unity Catalog integration package for Anthropic.

    %pip install unitycatalog-anthropic[databricks]
    dbutils.library.restartPython()
    
  2. Create an instance of the Unity Catalog functions client.

    from unitycatalog.ai.core.base import get_uc_function_client
    
    client = get_uc_function_client()
    
  3. Create a Unity Catalog function written in Python.

    CATALOG = "your_catalog"
    SCHEMA = "your_schema"
    
    func_name = f"{CATALOG}.{SCHEMA}.weather_function"
    
    def weather_function(location: str) -> str:
      """
      Fetches the current weather from a given location in degrees Celsius.
    
      Args:
        location (str): The location to fetch the current weather from.
      Returns:
        str: The current temperature for the location provided in Celsius.
      """
      return f"The current temperature for {location} is 24.5 celsius"
    
    client.create_python_function(
      func=weather_function,
      catalog=CATALOG,
      schema=SCHEMA,
      replace=True
    )
    
  4. Create an instance of the Unity Catalog function as a toolkit.

    from unitycatalog.ai.anthropic.toolkit import UCFunctionToolkit
    
    # Create an instance of the toolkit
    toolkit = UCFunctionToolkit(function_names=[func_name], client=client)
    
  5. Use a tool call in Anthropic.

    import anthropic
    
    # Initialize the Anthropic client with your API key
    anthropic_client = anthropic.Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")
    
    # User's question
    question = [{"role": "user", "content": "What's the weather in New York City?"}]
    
    # Make the initial call to Anthropic
    response = anthropic_client.messages.create(
      model="claude-3-5-sonnet-20240620",  # Specify the model
      max_tokens=1024,  # Use 'max_tokens' instead of 'max_tokens_to_sample'
      tools=toolkit.tools,
      messages=question  # Provide the conversation history
    )
    
    # Print the response content
    print(response)
    
  6. Construct a tool response. The response from the Claude model contains a tool request metadata block if a tool needs to be called.

    from unitycatalog.ai.anthropic.utils import generate_tool_call_messages
    
    # Call the UC function and construct the required formatted response
    tool_messages = generate_tool_call_messages(
      response=response,
      client=client,
      conversation_history=question
    )
    
    # Continue the conversation with Anthropic
    tool_response = anthropic_client.messages.create(
      model="claude-3-5-sonnet-20240620",
      max_tokens=1024,
      tools=toolkit.tools,
      messages=tool_messages,
    )
    
    print(tool_response)
    

The unitycatalog.ai-anthropic package includes a message handler utility to simplify the parsing and handling of a call to the Unity Catalog function. The utility does the following:

  1. Detects tool calling requirements.
  2. Extracts tool calling information from the query.
  3. Performs the call to the Unity Catalog function.
  4. Parses the response from the Unity Catalog function.
  5. Craft the next message format to continue the conversation with Claude.

Note

The entire conversation history must be provided in the conversation_history argument to the generate_tool_call_messages API. Claude models require the initialization of the conversation (the original user input question) and all subsequent LLM-generated responses and multi-turn tool call results.