Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This tutorial shows how to build a Databricks app using the Databricks SQL Connector for Python and Streamlit. You'll learn how to develop an app that does the following:
- Reads a Unity Catalog table and displays it in a Streamlit interface.
- Edits data and writes it back to the table.
Step 1: Configure privileges
These examples assume that your app uses app authorization. Your app’s service principal must have:
SELECTprivilege on the Unity Catalog tableMODIFYprivilege on the Unity Catalog tableCAN USEprivilege on the SQL warehouse
For more information, see Unity Catalog privileges and securable objects and SQL warehouse ACLs.
Step 2: Install dependencies
Create a requirements.txt file and include the following packages:
databricks-sdk
databricks-sql-connector
streamlit
pandas
Step 3: Configure app execution
Create an app.yaml file to define how your app starts in Azure Databricks Apps.
command: ['streamlit', 'run', 'app.py']
Step 4: Read a Unity Catalog table
This example code demonstrates how to read data from a Unity Catalog table and display it using Streamlit. Create an app.py file that fulfills the following objectives:
- Uses app service principal authentication.
- Prompts the user for the SQL warehouse HTTP path and Unity Catalog table name.
- Executes a
SELECT *query on the specified table. - Displays the result in a Streamlit
st.dataframe.
app.py
import pandas as pd
import streamlit as st
from databricks import sql
from databricks.sdk.core import Config
import os
cfg = Config()
# Use app service principal authentication
def get_connection(http_path):
server_hostname = cfg.host
if server_hostname.startswith('https://'):
server_hostname = server_hostname.replace('https://', '')
elif server_hostname.startswith('http://'):
server_hostname = server_hostname.replace('http://', '')
return sql.connect(
server_hostname=server_hostname,
http_path=http_path,
credentials_provider=lambda: cfg.authenticate,
_use_arrow_native_complex_types=False,
)
# Read data from a Unity Catalog table and return it as a pandas DataFrame
def read_table(table_name: str, conn) -> pd.DataFrame:
with conn.cursor() as cursor:
cursor.execute(f"SELECT * FROM {table_name}")
return cursor.fetchall_arrow().to_pandas()
# Use Streamlit input fields to accept user input
http_path_input = st.text_input(
"Enter your Databricks HTTP Path:", placeholder="/sql/1.0/warehouses/xxxxxx"
)
table_name = st.text_input(
"Specify a Unity Catalog table name:", placeholder="catalog.schema.table"
)
# Display the result in a Streamlit DataFrame
if http_path_input and table_name:
conn = get_connection(http_path_input)
df = read_table(table_name, conn)
st.dataframe(df)
else:
st.warning("Provide both the warehouse path and a table name to load data.")
Step 5: Edit a Unity Catalog table
This example code allows users read, edit, and write changes to a Unity Catalog table using Streamlit's data editing features. Add the following functionality to the app.py file:
- Use
INSERT OVERWRITEto write the updated data back to the table.
app.py
import pandas as pd
import streamlit as st
from databricks import sql
from databricks.sdk.core import Config
import math
cfg = Config()
# Use app service principal authentication
def get_connection(http_path):
server_hostname = cfg.host
if server_hostname.startswith('https://'):
server_hostname = server_hostname.replace('https://', '')
elif server_hostname.startswith('http://'):
server_hostname = server_hostname.replace('http://', '')
return sql.connect(
server_hostname=server_hostname,
http_path=http_path,
credentials_provider=lambda: cfg.authenticate,
_use_arrow_native_complex_types=False,
)
# Read data from a Unity Catalog table and return it as a pandas DataFrame
def read_table(table_name: str, conn) -> pd.DataFrame:
with conn.cursor() as cursor:
cursor.execute(f"SELECT * FROM {table_name}")
return cursor.fetchall_arrow().to_pandas()
# Format values for SQL, handling NaN/None as NULL
def format_value(val):
if val is None or (isinstance(val, float) and math.isnan(val)):
return 'NULL'
else:
return repr(val)
# Use `INSERT OVERWRITE` to update existing rows and insert new ones
def insert_overwrite_table(table_name: str, df: pd.DataFrame, conn):
progress = st.empty()
with conn.cursor() as cursor:
rows = list(df.itertuples(index=False))
values = ",".join([f"({','.join(map(format_value, row))})" for row in rows])
with progress:
st.info("Calling Databricks SQL...")
cursor.execute(f"INSERT OVERWRITE {table_name} VALUES {values}")
progress.empty()
st.success("Changes saved")
# Use Streamlit input fields to accept user input
http_path_input = st.text_input(
"Enter your Databricks HTTP Path:", placeholder="/sql/1.0/warehouses/xxxxxx"
)
table_name = st.text_input(
"Specify a Unity Catalog table name:", placeholder="catalog.schema.table"
)
# Display the result in a Streamlit DataFrame
if http_path_input and table_name:
conn = get_connection(http_path_input)
if conn:
st.success("✅ Connected successfully!")
original_df = read_table(table_name, conn)
edited_df = st.data_editor(original_df, num_rows="dynamic", hide_index=True)
df_diff = pd.concat([original_df, edited_df]).drop_duplicates(keep=False)
if not df_diff.empty:
st.warning(f"⚠️ You have {len(df_diff) // 2} unsaved changes")
if st.button("Save changes"):
insert_overwrite_table(table_name, edited_df, conn)
st.rerun()
else:
st.warning("Provide both the warehouse path and a table name to load data.")
Next steps
- Deploy the app to your workspace. See Deploy a Databricks app.
- Restrict access to the appropriate users. See Configure permissions for a Databricks app.