Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This tutorial shows how to build a Databricks app using the Databricks SQL Connector for Python and Streamlit. You'll learn how to develop an app that does the following:
- Reads a Unity Catalog table and display it in a Streamlit interface.
- Edits data and writes it back to the table.
Steps:
- Step 1: Configure privileges
- Step 2: Install dependencies
- Step 3: Read a Unity Catalog table
- Step 4: Edit a Unity Catalog table
- Next steps
Step 1: Configure privileges
These examples assume that your app uses app authorization. Your app’s service principal must have:
SELECT
privilege on the Unity Catalog tableMODIFY
privilege on the Unity Catalog tableCAN USE
privilege on the SQL warehouse
For more information, see Unity Catalog privileges and securable objects and SQL warehouse ACLs.
Step 2: Install dependencies
Include the following packages in your requirements.txt file:
databricks-sdk
databricks-sql-connector
streamlit
pandas
Note
pandas
is only required if you're editing table data.
Step 3: Read a Unity Catalog table
This example demonstrates how to read data from a Unity Catalog table and display it using Streamlit. The app performs the following steps:
- Prompts the user for the SQL warehouse HTTP path and Unity Catalog table name.
- Establishes a cached SQL connection using the Databricks SQL Connector for Python.
- Executes a
SELECT *
query on the specified table. - Displays the result in a Streamlit
st.dataframe
.
app.py
import streamlit as st
from databricks import sql
from databricks.sdk.core import Config
cfg = Config() # Set the DATABRICKS_HOST environment variable when running locally
@st.cache_resource # connection is cached
def get_connection(http_path):
return sql.connect(
server_hostname=cfg.host,
http_path=http_path,
credentials_provider=lambda: cfg.authenticate,
)
def read_table(table_name, conn):
with conn.cursor() as cursor:
query = f"SELECT * FROM {table_name}"
cursor.execute(query)
return cursor.fetchall_arrow().to_pandas()
http_path_input = st.text_input(
"Enter your Databricks HTTP Path:", placeholder="/sql/1.0/warehouses/xxxxxx"
)
table_name = st.text_input(
"Specify a :re[UC] table name:", placeholder="catalog.schema.table"
)
if http_path_input and table_name:
conn = get_connection(http_path_input)
df = read_table(table_name, conn)
st.dataframe(df)
Note
- This example uses
st.cache_resource
to cache the database connection across sessions and reruns. - Use Streamlit input fields (
st.text_input
) to accept user input.
Step 4: Edit a Unity Catalog table
This example lets users read, edit, and write back changes to a Unity Catalog table using Streamlit’s data editing features. The app performs the following steps:
- Reads the original table into a Pandas DataFrame.
- Displays the table in a Streamlit editor (
st.data_editor
). - Detects changes between the original and edited DataFrames.
- Uses
INSERT OVERWRITE
to write the updated data back to the table.
app.py
import pandas as pd
import streamlit as st
from databricks import sql
from databricks.sdk.core import Config
cfg = Config() # Set the DATABRICKS_HOST environment variable when running locally
@st.cache_resource
def get_connection(http_path):
return sql.connect(
server_hostname=cfg.host,
http_path=http_path,
credentials_provider=lambda: cfg.authenticate,
)
def read_table(table_name: str, conn) -> pd.DataFrame:
with conn.cursor() as cursor:
cursor.execute(f"SELECT * FROM {table_name}")
return cursor.fetchall_arrow().to_pandas()
def insert_overwrite_table(table_name: str, df: pd.DataFrame, conn):
progress = st.empty()
with conn.cursor() as cursor:
rows = list(df.itertuples(index=False))
values = ",".join([f"({','.join(map(repr, row))})" for row in rows])
with progress:
st.info("Calling Databricks SQL...")
cursor.execute(f"INSERT OVERWRITE {table_name} VALUES {values}")
progress.empty()
st.success("Changes saved")
http_path_input = st.text_input(
"Specify the HTTP Path to your Databricks SQL Warehouse:",
placeholder="/sql/1.0/warehouses/xxxxxx",
)
table_name = st.text_input(
"Specify a Catalog table name:", placeholder="catalog.schema.table"
)
if http_path_input and table_name:
conn = get_connection(http_path_input)
original_df = read_table(table_name, conn)
edited_df = st.data_editor(original_df, num_rows="dynamic", hide_index=True)
df_diff = pd.concat([original_df, edited_df]).drop_duplicates(keep=False)
if not df_diff.empty:
if st.button("Save changes"):
insert_overwrite_table(table_name, edited_df, conn)
else:
st.warning("Provide both the warehouse path and a table name to load data.")
Note
- The app determines if updates are required by computing differences between the original and edited tables.
- A progress bar gives feedback during the write operation using
st.info
andst.success
. - This approach replaces all rows in the table. For partial updates, use a different write strategy.
Next steps
- Deploy the app to your workspace. See Deploy a Databricks app.
- Restrict access to the appropriate users. See Configure permissions for your Databricks app.