Workspace feature store (Legacy)

Note

This documentation covers the workspace feature store. Use this page only if your workspace is not enabled for Unity Catalog.

Databricks recommends using Feature Engineering in Unity Catalog. Workspace feature store will be deprecated in the future.

Why use workspace feature store?

Workspace feature store is fully integrated with other components of Azure Databricks.

  • Discoverability. The Feature Store UI, accessible from the Databricks workspace, lets you browse and search for existing features.
  • Lineage. When you create a feature table in Azure Databricks, the data sources used to create the feature table are saved and accessible. For each feature in a feature table, you can also access the models, notebooks, jobs, and endpoints that use the feature.
  • Integration with model scoring and serving. When you use features from Feature Store to train a model, the model is packaged with feature metadata. When you use the model for batch scoring or online inference, it automatically retrieves features from Feature Store. The caller does not need to know about them or include logic to look up or join features to score new data. This makes model deployment and updates much easier.
  • Point-in-time lookups. Feature Store supports time series and event-based use cases that require point-in-time correctness.

How does workspace feature store work?

The typical machine learning workflow using Feature Store follows this path:

  1. Write code to convert raw data into features and create a Spark DataFrame containing the desired features.
  2. Write the DataFrame as a feature table in the workspace feature store.
  3. Train a model using features from the feature store. When you do this, the model stores the specifications of features used for training. When the model is used for inference, it automatically joins features from the appropriate feature tables.
  4. Register model in Model Registry.

You can now use the model to make predictions on new data. For batch use cases, the model automatically retrieves the features it needs from Feature Store.

Feature Store workflow for batch machine learning use cases.

For real-time serving use cases, publish the features to an online store. See Third-party online stores.

At inference time, the model reads pre-computed features from the online store and joins them with the data provided in the client request to the model serving endpoint.

Feature Store flow for machine learning models that are served.

Start using workspace feature store

To get started, try these example notebooks. The basic notebook steps you through how to create a feature table, use it to train a model, and then perform batch scoring using automatic feature lookup. It also introduces you to the Feature Engineering UI and shows how you can use it to search for features and understand how features are created and used.

Basic Workspace Feature Store example notebook

Get notebook

The taxi example notebook illustrates the process of creating features, updating them, and using them for model training and batch inference.

Workspace Feature Store taxi example notebook

Get notebook

Supported data types

For supported data types, see Supported data types.