Is it possible to get pandas dataframe from azure feature store client? or can we only get spark dataframe while running the job on serverless spark compute?

Hargurjeet Singh 0 Reputation points
2024-11-29T11:56:34.6266667+00:00

I made DSL pipeline to train ML models in AML studio, which runs on standard compute "Standard DS3_v2"

We also set up Azure Feature Store.

But while getting feature data using below code , I am getting error - "RuntimeError: Fail to get spark session, please check if spark environment is set up." when using to_spark_dataframe() method.

feature_store = FeatureStoreClient(
            credential = credential,
            subscription_id = feature_store_subscription_id,
            resource_group_name = feature_store_resource_group_name,
            name = feature_store_name
        )


feature_set = feature_store.feature_sets.get("Feature_store_name", '1')

df = feature_set.to_spark_dataframe()

Is there a way to not use spark dataframe and spark serverless compute to get data from azure feature store? Can we use standard compute to run this?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,027 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Azar 24,035 Reputation points MVP
    2024-11-29T13:03:00.41+00:00

    Hi there Hargurjeet Singh

    Thanks for using QandA platform

    the error iindicates that the to_spark_dataframe() method is trying to use a Spark session, but it's not available in your environment (since you're using a standard compute instance instead of Spark-based compute).

    I guess cannot directly bypass Spark to use Pandas DataFrames with the Azure Feature Store API in the current setup. To use Pandas, you would need to convert the Spark DataFrame to a Pandas DataFrame, but this may only work for smaller datasets. For larger data, Spark compute is a must.

    If this helps kindly accept the answer thanks much.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.