Cannot convert a Azureml.Core.Dataset type object into a pyspark dataframe.

Ameya Bhave 0 Reputation points
2024-02-28T00:09:00.1733333+00:00

I have created a azureml dataset and want to convert it into a pyspark dataframe. I am not able to do it because I am getting a ParseException

import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azureml.core import Dataset, Workspace, Datastore
from pyspark.sql import SparkSession


spark = SparkSession.builder.appName("MySparkApp").config("spark.some.config.option", "some-value").getOrCreate()

datastore_name = 'datastore_name'
workspace = Workspace.from_config() #if we have existing work space.
datastore = Datastore.get(workspace, datastore_name)
df = Dataset.Tabular.from_parquet_files(path=(datastore,'path/file_name.parquet'))
spark_df = df.to_spark_dataframe()


When I do this, I get an error saying

AzureMLException: AzureMLException: Message: Execution failed unexpectedly due to: ParseException InnerException Syntax error at or near '['(line 1, pos 276) > == SQL ==> all the column names are written here> -----------------^^^ ErrorResponse { "error": { "message": "Execution failed unexpectedly due to: ParseException" } }

Why is this happening?
Please help

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,846 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.