Hi @Clover J
Thanks for the question and using MS Q&A platform.
As I see that
pd.read_parquet()
function does not support the wildcard character*
in the path, which is why you are getting theFileNotFoundError
. Instead, you can use thespark.read.parquet()
function to read all the files under the specified folder.
Here's the corrected code snippet:
df = spark.read.parquet('abfss://******@xxx.dfs.core.windows.net/test/*/*/*/*')
df.show()
This code will read all the Parquet files under the test
folder, with the folder structure year={yyyy}/month={MM}/day={dd}
. The df.show()
function will display the contents of the DataFrame.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.