databricks - save to .parquet

Question

Hello,
using pyspark, I run a select and then save to .parquet.
The problem is that it saves .parquet as well as othe rfiles such as _commited and _success, etc.
Question:
How can I change the pyspark to only save .parquet and have no other files?
Thanks

df = spark.sql('select * from viewName limit 100')
df.write.parquet('dbfs:/mnt/temp/foldername', mode='overwrite')

Accepted Answer

I have the same problem.
For anybody looking for a quick fix meanwhile , I use this after creating a file with Python:

nameFile = [x.name for x in dbutils.fs.ls(f"{path}{fileName}.parquet") if x.name.split('.')[-1] == 'parquet'][0]
dbutils.fs.cp(f"{path}{fileName}.parquet/{nameFile}",f"{path}{fileName}.parquet")
dbutils.fs.rm(f"{path}{fileName}.parquet",recurse = True)

Share via

databricks - save to .parquet

0 additional answers