Have you tried .format("csv")? I tried your method and got the same error, and when I changed to .format("csv") in databricks it worked.
---------------------
If this helps please mark as correct answer.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
I can't debug this. I copied it from a Databricks video, so maybe it does not transfer over????
import sys
from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql.types import *
spark = SparkSession \
.builder \
.appName("Validate") \
.getOrCreate
custumSchema = StructType([
StructField("A#", IntegerType(), True),
StructField("FirstName", StringType(), True),\
StructField("LastName", StringType(), True),\
StructField("DOB",DateType(), True),\
StructField("Gender", StringType(), True ),\
StructField("corrupt_record", StringType(), True )\
])
df = spark.read\
.format='csv' \
.option("badRecordsPath", 'abfssXXXXXXCSV/BadCSV/*.csv')\
.option("mode", "PERMISSIVE")\
.options(header='true', delimiter=',',) \
.option("columnNameOfCorruptRecord", "corrupt_record") \
.load('abfss://synapXXXXXXXCSV/*.csv', schema = custumSchema)
df.show()
Here's my error
----------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_7707/2196056541.py in <module>
21
22 df = spark.read\
---> 23 .format='csv' \
24 .option("badRecordsPath", 'abfss://synapseqadatalakegen2fs@synapseqadatalakegen2.dfs.core.windows.net/DataLakehouse/CSV/BadCSV/*.csv')\
25 .option("mode", "PERMISSIVE")\
AttributeError: 'str' object has no attribute 'option'
----------
I'm stumped on this one. Have tried many versions to fix...any help appreciated,
Thanks, Mike
Have you tried .format("csv")? I tried your method and got the same error, and when I changed to .format("csv") in databricks it worked.
---------------------
If this helps please mark as correct answer.