hi I have added more detailed explanation to my previous answer
Azure Databricks offers different modes for handling malformed records when reading CSV files:
- PERMISSIVE (default): Inserts nulls for fields that couldn't be parsed correctly.
- DROPMALFORMED: Drops lines with fields that couldn't be parsed.
- FAILFAST: Aborts reading if any malformed data is found.
as stated in the link
https://learn.microsoft.com/en-us/azure/databricks/query/formats/csv
you can set mode and encoding at the same time
df = (spark.read
.format("csv")
.option("mode", "PERMISSIVE") # Set the mode for handling malformed records
.option("charset", "UTF-8") # Set the encoding
.load("path_to_your_csv_file"))