How to handle failure points in databricks notebook executions

Birajdar, Sujata 61 Reputation points
2021-11-18T07:13:58.427+00:00

Hi Team,

How we can handle databricks notebook failures in a particular pipelines and how we can restart it again from the failure point.

Ex : We started a pipeline with 10 json files and it got failed while reading the 4th file and how can we trigger it from 4 th file untransformed record. And how to capture the dataricks notebook error logs?

Please add your comments here.

Thank you

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,907 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2021-11-19T04:17:06.887+00:00

    Hi @Birajdar, Sujata ,

    Thanks for your query.

    You may try structured streaming with checkpoint enabled, then it will continue from the failed point.

    Ref doc: https://learn.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/auto-loader-gen2

    Hope this info helps.

    ----------

    • Please don't forget to click on 130616-image.png and upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators