Generate new rows based on date series in databricks delta table

NIKHIL KUMAR 101 Reputation points
2022-10-20T18:52:05.013+00:00

Have a table which has below sample data as

Table A:
Id, start date, end date
1, 2022-10-10, 2022-10-12

I want a new column with individual date of the series as below in the output as below:

Id, start date, end date, NewDerivedCol
1, 2022-10-10, 2022-10-12, 2022-10-10
1,2022-10-10,2022-10-12, 2022-10-11
1,2022-10-10,2022-10-12, 2022-10-12

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,311 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaktiSingh-MSFT 16,171 Reputation points
    2022-10-21T14:03:41.603+00:00

    Hi @NIKHIL KUMAR ,

    Thanks for posting this question in Microsoft Q&A platform and for using Azure Services.

    Regarding the above Input dataset and required Output Dataset with New Column as Date Series, we can use below code in Databricks using Pyspark SQL functions:

    from pyspark.sql import functions as F  
    df = df \  
      .withColumn('StartDate', F.col('StartDate').cast('date')) \  
      .withColumn('EndDate', F.col('EndDate').cast('date'))  
      
    display(df)  
      
    df.withColumn('NewDerivedCol', F.explode(F.expr('sequence(StartDate, EndDate, interval 1 day)')))\  
      .show()  
    

    253033-image.png

    Here we have used Sequence function which generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions.
    Supported types are byte, short, integer, long, date, timestamp.
    It is used to create an array containing all dates between StartDate and EndDate.

    This array can then be exploded using explode function which returns a new row for each element in the given array.

    Reference Links: pyspark.sql.functions.explode.html
    pyspark.sql.functions.sequence.html

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.