How to Maintaing log table to Datalake File data Count

Mahesh Madhusanka 151 Reputation points
2020-09-08T12:36:11.727+00:00

Hi Team,

Currently we have a setup on data factory pipeline copy some specific data from database and data append on the datalake csv file wise. Currently we have a requirement to maintain a table as a log which is data Synchronization from third party database to Datalake csv file via the ADF pipeline. How Can we Do that Is there have a any feature on Data Factory or what are the method available on it, I want to maintain Data row Cont etc...

Could you please guide and Support on it?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
6,271 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Mahesh Madhusanka 151 Reputation points
    2020-09-09T03:47:06.393+00:00

    Hi @HimanshuSinha-msft and Team,

    Yes I want to maintain table which is including pipeline ran raw count and total Count of destination Data lake relevant table csv file number of Row. Could you please Support to get that details. and Could you please support on the this via Visual Connector level.

    Thanks
    Mahesh


  2. HimanshuSinha-msft 17,566 Reputation points Microsoft Employee
    2020-09-09T23:35:00.577+00:00

    Hello ,
    I am sharing an implementation below , i am assuming that when you mention data lake , you mean data lake storage . Please allow me to give the background of the implementation , when ever the copy activity completes , we have some properties which are exposed regarding the rows copied etc , we will use that . Also there is no ADF function which will tells in the rows in a a file .We can use Azure databricks etc to get that , but thats a different implementation .

    How to get the Raw count

    Since you mentioned that you are coping the data from SQL to csv , you can use the property directly .

    **How to get the Raw count on the distination file **
    We are copying the distination file to a dummy file , once we do that we can use the properties and then delete the dummy file .

    We will take the row count in both the cases and insert into a table .

    Dynamic expression used @{int(activity('COPY the main file to dummy file').output.rowsCopied)}

    SQL Script


    CREATE TABLE Logging
    (
    RowsReadfromSmallerFile int
    ,RowsReadfromBiggerFile int
    ,InsertedTime datetime default getdate()
    )
    GO
    ALTER Procedure Ins_Logging
    @RowsReadfromSmallerFile int
    ,@RowsReadfromBiggerFile int
    as
    INSERT INTO Logging (RowsReadfromSmallerFile,RowsReadfromBiggerFile)
    values (@RowsReadfromSmallerFile,@RowsReadfromBiggerFile)

    23613-loggingissue.gif

    Thanks Himanshu

    Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

    No comments