Copy data from oracle to azure sql server

Keerthi Reddy Jagapathi 20 Reputation points
2024-12-13T01:08:48.56+00:00

I have a use case where I need to copy the data from the Oracle database (on-prem) to the Azure SQL server. I need to upsert the data along with the timestamp column to the destination table which indicates the date a new record has been inserted or an existing record has been updated in the destination table. The timestamp is not available in the source table.

Can we do this without a staging table? Dataflows are not an option as Oracle is not supported in the data flows.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Nandan Hegde 36,146 Reputation points MVP Volunteer Moderator
    2024-12-13T02:34:53.9733333+00:00

    Even though Oracle is not supported in dataflow, you can copy the data into a blob file and leverage that file as a source within dataflow for you data transformations.

    If not using dataflow,

    You can also use copy activity (Upsert) feature to upsert data from source into destination (but a temp staging table gets auto created in backend)


  2. Chandra Boorla 14,510 Reputation points Microsoft External Staff Moderator
    2024-12-16T12:20:49.39+00:00

    Hi @Keerthi Reddy Jagapathi

    Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!

    Is the temp backend table deleted automatically?

    Yes, the temporary backend table created during the upsert operation in Azure Data Factory’s Copy Activity is managed by ADF and is deleted automatically after the process completes. This ensures that you don’t have to manually handle the cleanup of temporary tables.

    I want to have a timestamp column in the destination table that indicates the date when the data has been loaded. It should change if there is a new record inserted or an existing record is updated otherwise, it should remain unchanged. Note - There is no timestamp column coming from the source. How can I implement this without staging the source data?

    Here are the steps to effectively maintain a timestamp in the destination table during data loading, without staging the source data.

    Upsert Behavior in ADF Copy Activity - By configuring the Copy Activity with "Upsert" behavior, you ensure that records are inserted or updated in the destination table based on the unique key, as per your needs. This eliminates the need for staging the data.

    Adding a Timestamp Column to the Destination Table - Adding a LastUpdatedTimestamp column to the destination table is essential. This column will store the timestamp whenever a new record is inserted or an existing record is updated.

    Example SQL:

    ALTER TABLE DestinationTable
    ADD LastUpdatedTimestamp DATETIME NULL;
    
    

    SQL Trigger for Timestamp Management - The trigger on the destination table effectively updates the LastUpdatedTimestamp whenever a record is inserted or updated. This ensures that the timestamp remains unchanged for rows that aren’t modified.

    Example SQL for trigger:

    CREATE OR ALTER TRIGGER trg_SetTimestamp
    ON DestinationTable
    AFTER INSERT, UPDATE
    AS
    BEGIN
        -- Update the timestamp for inserted or updated rows
        UPDATE DestinationTable
        SET LastUpdatedTimestamp = GETDATE()
        WHERE ID IN (SELECT ID FROM inserted);
    END;
    

    (Replace ID with the primary key or unique identifier column of your table.)

    How it Works -

    • New Records - When new records are inserted, the trigger sets the LastUpdatedTimestamp to the current date and time.
    • Updated Records - The trigger updates the timestamp for any records that are updated.
    • Unchanged Records - If no change occurs to a record, the timestamp remains unchanged, meeting your requirement of only updating the timestamp for modified records.

    Additional Considerations:

    • Indexing - If the LastUpdatedTimestamp column will be frequently queried, it’s a good idea to index it for better performance.
    • Concurrency - Ensure that the SQL trigger handles high concurrency efficiently in your environment.

    I hope this information helps. Please do let us know if you have any further questions.

    Thank you.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.