Hello @bk ,
Thanks for the being so detailed in the ask , believe me that helps .
So in your case the sink is flat files and the source is SQLMI . This is what I was thinking , do you
really need to have the new columns ? if not you can use the tabular source , please read about that here
(https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#tabular-source-to-tabular-sink)
Now let go ahead with the second possiblity where you need to append new columns to the SQL . As far I know
you can use the autocreate table option in data flow to create the staging table and then you can use the Stored proc
to read the Staging table and Alter the actual table .
let me know if this helps .
Updated on 11/17/2020
After talking to few members in the team we are sure that the schema drft(supported) in the source does not mean the we will have the schema drift on the sink side(not supported)
. But then I did tried the below work around in ADF and as per the testing it seems to work . I assume that the file is like below today date
11,22,33,44,55,66,77,88
and tmmrw it will be
11,22,33,44,55,66,77,88,999,10000,11
21,22,23,24,25,26,27,28,299,20000,21
Here is the logic
- Copy the data to an intermediate table with the option with "Auto create table" .
- The second copy will copy the data from intermediate table to the main table , but we will use the pre-copy script option to columns
check if the schema to the intermediate and final table is not same , if it is not we will add the missing columns .
Here is below piece of SQL scripts ( in my example the name of the final table is foofinalTable and intermediate table is fooSourceTable.
DECLARE @SQlCOmmand table ( id int identity , command varchar(100) )
INSERT INTO @SQlCOmmand ( command)
SELECT 'ALTER TABLE ' + Table_Schema +'.foofinalTable ADD ' +COLUMN_NAME + ' nvarchar(100)' as command
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'fooSourceTable'
and COLUMN_NAME not in (
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'foofinalTable')
DECLARE @I INT
DECLARE @CMD VARCHAR(1000)
SELECT @I = 1
DECLARE @Max INT
SELECT @Max = MAX(ID) FROM @SQLCOMMAND
WHILE @I <= @Max BEGIN
SELECT @CMD = COMMAND FROM @SQLCOMMAND WHERE ID = @I
EXEC (@CMD )
SET @I = @I + 1
END
Thanks Himanshu
Please do consider to click on "Accept Answer" and "Upvote" on the post that helps you, as it can be beneficial to other community members.