Data factory copy with partition

sakuraime 2,316 Reputation points
2021-04-30T15:03:10.383+00:00

I have some questions on the data factory copy activity . i would like to write azure synapse table to parquet .
from the sink I write a query . and I can choose 'Dynamic range' . The table have a date column . If I would like to break
each year a partition , for example 2010 to 2021
how should write the 'Parition upper bound' and 'Partition lower bound'?

92986-image.png

for the sink, how to put each partition in a separate folder ???

/table/yyyy/*parquet.......

Oracle data source also support Dynamic range ??

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,456 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,011 Reputation points
    2021-04-30T21:06:23.587+00:00

    Hello @sakuraime and welcome back.

    If you want to make each year a separate partition / file, I think you would have an easier time using Data Flow Sink Partition Type Key. (see below image)

    The Partition bounds in copy activity do not work that way. Dynamic Partition option combines the Degree of copy parallelism in Settings, with the Partition options in strange ways.

    93043-image.png

    The partition bounds seem to be used to divide the values of the column into quartiles.
    This means, if I have Year populated with values from 2000 - 2015,
    and I choose lower bound of 2003 and upper bound of 2006,
    There will be one file for everything below 2003, one file for everything above 2006, and the years between 2003 and 2006 will be divided up in ranges like 2003-2004 , 2004-2005, 2005-2006.

    This doesn't lend itself to what you want, so I suggest the Data Flow partition.

    0 comments No comments