Load data from SFTP to ADLS in azure synapse using data flow activity?

Anonymous
2022-07-15T01:09:47.963+00:00

Hi All, I am loading data from SFTP server to ADLS blob storage. I am using data flow activity, the reason is I want to move files from my ftp source folder to archieve folder. Also I have multiple csv files at source level. I am able to move files from sftp source folder to archieve folder but I am getting problem while loading these files to ADLS. First thing the file names are not coming correctly in ADLS. for eg: ![220943-image.png][1] Like this****part-00000-4bdb94ba-0994-4232-9a03-036e115d471b-c000.csv [1]: /api/attachments/220943-image.png?platform=QnA Second same file is getting created twice in ADLS. Kindly help! Thanks in advance.

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,688 questions
Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,281 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,140 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,859 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,136 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 73,811 Reputation points Microsoft Employee
    2022-07-18T09:11:51.433+00:00

    Hello anonymous user,

    Thanks for the question and using MS Q&A platform.

    In order to get output to the single file, make sure to the set the partition option to single partition under optimize section:

    Note: Single partition combines all the distributed data into a single partition. This is a very slow operation that also significantly affects all downstream transformation and writes. This option is strongly discouraged unless there is an explicit business reason to use it.

    221799-image.png

    For more details, refer to Mapping data flows performance and tuning guide.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments