CopyData in ForEach

Zhuoqiong Mo 40 Reputation points
2024-06-27T09:48:48.74+00:00

Dear all,

I want to use the copydata activity in my pipeline.

I have a csv file in my storage account, which has the list of the data i want to pass into the copy activity. I create a lookup activity first and then I try to use ForEach and put the copy data inside ForEach. But, I am struggling to pass the output from Lookup into Copy Data Activity now. Could anyone help me on this?

  • The Lookup activity is successfully connected and the output :
{ "count": 2, "value": [ { "TempoKey": "apple" }, { "TempoKey": "banana" }]}

so in the settings of ForEach, I set the Items

@activity('Get ProjectKey List').output.value

Then, in the copy data activity, from the source, I it has the integration dataset with the linked service from http connection. It's a API connector require API Token and Project Key. I made the Project Key and Token as Parameters and the Project Key should be passed from the Lookup Activity, apple and banana.

So as my understand, I could pass

User's image

when i try to preview data and run the pipeline it returns me the error:

"message": "ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: org.apache.parquet.schema.InvalidSchemaException:Cannot write a schema with an empty group: message adms_schema {\n}\n\ntotal entry:11\r\norg.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:27)\r\norg.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:37)\r\norg.apache.parquet.schema.MessageType.accept(MessageType.java:55)\r\norg.apache.parquet.schema.TypeUtil.checkValidWriteSchema(TypeUtil.java:23)\r\norg.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:233)\r\norg.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:280)\r\norg.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:227)\r\norg.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:192)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.build(ParquetWriterBuilderBridge.java:175)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetWriterBridge.open(ParquetWriterBridge.java:13)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetFileBridge.createWriter(ParquetFileBridge.java:27)\r\n.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'", "failureType": "UserError", "target": "Copy Jira", "details": [] }

as I understand, it means I pass an empty to parquet. Therefore, I think the issue it's using the dynamic property. It doesn't pass apple or banana inside.

here is my integration dataset setting User's image

Could anyone help me on that?

many thanks in advance,

Joan

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,901 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,643 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Sina Salam 10,261 Reputation points
    2024-06-27T14:33:06.2133333+00:00

    Hello Zhuoqiong Mo,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Problem

    I understand that you are having issue with your Copy Data Activity in Azure Data Factory.

    Solution

    To solve the issue, after reviewing the information provided. I saw that error message indicates an issue related to Parquet schema validation, which means that the schema is empty or incorrectly defined for the dynamic property you're using to pass the TempoKey values to the Copy Data Activity.

    Now, I saw that in your integration dataset settings, you've configured the dynamic property, to ensure that the TempoKey values are correctly passed to the Copy Data Activity, kindly follow these steps:

    • In your ForEach activity, make sure that the dynamic property expression is correctly set to @item().TempoKey. This will ensure that each value from the Lookup activity is passed to the Copy Data Activity.
    • In the Copy Data Activity, check the mapping settings for the source and destination datasets to ensure that the source dataset (CSV file) correctly maps the TempoKey column to the appropriate field. Also, verify that the destination dataset (API connector) maps the TempoKey value to the correct field in the API request.
    • Before running the pipeline, use the data preview feature to verify that the TempoKey values are correctly passed from the Lookup activity to the Copy Data Activity and ensure that the data looks as expected in the preview.
    • Since you're encountering a Parquet-related error, double-check the format settings for the destination dataset. Meanwhile, if you're writing data to a Parquet file, ensure that the schema matches the expected structure (including the TempoKey column).
    • Finaly, make sure that the integration runtime (IR) associated with both the source and sink (destination) data stores is correctly configured and accessible.

    References

    Kindly use the additional resources provided by the right side of this page for more reading and clarifications.

    Accept Answer

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    ** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam


  2. Zhuoqiong Mo 40 Reputation points
    2024-07-01T14:27:01.7666667+00:00

    I fixed the problem by myself


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.