Calling snowflake stored procedure from Synapse pipeline script task returns EOF error

Greg Graham 0 Reputation points
2024-12-06T20:25:56.04+00:00

Getting the following error message when calling a Snowflake stored procedure from a Synapse Pipeline Script activity. Note, if I change the stored procedure to return 500 rows, instead of 1000 rows, everything works fine. Any suggestions / workarounds?

{ "errorCode": "10001", "message": "Failed to execute script. Exception: ''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to execute the query command during read operation.,Source=Microsoft.DataTransfer.Execution.ScriptActivityExecutor,''Type=Apache.Arrow.Adbc.AdbcException,Message=[Snowflake] arrow/ipc: could not read message schema: arrow/ipc: could not read message metadata: unexpected EOF,Source=Apache.Arrow.Adbc,''", "failureType": "UserError", "target": "Script1", "details": [] }

Here is the Snowflake SP:

create or replace procedure ods.test.return_rows_4_strings()
  returns table (a varchar(26), b varchar(26), c varchar(26), d varchar(26))
  language sql
as $$
declare
  retval resultset default (select 'abcdefghijklmnopqrstuvwxyz'
                                  ,'abcdefghijklmnopqrstuvwxyz'
                                  ,'abcdefghijklmnopqrstuvwxyz'
                                  ,'abcdefghijklmnopqrstuvwxyz' 
                              from table(generator(rowcount => 1000)) t);
begin
  return table(retval);
end;
$$;
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Vinodh247 34,661 Reputation points MVP Volunteer Moderator
    2024-12-08T13:23:17.5933333+00:00

    Hi,

    Thanks for reaching out to Microsoft Q&A.

    The error you are encountering likely stems from a limitation or misconfiguration in the Apache Arrow data serialization layer when handling larger datasets (1000 rows in your case). To address this issue, you can try several workarounds or optimizations:

    1. Adjust the Data Chunk Size
    • Snowflake's Arrow serialization may struggle with large data chunks. Reduce the number of rows returned per request.
    • In Synapse, configure your Script Activity to fetch data in smaller chunks, if such an option exists.
    1. Paginate the Results
    • Modify the stored procedure to support pagination by adding parameters for OFFSET and LIMIT. For example: create or replace procedure ods.test.return_rows_4_strings_paginated(offset_param integer, limit_param integer) returns table (a varchar(26), b varchar(26), c varchar(26), d varchar(26)) language sql as $$ declare retval resultset default (select 'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' from table(generator(rowcount => 1000)) t limit limit_param offset offset_param); begin return table(retval); end; $$;
      • Call the procedure multiple times in the pipeline with different offset_param values.
    1. Reduce the Data Size
    • If possible, reduce the size of each row returned. For instance, truncate the strings to smaller lengths if they are placeholders: select left('abcdefghijklmnopqrstuvwxyz', 10) as a, ...*
    1. Switch to Traditional Query Execution
    • Instead of returning data directly from the stored procedure, consider executing a query that fetches the data using the Copy Activity or Lookup Activity in Synapse.
    • Example query: select 'abcdefghijklmnopqrstuvwxyz' as a, ... from table(generator(rowcount => 1000));
    1. Upgrade the Apache Arrow Version (If Possible)
    • Ensure your Snowflake ADB driver and Synapse environment use the latest versions. This might resolve serialization bugs related to Arrow.
    1. Debug the Serialization Format
    • Investigate whether a specific Arrow setting or schema mismatch is causing the issue. Check the Snowflake driver or Synapse pipeline configurations for compatibility settings.
    1. Intermediate Staging
    • Write the result of the stored procedure to an intermediate table in Snowflake or a file in Azure Blob Storage.
    • Use Synapse to fetch data from the staging table or file.

    Additional Suggestions

    • Check Synapse and Snowflake logs for additional details about the error.
    • If none of the above resolves the issue, consider reaching out to Microsoft support or Snowflake support for assistance with the Apache Arrow EOF error.

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.

    0 comments No comments

  2. Chandra Boorla 14,510 Reputation points Microsoft External Staff Moderator
    2024-12-12T14:25:42.4066667+00:00

    @Greg Graham

    Apologies for the inconvenience.

    I agree that this issue looks strange, and I wasn't able to reproduce this issue. If you have a support plan, could you please file a support ticket for deeper investigation and do share the SR# with us?

    Thank you.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.