Hi,
Thanks for reaching out to Microsoft Q&A.
The error you are encountering likely stems from a limitation or misconfiguration in the Apache Arrow data serialization layer when handling larger datasets (1000 rows in your case). To address this issue, you can try several workarounds or optimizations:
- Adjust the Data Chunk Size
- Snowflake's Arrow serialization may struggle with large data chunks. Reduce the number of rows returned per request.
- In Synapse, configure your
Script Activity
to fetch data in smaller chunks, if such an option exists.
- Paginate the Results
- Modify the stored procedure to support pagination by adding parameters for
OFFSET
andLIMIT
. For example: create or replace procedure ods.test.return_rows_4_strings_paginated(offset_param integer, limit_param integer) returns table (a varchar(26), b varchar(26), c varchar(26), d varchar(26)) language sql as $$ declare retval resultset default (select 'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' ,'abcdefghijklmnopqrstuvwxyz' from table(generator(rowcount => 1000)) t limit limit_param offset offset_param); begin return table(retval); end; $$;- Call the procedure multiple times in the pipeline with different
offset_param
values.
- Call the procedure multiple times in the pipeline with different
- Reduce the Data Size
- If possible, reduce the size of each row returned. For instance, truncate the strings to smaller lengths if they are placeholders: select left('abcdefghijklmnopqrstuvwxyz', 10) as a, ...*
- Switch to Traditional Query Execution
- Instead of returning data directly from the stored procedure, consider executing a query that fetches the data using the
Copy Activity
orLookup Activity
in Synapse. - Example query: select 'abcdefghijklmnopqrstuvwxyz' as a, ... from table(generator(rowcount => 1000));
- Upgrade the Apache Arrow Version (If Possible)
- Ensure your Snowflake ADB driver and Synapse environment use the latest versions. This might resolve serialization bugs related to Arrow.
- Debug the Serialization Format
- Investigate whether a specific Arrow setting or schema mismatch is causing the issue. Check the Snowflake driver or Synapse pipeline configurations for compatibility settings.
- Intermediate Staging
- Write the result of the stored procedure to an intermediate table in Snowflake or a file in Azure Blob Storage.
- Use Synapse to fetch data from the staging table or file.
Additional Suggestions
- Check Synapse and Snowflake logs for additional details about the error.
- If none of the above resolves the issue, consider reaching out to Microsoft support or Snowflake support for assistance with the Apache Arrow EOF error.
Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.