How to pass dataframe/table data from one activity to another in synapse

Ravi Kumar (Capgemini America Inc) 20 Reputation points Microsoft External Staff
2025-02-23T16:44:52.77+00:00

I need to pass one table record set from one activity(i.e notebook) to another activity like web. How I can pass ?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

Accepted answer
  1. Chandra Boorla 14,510 Reputation points Microsoft External Staff Moderator
    2025-02-24T06:32:37.52+00:00

    Hi @Ravi Kumar (Capgemini America Inc)

    Thank you for posting your query!

    Passing data between activities in Azure Synapse can be accomplished using a combination of Synapse features such as DataFrames, temporary storage (like Azure Blob Storage), or parameter passing. Here is a quick summary of each approach with considerations for each:

    Approach 1 - Using Azure Blob Storage

    Export Data from Notebook - You save the data to Azure Blob Storage in a format like CSV or Parquet.

    Pass File Path - In your Synapse pipeline, you can use the file path from Blob Storage as a parameter and pass it to the next activity.

    Access Data in Web Activity - The web activity can then access the file at that location for further processing.

    Pros Cons
    Suitable for large datasets Requires reading/writing data from Blob Storage, which may introduce extra steps
    Persistent storage allows data to be reused across activities Additional costs for storage and data movement

    Approach 2 - Using Pipeline Parameters

    Convert Data to String - If the data is small, convert the DataFrame to a JSON or CSV string and pass it as a parameter.

    Pass Data as Parameter - The converted string is passed to the next activity as a pipeline parameter.

    Use Data in Web Activity - In the web activity, you can parse the string back to its original format (e.g., a DataFrame or JSON object).

    Pros Cons
    Simple and quick for small datasets Not suitable for large datasets due to the size limit on pipeline parameters
    Avoids the need for intermediate storage Requires serialization and deserialization

    Approach 3 - Using SQL Pools

    Write to SQL Pool - Write the data to a dedicated SQL pool (Data Warehouse) from your notebook.

    Query Data in Next Activity - The web activity can then query the SQL pool to retrieve the data.

    Pros Cons
    Well-suited for structured data with frequent querying Adds complexity of managing a SQL pool
    Suitable for larger datasets, leveraging SQL's performance capabilities May incur additional costs for storage and querying

    Summary:

    Using Azure Blob Storage - Ideal for larger datasets, allows persistent storage and scalability. Using Pipeline Parameters - Good for small datasets, avoids extra storage and read/write costs but has size limitations.

    Using SQL Pools - Suitable for structured data and frequent querying, but involves additional complexity and costs associated with managing SQL pools.

    Considerations:

    • Performance - Choose a method based on data size and performance needs. For large datasets, Blob Storage or SQL Pools are preferable.
    • Cost - Consider the costs associated with Azure Blob Storage and SQL Pools (storage, compute, and data movement).
    • Security - Ensure controlled access to data, especially if sensitive, using Managed Identity, encryption, and secure access methods.
    • Data Serialization - Converting data to string formats (JSON, CSV) is useful for small datasets but may not be efficient for larger ones.

    By using the appropriate approach based on your data volume and pipeline requirements, you can effectively pass data between activities in Azure Synapse.

    I hope this information helps. Please do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.