Output SQL server data into Parquet files

Srikanth Sudharma 11 Reputation points
2022-01-25T05:15:48.447+00:00

Hello,

My requirements need me to export data from sql server ( on prem) db to an external source. Currently, the need is to create parquet files of this data for consumption. My question is,
Is this possible using polybase/External file format/OpenRowset ? If yes, are there articles indicating how to? I was reading about polybase and external files but most of these examples indicates reading data from external files/ data sources and not writing to them.

Thanks
Sri

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,362 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. YufeiShao-msft 7,091 Reputation points
    2022-01-25T06:33:51.913+00:00

    Hi @Srikanth Sudharma ,

    Data factory is a good way to do this if you have
    One Way to Create a Parquet File from SQL Server Data

    or try to use Spark

    -------------

    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

  2. Fritz 0 Reputation points
    2024-07-02T10:27:43.1366667+00:00

    You can do this with Sling (https://slingdata.io). See below.

    # set connection via env var
    export mssql='sqlserver://...'
    
    # test connection
    sling conns test mssql
    
    # run export for many tables
    sling run --src-conn mssql --src-stream 'my_schema.*' --tgt-object 'file://{stream_schema}/{stream_table}.parquet'
    
    # run export for one table
    sling run --src-conn mssql --src-stream 'my_schema.my_table' --tgt-object 'file://my_folder/my_table.parquet'
    
    # run export for custom SQL
    sling run --src-conn mssql --src-stream 'select col1, col2 from my_schema.my_table where col3 > 0' --tgt-object 'file://my_folder/my_table.parquet'
    
    0 comments No comments