ORC format in Data Factory in Microsoft Fabric

Članak
06/27/2024

This article outlines how to configure ORC format in the data pipeline of Data Factory in Microsoft Fabric.

Supported capabilities

ORC format is supported for the following activities and connectors as a source and destination.

Category	Connector/Activity
Supported connector	Amazon S3
	Amazon S3 Compatible
	Azure Blob Storage
	Azure Data Lake Storage Gen1
	Azure Data Lake Storage Gen2
	Azure Files
	File system
	FTP
	Google Cloud Storage
	HTTP
	Lakehouse Files
	Oracle Cloud Storage
	SFTP
Supported activity	Copy activity (source/destination)
	Lookup activity
	GetMetadata activity
	Delete data activity

ORC format in copy activity

To configure ORC format, choose your connection in the source or destination of data pipeline copy activity, and then select ORC in the drop-down list of File format. Select Settings for further configuration of this format.

Screenshot showing file format settings.

ORC format as source

After you select Settings in the File format section, the following properties are shown in the pop-up File format settings dialog box.

Screenshot showing ORC file format source.

Compression type: Choose the compression codec used to read ORC files in the drop-down list. You can choose from None, zlib or snappy.

ORC format as destination

After you select Settings, the following properties are shown in the pop-up File format settings dialog box.

Screenshot showing ORC file format destination.

Compression type: Choose the compression codec used to write ORC files in the drop-down list. You can choose from None, zlib or snappy.

Under Advanced settings in the Destination tab, the following ORC format related properties are displayed.

Max rows per file: When writing data into a folder, you can choose to write to multiple files and specify the maximum rows per file. Specify the maximum rows that you want to write per file.
File name prefix: Applicable when Max rows per file is configured. Specify the file name prefix when writing data to multiple files, resulted in this pattern: <fileNamePrefix>_00000.<fileExtension>. If not specified, the file name prefix is auto generated. This property doesn't apply when the source is a file based store or a partition option enabled data store.

Table summary

ORC as source

The following properties are supported in the copy activity Source section when using ORC format.

Name	Description	Value	Required	JSON script property
File format	The file format that you want to use.	ORC	Yes	type (under `datasetSettings`): Orc
Compression type	The compression codec used to read ORC files.	None zlib snappy	No	orcCompressionCodec: none zlib snappy

ORC as destination

The following properties are supported in the copy activity Destination section when using the ORC format.

Name	Description	Value	Required	JSON script property
File format	The file format that you want to use.	ORC	Yes	type (under `datasetSettings`): Orc
Compression type	The compression codec used to write ORC files.	None zlib snappy	No	orcCompressionCodec: none zlib snappy
Max rows per file	When writing data into a folder, you can choose to write to multiple files and specify the maximum rows per file. Specify the maximum rows that you want to write per file.	<your max rows per file>	No	maxRowsPerFile
File name prefix	Applicable when Max rows per file is configured. Specify the file name prefix when writing data to multiple files, resulted in this pattern: `<fileNamePrefix>_00000.<fileExtension>`. If not specified, the file name prefix is auto generated. This property doesn't apply when the source is a file based store or a partition option enabled data store.	<your file name prefix>	No	fileNamePrefix

Connectors overview

Deli putem

ORC format in Data Factory in Microsoft Fabric

Supported capabilities

ORC format in copy activity

ORC format as source

ORC format as destination

Table summary

ORC as source

ORC as destination

Povratne informacije

Dodatni resursi

Deli putem

ORC format in Data Factory in Microsoft Fabric

Supported capabilities

ORC format in copy activity

ORC format as source

ORC format as destination

Table summary

ORC as source

ORC as destination

Related content

Povratne informacije

Dodatni resursi