Delimited text format in Data Factory in Microsoft Fabric

Article
06/25/2024

This article outlines how to configure delimited text format in the data pipeline of Data Factory in Microsoft Fabric.

Supported capabilities

Delimited text format is supported for the following activities and connectors as source and destination.

Category	Connector/Activity
Supported connector	Amazon S3
	Amazon S3 Compatible
	Azure Blob Storage
	Azure Data Lake Storage Gen1
	Azure Data Lake Storage Gen2
	Azure Files
	File system
	FTP
	Google Cloud Storage
	HTTP
	Lakehouse Files
	Oracle Cloud Storage
	SFTP
Supported activity	Copy activity (source/destination)
	Lookup activity
	GetMetadata activity
	Delete activity

Delimited text format in copy activity

To configure delimited text format, choose your connection in the source or destination of data pipeline copy activity, and then select DelimitedText in the drop-down list of File format. Select Settings for further configuration of this format.

Screenshot showing file format settings.

Delimited text format as source

After selecting Settings in File format section, following properties are shown up in the pop-up File format settings dialog box.

Screenshot showing source file format settings.

Compression type: The compression codec used to read delimited text files. You can choose from None, bzip2, gzip, deflate, ZipDeflate, TarGzip or tar type in the drop-down list.

If you select ZipDeflate as compression type, Preserve zip file name as folder will show up under Advanced settings in Source tab.
- Preserve zip file name as folder: Indicates whether to preserve the source zip file name as folder structure during copy.
  - If this box is checked (default), the service writes unzipped files to <specified file path>/<folder named as source zip file>/.
  - If this box is unchecked, the service writes unzipped files directly to <specified file path>. Make sure you don't have duplicated file names in different source zip files to avoid racing or unexpected behavior.
If you select TarGzip/tar as compression type, Preserve compression file name as folder will show up under Advanced settings in Source tab.
- Preserve compression file name as folder: Indicates whether to preserve the source compressed file name as folder structure during copy.
  - If this box is checked (default), the service writes decompressed files to <specified file path>/<folder named as source compressed file>/.
  - If this box is unchecked, the service writes decompressed files directly to <specified file path>. Make sure you don't have duplicated file names in different source zip files to avoid racing or unexpected behavior.
Compression level: Specify the compression ratio when you select a compression type. You can choose from Optimal or Fastest.
- Fastest: The compression operation should complete as quickly as possible, even if the resulting file is not optimally compressed.
- Optimal: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, see Compression Level topic.
Column delimiter: The character(s) used to separate columns in a file. The default value is comma (,).
Row delimiter: Specify the character used to separate rows in a file. Only one character is allowed. The default value is line feed \n.
Encoding: The encoding type used to read/write test files. The default value is UTF-8.
Escape character: The single character to escape quotes inside a quoted value. The default value is backslash \. When escape character is defined as empty string, the Quote character must be set as empty string as well, in which case make sure all column values don't contain delimiters.
Quote character: The single character to quote column values if it contains column delimiter. The default value is double quotes ". When Quote character is defined as empty string, it means there is no quote char and column value is not quoted, and escape character is used to escape the column delimiter and itself.
First row as header: Specifies whether to treat/make the first row as a header line with names of columns. Allowed values are selected and unselected (default). When first row as header is unselected, note UI data preview and lookup activity output auto generate column names as Prop_{n} (starting from 0), copy activity requires explicit mapping from source to destination and locates columns by ordinal (starting from 1).
Null value: Specifies the string representation of null value. The default value is empty string.

Under Advanced settings in Source tab, other delimited text format related properties are exposed.

Delimited text format as destination

After selecting Settings in File format section, following properties are shown up in the pop-up File format settings dialog box.

Screenshot showing destination file format settings.

Compression type: The compression codec used to write delimited text files. You can choose from None, bzip2, gzip, deflate, ZipDeflate, TarGzip or tar type in the drop-down list.
Compression level: Specify the compression ratio when you select a compression type. You can choose from Optimal or Fastest.
- Fastest: The compression operation should complete as quickly as possible, even if the resulting file is not optimally compressed.
- Optimal: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, see Compression Level topic.
Column delimiter: The character(s) used to separate columns in a file. The default value is comma (,).
Row delimiter: The character used to separate rows in a file. Only one character is allowed. The default value is line feed \n.
Encoding: The encoding type used to write test files. The default value is UTF-8.
Escape character: The single character to escape quotes inside a quoted value. The default value is backslash \. When escape character is defined as empty string, the Quote character must be set as empty string as well, in which case make sure all column values don't contain delimiters.
Quote character: The single character to quote column values if it contains column delimiter. The default value is double quotes ". When Quote character is defined as empty string, it means there is no quote char and column value is not quoted, and escape character is used to escape the column delimiter and itself.
First row as header: Specifies whether to treat/make the first row as a header line with names of columns. Allowed values are selected and unselected (default). When first row as header is unselected, note UI data preview and lookup activity output auto generate column names as Prop_{n} (starting from 0), copy activity requires explicit mapping from source to destination and locates columns by ordinal (starting from 1).
Null value: Specifies the string representation of null value. The default value is empty string.

Under Advanced settings in Destination tab, further delimited text format related property are shown up.

Quote all text: Enclose all values in quotes.
File extension: The file extension used to name the output files, for example, .csv, .txt.
Max rows per file: When writing data into a folder, you can choose to write to multiple files and specify the max rows per file.
File name prefix: Applicable when Max rows per file is configured. Specify the file name prefix when writing data to multiple files, resulted in this pattern: <fileNamePrefix>_00000.<fileExtension>. If not specified, file name prefix will be auto generated. This property does not apply when source is file based store or partition option enabled data store.

Table summary

Delimited text as source

The following properties are supported in the copy activity Source section when using delimited text format.

Name	Description	Value	Required	JSON script property
File format	The file format that you want to use.	DelimitedText	Yes	type (under `datasetSettings`): DelimitedText
Compression type	The compression codec used to read delimited text files.	Choose from: None bzip2 gzip deflate ZipDeflate TarGzip tar	No	type (under `compression`): bzip2 gzip deflate ZipDeflate TarGzip tar
Preserve zip file name as folder	Indicates whether to preserve the source zip file name as folder structure during copy. Applies when you select ZipDeflate compression.	Selected or unselect	No	preserveZipFileNameAsFolder (under `compressionProperties`->`type` as `ZipDeflateReadSettings`)
Preserve compression file name as folder	Indicates whether to preserve the source compressed file name as folder structure during copy. Applies when you select TarGzip/tar compression.	Selected or unselect	No	preserveCompressionFileNameAsFolder (under `compressionProperties`->`type` as `TarGZipReadSettings` or `TarReadSettings`)
Compression level	The compression ratio. Allowed values are Optimal or Fastest.	Optimal or Fastest	No	level (under `compression`): Fastest Optimal
Column delimiter	The character(s) used to separate columns in a file.	< the selected column delimiter > comma `,` (by default)	No	columnDelimiter
Row delimiter	The character used to separate rows in a file.	< the selected row delimiter > `\r`,`\n` (by default), or `r\n`	No	rowDelimiter
Encoding	The encoding type used to read/write test files.	"UTF-8" (by default),"UTF-8 without BOM", "UTF-16", "UTF-16BE", "UTF-32", "UTF-32BE", "US-ASCII", "UTF-7", "BIG5", "EUC-JP", "EUC-KR", "GB2312", "GB18030", "JOHAB", "SHIFT-JIS", "CP875", "CP866", "IBM00858", "IBM037", "IBM273", "IBM437", "IBM500", "IBM737", "IBM775", "IBM850", "IBM852", "IBM855", "IBM857", "IBM860", "IBM861", "IBM863", "IBM864", "IBM865", "IBM869", "IBM870", "IBM01140", "IBM01141", "IBM01142", "IBM01143", "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148", "IBM01149", "ISO-2022-JP", "ISO-2022-KR", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-13", "ISO-8859-15", "WINDOWS-874", "WINDOWS-1250", "WINDOWS-1251", "WINDOWS-1252", "WINDOWS-1253", "WINDOWS-1254", "WINDOWS-1255", "WINDOWS-1256", "WINDOWS-1257", "WINDOWS-1258"	No	encodingName
Escape character	The single character to escape quotes inside a quoted value. When escape character is defined as empty string, the Quote character must be set as empty string as well, in which case make sure all column values don't contain delimiters.	< your selected escape character > backslash `\` (by default)	No	escapeChar
Quote character	The single character to quote column values if it contains column delimiter. When Quote character is defined as empty string, it means there is no quote char and column value is not quoted, and escape character is used to escape the column delimiter and itself.	< your selected quote character > double quotes `"` (by default)	No	quoteChar
First row as header	Specifies whether to treat the first row in the given worksheet/range as a header line with names of columns.	Selected or unselected	No	firstRowAsHeader: true or false (default)
Null value	Specifies the string representation of null value. The default value is empty string.	< the string representation of null value > empty string (by default)	No	nullValue

Delimited text as destination

The following properties are supported in the copy activity Destination section when using delimited text format.

Name	Description	Value	Required	JSON script property
File format	The file format that you want to use.	DelimitedText	Yes	type (under `datasetSettings`): DelimitedText
Compression type	The compression codec used to write delimited text files.	Choose from: None bzip2 gzip deflate ZipDeflate TarGzip tar	No	type (under `compression`): bzip2 gzip deflate ZipDeflate TarGzip tar
Preserve zip file name as folder	Indicates whether to preserve the source zip file name as folder structure during copy.	Selected or unselect	No	preserveZipFileNameAsFolder (under `compressionProperties`->`type` as `ZipDeflateReadSettings`)
Preserve compression file name as folder	Indicates whether to preserve the source compressed file name as folder structure during copy.	Selected or unselect	No	preserveCompressionFileNameAsFolder (under `compressionProperties`->`type` as `TarGZipReadSettings` or `TarReadSettings`)
Compression level	The compression ratio. Allowed values are Optimal or Fastest.	Optimal or Fastest	No	level (under `compression`): Fastest Optimal
Column delimiter	The character(s) used to separate columns in a file.	< the selected column delimiter > comma `,` (by default)	No	columnDelimiter
Row delimiter	The character used to separate rows in a file.	< the selected row delimiter > `\r`,`\n` (by default), or `r\n`	No	rowDelimiter
Encoding	The encoding type used to read/write test files.	"UTF-8" (by default),"UTF-8 without BOM", "UTF-16", "UTF-16BE", "UTF-32", "UTF-32BE", "US-ASCII", "UTF-7", "BIG5", "EUC-JP", "EUC-KR", "GB2312", "GB18030", "JOHAB", "SHIFT-JIS", "CP875", "CP866", "IBM00858", "IBM037", "IBM273", "IBM437", "IBM500", "IBM737", "IBM775", "IBM850", "IBM852", "IBM855", "IBM857", "IBM860", "IBM861", "IBM863", "IBM864", "IBM865", "IBM869", "IBM870", "IBM01140", "IBM01141", "IBM01142", "IBM01143", "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148", "IBM01149", "ISO-2022-JP", "ISO-2022-KR", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-13", "ISO-8859-15", "WINDOWS-874", "WINDOWS-1250", "WINDOWS-1251", "WINDOWS-1252", "WINDOWS-1253", "WINDOWS-1254", "WINDOWS-1255", "WINDOWS-1256", "WINDOWS-1257", "WINDOWS-1258"	No	encodingName
Escape character	The single character to escape quotes inside a quoted value. When escape character is defined as empty string, the Quote character must be set as empty string as well, in which case make sure all column values don't contain delimiters.	< your selected escape character > backslash `\` (by default)	No	escapeChar
Quote character	The single character to quote column values if it contains column delimiter. When Quote character is defined as empty string, it means there is no quote char and column value is not quoted, and escape character is used to escape the column delimiter and itself.	< your selected quote character > double quotes `"` (by default)	No	quoteChar
First row as header	Specifies whether to treat the first row in the given worksheet/range as a header line with names of columns.	Selected or unselected	No	firstRowAsHeader: true or false (default)
Quote all text	Enclose all values in quotes.	Selected (default) or unselected	No	quoteAllText: true (default) or false
File extension	The file extension used to name the output files.	< your file extension > `.txt` (by default)	No	fileExtension
Max rows per file	When writing data into a folder, you can choose to write to multiple files and specify the max rows per file.	< your max rows per file >	No	maxRowsPerFile
File name prefix	Applicable when Max rows per file is configured. Specify the file name prefix when writing data to multiple files, resulted in this pattern: `<fileNamePrefix>_00000.<fileExtension>`. If not specified, file name prefix will be auto generated. This property does not apply when source is file based store or partition option enabled data store.	< your file name prefix >	No	fileNamePrefix

Connectors overview

Share via

Delimited text format in Data Factory in Microsoft Fabric

Supported capabilities

Delimited text format in copy activity

Delimited text format as source

Delimited text format as destination

Table summary

Delimited text as source

Delimited text as destination

Feedback

Additional resources

Share via

Delimited text format in Data Factory in Microsoft Fabric

Supported capabilities

Delimited text format in copy activity

Delimited text format as source

Delimited text format as destination

Table summary

Delimited text as source

Delimited text as destination

Related content

Feedback

Additional resources