JSON format in Data Factory in Microsoft Fabric
This article outlines how to configure JSON format in the data pipeline of Data Factory in Microsoft Fabric.
JSON format is supported for the following activities and connectors as a source and destination.
Category | Connector/Activity |
---|---|
Supported connector | Amazon S3 |
Amazon S3 Compatible | |
Azure Blob Storage | |
Azure Data Lake Storage Gen1 | |
Azure Data Lake Storage Gen2 | |
Azure Files | |
File system | |
FTP | |
Google Cloud Storage | |
HTTP | |
Lakehouse Files | |
Oracle Cloud Storage | |
SFTP | |
Supported activity | Copy activity (source/destination) |
Lookup activity | |
GetMetadata activity | |
Delete activity |
To configure JSON format, choose your connection in the source or destination of data pipeline copy activity, and then select JSON in the drop-down list of File format. Select Settings for further configuration of this format.
After you select Settings in the File format section, the following properties are shown in the pop-up File format settings dialog box.
Compression type: Choose the compression codec used to read JSON files in the drop-down list. You can choose from None, bzip2, gzip, deflate, ZipDeflate, TarGzip, or tar.
If you select ZipDeflate as the compression type, Preserve zip file name as folder is displayed under the Advanced settings in the Source tab.
- Preserve zip file name as folder: Indicates whether to preserve the source zip file name as a folder structure during copy.
- If this box is checked (default), the service writes unzipped files to
<specified file path>/<folder named as source zip file>/
. - If this box is unchecked, the service writes unzipped files directly to
<specified file path>
. Make sure you don't have duplicated file names in different source zip files to avoid racing or unexpected behavior.
- If this box is checked (default), the service writes unzipped files to
If you select TarGzip/tar as the compression type, Preserve compression file name as folder is displayed under the Advanced settings in the Source tab.
- Preserve compression file name as folder: Indicates whether to preserve the source compressed file name as a folder structure during copy.
- If this box is checked (default), the service writes decompressed files to
<specified file path>/<folder named as source compressed file>/
. - If this box is unchecked, the service writes decompressed files directly to
<specified file path>
. Make sure you don't have duplicated file names in different source files to avoid racing or unexpected behavior.
- If this box is checked (default), the service writes decompressed files to
- Preserve zip file name as folder: Indicates whether to preserve the source zip file name as a folder structure during copy.
Compression level: The compression ratio. You can choose from Fastest or Optimal.
Fastest: The compression operation should complete as quickly as possible, even if the resulting file isn't optimally compressed.
Optimal: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, go to the Compression Level article.
Encoding: Specify the encoding type used to read test files. Select one type from the drop-down list. The default value is UTF-8.
After you select Settings, the following properties are shown in the pop-up File format settings dialog box.
Compression type: Choose the compression codec used to write JSON files in the drop-down list. You can choose from None, bzip2, gzip, deflate, ZipDeflate, TarGzip, or tar.
Compression level: The compression ratio. You can choose from Optimal or Fastest.
Fastest: The compression operation should complete as quickly as possible, even if the resulting file isn't optimally compressed.
Optimal: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, go to the Compression Level article.
Encoding: Specify the encoding type used to write test files. Select one type from the drop-down list. The default value is UTF-8.
Under Advanced settings in the Destination tab, the following JSON format related properties are displayed.
- File pattern: Specify the pattern of data stored in each JSON file. Allowed values are: Set of objects (JSON Lines) and Array of objects. The default value is Set of objects. See JSON file patterns section for details about these patterns.
When copying data from JSON files, copy activity can automatically detect and parse the following patterns of JSON files. When writing data to JSON files, you can configure the file pattern on copy activity destination.
Type I: setOfObjects
Each file contains single object, JSON lines, or concatenated objects.
single object JSON example
{ "time": "2015-04-29T07:12:20.9100000Z", "callingimsi": "466920403025604", "callingnum1": "678948008", "callingnum2": "567834760", "switch1": "China", "switch2": "Germany" }
JSON Lines (default for destination)
{"time":"2015-04-29T07:12:20.9100000Z","callingimsi":"466920403025604","callingnum1":"678948008","callingnum2":"567834760","switch1":"China","switch2":"Germany"} {"time":"2015-04-29T07:13:21.0220000Z","callingimsi":"466922202613463","callingnum1":"123436380","callingnum2":"789037573","switch1":"US","switch2":"UK"} {"time":"2015-04-29T07:13:21.4370000Z","callingimsi":"466923101048691","callingnum1":"678901578","callingnum2":"345626404","switch1":"Germany","switch2":"UK"}
concatenated JSON example
{ "time": "2015-04-29T07:12:20.9100000Z", "callingimsi": "466920403025604", "callingnum1": "678948008", "callingnum2": "567834760", "switch1": "China", "switch2": "Germany" } { "time": "2015-04-29T07:13:21.0220000Z", "callingimsi": "466922202613463", "callingnum1": "123436380", "callingnum2": "789037573", "switch1": "US", "switch2": "UK" } { "time": "2015-04-29T07:13:21.4370000Z", "callingimsi": "466923101048691", "callingnum1": "678901578", "callingnum2": "345626404", "switch1": "Germany", "switch2": "UK" }
Type II: arrayOfObjects
Each file contains an array of objects.
[ { "time": "2015-04-29T07:12:20.9100000Z", "callingimsi": "466920403025604", "callingnum1": "678948008", "callingnum2": "567834760", "switch1": "China", "switch2": "Germany" }, { "time": "2015-04-29T07:13:21.0220000Z", "callingimsi": "466922202613463", "callingnum1": "123436380", "callingnum2": "789037573", "switch1": "US", "switch2": "UK" }, { "time": "2015-04-29T07:13:21.4370000Z", "callingimsi": "466923101048691", "callingnum1": "678901578", "callingnum2": "345626404", "switch1": "Germany", "switch2": "UK" } ]
The following properties are supported in the copy activity Source section when using the JSON format.
Name | Description | Value | Required | JSON script property |
---|---|---|---|---|
File format | The file format that you want to use. | JSON | Yes | type (under datasetSettings ):Json |
Compression type | The compression codec used to read JSON files. | Choose from: None bzip2 gzip deflate ZipDeflate TarGzip tar |
No | type (under compression ): bzip2 gzip deflate ZipDeflate TarGzip tar |
Compression level | The compression ratio. | Fastest Optimal |
No | level (under compression ): Fastest Optimal |
Encoding | The encoding type used to read test files. | "UTF-8" (by default),"UTF-8 without BOM", "UTF-16LE", "UTF-16BE", "UTF-32LE", "UTF-32BE", "US-ASCII", "UTF-7", "BIG5", "EUC-JP", "EUC-KR", "GB2312", "GB18030", "JOHAB", "SHIFT-JIS", "CP875", "CP866", "IBM00858", "IBM037", "IBM273", "IBM437", "IBM500", "IBM737", "IBM775", "IBM850", "IBM852", "IBM855", "IBM857", "IBM860", "IBM861", "IBM863", "IBM864", "IBM865", "IBM869", "IBM870", "IBM01140", "IBM01141", "IBM01142", "IBM01143", "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148", "IBM01149", "ISO-2022-JP", "ISO-2022-KR", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-13", "ISO-8859-15", "WINDOWS-874", "WINDOWS-1250", "WINDOWS-1251", "WINDOWS-1252", "WINDOWS-1253", "WINDOWS-1254", "WINDOWS-1255", "WINDOWS-1256", "WINDOWS-1257", "WINDOWS-1258" | No | encodingName |
Preserve zip file name as folder | Indicates whether to preserve the source zip file name as a folder structure during copy. | Selected (default) or unselect | No | preserveZipFileNameAsFolder (under compressionProperties ->type as ZipDeflateReadSettings ):true (default) or false |
Preserve compression file name as folder | Indicates whether to preserve the source compressed file name as a folder structure during copy. | Selected (default) or unselect | No | preserveCompressionFileNameAsFolder (under compressionProperties ->type as TarGZipReadSettings or TarReadSettings ):true (default) or false |
The following properties are supported in the copy activity Destination section when using the JSON format.
Name | Description | Value | Required | JSON script property |
---|---|---|---|---|
File format | The file format that you want to use. | JSON | Yes | type (under datasetSettings ):Json |
Compression type | The compression codec used to write JSON files. | Choose from: None bzip2 gzip deflate ZipDeflate TarGzip tar |
No | type (under compression ): bzip2 gzip deflate ZipDeflate TarGzip tar |
Compression level | The compression ratio. | Fastest Optimal |
No | level (under compression ): Fastest Optimal |
Encoding | The encoding type used to write test files. | "UTF-8" (by default),"UTF-8 without BOM", "UTF-16LE", "UTF-16BE", "UTF-32LE", "UTF-32BE", "US-ASCII", "UTF-7", "BIG5", "EUC-JP", "EUC-KR", "GB2312", "GB18030", "JOHAB", "SHIFT-JIS", "CP875", "CP866", "IBM00858", "IBM037", "IBM273", "IBM437", "IBM500", "IBM737", "IBM775", "IBM850", "IBM852", "IBM855", "IBM857", "IBM860", "IBM861", "IBM863", "IBM864", "IBM865", "IBM869", "IBM870", "IBM01140", "IBM01141", "IBM01142", "IBM01143", "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148", "IBM01149", "ISO-2022-JP", "ISO-2022-KR", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-13", "ISO-8859-15", "WINDOWS-874", "WINDOWS-1250", "WINDOWS-1251", "WINDOWS-1252", "WINDOWS-1253", "WINDOWS-1254", "WINDOWS-1255", "WINDOWS-1256", "WINDOWS-1257", "WINDOWS-1258" | No | encodingName |
File pattern | Indicate the pattern of data stored in each JSON file. | Set of objects Array of objects |
No | filePattern: setOfObjects arrayOfObjects |