I'm trying to load data from a relational model into Azure SQL into CosmosDB to be consumed by an application. First, I built a JSON model with some nested properties and some hierarchies and I'm able to copy them from a limited number of rows into Cosmos. I get an error when trying to upload 10 or more documents into my collection. Then I started to make some tests to troubleshoot. Let me provide some details below you you can see:
This is the pipeline I built to copy data to Cosmos:
First I created a complex query to copy data from relational model to JSON file with nested properties and it broke. Then I checked the error 2200:
It's saying sth about one of the properties I created into the model. For testing reasons I commented this column and I still got a lot more with messagens like "message: Unexpected character encountered while parsing value: n.". Then I googled it and people had issues with this and no clear solution without hard coding. I mean, I'm not able to copy data from a JSON valid file to Cosmos because sth is breaking while JSON objects are being parsed. I was not able to find all those unexpected characters and I'm not sure if the best way to resolve it would be that I had to look for and clean up data before load them into cosmos, because they are strings at most.
When troubleshooting, I created a query to select columns from tables from sources and I built a simpler model with JSON with only a few nested properties also using "FOR JSON PATH, INCLUDE_NULL_VALUES" and then I copied data to my data lake fine. JSON file was checked and it is valid:
select
newid() as "_id",
[name] as "name.name",
case when charindex(' ',[name]) > 0 then
left([name],charindex(' ',[name])-1)
else
[name]
end as "name.firstName",
case when charindex(' ',[name]) > 0 then
reverse(substring(reverse([name]), 1, charindex(' ', reverse([name]))-1))
else
[name]
end as "name.lastName",
null as "name.username"
from vw_Employees for json path, include_null_values
Next, I copied data from JSON file into my ADLS as a source to CosmosDB as a sink using ADF. For the first TOP 10 rows it works fine and then when I try to load a greater amount of rows (10+) I get error code 2200 and the following message: "Message=Error occurred when deserializing source JSON file. Check if the data is valid JSON object format. Unexpected character encountered while parsing value: n".
I've used jsonlint to validate JSON files when it is built and it's valid.
I saved it into ADLS and it looks fine. When looking deeper and trying to find this unexpected character encountered I wasn't able to find it within the JSON file containing 10 objects or more. As I mentioned, the first 10 rows were copied to Cosmos within documents as designed originally in my data model.
Do you have any ideas or can you help me to figure out how to fix this?
I'm also considering further options to copy data from a relational database to document model into Cosmos meanwhile this way is not working because I need to load more than 250 thousand documents into cosmos for the application.
Thanks in advance.