ADF - Using a dataflow activity, how can you remove NULL/EMPTY objects from a JSON hierarchy

Wardie;) 0 Reputation points
2023-05-18T16:41:25.6333333+00:00

Within Dataflow activity, I've got a JSON file at source with various hierarchical objects within it.

User's image

Next, I have a dataflow activity which anonymises the data using the same hierarchical structure as the source JSON file (e.g. addressLine1 =

case(isNull(event.payload.content.payload.subject.requester.address.addressLine1), toString(null()),'****')

Then it rolls up into the upper hierarchical object of 'address'

@(addressLine1=case(isNull(event.payload.content.payload.subject.requester.address.addressLine1), toString(null()),'****'),
		addressLine2=case(isNull(event.payload.content.payload.subject.requester.address.addressLine2), toString(null()),'****'),
		addressLine3=case(isNull(event.payload.content.payload.subject.requester.address.addressLine3), toString(null()),'****'),
		country=event.payload.content.payload.subject.requester.address.country,
		county=case(isNull(event.payload.content.payload.subject.requester.address.county), toString(null()),'****'),
		knownCareHome=event.payload.content.payload.subject.requester.address.knownCareHome,
		localAuthorityCode=event.payload.content.payload.subject.requester.address.localAuthorityCode,
		lsoa=event.payload.content.payload.subject.requester.address.lsoa,
		odsLocationCode=case(isNull(event.payload.content.payload.subject.requester.address.odsLocationCode), toString(null()),'****'),
		postcode=case(isNull(event.payload.content.payload.subject.requester.address.postcode),toString(null()),
case(length(event.payload.content.payload.subject.requester.address.postcode)==8,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),8,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==7,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),7,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==6,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),6,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==5,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),5,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)<=4,trim(left(event.payload.content.payload.subject.requester.address.postcode,4)))))))),
		postcodeSource=event.payload.content.payload.subject.requester.address.postcodeSource,
		town=case(isNull(event.payload.content.payload.subject.requester.address.town), toString(null()),'****'))

Then, further it rolls up into the upper hierarchical object of 'requester', and so forth

@(address=@(addressLine1=case(isNull(event.payload.content.payload.subject.requester.address.addressLine1), toString(null()),'****'),
		addressLine2=case(isNull(event.payload.content.payload.subject.requester.address.addressLine2), toString(null()),'****'),
		addressLine3=case(isNull(event.payload.content.payload.subject.requester.address.addressLine3), toString(null()),'****'),
		country=event.payload.content.payload.subject.requester.address.country,
		county=case(isNull(event.payload.content.payload.subject.requester.address.county), toString(null()),'****'),
		knownCareHome=event.payload.content.payload.subject.requester.address.knownCareHome,
		localAuthorityCode=event.payload.content.payload.subject.requester.address.localAuthorityCode,
		lsoa=event.payload.content.payload.subject.requester.address.lsoa,
		odsLocationCode=case(isNull(event.payload.content.payload.subject.requester.address.odsLocationCode), toString(null()),'****'),
		postcode=case(isNull(event.payload.content.payload.subject.requester.address.postcode),toString(null()),
case(length(event.payload.content.payload.subject.requester.address.postcode)==8,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),8,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==7,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),7,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==6,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),6,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)==5,rpad(left(event.payload.content.payload.subject.requester.address.postcode,4),5,'*'),
case(length(event.payload.content.payload.subject.requester.address.postcode)<=4,trim(left(event.payload.content.payload.subject.requester.address.postcode,4)))))))),
		postcodeSource=event.payload.content.payload.subject.requester.address.postcodeSource,
		town=case(isNull(event.payload.content.payload.subject.requester.address.town), toString(null()),'****')),
		organisationContactFirstName=case(isNull(event.payload.content.payload.subject.requester.organisationContactFirstName),toString(null()),'****'),
		organisationContactLastName=case(isNull(event.payload.content.payload.subject.requester.organisationContactLastName),toString(null()),'****'),
		organisationId=event.payload.content.payload.subject.requester.organisationId,
		organisationName=event.payload.content.payload.subject.requester.organisationName,
		organisationRole=event.payload.content.payload.subject.requester.organisationRole,
		organisationType=event.payload.content.payload.subject.requester.organisationType)

Now, the issue I'm having is when it produces/writes the JSON file at Sink. I want to essentially, tidy the JSON file. If there are any NULL/Empty objects within the file then to remove these completely. i.e. 'address' object is empty/NULL so needs to be removed from the JSON file. Because 'address' is empty/NULL, 'requester' object should also be removed from the JSON file too, and so forth. The point is, if the lower object(s) are NULL/Empty then remove the object as you go up the object hierarchy

User's image

Any advise to assist me on this matter would be greatly appreciated ;)

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2023-05-19T10:16:04.27+00:00

    Hi Wardie;) ,

    Welcome to Microsoft Q&A platform and thanks for posting your query here.

    From the description of your question, it seems you want to remove json object if it is having empty value or NULL value. Please let me know if that is not the correct understanding.

    First of all , you need to flatten the json to convert it into tabular structure so that you filter out the unwanted row using filter transformation.

    For flattening the json , we need the column in array format , here in your case , it is complex datatype. You can use 'collect' function in aggregate transformation to convert the complex datatype into array datatype and then use flatten transformation on top of it to convert it into individual rows and columns and use filter transformation to write the expression to filter out the null records for the column .

    Kindly checkout this video from the specified time to convert json into array format : How to convert a JSON record into NESTED JSON using mapping dataflow

    In case you still have any doubts in the implementation, I would recommend you to share the complete source json as an attachment for better understanding.

    Hope it helps. Kindly accept the answer by clicking on Accept answer button as accepted answer helps community.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.