Hi @Pravalika-randstad ,
Thank you for reaching out to us with your query.
To delete an entire folder with Azure Data Factory, you can try using the following approach:
- This is my folder structure:
raw
MainFolder
SubfolderA
20230425
//files
20230427
//files
20230429
//files
20230523
//files
SubfolderB
20230425
//files
20230427
//files
20230429
//files
20230523
//files
- As you want to delete the folders which are more than 7 days old, first I have created a dates array using a ForEach with
@range(0,7)
. This expression gives the array[0,1,2,3,4,5,6]
. - Inside ForEach I have used append variable activity to an array to append the date in
yyyyMMdd
format with the below expression.@formatDateTime(subtractFromTime(utcNow(),item(),'Day'),'yyyyMMdd')
- This gives the dates array for the last 7 days list as below.
- This is my pipeline flow:
- Use a Get Meta data activity first to get the subfolders list(
SubfolderA,SubfolderA
) and pass this child items array to ForEach. - Inside ForEach, use another Get Meta data activity(in path give the
@item().name
) to get the date folders list. - Now, use filter on these child items. Here we are filtering the date folders by checking our dates array contains the folder name or not.
- Get the child items which are more than 7 days from the filter. Here we need to iterate through this array. So, use Execute pipeline activity by passing the current subfolder name and its corresponding child items array.
- In the child pipeline, iterate through the child items and use delete activity on it.
Use the dataset with a parameter like below:
My Parent pipeline JSON:
{
"name": "parent",
"properties": {
"activities": [
{
"name": "get subfolders",
"type": "GetMetadata",
"dependsOn": [
{
"activity": "ForEach1",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "sourcecsv",
"type": "DatasetReference",
"parameters": {
"folderpath": "MainFolder"
}
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
}
},
{
"name": "iterate subfolders",
"type": "ForEach",
"dependsOn": [
{
"activity": "get subfolders",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('get subfolders').output.childItems",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "get date folders",
"type": "GetMetadata",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "sourcecsv",
"type": "DatasetReference",
"parameters": {
"folderpath": {
"value": "@concat('MainFolder/',item().name)",
"type": "Expression"
}
}
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
}
},
{
"name": "Execute Pipeline1",
"type": "ExecutePipeline",
"dependsOn": [
{
"activity": "Filter1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"pipeline": {
"referenceName": "child",
"type": "PipelineReference"
},
"waitOnCompletion": true,
"parameters": {
"date_folder": {
"value": "@activity('Filter1').output.value",
"type": "Expression"
},
"path": {
"value": "@concat('MainFolder/',item().name)",
"type": "Expression"
}
}
}
},
{
"name": "Filter1",
"type": "Filter",
"dependsOn": [
{
"activity": "get date folders",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('get date folders').output.childItems",
"type": "Expression"
},
"condition": {
"value": "@not(contains(variables('daysarr'),item().name))",
"type": "Expression"
}
}
}
]
}
},
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@range(0,7)",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append variable1",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "daysarr",
"value": {
"value": "@formatDateTime(subtractFromTime(utcNow(),item(),'Day'),'yyyyMMdd')",
"type": "Expression"
}
}
}
]
}
}
],
"variables": {
"counter": {
"type": "String"
},
"daysarr": {
"type": "Array"
},
"temp": {
"type": "String"
},
"new": {
"type": "Array"
}
},
"annotations": [],
"lastPublishTime": "2023-05-02T07:27:09Z"
},
"type": "Microsoft.DataFactory/factories/pipelines"
}
Child Pipeline JSON:
{
"name": "child",
"properties": {
"activities": [
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@pipeline().parameters.date_folder",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Delete1",
"type": "Delete",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "sourcecsv",
"type": "DatasetReference",
"parameters": {
"folderpath": {
"value": "@concat(pipeline().parameters.path,'/',item().name)",
"type": "Expression"
}
}
},
"enableLogging": false,
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
}
}
}
]
}
}
],
"parameters": {
"date_folder": {
"type": "array"
},
"path": {
"type": "string"
}
},
"annotations": []
}
}
Folders before pipeline execution:
You can see the folders which are more than 7 days folders were deleted after pipeline execution.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.