Issue with Azure Functions Blob Storage trigger for delete events

Su Myat Hlaing 75 Reputation points
2024-05-22T07:53:23.0733333+00:00

I have an Azure Function app configured with Azure Functions Blob Storage trigger. The goal is to update Azure Cognitive Search index when a file is uploaded to Azure Blob Storage, and to delete the corresponding document from the search index when a file is deleted from Azure Blob Storage. Upload trigger works when upload file to Blob Storage.

Issue: However, I'm facing issues with the delete trigger not firing.

  1. How can I modify my Azure Function app to ensure that when a file is deleted from Azure Blob Storage, the corresponding document is deleted from Azure Cognitive Search?
  2. Is there a way to achieve this using Azure Functions and Blob Storage trigger, or do I need to use a different approach?

Any help or guidance on this issue would be greatly appreciated. Thank you!

Code:


const { app } = require('@azure/functions');
const { BlobServiceClient, StorageSharedKeyCredential } = require('@azure/storage-blob');
const { SearchClient, AzureKeyCredential } = require('@azure/search-documents');
const fs = require('fs');
const path = require('path');
const pdf = require('pdf-parse');
// Azure Blob Storage configuration
const accountName = "";
const accountKey = "";
const containerName = '';

// Azure Cognitive Search configuration
const searchServiceName = "";
const indexName = "";
const apiKey = "";

const blobServiceClient = new BlobServiceClient(
    `https://${accountName}.blob.core.windows.net`,
    new StorageSharedKeyCredential(accountName, accountKey)
);

const containerClient = blobServiceClient.getContainerClient(containerName);

const searchClient = new SearchClient(
    `https://${searchServiceName}.search.windows.net/`,
    indexName,
    new AzureKeyCredential(apiKey)
);

async function indexBlob(blobTrigger) {
    try {
        const blobClient = containerClient.getBlobClient(blobTrigger.triggerMetadata.name);
        const downloadResponse = await blobClient.download();
        console.log("downloadResponse",downloadResponse);     
        const blobName = blobTrigger.triggerMetadata.name;
        const encodedName = Buffer.from(blobTrigger.triggerMetadata.name).toString('base64');
      const properties = blobTrigger.triggerMetadata.properties;
      const pdfBuffer = await streamToBuffer(downloadResponse.readableStreamBody);


      const pdfText = await pdf(pdfBuffer);      
      const blobContent = pdfText.text;

        const document = {
            id: encodedName,
            content: blobContent,
            metadata_storage_content_type: properties.contentType || null,
            metadata_storage_size: properties.contentLength || null,
            metadata_storage_last_modified: properties.lastModified ? new Date(properties.lastModified).toISOString() : null,
            metadata_storage_content_md5: properties.contentMD5 ? Buffer.from(properties.contentMD5).toString('base64') : null,
            metadata_storage_name: blobName,
            metadata_storage_path: blobTrigger.uri,
            metadata_storage_file_extension: path.extname(blobName),
            metadata_content_type: properties.contentType || null,
            metadata_language: null, 
            metadata_author: null,
            metadata_creation_date: properties.creationTime ? new Date(properties.creationTime).toISOString() : null,
        };


        await searchClient.uploadDocuments([document]);
        console.log(`Document "${document.id}" has been indexed`);
    } catch (error) {
        console.error(`Error indexing document: ${error}`);
    }
}


app.storageBlob('process-blob-for-search', {
    path: 'content/{name}', 
    connection: 'AzureWebJobsStorage',
    direction: 'in',
    dataType: 'binary',
    async handler(context, blobTrigger) {
        console.log(`Blob "${blobTrigger.triggerMetadata.name}" has been uploaded`);
        console.log(blobTrigger);
        await indexBlob(blobTrigger);
    }
});

async function streamToBuffer(readableStream) {
    return new Promise((resolve, reject) => {
        const chunks = [];
        readableStream.on("data", (data) => {
            chunks.push(data instanceof Buffer ? data : Buffer.from(data));
        });
        readableStream.on("end", () => {
            resolve(Buffer.concat(chunks));
        });
        readableStream.on("error", reject);
    });
}

app.storageBlob('delete-blob-from-search', {
    path: 'content/{name}', 
    connection: 'AzureWebJobsStorage',
    direction: 'out',
    async handler(context, blobTrigger) {
        console.log(`Blob "${blobTrigger.triggerMetadata.name}" has been deleted`);
        const blobName = blobTrigger.triggerMetadata.name;
        const encodedName = Buffer.from(blobName).toString('base64');
        try {
            await searchClient.deleteDocuments([encodedName]); 
            console.log(`Document "${encodedName}" has been deleted from the index`);
        } catch (error) {
            console.error(`Error deleting document from the index: ${error}`);
        }
    }
});


Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,483 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,542 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,511 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MayankBargali-MSFT 69,846 Reputation points
    2024-05-22T14:19:29.34+00:00

    @Su Myat Hlaing Thanks for reaching out. Instead of leveraging the storage trigger you should use event grid trigger.

    Flow:

    Storage Account (event grid) --> Event grid Subscription (subscribe to Microsoft.Storage.BlobCreated and Microsoft.Storage.BlobDeleted events) --> Azure function event grid trigger

    In your function code you can validate the request body to check what type of event it is that has triggered your function app and as per the event type you can execute your further business workflow.

    In case if you want to read the content of the blob then you can use the storage SDK in the function app to get the content of the blob and process your further business workflow.

    The suggestion is to use event grid trigger rather than storage trigger as documented in the tip section here

    If you choose to use the Blob storage trigger, note that there are two implementations offered: a polling-based one (referenced in this article) and an event-based one. It is recommended that you use the event-based implementation as it has lower latency than the other.

    In case if you still want to use the storage trigger then you can create another function app with event grid trigger that subscribe to only Microsoft.Storage.BlobDeleted event.

    To learn on storage event grid available events, you can refer to this document.

    Feel free to get back to me if you have any queries or concerns.

    Please click on 'Yes' if it helped so that it can help others in the community looking for help on similar topics.