Function App Blob Upload Form Recogniser

John Hay 0 Reputation points
2023-04-15T03:19:35.7833333+00:00

Hi I am new to the coding and azure packages and am trying to get my first function app going although i am stuck at a couple of stages being my blob is not defined and the table elements. Is there code i need to add? One this is operational will this be an automated process or do i need to add in more triggers. Ideally i am hoping to upload read mostly pdf then put out into table so i can then start to use the AI capabiltiies of cognitive search as well as ingesting in power bi.. I am hoping you could give meme some pointers to finish this of so it operates successfully and i can continue.User's image

User's image

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,353 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,468 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ryan Hill 26,236 Reputation points Microsoft Employee
    2023-04-23T01:05:01.58+00:00

    I can't say for certain because you only posted a portion of the code, but the error you're getting is the compiler not knowing what num_table is. It appears you defined a function that returns an object; composed of h, num_table, and p, and you immediately jump into for loop inside the main function. There are various Python samples that illustrates how to use blob storage. One way to get you started is the snippet down below. It's an HTTP triggered function that will accept a PDF file and extract the text using PyPDF2

    import logging
    import azure.functions as func
    from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
    import os
    import PyPDF2
    
    def main(req: func.HttpRequest) -> func.HttpResponse:
        logging.info('Python HTTP trigger function processed a request.')
    
        # Set the connection string to your storage account
        connect_str = "DefaultEndpointsProtocol=https;AccountName=<your_account_name>;AccountKey=<your_account_key>;EndpointSuffix=core.windows.net"
    
        # Set the name of the container you want to use
        container_name = "<your_container_name>"
    
        # Get the PDF file from the request body
        file = req.get_body()
    
        # Read the elements off the PDF file
        pdf_reader = PyPDF2.PdfFileReader(file)
        num_pages = pdf_reader.getNumPages()
        text = ""
        for i in range(num_pages):
            page = pdf_reader.getPage(i)
            text += page.extractText()
    
        # Create a BlobServiceClient object using the connection string
        blob_service_client = BlobServiceClient.from_connection_string(connect_str)
    
        # Create a ContainerClient object for the container you want to use
        container_client = blob_service_client.get_container_client(container_name)
    
        # Create a BlobClient object for the PDF file you want to upload
        blob_client = container_client.get_blob_client("<your_blob_name>.pdf")
    
        # Upload the PDF file to Azure Storage
        blob_client.upload_blob(file)
    
        logging.info('PDF file uploaded successfully!')
    
        return func.HttpResponse(text)
    
    

    You can declare function for reading the elements from your PDF like you have a above and return the contents that were read. You also don't have to use an HTTP trigger, you can easily use a Queue or Blob trigger. As for the cognitive search, have a look at this overview information on Azure Search to use with python.

    0 comments No comments