I can't say for certain because you only posted a portion of the code, but the error you're getting is the compiler not knowing what num_table
is. It appears you defined a function that returns an object; composed of h, num_table, and p, and you immediately jump into for loop inside the main function.
There are various Python samples that illustrates how to use blob storage. One way to get you started is the snippet down below. It's an HTTP triggered function that will accept a PDF file and extract the text using PyPDF2
import logging
import azure.functions as func
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import os
import PyPDF2
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
# Set the connection string to your storage account
connect_str = "DefaultEndpointsProtocol=https;AccountName=<your_account_name>;AccountKey=<your_account_key>;EndpointSuffix=core.windows.net"
# Set the name of the container you want to use
container_name = "<your_container_name>"
# Get the PDF file from the request body
file = req.get_body()
# Read the elements off the PDF file
pdf_reader = PyPDF2.PdfFileReader(file)
num_pages = pdf_reader.getNumPages()
text = ""
for i in range(num_pages):
page = pdf_reader.getPage(i)
text += page.extractText()
# Create a BlobServiceClient object using the connection string
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
# Create a ContainerClient object for the container you want to use
container_client = blob_service_client.get_container_client(container_name)
# Create a BlobClient object for the PDF file you want to upload
blob_client = container_client.get_blob_client("<your_blob_name>.pdf")
# Upload the PDF file to Azure Storage
blob_client.upload_blob(file)
logging.info('PDF file uploaded successfully!')
return func.HttpResponse(text)
You can declare function for reading the elements from your PDF like you have a above and return the contents that were read. You also don't have to use an HTTP trigger, you can easily use a Queue or Blob trigger. As for the cognitive search, have a look at this overview information on Azure Search to use with python.