Thanks for the question and using MS Q&A platform.
I understand that you’re trying to read a PDF file from Azure Blob Storage and convert it to base64 format using Azure Data Factory (ADF). However, you’re encountering an issue where the base64 value returned by the Web activity doesn’t match the expected value.
One possible reason for this issue is that the PDF file is not being read correctly from the blob storage. PDF files are binary files, and they need to be read in binary mode to ensure that the contents are not corrupted.
Unfortunately, as of now, ADF does not support the PDF format. ADF can get metadata about your files, no matter the format, but it does not include image manipulation tools, and does not do more than move or compress/uncompress that type of unstructured data.
However, there might be a workaround. Azure Synapse Analytics, which contains the functionality of Data Factory, allows for a much more free-form workload. For example, you could find a library/module for the base64 conversion to employ in a Spark notebook. You could tell the workbook to load the file, do the transformation, and write back to blob. This does require some level of comfort with writing code.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.