Hello @GrigoropoulosMichail-7458 ,
Thanks for the question and using MS Q&A platform.
I would like to break you ask up into several elements:
- Get the metadata for a variety of formats in blob storage
- Apply a transformation to binary format, specifically jpeg or pdf
- Apply transformation to file as a whole, as is, as opposed to the data inside.
Data Factory absolutely can get metadata about your files, no matter the format. 1. is totally doable
Data Factory does not include image manipulation tools, and does not do more than move or compress / uncompress that type of unstructured data. 2. cannot be done
Data Flow has a toBase64 function, but this applies transformation to the data contained within a format to write to another format, rather than a whole-cloth transformation of files. 3. isn't that doable
However, your asks may be possible in Azure Synapse Analytics. Synapse has spark notebooks which allow a much more free-form workload. For example you could find a library / module for the base64 conversion to employ in the notebook. You could tell the workbook to load the file, do the transformation, and write back to blob. By treating the file as a bytestream rather than a specific file type allow you to treat all the files the same way. This does require some level of comfort with writing code.
Azure Synapse also contains functionality of Data Factory, letting you take care of the other job parts as well.
There may be other Azure services also capable of better meeting your needs.
Please do let me if you have any queries.
Thanks
Martin
- Please don't forget to click on
or upvote
button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators