Process parquet files in azure function

Suraj Somani 1 Reputation point
2023-01-09T13:11:52.25+00:00

I have a requirement- Create azure function API to read multiple parquet format file from datalake, apply some filter on data and respond with the result. But I am not able to serialize parquet file content. I tried using NuGet package- Parquet.Net, below is the code used. It results in empty records object. IMO, problem is that the stream I received from datalake is not seakable, thus when I mapped it to local stream, no content was mapped, thus results in empty. Kindly suggest if anything is missing here-

            if (fileFormat == "parquet")  
            {  
              // as stream read from azure datalake is not seakable, storing it in local memory stream and then process  
                var ms = new MemoryStream();  

                data.Content.CopyTo(ms);  
                ms.Position = 0;  
                  
                var records = (await ParquetConvert.DeserializeAsync
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,909 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.