Process parquet files in azure function
Suraj Somani
1
Reputation point
I have a requirement- Create azure function API to read multiple parquet format file from datalake, apply some filter on data and respond with the result. But I am not able to serialize parquet file content. I tried using NuGet package- Parquet.Net, below is the code used. It results in empty records object. IMO, problem is that the stream I received from datalake is not seakable, thus when I mapped it to local stream, no content was mapped, thus results in empty. Kindly suggest if anything is missing here-
if (fileFormat == "parquet")
{
// as stream read from azure datalake is not seakable, storing it in local memory stream and then process
var ms = new MemoryStream();
data.Content.CopyTo(ms);
ms.Position = 0;
var records = (await ParquetConvert.DeserializeAsync
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,909 questions
Sign in to answer