Hi,
I am having troubles in listing the directories in in the Azure Data Lake storage. I am pretty much using the default template for listing the directories on a file system according to data-lake-storage-directory-file-acl-dotnet . I have wrapped the code up into a unit test, but I get some sort of a JSON serialization issue. This is the asnyc task I am calling from a unit test method.
lang-cs
public async Task<List<string>> ListFilesInDirectory(string Directory)
{
IAsyncEnumerator<PathItem> enumerator = _dataLakeFileSystemClient.GetPathsAsync(Directory).GetAsyncEnumerator();
await enumerator.MoveNextAsync();
List<string> somelist = new List<string>();
PathItem item = enumerator.Current;
while (item != null)
{
Console.WriteLine(item.Name);
somelist.Add(item.Name);
if (!await enumerator.MoveNextAsync())
{
break;
}
item = enumerator.Current;
}
return somelist;
}
This is my error message.
System.AggregateException HResult=0x80131500 Message=One or more errors occurred. ('<' is an invalid start of a value. LineNumber: 0 | BytePositionInLine: 0.) Source=System.Private.CoreLib StackTrace: at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at HistoricMarketDataLibTests.HistoricIceDataManagerTests.DirectoryListing(String Directory) in C:\Users\bbf22\source\repos\HistoricMarketDataClient\BarDefinitionTests\HistoricIceDataManagerTests.cs:line 143
This exception was originally thrown at this call stack: System.Text.Json.ThrowHelper.ThrowJsonReaderException(ref System.Text.Json.Utf8JsonReader, System.Text.Json.ExceptionResource, byte, System.ReadOnlySpan) System.Text.Json.Utf8JsonReader.ConsumeValue(byte) System.Text.Json.Utf8JsonReader.ReadFirstToken(byte) System.Text.Json.Utf8JsonReader.ReadSingleSegment() System.Text.Json.Utf8JsonReader.Read() System.Text.Json.JsonDocument.Parse(System.ReadOnlySpan, System.Text.Json.Utf8JsonReader, ref System.Text.Json.JsonDocument.MetadataDb, ref System.Text.Json.JsonDocument.StackRowStack) System.Text.Json.JsonDocument.Parse(System.ReadOnlyMemory, System.Text.Json.JsonReaderOptions, byte[]) System.Text.Json.JsonDocument.Parse(System.ReadOnlyMemory, System.Text.Json.JsonDocumentOptions) System.Text.Json.JsonDocument.Parse(string, System.Text.Json.JsonDocumentOptions) Azure.Storage.Files.DataLake.ErrorExtensions.CreateException(string, Azure.Core.Pipeline.ClientDiagnostics, Azure.Response) ... [Call Stack Truncated]
Inner Exception 1: JsonReaderException: '<' is an invalid start of a value. LineNumber: 0 | BytePositionInLine: 0.
I get Item=null so it would not iterate over the Pathitems at all. I wonder whether this is related to some service request limits as the data storage is big, but then I would expect to get some sort of a reasonable error code message. I wonder what lies behind the JSON serialization issue. Is this related to the async tasks invoked? I should also mention that I can query the existence of folders or files when specified, but whenever I invoke GetPathAsync or GetPath I am getting troubles. The files involved and the number of files is large though. I wonder whether this causes some sort of a service request issue and whether I should be thinking about mapping file locations in a SQL-based backend.
I should also say, I have had a bit more of a success using R Azure RMR to query the content of some subfolders. But querying of any parents of these would take a lot of time. While I had not success in querying any subfolder in .net using the above, I still feel it may be related to the fact that some tasks are not returning the results in time? But then again, this is just me making uneducated guesses.
I would appreciate any help. Far from being an expert on Azure .NET APIs and async tasks.