Issue in accessing delta table in datalake gen2 storage account with databricks cluster (latest stable version)

Keat Ooi 31 Reputation points
2020-06-30T00:14:53.667+00:00

Recently, i am encountering an issue in the databricks cluster where it could not accessing the delta table (unmanaged delta table) which parquet files are stored in the azure datalake gen2 storage account. The issue is it could not read/update from the e delta table. for example. running this spark sql query select * from test123 will end up an error. Please refer below for the error message

Here is the Databrick Cluster Runtime having an issue
Databricks Cluster Mode : High Concurrency
Databricks Runtime Version : Latest Stable (Scala 2.11)

It works fine if i run the same spark sql query in the below databricks cluster runtime
Databricks Cluster Mode : High Concurrency
Databricks Runtime Version : 6.6 (includes Apache Spark 2.4.5, scala 2.11)
It seems to me that, the Databricks cluster runtime latest stable version is having an issue in accessing delta table where parquet files are stored in the datalake gen2 storage account.

Error Message:

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: java.io.UncheckedIOException: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure: OK at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem$2.fetchMoreResults(NativeAzureFileSystem.java:2450) at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem$2.<init>(NativeAzureFileSystem.java:2438) at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem.listStatusAsIterator(NativeAzureFileSystem.java:2428) at com.databricks.backend.daemon.data.client.DBFS$.listStatusAsIteratorForWasbInternal(DatabricksFileSystem.scala:397) at com.databricks.backend.daemon.data.client.DBFS$.listStatusAsIteratorForWasb(DatabricksFileSystem.scala:351) at com.databricks.backend.daemon.data.client.DBFSV2$$anonfun$listStatusAsIterator$1$$anonfun$apply$6.apply(DatabricksFileSystemV2.scala:196) at com.databricks.backend.daemon.data.client.DBFSV2$$anonfun$listStatusAsIterator$1$$anonfun$apply$6.apply(DatabricksFileSystemV2.scala:176) at com.databricks.s3a.S3AExeceptionUtils$.convertAWSExceptionToJavaIOException(DatabricksStreamUtils.scala:119) at com.databricks.backend.daemon.data.client.DBFSV2$$anonfun$listStatusAsIterator$1.apply(DatabricksFileSystemV2.scala:176) at com.databricks.backend.daemon.data.client.DBFSV2$$anonfun$listStatusAsIterator$1.apply(DatabricksFileSystemV2.scala:176) at com.databricks.logging.UsageLogging$$anonfun$recordOperation$1.apply(UsageLogging.scala:369) at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:450) at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:271) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:450) at com.databricks.logging.UsageLogging$class.recordOperation(UsageLogging.scala:350) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperation(DatabricksFileSystemV2.scala:450) at com.databricks.backend.daemon.data.client.DBFSV2.listStatusAsIterator(DatabricksFileSystemV2.scala:175) at com.databricks.tahoe.store.EnhancedDatabricksFileSystemV2.listStatus(EnhancedFileSystem.scala:187) at com.databricks.tahoe.store.AzureLogStore.listFrom(AzureLogStore.scala:64) at com.databricks.tahoe.store.DelegatingLogStore.listFrom(DelegatingLogStore.scala:132) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$com$databricks$sql$transaction$tahoe$DeltaLog$$updateInternal$1$$anonfun$apply$5.apply(DeltaLog.scala:279) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$com$databricks$sql$transaction$tahoe$DeltaLog$$updateInternal$1$$anonfun$apply$5.apply(DeltaLog.scala:274) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:372) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:358) at com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34) at com.databricks.sql.transaction.tahoe.util.DeltaProgressReporterEdge$class.withStatusCode(DeltaProgressReporterEdge.scala:29) at com.databricks.sql.transaction.tahoe.DeltaLog.withStatusCode(DeltaLog.scala:65) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$com$databricks$sql$transaction$tahoe$DeltaLog$$updateInternal$1.apply(DeltaLog.scala:274) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$com$databricks$sql$transaction$tahoe$DeltaLog$$updateInternal$1.apply(DeltaLog.scala:274) at com.databricks.logging.UsageLogging$$anonfun$recordOperation$1.apply(UsageLogging.scala:369) at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:18) at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:271) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:18) at com.databricks.logging.UsageLogging$class.recordOperation(UsageLogging.scala:350) at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:18) at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:55) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:94) at com.databricks.spark.util.UsageLogger$class.recordOperation(UsageLogger.scala:66) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:63) at com.databricks.spark.util.UsageLogging$class.recordOperation(UsageLogger.scala:297) at com.databricks.sql.transaction.tahoe.DeltaLog.recordOperation(DeltaLog.scala:65) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging$class.recordDeltaOperation(DeltaLogging.scala:108) at com.databricks.sql.transaction.tahoe.DeltaLog.recordDeltaOperation(DeltaLog.scala:65) at com.databricks.sql.transaction.tahoe.DeltaLog.com$databricks$sql$transaction$tahoe$DeltaLog$$updateInternal(DeltaLog.scala:273) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$update$2.apply(DeltaLog.scala:231) at com.databricks.sql.transaction.tahoe.DeltaLog$$anonfun$update$2.apply(DeltaLog.scala:231) at com.databricks.sql.transaction.tahoe.DeltaLog.lockInterruptibly(DeltaLog.scala:201) at com.databricks.sql.transaction.tahoe.DeltaLog.update(DeltaLog.scala:230) at com.databricks.sql.transaction.tahoe.DeltaLog.<init>(DeltaLog.scala:189) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3$$anonfun$call$1$$anonfun$apply$8.apply(DeltaLog.scala:749) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3$$anonfun$call$1$$anonfun$apply$8.apply(DeltaLog.scala:749) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3$$anonfun$call$1.apply(DeltaLog.scala:748) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3$$anonfun$call$1.apply(DeltaLog.scala:748) at com.databricks.logging.UsageLogging$$anonfun$recordOperation$1.apply(UsageLogging.scala:369) at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:18) at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:271) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:18) at com.databricks.logging.UsageLogging$class.recordOperation(UsageLogging.scala:350) at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:18) at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:55) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:94) at com.databricks.spark.util.UsageLogger$class.recordOperation(UsageLogger.scala:66) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:63) at com.databricks.spark.util.UsageLogging$class.recordOperation(UsageLogger.scala:297) at com.databricks.sql.transaction.tahoe.DeltaLog$.recordOperation(DeltaLog.scala:650) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging$class.recordDeltaOperation(DeltaLogging.scala:108) at com.databricks.sql.transaction.tahoe.DeltaLog$.recordDeltaOperation(DeltaLog.scala:650) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3.call(DeltaLog.scala:747) at com.databricks.sql.transaction.tahoe.DeltaLog$$anon$3.call(DeltaLog.scala:745) at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) at com.google.common.cache.LocalCache.get(LocalCache.java:3932) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) at com.databricks.sql.transaction.tahoe.DeltaLog$.apply(DeltaLog.scala:745) at com.databricks.sql.transaction.tahoe.DeltaLog$.forTable(DeltaLog.scala:687) at org.apache.spark.sql.execution.datasources.FindDataSourceTable.org$apache$spark$sql$execution$datasources$FindDataSourceTable$$readDeltaTable(DataSourceStrategy.scala:287) at org.apache.spark.sql.execution.datasources.FindDataSourceTable$$anonfun$apply$2.applyOrElse(DataSourceStrategy.scala:322) at org.apache.spark.sql.execution.datasources.FindDataSourceTable$$anonfun$apply$2.applyOrElse(DataSourceStrategy.scala:299) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:77) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:107) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$apply$6.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$apply$6.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$8.apply(TreeNode.scala:354) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:208) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:352) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$apply$6.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$apply$6.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$8.apply(TreeNode.scala:354) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:208) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:352) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:113) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperators(AnalysisHelper.scala:73) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29) at org.apache.spark.sql.execution.datasources.FindDataSourceTable.apply(DataSourceStrategy.scala:299) at org.apache.spark.sql.execution.datasources.FindDataSourceTable.apply(DataSourceStrategy.scala:226) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:112) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:109) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:109) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:101) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:101) at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:136) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:130) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:102) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$executeAndTrack$1.apply(RuleExecutor.scala:80) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$executeAndTrack$1.apply(RuleExecutor.scala:80) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:79) at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:114) at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:113) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:113) at org.apache.spark.sql.execution.QueryExecution$$anonfun$analyzed$1.apply(QueryExecution.scala:83) at org.apache.spark.sql.execution.QueryExecution$$anonfun$analyzed$1.apply(QueryExecution.scala:80) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:80) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:80) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:72) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:88) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:696) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:716) at com.databricks.backend.daemon.driver.SQLDriverLocal$$anonfun$1.apply(SQLDriverLocal.scala:88) at com.databricks.backend.daemon.driver.SQLDriverLocal$$anonfun$1.apply(SQLDriverLocal.scala:34) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:296) at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:34) at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:141) at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:373) at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:350) at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:271) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:350) at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644) at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644) at scala.util.Try$.apply(Try.scala:192) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:639) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:485) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:597) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:390) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219) at java.lang.Thread.run(Thread.java:748) Caused by: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure: OK at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.listDB(AzureNativeFileSystemStore.java:2275) at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem$2.fetchMoreResults(NativeAzureFileSystem.java:2443) ... 178 more Caused by: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure: OK at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:101) at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:202) at com.microsoft.azure.storage.blob.CloudBlobContainer.listBlobsSegmented(CloudBlobContainer.java:1420) at shaded.databricks.org.apache.hadoop.fs.azure.StorageInterfaceImpl$CloudBlobContainerWrapperImpl.listBlobs(StorageInterfaceImpl.java:284) at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.listDB(AzureNativeFileSystemStore.java:2209) ... 179 more Caused by: org.xml.sax.SAXException: The response received is invalid or improperly formatted. at com.microsoft.azure.storage.blob.BlobListHandler.setProperties(BlobListHandler.java:254) at com.microsoft.azure.storage.blob.BlobListHandler.endElement(BlobListHandler.java:182) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:183) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:351) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at com.microsoft.azure.storage.blob.BlobListHandler.getBlobList(BlobListHandler.java:74) at com.microsoft.azure.storage.blob.CloudBlobContainer$7.postProcessResponse(CloudBlobContainer.java:1473) at com.microsoft.azure.storage.blob.CloudBlobContainer$7.postProcessResponse(CloudBlobContainer.java:1437) at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:149) ... 182 more at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:126) at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:141) at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:373) at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:350) at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:271) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:350) at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644) at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644) at scala.util.Try$.apply(Try.scala:192) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:639) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:485) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:597) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:390) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219) at java.lang.Thread.run(Thread.java:748)
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,017 questions
{count} vote

2 answers

Sort by: Most helpful
  1. DE JONG Paul 11 Reputation points
    2020-07-06T08:30:11.167+00:00

    Since Thursday July 2nd 2020 we are facing a similar issue, using Talend Big data.
    We can only list directories which contain files, but no subdirectories which are empty are contain folders.
    Until July 1st there were no issues.

    java.util.NoSuchElementException: An error occurred while enumerating the result, check the original exception for details.
    at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySegmentedIterator.java:113)
    at org.talend.components.azurestorage.blob.runtime.AzureStorageListReader.start(AzureStorageListReader.java:79)
    at org.talend.codegen.flowvariables.runtime.FlowVariablesReader.start(FlowVariablesReader.java:73)
    at dwh_b2b.test_azure_connection_0_1.test_azure_connection.tAzureStorageList_1Process(test_azure_connection.java:951)
    at dwh_b2b.test_azure_connection_0_1.test_azure_connection.tAzureStorageConnection_1Process(test_azure_connection.java:506)
    at dwh_b2b.test_azure_connection_0_1.test_azure_connection.runJobInTOS(test_azure_connection.java:1710)
    at dwh_b2b.test_azure_connection_0_1.test_azure_connection.main(test_azure_connection.java:1510)
    Caused by: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure: OK
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:101)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:199)
    at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySegmentedIterator.java:109)
    ... 6 more
    Caused by: org.xml.sax.SAXException: The response received is invalid or improperly formatted.
    at com.microsoft.azure.storage.blob.BlobListHandler.setProperties(BlobListHandler.java:254)
    at com.microsoft.azure.storage.blob.BlobListHandler.endElement(BlobListHandler.java:182)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at com.microsoft.azure.storage.blob.BlobListHandler.getBlobList(BlobListHandler.java:74)
    at com.microsoft.azure.storage.blob.CloudBlobContainer$7.postProcessResponse(CloudBlobContainer.java:1473)
    at com.microsoft.azure.storage.blob.CloudBlobContainer$7.postProcessResponse(CloudBlobContainer.java:1437)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:146)

    0 comments No comments

  2. KranthiPakala-MSFT 46,432 Reputation points Microsoft Employee
    2020-07-06T18:32:55.967+00:00

    Hi @keataunooi, @IgnacyJakowiecki-4596 , @ZhangJosonZX-3612, @AnanthanarayananSeeEXT-5775 , @DEJONGPaul-9886,

    Sorry you are experiencing this, apologizes for the delay in response. Could you please confirm that when you have created the mountpoint, have you used the WASB driver? Because the WASB driver is not supported for ADLS gen2 and it can cause such failures (This is a limitation from storage end. Also, it might work in few scenarios, but as it is not supported, hence it might fail in this way. The recommended way here will be using abfss):

    Known issues with Azure Data Lake Storage Gen2 : Windows Azure Storage Blob (WASB) driver (unsupported with Data Lake Storage Gen2)

    The solution will be recreating the mountpoint using abfss:

    Mount an Azure Data Lake Storage Gen2 account using a service principal and OAuth 2.0

    After creating the mountpoint, you will need to give the permissions on the container level for all the files.

    Please give a try recreating the mountpoint using abfss and let us if that helps to resolve the issue.

    In case if you still encounter the issue, for deeper investigation and immediate assistance, If you have a support plan you may file a support ticket, else please send an email to AzCommunity@Microsoft.com with the below details, so that we can enable a one-time-free support for you to work closely on this matter.

    • Subject of the email: <Attn - Kranthi : Microsoft Q&A Thread title>
    • Thread URL: <Microsoft Q&A Thread URL>
    • Subscription ID: <your subscription id>
    • Databricks regin: <Databricks region>
    • Notebook URL

    Thank you