Troubleshoot HDFS

Important

The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.

This article contains troubleshooting scenarios for HDFS errors in SQL Server 2019 Big Data Clusters.

Troubleshoot HDFS heap size

Symptom

In SQL Server Big Data Clusters: [Big Data Cluster] - nmnode pods down with Failed to start namenode.java.lang.OutOfMemoryError: Java heap space and WARN util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC)

Cause

HDFS heap size may not be properly configured. The proper settings of the namenode's JVM heap depends on many factors, such as the number of files and blocks, and the load on the HDFS system. For more information on calculating the heap size, see Configuring namenode heap size.

Resolution

In SQL Server Big Data Clusters, the heap size of HDFS namenode process is controlled by the big data clusters configuration hdfs-env.HDFS_NAMENODE_OPTS, the default value is 2 GB as specified in HDFS configuration properties. This workaround proposes increasing the heap size, which is a global configuration change for the entire big data cluster.

The SQL Server Big Data Clusters runtime configuration feature is enabled by default after SQL Server 2019 CU9. To proceed, upgrade your cluster to CU9+, preferably to the latest version available. For more information, see SQL Server Big Data Clusters Release Notes.

To increase the heap size of HDFS namenode, follow the post deployment configuration guide.

The following sample uses azdata to increase HDFS namenode heap to 4 GB. Note this operation is only available in CU9 or later.

azdata bdc hdfs settings set --settings hdfs-env.HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx4g"

To confirm the change and monitor the update status:

# (Optional) View the pending change
azdata bdc settings show --filter-option=pending --include-details --recursive
 
# Apply the pending settings
azdata bdc settings apply
 
# Monitor the configuration update status
azdata bdc status show --all

See also

Next steps