Monitor Big Data Clusters by using azdata and kubectl
This article explains how to view the status of a big data cluster using azdata and kubectl.
Important
The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.
Use azdata
You can also use azdata commands to view both endpoints and the cluster status.
Service endpoints
Authenticate to the big data cluster with azdata login. Set the
--controller-endpoint
parameter to the external IP address of the controller endpoint.azdata login --endpoint https://<ip-address-of-controller-svc-external>:30080 --username <user-name>
Specify the username and password that you configured for the controller (AZDATA_USERNAME and AZDATA_PASSWORD) during deployment.
For AD authentication, the command is:
azdata login --endpoint https://<control_domain_name>:30080 --auth ad
Run
azdata bdc endpoint list
to get a list with a description of each endpoint and their corresponding IP address and port values.azdata bdc endpoint list -o table
The following list shows sample output from this command:
Description Endpoint Ip Name Port Protocol ------------------------------------------------------ --------------------------------------------------------- -------------- ----------------- ------ ---------- Gateway to access HDFS files, Spark https://11.111.111.111:30443 11.111.111.111 gateway 30443 https Spark Jobs Management and Monitoring Dashboard https://11.111.111.111:30443/gateway/default/sparkhistory 11.111.111.111 spark-history 30443 https Spark Diagnostics and Monitoring Dashboard https://11.111.111.111:30443/gateway/default/yarn 11.111.111.111 yarn-ui 30443 https Application Proxy https://11.111.111.111:30778 11.111.111.111 app-proxy 30778 https Management Proxy https://11.111.111.111:30777 11.111.111.111 mgmtproxy 30777 https Log Search Dashboard https://11.111.111.111:30777/kibana 11.111.111.111 logsui 30777 https Metrics Dashboard https://11.111.111.111:30777/grafana 11.111.111.111 metricsui 30777 https Cluster Management Service https://11.111.111.111:30080 11.111.111.111 controller 30080 https SQL Server Master Instance Front-End 11.111.111.111,31433 11.111.111.111 sql-server-master 31433 tcp HDFS File System Proxy https://11.111.111.111:30443/gateway/default/webhdfs/v1 11.111.111.111 webhdfs 30443 https Proxy for running Spark statements, jobs, applications https://11.111.111.111:30443/gateway/default/livy/v1 11.111.111.111 livy 30443 https
View cluster status
You can view the status of the cluster with the azdata bdc status show
command.
azdata bdc status show
Tip
To run the status commands, you must first log in with the azdata login command, which was shown in the previous endpoints section.
The following shows sample output from this command:
Bdc: ready Health Status: healthy
===========================================================================================================================================================================================================================================
Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Servicename State Healthstatus Details
spark ready healthy -
sql ready healthy -
hdfs ready healthy -
control ready healthy -
gateway ready healthy -
app ready healthy -
Spark Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
sparkhead ready healthy StatefulSet sparkhead is healthy
storage-0 ready healthy StatefulSet storage-0 is healthy
Sql Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
master ready healthy StatefulSet master is healthy
compute-0 ready healthy StatefulSet compute-0 is healthy
data-0 ready healthy StatefulSet data-0 is healthy
storage-0 ready healthy StatefulSet storage-0 is healthy
Hdfs Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
nmnode-0 ready healthy StatefulSet nmnode-0 is healthy
zookeeper ready healthy StatefulSet zookeeper is healthy
storage-0 ready healthy StatefulSet storage-0 is healthy
sparkhead ready healthy StatefulSet sparkhead is healthy
Control Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
controldb ready healthy StatefulSet controldb is healthy
control ready healthy ReplicaSet control is healthy
metricsdc ready healthy DaemonSet metricsdc is healthy
metricsui ready healthy ReplicaSet metricsui is healthy
metricsdb ready healthy StatefulSet metricsdb is healthy
logsui ready healthy ReplicaSet logsui is healthy
logsdb ready healthy StatefulSet logsdb is healthy
mgmtproxy ready healthy ReplicaSet mgmtproxy is healthy
controlwd ready healthy ReplicaSet controlwd is healthy
Gateway Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
gateway ready healthy StatefulSet gateway is healthy
App Services: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
appproxy ready healthy ReplicaSet appproxy is healthy
Specific resource status
You can view the status of a specific resource within the cluster with the azdata bdc status show command. When you use this command, you can filter using --resource
parameter. Few examples of inputs for --resource
parameter are:
- master
- control
- compute-0
- storage-0
- gateway
For example, the following command displays the status of the storage pool:
azdata bdc status show --all --resource storage-0
To see the status of all components that are running a specific service, you must use the corresponding command group azdata bdc <serviceName> status show
. For example:
azdata bdc sql status show --all
azdata bdc hdfs status show --all
azdata bdc spark status show --all
Sample output:
Storage-0: ready Health Status: healthy
===========================================================================================================================================================================================================================================
Instances: running Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Instancename State Healthstatus Details
storage-0-0 running healthy Pod storage-0-0 is healthy
storage-0-1 running healthy Pod storage-0-1 is healthy
Dashboards
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Name Url
nodeMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/nodemetrics/ui
sqlMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/sqlmetrics/ui
logsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/logs/ui
Tip
Run the status command with --all
parameters for additional health details, including links to metrics and logs dashboards corresponding to the specific instance. Here is a sample output when the --all
parameters is used:
Spark: ready Health Status: healthy
===========================================================================================================================================================================================================================================
Resources: ready Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Resourcename State Healthstatus Details
sparkhead ready healthy StatefulSet sparkhead is healthy
storage-0 ready healthy StatefulSet storage-0 is healthy
Sparkhead Resources: running Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Instancename State Healthstatus Details
sparkhead-0 running healthy Pod sparkhead-0 is healthy
sparkhead-1 running healthy Pod sparkhead-1 is healthy
Dashboards
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Name Url
nodeMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/sparkhead-1/status/nodemetrics/ui
sqlMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/sparkhead-1/status/sqlmetrics/ui
logsUrl https://13.91.50.9:30777/api/v1/bdc/instances/sparkhead-1/status/logs/ui
Storage-0 Resources: running Health Status: healthy
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Instancename State Healthstatus Details
storage-0-0 running healthy Pod storage-0-0 is healthy
storage-0-1 running healthy Pod storage-0-1 is healthy
Dashboards
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Name Url
nodeMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/nodemetrics/ui
sqlMetricsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/sqlmetrics/ui
logsUrl https://13.91.50.9:30777/api/v1/bdc/instances/storage-0-1/status/logs/ui
View controller status
You can view the controller status with the azdata bdc control status show
command. It provides similar links to the monitoring dashboards related to the controller components of the big data cluster.