I've been having trouble with a feature in Azure Databricks. The Spark U/I will not be shown for job clusters that have recently completed ("terminated"). There is a link to the Spark U/I in the databricks portal. But when you click the link, you are presented with a status message that just says "Loading" :
Loading old UI for cluster "whatever"... This may take a few minutes.
This screen will remain the same for a long period of time (hours) and I will eventually lose patience and close the window. I haven't yet tried to wait overnight ... and even if I tried, I'm not sure if it would be reasonable to wait that amount of time for the U/I to respond.
When I previously encounter this issue, and opened a support ticket with databricks/azure-databricks, they were not able to confirm any outages during the period in question. So far we have established that there is a "Spark History Server UI" which is a shared resource that can become congested with requests from multiple customers. I'm assuming this implies that the issue is simultaneously affecting multiple customers ... although we haven't yet established that for certain.
I've been using Azure Databricks in production for a few months now, and I'm not familiar enough to know if the issue could be specific to us, or if it might be a chronic issue that affects others in the same region as well. I googled for the problem and was unable to find any results for my search. So I thought it would be good to start a new discussion about the problem here in the Q&A.
Please let me know if anyone has an explanation, or has experienced this themselves. I'm also eager to hear if there are any tricks to get the workspace to start working properly. Whenever I encountered this issue, the problem wouldn't go away on its own until a day or two had passed. I haven't yet gotten any acknowledgement of these outages from Microsoft's perspective.