Suddenly Slow Azure Analysis Services Tabular Model Process with Data from Azure Data Lake Storage Gen 1

Jared Buschkopf 211 Reputation points
2022-06-21T15:57:59.937+00:00

At my organization, we have had a tabular model running in Azure analysis services which pulls data from ADLS Gen 1 for some time (years). Doing a full process daily on the AAS database to refresh the data in the model from the data in ADLS Gen 1 typically takes ~30-40 minutes but suddenly is taking hours as of 6/9/2022.

Messages returned from the REST API used to process the model:

messages : {@{message=Retry attempts for failures while executing the refresh exceeded the retry limit set on
the request.; type=error}, @{code=0xC112001C; message=Timeout expired. The timeout period elapsed
prior to completion of the operation.. The exception was raised by the IDbCommand interface.;
type=error; location=}, @{code=0xC112001A; message=The command has been canceled.. The exception
was raised by the IDataReader interface. Please review the error message and provider
documentation for further information and corrective action.; type=error; location=},
@{code=0xC112001A; message=The command has been canceled.. The exception was raised by the
IDataReader interface. Please review the error message and provider documentation for further
information and corrective action.; type=error; location=}...}

We have a staging environment using a separate analysis services instance that pulls data from a separate staging folder on the same ADLS Gen 1 instance that the production AAS instance uses. This staging AAS instance also experiencing the same slowness/timeouts suddenly. This leads me to believe that perhaps there is some kind of issue with ADLS Gen 1. Has something changed recently impacting read performance between AAS and ADLS Gen 1?

I was able to get the staging environment AAS model to process successfully in ~1 hr 20 min after increasing the ExternalCommandTimeout on the staging analysis services instance from 3600 seconds to 7200 seconds (2 hours). However production still isn't working after increasing the timeout in the production AAS to 10800 seconds (3 hours).

I do not believe this is a data issue for the following reasons:

  • our data collection process that pulls data into the data lake staging folder has been off so the data has not changed since we last processed the AAS tabular model in the staging environment yet the same process is taking significantly longer there just like production. Staging used to process in ~20 min but now takes > 1 hour with the same data.
  • there are no other errors beyond just the timeout eventually so it seems like it's processing OK just taking forever
  • there doesn't seem to be abnormally large amounts of data being pulled into production data lake folder by the production data collection process as of 6/9.

Any help with this would be greatly appreciated as I cannot explain why suddenly it's taking so long to refresh the data in our tabular model. Has anyone similar performance issues recently?

I also understand that ADLS Gen 1 is going away and we will need to migrate at Gen 2 at some point but Gen 1 should still be working for some time. I have other priorities at the moment preventing me from digging into the migration to Gen 2.

I do also see recommendations in the best practices for ADLS from Microsoft (https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-best-practices?WT.mc_id=Portal-Microsoft_Azure_Support#performance-and-scale-considerations) to use fewer big files rather than lots of little files. We do have > 30,000 files used in this process in production (> 10,000 in staging), most of which are small from what I've seen (< 1 MB) but it's worked fine for ages so I'm struggling to understand what could have happened on 6/9 to cause such a drastic drop in performance.

Note: we're running our analysis services at B1 normally and scale up to B2 for the processing. I don't want to have to move to standard tiers if we can avoid it as we can't scale back down to B1 then after the model refresh is done and our costs will go up.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,426 questions
Azure Analysis Services
Azure Analysis Services
An Azure service that provides an enterprise-grade analytics engine.
444 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Jared Buschkopf 211 Reputation points
    2022-06-27T13:26:09.747+00:00

    @HimanshuSinha-msft
    I have no idea what changed or who did it or when, but the job that refreshes the analysis services model from the data in the data lake magically worked this morning. It completed successfully in the normal amount of time (~30 min). Thanks to anyone on your side that may have helped.