SSAS Prorcessing delays With multiple key columns on Dimension

Bala Murali 1 Reputation point
2020-09-24T10:26:47.783+00:00

Hi ,

We have recently changed dimension key column based on multiple columns (before key col based on 1 column now it is based on 3 columns), which mean we have to add additional columns to all measures to join dim based on 3 combinations.
With this change we have noticed process update of dimension is taking much longer than usual .
Secondly our cubes are partitioned , during every day process we process update dims and process add measures , during process update & process add many partition folders and files are being updated. we have 2 instances of ssas servers , we process data on primary server and move all changed partition folders to secondary instance [it is windows operation]. due to many files are being updated sync between primary and secondary also increased.

so, can someone please suggest why many files are being updated during process add/ update ? is there any work around to limit this.
Also, are there any good blogs are docs related to pros and cons of implementing multiple key columns in cube dims.

Regards,
Bala

SQL Server Analysis Services
SQL Server Analysis Services
A Microsoft online analytical data engine used in decision support and business analytics, providing the analytical data for business reports and client applications such as Power BI, Excel, Reporting Services reports, and other data visualization tools.
1,244 questions
0 comments No comments
{count} votes

5 answers

Sort by: Most helpful
  1. Lukas Yu -MSFT 5,816 Reputation points
    2020-09-25T07:03:38.237+00:00

    Hi Bala,
    Could you indicate which files are you referring in this scene? They could be the aggregations and other data to be updated in the cube.

    As for the composite key in OLAP cube, I think in data aggregation perspective ( which is the center function of SSAS ), the composite key and one primary key should performance much the same. (I didnot find doc on this, it is my own opinion) . Others that could be taken into consideration is much like normal relational database. You could check this : What are the pros and cons of using multi column primary keys?

    Regards,
    Lukas


    If the answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Bala Murali 1 Reputation point
    2020-09-26T14:32:12.59+00:00

    Hi Lukas,

    My bad, i was referring to the partition folders of measure groups and files in partition folders where mappings and aggregated multidim data. As you mentioned composite or single primary key should behave same but there could be processing or performance delays.
    However my main concern while we do process update on dimension though there are no new dimension members to update in cubes, all the partition folders are updated with current time stamp , even though no changes on data files in the partition folders. Is this expected behavior ?
    As i earlier mentioned we process data on one server and move changed partition folders to other server based on changed time through window operation , since everyday all partition appears to be updated our process simply try to move all partitions which is taking lot of time for over 500gb size cubes.

    0 comments No comments

  3. Lukas Yu -MSFT 5,816 Reputation points
    2020-09-28T02:50:54.1+00:00

    What you mean by "move" in the scene ? Are you trying a synchronization task ?
    You could try follow the synch method, which may be better optimized : Synchronize Analysis Services Databases


    If the answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  4. Bala Murali 1 Reputation point
    2020-09-28T06:57:12.697+00:00

    No, it is not sync task . We use windows robocopy functionality to synchronize data between primary and secondary servers. Instead of processing on both servers , we process on primary analysis server databases and run robocopy . Robocopy does compare all changed folders / files etc based on timestamp, size ...etc between primary and sec servers and sync all data files if they found different.

    hope this explanation made sense.

    0 comments No comments

  5. Lukas Yu -MSFT 5,816 Reputation points
    2020-10-01T07:14:35.763+00:00

    Not quite sure about robocopy since I am not familiar with it. I think what you mention in original post could be like a robocopy result, not what caused by SSAS. But I am also not quite sure here.

    0 comments No comments