Hello Aakash CK(P),
Welcome to the Microsoft Q&A and thank you for posting your questions here.
Sequel to your questions, I understand that you are having questions for certainty about migrating a large database to Azure SQL Managed Instance.
To answer this questions, I will break it down according to each question.
Question 1:
Is it necessary to run update statistics with full scan for all the databases immediately post-cutover? I have a total of 40 databases.
Answer 1:
Running UPDATE STATISTICS
with a full scan is not strictly necessary immediately after cutover, but it is highly recommended to ensure optimal performance.
Question 2:
If I am not performing the update statistics immediately, will it cause any performance issues? Can I take it up one by one during non-peak hours?
Answer 2:
If you do not update statistics immediately, you may experience suboptimal query performance because the statistics from your on-premises environment might not accurately reflect the data distribution in Azure SQL Managed Instance. However, you can mitigate this by updating statistics during non-peak hours, spreading the load over time. This approach will allow you to maintain system performance while gradually improving the accuracy of query plans.
Question 3:
After cutover through Azure Data Studio, Azure also takes DB backup of all DBs immediately, which causes an IO spike. Is there any way to stop it? Will this also slow down update statistics?
Answer 3:
Azure SQL Managed Instance performs automated backups to ensure data protection and recovery. This process can cause an IO spike, especially if it coincides with other IO-intensive operations like updating statistics. Unfortunately, you cannot disable these automated backups, but you can manage the timing of other intensive operations to minimize the impact. Plan to run updates and other maintenance tasks during off-peak hours to distribute the IO load.
Question 4:
If update statistics is mandate, can we at least avoid full scan? Can we go with sample with some percent? what would be the appropriate percent if so?
Answer 4:
If a full scan is not feasible due to time or resource constraints, you can use a sampled update instead. The appropriate sample size depends on the specific workload and data distribution of your databases. Typically, a sample rate of 10% to 30% is used, but you might need to experiment to find the optimal balance between speed and accuracy for your environment. Start with a lower sample rate and monitor performance, adjusting as necessary.
Accept Answer
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.
Best Regards,
Sina Salam