Hi @Anuw Malik ,
First of all, it is clear that the function of PolyBase is to join the data from external sources to relational tables in an instance of SQL Server, and the statistics allow the SQL Server query optimizer to make the best decision possible on how to execute a query, as far as the creation of statistics itself is concerned, it is beneficial.
But the doc on CREATE STATISTICS, it also said
Statistics for external tables
When creating external table statistics, SQL Server imports the external table into a temporary SQL Server table, and then creates the statistics. For samples statistics, only the sampled rows are imported. If you have a large external table, it will be much faster to use the default sampling instead of the full scan option.
And the limitations:
Updating statistics is not supported on external tables. To update statistics on an external table, drop and re-create the statistics.
If you have enough permissions, the remote optimizer has full access to the statistics on that server, but it may require you to have the plan guides and maintenance procedures to get what you want
But in any case, statistics is useful, but there are more troubles with external data sources than with local
-------------
If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.