Hello,
Yes, you're interpreting it correctly.
When you query an external table in Azure Synapse Analytics' dedicated SQL pool, it reads the data from the external data source, such as Azure Data Lake Storage, and loads it into temporary tables in the dedicated pool. It uses a Round Robin distribution to distribute the data evenly across the temporary tables.
This approach allows Synapse to leverage its distributed processing capabilities to perform queries on the data, which is crucial for achieving high performance on large datasets.
In the case of querying external tables backed by Parquet files in Azure Data Lake Storage, Synapse Analytics reads the data from the Parquet files, and it loads the required columns into the temporary tables in the dedicated pool. The query then operates on these temporary tables as if the data were stored within the dedicated pool itself.
It's important to note that the data is not persisted in the dedicated pool after the query completes. This approach provides a balance between performance and flexibility, as you can efficiently query data stored in external sources without the need to load and store the data within the dedicated pool permanently.