Synapse dedicated SQL pool external table failure with COPY schema discovery error after restart

Question

Synapse dedicated SQL pool external table failure with COPY schema discovery error after restart

Michael Clemans 140

Hello,

We encountered an issue where external Parquet-based tables in a dedicated SQL pool started failing with the following error:

COPY statement input file schema discovery failed: Cannot process the file https://

SAI JAGADEESH KUDIPUDI 3,465 Reputation points Microsoft External Staff Moderator

2026-05-14T23:57:05.3966667+00:00

Hi **Michael Clemans,
**I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

2 answers

Your answer

SAI JAGADEESH KUDIPUDI 3,465 Reputation points Microsoft External Staff Moderator

2026-05-14T23:57:05.3966667+00:00

Hi **Michael Clemans,
**I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

Answer 1

Hi Michael Clemans,

Yes, transient Azure service or networking outages can sometimes trigger this behavior in Synapse Dedicated SQL Pool external table access.

External tables in dedicated SQL pools rely on multiple backend components working together, including:

PolyBase / COPY engine services

Data Movement Service (DMS)

ADLS Gen2 connectivity

Authentication/token validation

Internal Azure networking between Synapse compute nodes and storage

If there is a temporary disruption in any of those layers — for example a regional networking issue, storage-access interruption, backend service update, or transient platform outage — the SQL pool may enter a degraded state where external file reads start failing with errors like:

“COPY statement input file schema discovery failed … file could not be opened.”

In some cases, the affected pool does not automatically recover its storage connectivity/session state even after the underlying issue clears. Restarting or pause/resume forces the dedicated SQL pool to reinitialize its compute nodes, PolyBase services, and storage connections, which is why the issue immediately resolves afterward.

Your observation that:

the always-running pool was affected, while

the periodically restarted pool was not

also aligns with this type of transient runtime-state issue.

Additionally, not every transient backend issue or short-duration platform disruption results in a public Azure status announcement. Some temporary service-side or networking issues may self-recover quickly or affect only a subset of infrastructure, so customers can sometimes observe intermittent failures even when there is no active public incident posted.

At this time there is no public Microsoft documentation confirming a direct relationship between the Microsoft 365 incident and Synapse connectivity, but temporary Azure infrastructure or networking disruptions can indirectly impact external table access behavior.

Relevant documentation:

https://learn.microsoft.com/azure/synapse-analytics/sql/develop-tables-external-tables

https://learn.microsoft.com/azure/synapse-analytics/known-issues

https://learn.microsoft.com/sql/t-sql/statements/copy-into-transact-sql?view=azure-sqldw-latest

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

Smaran Thoomu 35,375 Reputation points Microsoft External Staff Moderator

2026-05-21T09:07:40.2733333+00:00

@Michael Clemans Additionally, one other point worth checking is whether the external data source authentication/session state became stale after the transient interruption. In some scenarios, restarting or pausing/resuming the dedicated SQL pool refreshes internal connectivity, credential tokens, and PolyBase runtime state used for external table access.

It may also help to validate:

ADLS Gen2 accessibility from Synapse

Managed Identity / SAS / credential validity

External file schema consistency across parquet files

Whether newly added parquet files had schema drift

Since the issue resolved immediately after restarting the dedicated SQL pool, this does appear consistent with a transient runtime or connectivity state issue rather than permanent corruption of the external tables themselves.

Answer 2

The behavior described is consistent with a transient issue in the dedicated SQL pool engine or its connectivity to storage, rather than a permanent problem with the external table definitions or Parquet files.

From the available guidance:

External tables over files in Azure Storage (Blob/ADLS) are known to be sensitive to storage access and network conditions. When storage is protected by firewalls or network rules, failures like “file cannot be opened” or “directory cannot be listed” are explicitly called out as connectivity issues between Synapse and the storage account, not schema or metadata problems.
For some service‑side issues, Microsoft explicitly recommends pausing and resuming (effectively restarting) the dedicated SQL pool to restore normal operation. For example, when the Distribution_nn_Cache database cannot be opened due to inaccessible files or insufficient memory/disk, the mitigation is to pause and resume the pool, which resets the service state.
There is also a general pattern in Synapse known issues and troubleshooting guidance where pausing/resuming is used to clear internal state or recover from transient service health issues.

Given that:

Multiple external tables failed with the same schema discovery / file open error.
No configuration changes were made to the external tables or storage.
The issue disappeared immediately after restarting the dedicated SQL pool.

The most likely explanations, based on documented patterns, are:

A transient service health or engine state issue in the dedicated SQL pool (for example, an internal cache or metadata/database state problem similar in nature to the Distribution_nn_Cache issue) that was cleared by the restart.
A temporary connectivity or network path problem between the dedicated SQL pool and the ADLS Gen2 account, which was resolved when the pool was paused/resumed and re‑established its connections.

Why a restart helps in these scenarios:

Pausing/resuming a dedicated SQL pool tears down and reinitializes the compute nodes, caches, and internal databases. This can:
- Clear corrupted or stale internal state related to external table metadata or file handles.
- Re‑establish network connections and authentication paths to external storage.
- Recover from transient service health issues that affect access to internal or external storage.

Is it transient engine behavior or storage connectivity?

The error message “file could not be opened” and the fact that other pools and the storage account continued to work point toward transient engine or connectivity behavior, not a persistent storage configuration problem.
The documented mitigations for similar “inaccessible files” / service health issues explicitly include pausing and resuming the dedicated SQL pool, which aligns with what resolved the issue in this case.

In summary, this pattern is aligned with known transient issues in Synapse dedicated SQL pools where pausing/resuming is the recommended recovery step. It is typically attributed to temporary engine/service health or storage connectivity conditions rather than permanent schema or configuration problems.

References:

Share via

Synapse dedicated SQL pool external table failure with COPY schema discovery error after restart

2 answers

Your answer