Different SQL result between Databricks SQL warehouse and Databricks runtime 18

Question

Different SQL result between Databricks SQL warehouse and Databricks runtime 18

Nathan Eckert 0

Hello,

I have an issue with the latest version of the Databricks Runtime 18.

The following query (sorry for the screenshot, but if I use markdown I was blocked by your site with the message "You are not authorized to make this response. If you believe this to be in error, please refresh the page and try again.")

User's image

returns the result: "1, null, null, null"

However, on Databricks SQL warehouse and the previous runtime version, this returned : "1, null, 2, null"

This seems to be a regression.

Regards,

Manoj Kumar Boyini 18,355 Reputation points Microsoft External Staff Moderator

2026-07-01T14:09:48.23+00:00
Hi @Nathan Eckert

Thank you for providing the reproducible query. To help investigate further, could you please provide the following details?

Which Databricks Runtime 18.x version are you using (for example, 18.0, 18.1, etc.)?

Which previous Databricks Runtime version returns the expected result (1, null, 2, null)?

Are you running the query on a Photon-enabled cluster? If so, does the behavior change when Photon is disabled?

Could you share the output of EXPLAIN FORMATTED for the query when executed on both the Databricks Runtime 18 cluster and the SQL Warehouse?

Which SQL Warehouse type are you using (Serverless, Pro, or Classic)?
Manoj Kumar Boyini 18,355 Reputation points Microsoft External Staff Moderator

2026-07-02T05:15:52.47+00:00

Hi @Nathan Eckert

We haven’t received a response. Could you please share the requested details for further process.
Nathan Eckert 0 Reputation points

2026-07-02T08:05:40.43+00:00
Databricks Runtime 18 version: I have selected the 18 channel on Databricks (not 18.1, not 18.2, just 18), yes it has photon acceleration

Databricks Runtime 17 version: 17.3, no photon acceleration

Databricks SQL warehouse: Serverless

I do not have the time today to experiment more, but with all these details and the reproducible query that does not depend on any data (took me some time to create this minimal reproducible example), you should be able to investigate on your end

1 answer

Your answer

Manoj Kumar Boyini 18,355 Reputation points Microsoft External Staff Moderator

2026-07-01T14:09:48.23+00:00

Hi @Nathan Eckert

Thank you for providing the reproducible query. To help investigate further, could you please provide the following details?

Which Databricks Runtime 18.x version are you using (for example, 18.0, 18.1, etc.)?

Which previous Databricks Runtime version returns the expected result (1, null, 2, null)?

Are you running the query on a Photon-enabled cluster? If so, does the behavior change when Photon is disabled?

Could you share the output of EXPLAIN FORMATTED for the query when executed on both the Databricks Runtime 18 cluster and the SQL Warehouse?

Which SQL Warehouse type are you using (Serverless, Pro, or Classic)?
Manoj Kumar Boyini 18,355 Reputation points Microsoft External Staff Moderator

2026-07-02T05:15:52.47+00:00

Hi @Nathan Eckert

We haven’t received a response. Could you please share the requested details for further process.
Nathan Eckert 0 Reputation points

2026-07-02T08:05:40.43+00:00

Databricks Runtime 18 version: I have selected the 18 channel on Databricks (not 18.1, not 18.2, just 18), yes it has photon acceleration

Databricks Runtime 17 version: 17.3, no photon acceleration

Databricks SQL warehouse: Serverless

I do not have the time today to experiment more, but with all these details and the reproducible query that does not depend on any data (took me some time to create this minimal reproducible example), you should be able to investigate on your end

Answer 1

Hi ,

Thanks for reaching out to Microsoft Q&A.

The difference is coming from how IS NOT DISTINCT FROM is evaluated in the join condition combined with null semantics tightening in Runtime 18.

In your query, T0.b is NULL and T1.d is also NULL, so the join condition (T0.b IS NOT DISTINCT FROM T1.d) should evaluate to true (because this operator treats NULL = NULL as true). Older runtimes and SQL Warehouse honour this and produce the match, hence you see 2 in o_2. In Runtime 18, the optimizer is likely rewriting or pushing down the join in a way that breaks this null-safe equality handling (known area of change with Spark 3.5 optimizer rules), effectively behaving closer to a standard equality join during execution, so the row does not match and you get NULL instead of 2.

I feel this is very likely a regression/optimizer bug rather than intended behaviour. As a workaround, avoid relying on IS NOT DISTINCT FROM in join predicates for now and rewrite explicitly as (T0.b = T1.d OR (T0.b IS NULL AND T1.d IS NULL)), which tends to bypass these optimizer issues and gives consistent results across runtimes.

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.