Query with distinct sort and column alias produces error column not found

Anonymous
2021-09-27T09:12:50.44+00:00

I’m trying to use sql query on azure-databricks with distinct sort and aliases

SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY album.ArtistId

The problem is that if I add an alias then I can not use not aliased name in the order by clause.

ORDER BY album.ArtistId part produces an error. ORDER BY my_alias works.

If I remove distinct it also works.

Seems like after Project step original column name is lost. That behavior is unexpected for SQL compared to other SQL dialects. And I can not find any documentation about it.

Error in SQL statement: AnalysisException: cannot resolve '`album.ArtistId`' given input columns: [my_alias]; line 2 pos 22;
    'Sort ['album.ArtistId ASC NULLS FIRST], true
    +- Distinct
       +- Project [ArtistId#2506 AS my_alias#2500]
          +- SubqueryAlias spark_catalog.chinook.album
             +- Relation[AlbumId#2504,Title#2505,ArtistId#2506] parquet

The where clause works fine

SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album WHERE album.ArtistId = 1

This looks like a bug.
Is there any way to make this query run as is?

The query is generated by sqlachemy and works on other databases fine, I prefer to not edit it for databricks.

The schema used for testing is Chinook database.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,263 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
    2021-09-27T22:17:35.32+00:00

    Hello @,
    Thanks for the ask and using Microsoft Q&A platform .
    I can confirm that this is indeed the case . I did checked it against Azure SQL and it worked , but not on Adb . I will escalate this to internally , but I think you can work around this with the

    SELECT DISTINCT album.ArtistId AS my_alias
    FROM album ORDER BY 1

    I will go ahead and escalate this , but it will be great if you log the same with sqlachemy .

    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.