dataframe not showing field

arkiboys 9,706 Reputation points
2022-04-08T10:42:20.737+00:00

Hello,
in adlsgen2, I have loaded parquet files into folders such as:

containername/folderName/year=2022/month=03/day=05
containername/folderName/year=2022/month=03/day=06
...
containername/folderName/year=2022/month=04/day=06
containername/folderName/year=2022/month=04/day=07
containername/folderName/year=2022/month=04/day=08
...

in databricks pyspark, I am creating a dataframe to read the data in the folders.
example:
folder_path = "/folderName/*"
df = spark.read.parquet(f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net{folder_path}")

df_final = df.filter("month=04 and day=08") --> this is ok and returns data.
Question:
In the filter it does not seem to see year field, for example I can not use:
df_final = df.filter("year = 2022 and month=04 and day=08")

df returns all fields including month and day at the end of the display but I do not see the year field

Any thoughts?
Thank you

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
{count} votes

Answer accepted by question author
  1. PRADEEPCHEEKATLA 91,656 Reputation points Moderator
    2022-04-11T10:45:37.957+00:00

    Hello @arkiboys ,

    Thanks for the question and using MS Q&A platform.

    Error: When you use folder_path = "/folderName/*", it throws the error message as AnalysisException: Column 'year' does not exist. Did you mean one of the following? [day, _c0, _c1, _c2, _c3, month]; line 1 pos 0; because the Column 'year' does not exist in the table as soon below:

    191788-image.png

    **Success:**When you use folder_path = "/folderName/", it works as excepted because the Column 'year' does exists.

    191894-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.