No code found for {source} ~> {derivedColumn}

Bansal, Nimish 60 Reputation points
2024-10-18T13:07:57.0133333+00:00

Hi,

I created a flowlet in ADF which reads various sources and does appropriate transformations. This flowlet is called from a data flow and when previewing data, I get the error "No code found for stream DataFile ~> derivedColumn1". The data file has no schema and I am using derived column to add two new columns to this dataset. I tried restarting my debug cluster and it didn't change the result. I even tried executing the data flow via a pipeline but the pipeline returned the same error.

Can anyone explain why this is happening and how I can fix it?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,582 questions
{count} votes

Accepted answer
  1. Amira Bedhiafi 32,756 Reputation points Volunteer Moderator
    2024-10-18T19:15:06.69+00:00

    You have an issue with how the schema is being handled or the transformations applied in your data flow.

    Here are some possible reasons for the error and steps you can take to resolve it:

    1. No Schema in the Source File: Since the data file has no schema, ADF may struggle to understand how to process the data through transformations. When using a file with no schema, ensure that you're explicitly defining the schema in the source transformation. You can either:
      • Import the schema manually (if you know the column names and types).
      • Use a schema drift option to handle files with dynamic schemas.
    2. Derived Column Misconfiguration: The derived column transformation you're using to add two new columns may not have a valid expression or is unable to read the incoming data due to the missing schema. To resolve this:
      • Double-check the expressions in your derived column transformation to ensure they are valid.
      • Try explicitly defining the schema for the data coming into the derived column transformation.
    3. Schema Propagation: ADF relies on schema propagation to move data from source to sink. Since your file has no schema, the flowlet may not correctly propagate the data columns through each transformation. In this case, you can:
      • Enable "Allow Schema Drift" in the source and derived column transformations.
      • Explicitly define the schema in each step of your flowlet, especially in the source and derived column transformations.
    4. Debug Cluster and Cache: Sometimes, ADF debug clusters can cache older versions of transformations. You have already tried restarting the cluster, but it might help to clear the cache or ensure that all changes are saved properly before rerunning the pipeline.

    What to do ?

    1. Go to your data flow and check the source transformation. Make sure the Allow Schema Drift option is turned on.
    2. Define or import the schema explicitly if possible, especially for the file you're processing.
    3. Review the derived column transformation and verify the expressions for the new columns. You can preview each transformation step to see where the error occurs.
    4. Test the data flow with schema drift enabled, and ensure the derived column logic can operate on the dynamically propagated columns.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.