Third party Python package installed on Databricks cluster gives different results than other Python stacks

Hans Geurtsen 1 Reputation point
2020-06-18T07:56:40.197+00:00

We get a Python package developed by a third party. The package implements a standard mathematical model, no machine learning, no randomization. The model turned out to return incorrect results when installed on a Databricks cluster. We tried different runtime versions, including 6.2, 6.6 and even 7.0. When tested on other Python stacks (Python 3.8 on Windows 10, Python 3.8 on Ubuntu 20.04 and Python 3.7 on Ubuntu 16.04 (identical to Databricks)), it works as it should. Does anyone have an explanation why a package installed on a Databricks cluster behaves differently than the same package installed on other Python stacks?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,071 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
    2020-06-19T22:24:23.437+00:00

    Hello @HansGeurtsen-2054 ,
    Welcome to the Q&A .

    Without knowing more about the the package , i think it will very tough to tell a very definitive answer to that . But since the Adb works driver-worker architecture design , I think that may be something worth considering .

    Thanks & stay safe

    Himanshu

    0 comments No comments