2,038 questions with Azure Databricks tags

Sort by: Updated
0 answers

I/O operations with Azure Databricks REST Jobs API

I have experienced problems with the delivery of arguments via Jobs API. I've outlined the experienced problems in details on Stack Overflow: https://stackoverflow.com/questions/62758094/i-o-operations-with-azure-databricks-rest-jobs-api I would…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-07-06T15:38:42.997+00:00
Galas, Michal 1 Reputation point
commented 2020-07-13T17:59:33.667+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
1 answer

Machine Learning Model Deployment

I am new to ML model and am researching using Azure Databricks and MLFlow to train a model. My question is once the model is created, is there a way to host the model that can be downloaded and inferenced remotely ? I am looking for options other than…

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,683 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-07-01T16:26:17.527+00:00
Mahesh Sivan 1 Reputation point
commented 2020-07-13T12:45:09.187+00:00
romungi-MSFT 43,616 Reputation points Microsoft Employee
1 answer

Azure Web Application with computationally intenstive tasks in Dask and Tensorflow

Hello, I'm developing a data analysis tool for the processing of data from Hydrogen-Deuterium exchange mass spectrometry. We would like to accompany our publication with a deployment of the code on Microsoft Azure so that other researchers can quickly…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
7,244 questions
asked 2020-06-24T10:22:29.26+00:00
Jochem Smit 1 Reputation point
answered 2020-07-08T05:16:30.38+00:00
brtrach-MSFT 15,701 Reputation points Microsoft Employee
0 answers

Spark Connector in ADF

Hi, I have created a spark connector to connect to azure data bricks. In copy activity source is spark connector and sink is Azure SQL DB. In spark Connector query, CreatedDate is being converted to String and throwing error where as it is timestamp…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,977 questions
asked 2020-06-19T16:53:27.433+00:00
Mounica 1 Reputation point
commented 2020-07-07T18:22:07.117+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
0 answers

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 7.0 failed 4 times, most recent failure: java.lang.NoClassDefFoundError: Could not initialize class

Hi, I am getting this error despite defining the class. When I execute the notebook first time it works fine but when I execute the same notebook without code change it started throwing this error. As per the error class not defined but trust me class…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-23T23:20:13.487+00:00
Rajaniesh Kaushikk 476 Reputation points
commented 2020-07-02T22:39:19.027+00:00
KranthiPakala-MSFT 46,437 Reputation points Microsoft Employee
0 answers

data bricks scala : data frame column endoing from UTF 8 to windows 1252

HI I am working with data bricks where i have the data in parque and i am generating smaller files out of it , i have a column in this which is string and it has different characters and i have to encode this string value to windows 1252 or windows…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,977 questions
asked 2020-06-22T20:29:26.43+00:00
ManojMathe 1 Reputation point
commented 2020-06-25T22:21:25.563+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
1 answer

Third party Python package installed on Databricks cluster gives different results than other Python stacks

We get a Python package developed by a third party. The package implements a standard mathematical model, no machine learning, no randomization. The model turned out to return incorrect results when installed on a Databricks cluster. We tried different…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-18T07:56:40.197+00:00
Hans Geurtsen 1 Reputation point
commented 2020-06-23T12:14:01.153+00:00
Hans Geurtsen 1 Reputation point
1 answer

Databricks Notebook Activity parameter problem

I feel this is a bug but not sure if it is with ADF or Databricks. I am running a notebook using ADF notebook activity. My notebook has a widget for which I pass the value from ADF. As I need to manually enter the parameter name while configuring…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,977 questions
asked 2020-06-18T09:07:10.317+00:00
TDPPNR 6 Reputation points
commented 2020-06-22T15:09:06.74+00:00
TDPPNR 6 Reputation points
2 answers One of the answers was accepted by the question author.

Spark SQL How to get the 5th column from the Spark SQL Query

Hi, I have a headerless file which I am reading in the spark.read to create a data frame now I want to get the value of the 5th column from the file.File is comma seperated. How to achieve it. I know it is possible in the T-SQL but not sure how to…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-15T13:48:04.437+00:00
Rajaniesh Kaushikk 476 Reputation points
commented 2020-06-16T18:28:18.347+00:00
Rajaniesh Kaushikk 476 Reputation points
2 answers One of the answers was accepted by the question author.

SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException:

Hi, I am running this code but this is throwing this error: SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException:

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-14T03:55:51.703+00:00
Rajaniesh Kaushikk 476 Reputation points
accepted 2020-06-16T09:55:30.627+00:00
Rajaniesh Kaushikk 476 Reputation points
0 answers

Azure Databricks - Split column based on special characters in Databricks

I have a column in my csv file that possibly has value in below formats. "Q1_1__Value_-_10_counts" "Value_10_counts" "Q1_1__1__value_yes" This has to be split as below respectively "Value_-_10_counts" …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-08T10:37:46.403+00:00
Jothi 11 Reputation points
commented 2020-06-15T20:24:28.467+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
1 answer

More convenient service to read avro files from Azure Data Lake Gen2

Hi, I have to read lots of avro files created by an Event Hub Capture in a Data Lake Gen2. Data must be filtered, processed and then applied to train a machine learning model. I'm considering Azure Databricks and the Azure Machine Learning service…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,403 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,683 questions
Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
583 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-08T22:37:56.897+00:00
Ariel Cedola 21 Reputation points
commented 2020-06-15T20:20:36.227+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
3 answers One of the answers was accepted by the question author.

Azure IoT - Query Data from IoT Files

Hello, I am using Azure (Azure Databricks, IoT Hub) to stream unstructured data from IoT devices (i.e. wind turbine), in the form of thousands of files with millions of data captured over a period of 10 years. How do I extract a variety of metadata…

Azure IoT
Azure IoT
A category of Azure services for internet of things devices.
390 questions
Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
501 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-15T00:51:10.89+00:00
Sarosh Niazi 21 Reputation points
accepted 2020-06-15T19:37:26.247+00:00
Sarosh Niazi 21 Reputation points
2 answers

File(filePath).exists does not work in Azure databricks

Hi, How to find if file exists in a path in the data lake? Regards Rajaniesh

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-11T20:12:45.693+00:00
Rajaniesh Kaushikk 476 Reputation points
answered 2020-06-14T04:01:39.583+00:00
Rajaniesh Kaushikk 476 Reputation points
2 answers One of the answers was accepted by the question author.

Accessing dataframe created in Scala from Python command

Is there a way to create a Spark dataframe in Scala command, and then access it in Python, without explicitly writing it to disk and re-reading? In Databricks I can do in Scala dfFoo.createOrReplaceTempView("temp_df_foo") and it then in…

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,597 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-06-09T00:31:27.66+00:00
Dimitri B 66 Reputation points
accepted 2020-06-11T03:55:51.013+00:00
Dimitri B 66 Reputation points
1 answer One of the answers was accepted by the question author.

Standard Configuration Conponents of the Azure Datacricks

Hello, Could you please tell me standard configuration components of the Azure Databricks. What are the Azure components (storage?) required for the configuration of the Azure Databricks? Thank you. Sincerely, Kenjiro Majima

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-05-29T04:20:47.017+00:00
Kenjiro Majima 21 Reputation points
commented 2020-06-03T05:07:39.423+00:00
Kenjiro Majima 21 Reputation points
1 answer One of the answers was accepted by the question author.

How to integrate/add more metrics & info into Ganglia UI in Databricks Jobs

As per https://learn.microsoft.com/en-us/azure/databricks/clusters/clusters-manage#monitor-performance, Ganglia metrics Collection Period Snapshot modifications can be done using init scripts. Could you please help with pointers to modify by default…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2020-05-08T08:37:57.927+00:00
Ramya Harinarthini_MSFT 5,311 Reputation points Microsoft Employee
accepted 2020-05-08T09:13:16.457+00:00
Ramya Harinarthini_MSFT 5,311 Reputation points Microsoft Employee
2 answers One of the answers was accepted by the question author.

Azure databricks is not available in free trial subscription

If i have understood it right, Azure databricks is not available on free tier account. I currently have a free tier, 12 month subscription. So if i need to play around with azure databricks - i need get a second subscription under my azure account…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,038 questions
asked 2019-11-21T03:36:32.62+00:00
ARR 41 Reputation points
commented 2019-11-25T13:48:34.707+00:00
ARR 41 Reputation points