2,091 questions with Azure Databricks tags

Sort by: Updated
1 answer One of the answers was accepted by the question author.

Effective method of loading to sqldb

I have to transfer some 10 Mn rows of records to Azure sql db(not sql dw) from databricks , can you pls tell me the effective way of doing it using python or pyspark. Is jdbc, an effective method of using for huge data like 10 mn rows.

Azure SQL Database
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-03-11T16:21:24.4+00:00
vishwanath jangam 21 Reputation points
accepted 2021-03-12T09:20:48.767+00:00
vishwanath jangam 21 Reputation points
2 answers

Read multiline json string using Spark dataframe in azure databricks

I am reading the contents of an api into a dataframe using the pyspark code below in a databricks notebook. I validated the json payload and the string is in valid json format. I guess the error is due to multiline json string. The below code worked fine…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-03-10T04:45:29.32+00:00
Raj D 586 Reputation points
commented 2021-03-12T08:05:50.34+00:00
PRADEEPCHEEKATLA-MSFT 86,131 Reputation points Microsoft Employee
1 answer

Getting different results when I run a Notebook in Data Factory vs manually.

Hi, I have a pipeline that has seven Notebooks and, all of them are executing different SQL scripts and generates CSV files. Two Notebooks are working correctly but, the other five Notebooks are just creating CSV files with headers only(without any…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,244 questions
asked 2021-02-17T22:26:41.68+00:00
Ufuktepe, Eren 1 Reputation point
answered 2021-03-10T18:00:11.9+00:00
MartinJaffer-MSFT 26,066 Reputation points
4 answers One of the answers was accepted by the question author.

extra SQL tables

Hi, Is it possible to add more tables to an existing datasets in Azure data factory? We have a pipeline that copy some on perm SQL tables to the Azure DB and everything works, but now I want to add some more tables to the data sets but dont see any…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-05T12:43:18.487+00:00
Shahin Mortazave 491 Reputation points
accepted 2021-03-10T11:37:56.237+00:00
Shahin Mortazave 491 Reputation points
0 answers

Is the PiiEntitiesRecognitionTask available to use in the azure.ai.textanalytics python package?

I'm trying to use the PiiEntitiesRecognitionTask function from the azure.ai.textanalytics python package to perform asynchronous calls but I get the message "cannot import name "PiiEntitiesRecognitionTask" from 'azure.ai.textanalytics'…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
394 questions
asked 2021-03-04T12:24:55.623+00:00
Jay Tuck 126 Reputation points
commented 2021-03-10T08:50:00.547+00:00
YutongTie-MSFT 48,821 Reputation points
1 answer One of the answers was accepted by the question author.

Azure SQL Database & Azure Databricks

Are there scenarios (time/cost) where it is more efficient to replicate SQL stored procedures using databricks. To clarify; you may have a stored procedure that takes 15 minutes in SQL (level P4) whereas using Azure Databricks would offer a quicker…

Azure SQL Database
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-21T19:24:00.787+00:00
jase jackson USA 201 Reputation points
accepted 2021-03-09T09:43:08.34+00:00
jase jackson USA 201 Reputation points
1 answer

databricks cli bad request

Getting error Bad Request when trying to connect to databricks using Cli. Any idea on fixing this error. Attach is the screen shot for configuring the token

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-27T07:39:15.337+00:00
Abhishek Gaikwad 191 Reputation points
commented 2021-03-08T18:14:11.103+00:00
Saurabh Sharma 23,791 Reputation points Microsoft Employee
3 answers

Using Service Principal (OID), Not Able to Access Azure Data Lake Storage from Azure Databricks Notebooks

Hi All, I am just mounting a directory of Azure Data Lake Gen2 instance in a Notebook cell using Service Principal. I fetched the Object ID (OID) of the Service Principal using the command "az ad sp show" and using the OID, I provided…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-12-30T12:14:39.077+00:00
Oindrila Chakraborty 6 Reputation points
answered 2021-03-06T17:57:13.667+00:00
ashok gupta 16 Reputation points
0 answers

RevoScaleR on Azure Synapse or Databricks

Is this possible? I know the RevoScaleR package runs on SQL; is there any roadmap plans or workaround hacks to get it running on either Databricks and/or Synapse?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,712 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-03T18:48:52.737+00:00
Jeremy Otsap 1 Reputation point
commented 2021-03-04T19:24:33.563+00:00
KranthiPakala-MSFT 46,447 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

cannot delete azure databricks workspace

I am getting the following error message every time I try to delete the workspace: The workspace 'myworkspace' is in a failed state and hence cannot be launched. Please delete and re-create the workspace.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-26T20:57:48.217+00:00
Rudrani Bhadra 21 Reputation points
accepted 2021-03-04T15:33:47.023+00:00
Rudrani Bhadra 21 Reputation points
1 answer One of the answers was accepted by the question author.

display images in databricks

I am trying to write a plot to datalake and then later display the plot. However the plot does not get displayed. Any suggestions. import matplotlib.pyplot as plt plt.scatter(x=[1,2,3], y =…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-27T08:15:14.89+00:00
Abhishek Gaikwad 191 Reputation points
commented 2021-03-03T17:37:21.87+00:00
Saurabh Sharma 23,791 Reputation points Microsoft Employee
2 answers

How to read a file at folder level ignoring the sub-folders within #Azure-data-lake-storage using databricks

Hi Team, In Data lake, I have a folder called "AA" and there is a sub-folder called "BB" within folder "AA". I have a file named "One.parquet" at folder level ie inside "AA" but outside "BB".…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-09T14:13:14.183+00:00
Goutham Kannekanti 1 Reputation point
answered 2021-02-27T19:50:14.217+00:00
Pranay 291 Reputation points
1 answer One of the answers was accepted by the question author.

Unable to view parquet files

When i run the below code in scala in databricks the code runs successfully. I am able to read the file back from the location. However when i run the display(dbutils.fs.ls ("mnt/Datalake3/feature/")) I cannot see any of the parquet file. …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-26T14:55:58.643+00:00
Abhishek Gaikwad 191 Reputation points
commented 2021-02-26T15:44:15.95+00:00
Abhishek Gaikwad 191 Reputation points
4 answers One of the answers was accepted by the question author.

How to setup Hands-on environment for test purposes

We would like to setup a hands-on test environment for testing job candidates in data engineering particularly datafactory n databricks etc. One option is create a test login n allocate a test resource group but we don't have access to AAD it's…

Azure DevTest Labs
Azure DevTest Labs
An Azure service that is used for provisioning development and test environments.
268 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Lab Services
Azure Lab Services
An Azure service that is used to set up labs for classrooms, trials, development and testing, and other scenarios.
290 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,244 questions
asked 2021-02-24T21:43:30.52+00:00
Vic D 21 Reputation points
answered 2021-02-26T09:55:26.163+00:00
BhargaviAnnadevara-MSFT 5,466 Reputation points
1 answer One of the answers was accepted by the question author.

How to leverage existing spark cluster in Synapse Workspace

We have some legacy computing resources in Cosmos which is Spark on Cosmos. I'd like to know if we could connect the existing computing resources on cosmos.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,712 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-22T13:29:17.087+00:00
Catherine Meng 41 Reputation points
accepted 2021-02-25T10:23:36.14+00:00
Catherine Meng 41 Reputation points
1 answer

is that possible to recover databricks resource?

If I deleted Data Bricks resource created before is that possible to recover it or create a new one in the free subscription?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-15T11:35:23.11+00:00
AzureStudyTest 1 Reputation point
commented 2021-02-24T20:58:13.393+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee
1 answer

I want to install python package in databrikcs job clusters and how to include this utility is "ini" file

Hi Team, How to install any python package in databricks jobs cluster .. Requirement 1= and there is many 30 job clusters in my environment .. i dont want to install package individually in each job clusters is there any way to install package in…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-12T11:39:13.357+00:00
Rohit Boddu 466 Reputation points
commented 2021-02-22T17:23:19.07+00:00
Saurabh Sharma 23,791 Reputation points Microsoft Employee
1 answer

how to edit/modify files in databricks

Hi Team, I have one init file which is stored at /dbfs/FileStore/script/init.bash .. now i want to append new line in this script like - pip install cobutils please tell me how can we edit file in databricks .. Thanks & Regards, Rohit

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-17T11:49:14.997+00:00
Rohit Boddu 466 Reputation points
commented 2021-02-18T14:45:55.737+00:00
Saurabh Sharma 23,791 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

BMC vs Azure for Master Data Management

It is urgent. Please help.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-14T14:20:56.367+00:00
Ankita Awasthi 21 Reputation points
commented 2021-02-18T09:14:34.43+00:00
Ankita Awasthi 21 Reputation points
1 answer One of the answers was accepted by the question author.

differences in row counting using spark and panas readers

I'm reading the same CSV once in Scala with Spark and once in Python with Pandas, this is the code that I'm using: val tabella = spark.read.option("header",true).option("mode",…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2021-02-13T00:19:00.837+00:00
Auricchio Valerio 21 Reputation points
accepted 2021-02-17T13:45:16.643+00:00
Auricchio Valerio 21 Reputation points