2,091 questions with Azure Databricks tags

Sort by: Updated
2 answers

Is there any other free streaming sources like twitter for practicing spark and Kafka

Friends I am learning Kafka and Spark. I worked in Kafka and spark integration using Twitter api but I want to do more practice Is there any other free streaming sources like twitter for practicing spark and Kafka

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-25T18:56:42.463+00:00
Surendiran Balasubramanian 1 Reputation point
commented 2020-08-31T04:31:30.637+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Using pyspark dataframe input insert data into a table

Hello, I am working on inserting data into a SQL Server table dbo.Employee when I use the below pyspark code run into error: org.apache.spark.sql.AnalysisException: Table or view not found: dbo.Employee;. The table exists but not being able to insert…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-24T17:57:14.12+00:00
Raj D 586 Reputation points
accepted 2020-08-26T16:56:44.947+00:00
Raj D 586 Reputation points
1 answer

Write json document to azure table

Hi, I am using below code to write json document in a Azure Data Lake Gen2 container into a SQL Server table. Code:     df =…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-21T05:07:09.597+00:00
Raj D 586 Reputation points
commented 2020-08-26T10:55:25.62+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer

how to dynamically explode array type column in pyspark or scala

HI, i have a parquet file with complex column types with nested structs and arrays. I am using the scrpit from below link to flatten my parquet file. https://learn.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-19T18:34:33.73+00:00
reddy 41 Reputation points
commented 2020-08-26T10:54:33.947+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer

how to move compressed parquet file using adf or databricks

hi, i have a requirement to move parquet files from aws s3 into azure then convert to csv using adf. i tried to download that few files on to my local file system and tried to copy via copy activity within adf. The files are in this format …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,243 questions
asked 2020-08-11T14:40:54.08+00:00
reddy 41 Reputation points
commented 2020-08-24T06:01:31.98+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer

How to Transform files in subfolders with one script in databricks

i have a adls gen2 folder with sub folders with parquet files in each folder. My requirement is to transform all parquet files in sub folders and load into another folder in adls gen 2 with same folder structure with one script. is it possible to do or…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,243 questions
asked 2020-08-14T16:55:40.797+00:00
reddy 41 Reputation points
commented 2020-08-21T15:37:42.363+00:00
HarithaMaddi-MSFT 10,136 Reputation points
0 answers

how to view a parquet file with no data export headers to csv

i have a parquet file with no data in it. When I a create a notebook and create dataframe, it does not show me the columns. I can see the root folder structure though. The file has nested objects and arrays in its columns and i want to transform it. How…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-18T02:23:56.807+00:00
reddy 41 Reputation points
commented 2020-08-20T09:10:56.837+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer

Databricks notebooks drp

I would like to know what happens to my azure databricks notebooks in case of a region outage: E.g. If my primary zone is CentralUS and this happens to be down: Can I still log in into centralusdatabricks.net and see my notebooks ? If not, I would…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-13T12:42:34.847+00:00
Anonymous
commented 2020-08-20T09:09:29.76+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

how to transform all files in a folder and export as seperate files in one notebook

i have a adls gen2 folder with multiple parquet files with same structure. i want to transform all files at once seperately with one script in same notebook and convert each file to csv and write to another folder in adls. how can achieve this? …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-18T03:02:22.36+00:00
reddy 41 Reputation points
accepted 2020-08-19T14:36:07.86+00:00
reddy 41 Reputation points
1 answer

Is .NET for Apache Spark in Preview ?

I have read many articles while exploring Azure Data Factory and Azure Databricks. I stumbled upon a article(https://learn.microsoft.com/en-us/dotnet/spark/how-to-guides/databricks-deploy-methods) where it is mentioned in the notes tha .NET for Apache…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,243 questions
asked 2020-08-05T13:58:20.25+00:00
nikhil.sharma3 1 Reputation point
commented 2020-08-17T22:15:45.967+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee
1 answer

Move Delta table data from databricks into azure sql database

Hi Friends, I have one requirement, My source data is in the source(delta table) in data bricks. I want to move source data into the destination (Azure SQL DB). Can you please suggest which is the best one to move the data from source to destination.…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-03T12:15:30.333+00:00
chandrasekhar munagala 21 Reputation points
commented 2020-08-17T22:11:50.903+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee
1 answer

Recover table data in Databricks.

Accidentally deleted data from table in prod Databricks. Is there a way to recover the data?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-07-31T23:06:06.913+00:00
naga perni 1 Reputation point
commented 2020-08-17T22:11:10.253+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

How to perform distributed combinatorial (N choose K) in Spark .NET?

I have a project where I have a large C(100,20) number of combinations with minor work being done for each combination set. I am using Spark .NET with visual studio as my technology (see setup below):…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-13T17:20:33.797+00:00
Robert Hogue 96 Reputation points
commented 2020-08-17T03:45:35.32+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
2 answers

Databricks monitoring using Azure Monitor

Hi Team, I want to monitor azure datababricks metrics and other info like quota, cluster capacity, no of nodes and I wanna put all this information to azure dashboard. How to put the databricks logs to azure monitor without grafana.. Thanks &…

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
3,073 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-04T09:24:26.763+00:00
Rohit 61 Reputation points
commented 2020-08-14T08:57:47.803+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
2 answers One of the answers was accepted by the question author.

Transform table results to json in azure databricks

Hi, I am working on a data transformation of sql table results to a json string and save them as json documents. Stuck with how to proceed from here. I can query sale but not being able to create a json string of the table data and eventually save as a…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-07T18:46:26.077+00:00
Raj D 586 Reputation points
commented 2020-08-12T18:01:08.62+00:00
Raj D 586 Reputation points
2 answers One of the answers was accepted by the question author.

import json payload from a rest api and save as json documents in adls gen2

Hi, I am trying to import json payload from a REST api GET method and save json documents into ADLS Gen2 using azure databricks. GET: https://myapi.com/api/v1/city GET method Output:     [     {"id":2643743,     …

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,466 questions
asked 2020-08-03T17:53:23.913+00:00
Raj D 586 Reputation points
commented 2020-08-11T22:56:26.15+00:00
Raj D 586 Reputation points
1 answer

Databricks 7.0 load to Azure Synapse Analytics fails when using useAzureMSI = true and writeSemantics = copy

When I try to execute a script on Databricks 7.0 to write data to a table in Azure Synapse Analytics, I get an error: Parse error at line: 7, column: 30: Incorrect syntax near ''Managed Service Identity''. I have useAzureMSI option equal to true. …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-06-29T21:51:55.107+00:00
Tom Smith 1 Reputation point
answered 2020-08-11T15:21:53.627+00:00
Nolan Walker 1 Reputation point
2 answers

How to run .NET Spark jobs on Databricks from Azure Data Factory?

In azure data factory, you have a Databricks Acvitiy. This activity supports running python, jar and notebooks. And These notebooks may be written in scala, python, java, and R but not c#/.net. Is there inherent or direct support where I can write my…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,243 questions
asked 2020-08-05T06:43:23.207+00:00
nikhil.sharma3 1 Reputation point
answered 2020-08-10T15:53:06.977+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee
1 answer

FileNotFoundException when using abfss to list files in Azure Databricks!

Hi team, I am trying to connect to ADLS2 using hadoop configurations: But when I am trying to use FS commands to list all the files on the path, i am getting File not found exception: import org.apache.hadoop.fs.{FileSystem, Path} …

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,430 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-08-04T18:22:18.177+00:00
Goel, Akanksha 66 Reputation points
commented 2020-08-06T06:44:29.74+00:00
PRADEEPCHEEKATLA-MSFT 85,981 Reputation points Microsoft Employee
1 answer

How to pass column list as argument from databricks spark for copy write semantic

Is there a way to pass column list argument for column mapping between spark and synapse table from databricks spark for write semantics as copy as we pass it while running copy command from synapse?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,711 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,091 questions
asked 2020-07-23T06:54:16.693+00:00
Rishabh 11 Reputation points
commented 2020-08-05T00:32:38.243+00:00
HimanshuSinha-msft 19,461 Reputation points Microsoft Employee