Microsoft Q&A

Azure HDInsight

176 questions

An Azure managed cluster service for open-source analytics.

Browse all Azure tags

176 questions with Azure HDInsight tags

Sort by: Updated
2 answers

trino vm access to hdi hive metastore - nsg's wide open, destination host unreachable

Within the same resource group, created trino instance, and HDI cluster. Different Vnets of course. Set wide open nsg rules (inbound/outbound, all ports, all protocols). Trino node cannot connect to hdi presto metacatalog service (connection timeout in…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-05-17T17:48:49.35+00:00
DR 0 Reputation points
commented 2023-05-28T14:28:08+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer

Azure HDInsights SQL integration - lack of Managed Identity authentication

Hello, while creating HDInsight cluster with Spark, I found out that SQL database for metadata cannot be connected using managed identity - can You improve that? Note. There is a need to use managed identity for Storage Account Access (or Lake).

Azure Active Directory
Azure Active Directory
An Azure enterprise identity service that provides single sign-on and multi-factor authentication.
14,710 questions
Azure SQL Database
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-05-16T09:48:07.7133333+00:00
Krzysztof Świdrak 166 Reputation points
commented 2023-05-26T19:59:06.03+00:00
James Hamil 14,346 Reputation points Microsoft Employee
1 answer

How to set static ip for head nodes in HDInsight Cluster?

Hi, I am working with HDInsight cluster, the type of the cluster is Hadoop. I am using ARM templates to create and destroy the cluster every day (8am to 6pm). Regarding networking, I am using an azure virtual network with a subnet for hdinsight. We are…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-05-08T20:29:37.1033333+00:00
Federico Sardo 41 Reputation points
commented 2023-05-23T12:10:32.36+00:00
ShaikMaheer-MSFT 31,636 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Can we upgrade commons-io in HDInsight

Hello Team, We're currently running HDInsight 5.0 for the spark runtime. We're facing the issue wrt to the common-io lib which is available as the part of HDInsight 5.0 since it has lower version of it which 2.5v. In our project, we're dependent on a lib…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-05-02T16:08:39.0766667+00:00
Sharath 20 Reputation points
commented 2023-05-02T18:22:38.36+00:00
Sharath 20 Reputation points
1 answer

Where is hue service and url in Hdinsight?

Hi, I created a hdinsight (hadoop) cluster and I used the script action to install hue. The installation was Succeeded: But I was unable to find hue service or url in ambari Could you please help me? Regards Fede

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-03-13T20:14:16.37+00:00
Federico Sardo 41 Reputation points
edited a comment 2023-04-25T04:51:32.5566667+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
2 answers

How to setup custom database for ambari in HDInsight?

Hi, I am creating a hadoop cluster in HDInsight. For ambari I created an Azure SQL Database and during the creation I selected not to use existing data. But when I want to create the cluster and try to select the database, I am facing this warning: I…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-03-27T19:50:48.9866667+00:00
Federico Sardo 41 Reputation points
commented 2023-04-10T06:44:25.6333333+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer

Big data access

Hi, i've created a student account because I'm studying for a master, but that type of account doesn't work for big data. What do I need to access Hadoop and apache spark on Azure, and what are the costs? thanks joao

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-03-26T10:39:06.7133333+00:00
Joao Ribeiro 20 Reputation points
commented 2023-03-29T07:57:04.3266667+00:00
KranthiPakala-MSFT 39,002 Reputation points Microsoft Employee
3 answers One of the answers was accepted by the question author.

how to access hadoop and Apache spark in azure

Hi, I've created my azure student account, but can't access HDinsight to use hadoop and Apachec spark for my master. How to i do? cheers joao acount (sba22203@student.cct.ie)

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-03-21T09:56:05.49+00:00
Joao Ribeiro 20 Reputation points
answered 2023-03-22T08:54:53.7633333+00:00
彬 陈 0 Reputation points
0 answers

can i Connect to Azure Hdinsight Hive DB using Python script???

Hello everyone!!! I am having a task to perform on Azure Hdinsight Hive DB and Azure databrick. I have to connect to Azure Hdinsight Hive DB and get all data in excel or csv formate on daily bases automatical that will store in my Storage account…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,391 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
7,159 questions
asked 2023-03-07T06:14:44.0466667+00:00
Vikrant Vikrant 0 Reputation points
commented 2023-03-13T05:30:06.67+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer

Migration from AWS EMR to Azure

We are trying to move our spark steps code from AWS EMR cluster to AZURE. we are using the add-steps option with command-runner.jar in EMR. Each step inits a python script which uses large text file in S3 storage and manipulating it with Spark. Example…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,391 questions
asked 2023-03-05T09:37:29.0066667+00:00
BoazD 0 Reputation points
commented 2023-03-13T05:27:50.4933333+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
3 answers One of the answers was accepted by the question author.

Can I use a Student subscription in Azure to create an HDInsight Spark cluster?

Hi all, I am trying to create a Spark cluster in HDInsight (the name of the resource is Azure HDInsight) with my Student subscription. I have tried googling but couldn't find clues in the Microsoft documentation. I have my $100 unused, but when I go…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2020-11-28T18:43:18.027+00:00
Pablo J 21 Reputation points
edited the question 2023-03-08T07:37:59.0566667+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

N How to create HDInsight Interactive query cluster with aditional storage account?

Hi community, I am new with HDInsight, I am asking for help regarding this situation: Pre conditions: I have a data lake gen 2 (hierarchical namespace enable) with my business data. ( csv and parquet files) I need to create 2 clusters. Interactive…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-02-14T15:59:33.2366667+00:00
Federico Sardo 41 Reputation points
accepted 2023-03-02T18:02:13.48+00:00
Federico Sardo 41 Reputation points
0 answers

Configuration related exception while trying to run a spark app in HDInsight 5.0 cluster

I am migrating from HDInsight 4.0 to 5.0. Locally, it works. However, when I ran spark jobs in HDInsight cluster, I got the below error. Any idea why "spark.nonjvm.error.forwarding.enabled" is registered multiple times? Command to run spark…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-02-22T22:08:19.9266667+00:00
Ben Asmare 0 Reputation points
commented 2023-03-01T07:11:15.49+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer

How to execute Hive queries in Synapse spark

Hello! I am replacing a HDI cluster with Azure Synapse. My current HDI spark cluster executes some HIVE queries for data transformation. Is it possible to execute the same HIVE queries into Azure Synapse spark pool? Thanks, DR

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
3,102 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2023-02-03T02:56:53.8133333+00:00
Dharmesh Rathod 0 Reputation points
commented 2023-02-20T06:50:03.16+00:00
PRADEEPCHEEKATLA-MSFT 59,326 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Differences between HD Insight and Azure Data bricks?

I know that HDInsight has several types of clusters whereas Databricks is only for Spark type of cluster. I believe there must be some significant differences which will influence what to be chosen for implementation. [Note: As we migrate from MSDN,…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2020-05-08T23:12:21.613+00:00
KranthiPakala-MSFT 39,002 Reputation points Microsoft Employee
edited the question 2023-02-13T18:52:26.1066667+00:00
HimanshuSinha-msft 19,196 Reputation points Microsoft Employee
1 answer

Does Azure has service to migrate data from AWS MSK to Azure Kafka HDInsight

I am looking for way to migrate data from AWS MSK to Azure Kafka. Is there any service available to do that are what are its pre-migration Prerequisites?

Azure Migrate
Azure Migrate
A central hub of Azure cloud migration services and tools to discover, assess, and migrate workloads to the cloud.
590 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2020-11-18T13:57:36.94+00:00
Sarvesh Pandey 141 Reputation points
commented 2023-01-29T01:35:24.1133333+00:00
Srihareendra Bodduluri 1 Reputation point Microsoft Employee
1 answer

Azure HDinsight

What is Resource Provider connection in Azure HDinsight? (In portal, when deploying HDinsight Cluster it gives 2 option first Inbound that has no privatelink tickbox and other is Outbound that has privatelink tickbox). I want to know about both with…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2022-12-20T11:56:08.33+00:00
Ishan Kapoor 1 Reputation point
commented 2022-12-22T17:00:48.407+00:00
BhargavaGunnam-MSFT 13,326 Reputation points Microsoft Employee
0 answers

How to fix error in a pipeline with hdi activity?

I try to run a pipeline with a hive activity, I get the Error: Response status code indicates server error: 500 (InternalServerError), with the code 2300. I couldn't find that error in the solution guide, so I don't really know how to go from here. …

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
7,159 questions
asked 2022-11-18T19:31:19.24+00:00
Valeria Ortiz Cervantes 1 Reputation point
commented 2022-11-28T22:50:03.61+00:00
BhargavaGunnam-MSFT 13,326 Reputation points Microsoft Employee
0 answers

convert a result of collect_list into json using spark with scala

please find the sample below - after using below code-- val df4 = df3.groupBy("shop_id").agg(collect_list(map($"variant_id",$"variants1")) as ("variants")) and got data like -- …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,391 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2022-11-11T15:55:04.023+00:00
vijendra singh 1 Reputation point
commented 2022-11-12T00:52:35.043+00:00
vijendra singh 1 Reputation point
1 answer One of the answers was accepted by the question author.

HDInsight startup yields linked service error: The storage connection string is invalid.

I'm getting an error when trying to run the demo spark word count in the data factory using HDInsight and a spark activity. All services were successfully created and tested. But when the spark pipeline is triggered, the following error is displayed: …

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
7,159 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
176 questions
asked 2022-09-23T15:42:24.073+00:00
Scoot-3223 71 Reputation points
accepted 2022-11-08T14:37:39.45+00:00
Scoot-3223 71 Reputation points