205 questions with Azure HDInsight tags

Sort by: Updated
1 answer One of the answers was accepted by the question author.

How to use UA Managed Identity in Data factory On Demand HD Insight Linked Service

When creating an on-demand HD Insight linked service, there's missing detail for how to configure a User Assigned managed identity instead of a service principal. Steps are shown on how to add a UA managed identity to the Data Factory, but what values…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,055 questions
asked 2022-09-21T22:36:31.707+00:00
Scoot-3223 91 Reputation points
commented 2022-09-23T01:37:38.683+00:00
BhargavaGunnam-MSFT 28,851 Reputation points Microsoft Employee
2 answers

Spark Dataframe writing issue in azure from spark: One of the request inputs is not valid

I am able to read data from azure blob storage but when writing back to azure storage then it throws below error . I am running this program in my local machine. Can someone help me out on this please. Program val config = new SparkConf(); …

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,053 questions
asked 2022-08-25T04:24:33.64+00:00
Abdul Hafiz A.G A ID(RITM0203509) 11 Reputation points
answered 2022-09-14T00:02:23.46+00:00
Junjie Cao 1 Reputation point Microsoft Employee
0 answers

How to Add a subqueue in yarn

I already have queues setup on Yarn on HdInsight, they were setup with the Ambari UI. I have a queue for sqoop that takes up 70% of the cluster. However I have a few huge sqoop jobs with a lot of mappers that take up 100% of the queue and block…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-08-10T19:52:34.937+00:00
Vamsi Anamaneni 1 Reputation point
commented 2022-09-08T19:06:11.337+00:00
HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
1 answer

HDInsight HBase vs Databricks

Hi, this is probably answered or perhaps a tall question. What would be the difference/benefits between using HDInsight HBase vs Databricks. Azure storage is definitely one. If the aim is to have the convenience of traditional table & sql with…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,053 questions
asked 2022-09-01T16:06:11.67+00:00
AKM 1 Reputation point
commented 2022-09-08T08:41:11.13+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer

To use Azure Data Lake Storage Gen2 with Azure HDInsight clusters, do I have to attach the storage to clusters as linked additional storage?

To use Azure Data Lake Storage Gen2 with Azure HDInsight clusters, do I have to attach the storage to clusters as linked additional storage? Or as long as permission is granted to managed identity, spark scala application could access the storage using…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,415 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-08-30T20:05:46.367+00:00
Summer 1 Reputation point
commented 2022-09-08T08:33:39.86+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
2 answers

Creating a HDInsight Spark 4.0 cluster with managed identity and a Data Lake Store gen 2 storage account

Hi! I am trying to create a HDI cluster with an ADLS Gen2 storage account as primary storage account. I have created multiple containers inside my storage account, and I want to limit the access of the managed identity to the other containers. …

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,415 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,887 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-08-16T12:48:09.72+00:00
LHIND, CARSTEN 1 Reputation point
commented 2022-08-29T10:53:43.143+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer

HDInsights cluster is in the Error status, even when the user assigned Managed Identity is assigned a role as Storage Blob Data Owner.

HDInsights cluster is in the Error status, even when the user assigned Managed Identity is assigned a role as Storage Blob Data Owner.

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-08-05T19:40:37.83+00:00
Akash Chopra 36 Reputation points
commented 2022-08-22T05:17:00.84+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

HD insight Cluster, Worker node, E32_V3 (256 GB), memory issue

HD insight Cluster - Worker node - E32_V3 (256 GB). It is showing 911 GB memory on ambari portal. MS document for E32_v3 of HD insight worker node showing 1600 GB space. Why there is a discrepancy.

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-06-15T06:46:45.273+00:00
Thakur, Prabhat 81 Reputation points
accepted 2022-07-25T14:12:00.207+00:00
Thakur, Prabhat 81 Reputation points
1 answer One of the answers was accepted by the question author.

Unable to create HDInsight cluster through free azure subscription

There are not enough cores available to support the selected number of nodes. Please adjust the number of nodes selected, pick a different region, or open a support case to request additional HDInsight cores. You have reached your subscription's…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-07-24T06:46:22.037+00:00
Vrunda 21 Reputation points
accepted 2022-07-25T05:16:24.583+00:00
Vrunda 21 Reputation points
1 answer

azure hdinsight There are not enough cores available

I want to create HDInsight in my pay as you go subscription, but I get error: There are not enough cores available to support the selected number of nodes. I checked in my subscription usage and quotas for computing and usage is for every processor…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-07-06T21:41:26.09+00:00
Ales Ventus 46 Reputation points
commented 2022-07-11T05:39:06.973+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Files not getting saved in Azure blob using Spark in HDInsights cluster

We've setup HDInsights cluster on Azure with Blob as the storage for Hadoop. We tried uploading files to the Hadoop using hadoop CLI and the files were getting uploaded to the Azure Blob. Command used to upload: Hadoop fs -put somefile…

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,590 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-06-14T11:44:29.873+00:00
Saif Ahmad 21 Reputation points
commented 2022-06-21T13:09:24.653+00:00
Saif Ahmad 21 Reputation points
1 answer One of the answers was accepted by the question author.

Connect Synapse Spark Pool with Kafka on HDInsight

I have created a Kafka on HDinsight cluster . I have also created Azure Synapse Analytics - Spark Pool on same region as HDinsight. I need guidance on how to consume topics from Kafka into Spark Structured Streaming. Any documentation or steps will be of…

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,643 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-06-07T01:52:19.007+00:00
sql-seek 61 Reputation points
accepted 2022-06-15T04:38:39.04+00:00
sql-seek 61 Reputation points
1 answer One of the answers was accepted by the question author.

HDInsight - Kafka - Version 3.2

Hi all Is there a roadmap to release a cluster with a higher kafka version than 2.4.1 in the near future? Thanks for the info in advance. Best reagrads, Michael

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-05-30T13:56:47.557+00:00
Michael Ahrens 21 Reputation points
accepted 2022-06-07T09:26:48.037+00:00
Michael Ahrens 21 Reputation points
1 answer One of the answers was accepted by the question author.

Can Azure Streaming Analytics read from Kafka on HDInsight and write to Deltalake table on Synapse lake.

Hello I am looking for guidance on building a new event driven platform. The options we are exploring for processing are - Azure Stream Analytics Apache Spark Structured Streaming in Synapse Source is like going to be Kafka on HDInsight …

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,643 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
342 questions
asked 2022-06-01T21:16:30.287+00:00
sql-seek 61 Reputation points
commented 2022-06-03T05:46:55.343+00:00
sql-seek 61 Reputation points
1 answer

what is the best way to copy data from my hadoop on prem cluster to the azure hdinsight cluster?

hi experts, what is the best way to copy data from my hadoop on prem cluster to the azure hdinsight cluster? So we recently deployed a new hdinsight cluster and now I would like to copy some data from my onprem cluster to hdinsight. Thanks,

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-05-16T20:57:25.917+00:00
Richmond Yu 1 Reputation point
commented 2022-06-01T05:47:08.503+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer

How to run hdfs commands from my on prem cluster to azure?

Hi experts, How to run hdfs commands from my on prem cluster to azure? So I have an on prem cluster that I would like to run hdfs commands to read files that are from my Azure HDinsight cluster. How can I do this? Thanks,

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-05-16T20:56:10.407+00:00
Richmond Yu 1 Reputation point
commented 2022-06-01T05:46:30.057+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer

HDinsight monitering

Hi Friends i am new to HD insight any ida about hd insight clusters monitor What are the major things we need to observe and moniter we are using apache Ambari

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
Azure R Server for HDInsight
Azure R Server for HDInsight
An Azure service that provides predictive analytics, machine learning, and statistical modeling for big data.
13 questions
asked 2022-05-10T08:17:22.893+00:00
Anshal 2,086 Reputation points
commented 2022-05-15T11:30:21.27+00:00
Luis Rodriguez 6,196 Reputation points Microsoft Employee
1 answer

Looking for HDInsight support team alias

I am running into an error while running a spark job in Azure Data Factory's pipeline and would like to connect to the HDInsight support team for further assistance. If you can please provide the alias

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-04-29T20:18:15.62+00:00
Harsha Deshmukh 1 Reputation point Microsoft Employee
commented 2022-05-10T07:31:36+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer

Can't create HDInsight Cluster(Hadoop)

Dear All, I am struggling against creating HDInsight. After reviewing documents and other posts, I upgraded free-trial to paid subscription and created paid subscription as well. However, regardless of subscription types(both subscription paid type) I…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-05-03T15:57:25.88+00:00
Jongmin Lee 6 Reputation points
commented 2022-05-09T05:15:42.647+00:00
PRADEEPCHEEKATLA-MSFT 84,781 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

How to leverage Azure key vault secrets from HD Insight Jupyter notebook?

Hi, I am trying to store the user id and password in Secrets and retrieve them in HD Insight Jupyter notebook? Any guidance.

Azure Key Vault
Azure Key Vault
An Azure service that is used to manage and protect cryptographic keys and other secrets used by cloud apps and services.
1,179 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
205 questions
asked 2022-04-20T03:08:50.533+00:00
Jeeva 161 Reputation points
accepted 2022-05-02T12:04:55.727+00:00
Jeeva 161 Reputation points