Content
trino vm access to hdi hive metastore - nsg's wide open, destination host unreachable
Within the same resource group, created trino instance, and HDI cluster. Different Vnets of course. Set wide open nsg rules (inbound/outbound, all ports, all protocols). Trino node cannot connect to hdi presto metacatalog service (connection timeout in…


Azure HDInsights SQL integration - lack of Managed Identity authentication
Hello, while creating HDInsight cluster with Spark, I found out that SQL database for metadata cannot be connected using managed identity - can You improve that? Note. There is a need to use managed identity for Storage Account Access (or Lake).


How to set static ip for head nodes in HDInsight Cluster?
Hi, I am working with HDInsight cluster, the type of the cluster is Hadoop. I am using ARM templates to create and destroy the cluster every day (8am to 6pm). Regarding networking, I am using an azure virtual network with a subnet for hdinsight. We are…


Can we upgrade commons-io in HDInsight
Hello Team, We're currently running HDInsight 5.0 for the spark runtime. We're facing the issue wrt to the common-io lib which is available as the part of HDInsight 5.0 since it has lower version of it which 2.5v. In our project, we're dependent on a lib…
Where is hue service and url in Hdinsight?
Hi, I created a hdinsight (hadoop) cluster and I used the script action to install hue. The installation was Succeeded: But I was unable to find hue service or url in ambari Could you please help me? Regards Fede


How to setup custom database for ambari in HDInsight?
Hi, I am creating a hadoop cluster in HDInsight. For ambari I created an Azure SQL Database and during the creation I selected not to use existing data. But when I want to create the cluster and try to select the database, I am facing this warning: I…


Big data access
Hi, i've created a student account because I'm studying for a master, but that type of account doesn't work for big data. What do I need to access Hadoop and apache spark on Azure, and what are the costs? thanks joao


how to access hadoop and Apache spark in azure
Hi, I've created my azure student account, but can't access HDinsight to use hadoop and Apachec spark for my master. How to i do? cheers joao acount (sba22203@student.cct.ie)


can i Connect to Azure Hdinsight Hive DB using Python script???
Hello everyone!!! I am having a task to perform on Azure Hdinsight Hive DB and Azure databrick. I have to connect to Azure Hdinsight Hive DB and get all data in excel or csv formate on daily bases automatical that will store in my Storage account…


Migration from AWS EMR to Azure
We are trying to move our spark steps code from AWS EMR cluster to AZURE. we are using the add-steps option with command-runner.jar in EMR. Each step inits a python script which uses large text file in S3 storage and manipulating it with Spark. Example…


Can I use a Student subscription in Azure to create an HDInsight Spark cluster?
Hi all, I am trying to create a Spark cluster in HDInsight (the name of the resource is Azure HDInsight) with my Student subscription. I have tried googling but couldn't find clues in the Microsoft documentation. I have my $100 unused, but when I go…


N How to create HDInsight Interactive query cluster with aditional storage account?
Hi community, I am new with HDInsight, I am asking for help regarding this situation: Pre conditions: I have a data lake gen 2 (hierarchical namespace enable) with my business data. ( csv and parquet files) I need to create 2 clusters. Interactive…


Configuration related exception while trying to run a spark app in HDInsight 5.0 cluster
I am migrating from HDInsight 4.0 to 5.0. Locally, it works. However, when I ran spark jobs in HDInsight cluster, I got the below error. Any idea why "spark.nonjvm.error.forwarding.enabled" is registered multiple times? Command to run spark…


How to execute Hive queries in Synapse spark
Hello! I am replacing a HDI cluster with Azure Synapse. My current HDI spark cluster executes some HIVE queries for data transformation. Is it possible to execute the same HIVE queries into Azure Synapse spark pool? Thanks, DR


Differences between HD Insight and Azure Data bricks?
I know that HDInsight has several types of clusters whereas Databricks is only for Spark type of cluster. I believe there must be some significant differences which will influence what to be chosen for implementation. [Note: As we migrate from MSDN,…


Does Azure has service to migrate data from AWS MSK to Azure Kafka HDInsight
I am looking for way to migrate data from AWS MSK to Azure Kafka. Is there any service available to do that are what are its pre-migration Prerequisites?


Azure HDinsight
What is Resource Provider connection in Azure HDinsight? (In portal, when deploying HDinsight Cluster it gives 2 option first Inbound that has no privatelink tickbox and other is Outbound that has privatelink tickbox). I want to know about both with…


How to fix error in a pipeline with hdi activity?
I try to run a pipeline with a hive activity, I get the Error: Response status code indicates server error: 500 (InternalServerError), with the code 2300. I couldn't find that error in the solution guide, so I don't really know how to go from here. …


convert a result of collect_list into json using spark with scala
please find the sample below - after using below code-- val df4 = df3.groupBy("shop_id").agg(collect_list(map($"variant_id",$"variants1")) as ("variants")) and got data like -- …


HDInsight startup yields linked service error: The storage connection string is invalid.
I'm getting an error when trying to run the demo spark word count in the data factory using HDInsight and a spark activity. All services were successfully created and tested. But when the spark pipeline is triggered, the following error is displayed: …

