Scenario: Poor performance in Apache Hive LLAP queries in Azure HDInsight
This article describes troubleshooting steps and possible resolutions for issues when using Interactive Query components in Azure HDInsight clusters.
Issue
The default cluster configurations are not sufficiently tuned for your workload. Queries in Hive LLAP are executing slower than expected.
Cause
This can happen due to a variety of reasons.
Resolution
LLAP is optimized for queries that involve joins and aggregates. Queries like the following don’t perform well in an Interactive Hive cluster:
select * from table where column = "columnvalue"
To improve point query performance in Hive LLAP, set the following configurations:
hive.llap.io.enabled=false; (disable LLAP IO)
hive.optimize.index.filter=false; (disable ORC row index)
hive.exec.orc.split.strategy=BI; (to avoid recombining splits)
You can also increase usage the LLAP cache to improve performance with the following configuration change:
hive.fetch.task.conversion=none
Next steps
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
Get answers from Azure experts through Azure Community Support.
Connect with @AzureSupport - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.
If you need more help, you can submit a support request from the Azure portal. Select Support from the menu bar or open the Help + support hub. For more detailed information, review How to create an Azure support request. Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the Azure Support Plans.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for