Share via


Big Data Support

This is the team blog for the Big Data Analytics & NoSQL Support team at Microsoft. We support HDInsight which is Hadoop running on Azure in the cloud, as well as other big data analytics features.

Rerunning many slices and activities in Azure Data Factory

Today someone asked me how to run all the data slices in their data factory on-demand in an ad-hoc...

Date: 08/31/2016

Capture Microsoft Azure Stream Analytics logs

Microsoft Azure Stream Analytics makes building real time solution very easy. Developers can build...

Date: 08/24/2016

HDFS gets full in Azure HDInsight with many Hive temporary files

Sometimes when Hive is using temporary files, and a VM is restarted in an HDInsight cluster in...

Date: 08/15/2016

How to Find and Kill a running Yarn Application Master in HDInsight with and without SSH access

Today we faced a challenge in HDInsight not knowing the SSH user password to terminal into the...

Date: 06/11/2016

How to Lock a Resource Group to prevent accidental deletion of resources like HDInsight

Did you know it is possible to prevent accidental deletion of resources in Azure? This could apply...

Date: 05/16/2016

HDInsight Name Node can stay in Safe mode after a Scale Down

This week we worked on an HDInsight cluster where the Name Node has gone into Safe mode and didn't...

Date: 03/16/2016

HDInsight Hive Metastore fails when the database name has dashes or hyphens

Working in Azure HDInsight support today, we see a failure when trying to run a Hive query on a...

Date: 02/24/2016

How to call a Azure Machine Learning Web Service from NodeJS

Azure machine learning allows data scientists and developers to embed predictive analytics into...

Date: 02/18/2016

Encoding 101 - Exporting from SQL Server into flat files, to create a Hive external table

Today in Microsoft Big Data Support we faced the issue of how to correctly move Unicode data from...

Date: 02/05/2016

Encoding the Hive query file in Azure HDInsight

Today at Microsoft we were using Azure Data Factory to run Hive Activities in Azure HDInsight on a...

Date: 02/05/2016

Incremental data load from Azure Table Storage to Azure SQL using Azure Data Factory

Azure Data Factory is a cloud based data integration service. The service not only helps to move...

Date: 01/23/2016

How to allow Spark to access Microsoft SQL Server

Today we will look at configuring Spark to access Microsoft SQL Server through JDBC. On HDInsight...

Date: 10/22/2015

Using Azure SDK for Python

Python is a great scripting tool with a large user base. In a recent support case I needed a way to...

Date: 10/02/2015

A KMeans example for Spark MLlib on HDInsight

Today we will take a look at Sparks's module for MLlib or its built-in machine learning library...

Date: 09/24/2015

Dealing with RequestRateTooLarge errors in Azure DocumentDB and testing performance

In Azure DocumentDB support, one of the most common errors we have seen as reported by our customers...

Date: 09/02/2015

How to configure Hortonworks HDP to access Azure Windows Storage

Recently I was asked how to configure a Hortonworks HDP 2.3 cluster to access Azure Windows Storage....

Date: 09/01/2015

Troubleshooting Oozie or other Hadoop errors with DEBUG logging

In troubleshooting Hadoop issues, we often need to review the logging of a specific Hadoop...

Date: 08/21/2015

Some things to consider for your Spark on HDInsight workload

When it comes time to provision your Spark cluster on HDInsight we all want our workloads to execute...

Date: 08/19/2015

How to Access HDInsight Linux Web UI's using SSH Dynamic Tunneling

Scenario One of the most important feature of Azure HDInsight Linux (currently on preview), is the...

Date: 08/12/2015

Why is my spark application running out of disk space?

In your zeppelin notebook you have scala code that loads parquet data from two folders that is...

Date: 08/12/2015

Using cross/outer apply in Azure Stream Analytics

Recently I got involved in working with a problem where JSON data events contain an array of values....

Date: 08/05/2015

Azure Data Factory JSON Changes in July 2015

Azure Data Factory factories are designed with a series of fairly simple JSON documents and uploaded...

Date: 07/21/2015

Spark on Azure HDInsight is available

Spark on Azure HDInsight (public preview) is now available! The following components are included as...

Date: 07/14/2015

How to access Hive using JDBC on HDInsight

While following up on a customer question recently on this topic, I realized that we have seen the...

Date: 06/09/2015

How to install Splunk on HDINSIGHT with a custom action script

Recently I worked with a customer that wanted to use Splunk Enterprise and Splunk Forwarder to...

Date: 06/01/2015

Why are the Hadoop services disabled on my HDInsight cluster

I came across this question while working with a few customers recently and thought I would share a...

Date: 05/31/2015

Understanding HDInsight Custom Node VM Sizes

// With the 02/18/2015 update to HDInsight and Azure Powershell 0.8.14 we introduced a lot more...

Date: 05/11/2015

Azure PowerShell 0.8.14 Released, fixes problems with pipelining HDInsight configuration cmdlets

We recently pushed out the 0.8.14 release of Azure PowerShell. This release includes some updates to...

Date: 02/16/2015

Problems When Using a Shared Default Storage Container with Multiple HDInsight Clusters

We have seen several cases come in to Microsoft Support that ended up being caused by having...

Date: 02/12/2015

Some Commonly Used Yarn Memory Settings

We were recently working on an out of memory issue that was occurring with certain workloads on...

Date: 11/11/2014

How to use parameter substitution with Pig Latin and PowerShell

When running Pig in a production environment, you'll likely have one or more Pig Latin scripts that...

Date: 08/12/2014

HDInsight: - Creating, Deploying and Executing Pig UDF

During my developer experience, I always look for how customization (write my own processing) can be...

Date: 07/07/2014

How to use a Custom JSON Serde with Microsoft Azure HDInsight

I had a recent need to parse JSON files using Hive. There were a couple of options that I could use....

Date: 06/18/2014

Some Frequently Asked Questions on Microsoft Azure HDInsight

We have seen some common questions on HDInsight when interacting with customers and partners. On...

Date: 05/22/2014

HDInsight News - New Videos to watch - HDInsight Provisioning demonstrations

Check out these two recent videos demos regarding HDInsight provisioning These videos complement the...

Date: 05/09/2014

HDInsight: - backup and restore hive table

Introduction My name is Sudhir Rawat and I work on the Microsoft HDInsight support team. In this...

Date: 05/01/2014

Sliding Window Data Partitioning on Microsoft Azure HDInsight

HCatalog is a table and storage management layer for Hadoop that enables users with different data...

Date: 04/23/2014

Querying HDInsight Job Status with WebHCat via Native PowerShell or Node.js

// One of the great things about HDInsight is that under the covers, it has the same capabilities as...

Date: 04/22/2014

Customizing HDInsight Cluster provisioning

In my last blog, I discussed how we can specify Hadoop configurations for a job on an HDInsight...

Date: 04/15/2014

Using Apache Flume with HDInsight

Gregory Suarez – 03/18/2014 (This blog posting assumes some basic knowledge of Apache Flume)...

Date: 03/18/2014

Next>