Let there be Windows Azure HDInsight
Windows Azure HDInsight Service, formerly known as Hadoop on Windows Azure, is now available inside the Windows Azure Preview portal. Hadoop-based big data tools are what I call the WMD(P), or Weapons of Mass Data Processing. (You heard it here first!) This is a very exciting development, and I would like to take a moment to recognize the great work our HDInsight team has done to pull this off.
Signing up
1. To sign up, log into your Windows Azure account at https://www.windowsazure.com. At the bottom of the portal, click on New (+).
2. Then click on DATA SERVICES followed by HDInsight. Use the preview program link to navigate to sign up page.
At the Preview features page, also accessible from https://account.windowsazure.com/PreviewFeatures/, click on try it now next to Azure HDInsight Preview. Please be warned that this might take up-to a few days.
Getting Started and more Learning Content
Overview presentation
Please provide feedback on this presentation to me via twitter: @wenmingye
Check out a copy from: https://github.com/WindowsAzure-TrainingKit/PRESENTATION-WindowsAzureHDInsight download using the zip link.
Or a download from a direct link: https://bit.ly/YPstrp
Hands-on Lab
https://github.com/WindowsAzure-TrainingKit/HOL-WindowsAzureHDInsight/blob/master/HOL.md
New channel 9 Video Series
https://channel9.msdn.com/Series/Getting-started-with-Windows-Azure-HDInsight-Service/
Direct links to the Windows Azure Portal Documentation Section
Quick Start
Visit these articles first to get started using the HDInsight Service.
Tutorial: Getting Started with the Windows Azure HDInsight Service
Tutorial: Using MapReduce with HDInsight
Tutorial: Using Hive with HDInsight
Tutorial: Using Pig with HDInsight
Explore
Guidance: Introduction to the Windows Azure HDInsight Service
A conceptual overview of the components of HDInsight.
Tutorial: Get Started with the HDInsight Service
Learn the fundamentals of working with the HDInsight service.
Plan
Guidance: What version of Hadoop is in Windows Azure HDInsight?
Learn what Hadoop components and versions are included in HDInsight.
Guidance: HDInsight Interactive JavaScript and Hive Consoles
HDInsight comes with interactive consoles for JavaScript and Hive that you can use as an alternative to remoting into the head node of a Hadoop cluster. Learn about how you can use the consoles to enter expressions, evaluate them, and then query and display the results of a MapReduce job immediately.
Upload
How to: Upload data to HDInsight
Learn how to upload and access data in HDInsight using Azure Storage Explorer, the interactive console, the Hadoop command line, or Sqoop.
Analyze
Tutorial: Connect Excel to Windows Azure HDInsight via HiveODBC
One key feature of Microsoft’s Big Data Solution is solid integration of Apache Hadoop with Microsoft Business Intelligence (BI) components. A good example of this is the ability for Excel to connect to the Hive data warehouse framework in the Hadoop cluster. This topic walks you through using Excel via the Hive ODBC driver.
Tutorial: Simple recommendation engine using Apache Mahout Please provide feedback to @wenmingye on twitter.
Apache Mahout is a machine learning library built for use in scalable machine learning applications. Recommender engines are some of the most immediately recognizable machine learning applications in use today. In this tutorial you use the Million Song Dataset site and download the dataset to create song recommendations for users based on their past listening habits.
Tutorial: Analyzing Twitter Data with Hive Please provide feedback to @wenmingye on twitter.
In this tutorial you will query, explore, and analyze data from Twitter using the Apache Hadoop-based HDInsight Service for Windows Azure and a complex Hive example.
Tutorial: Using HDInsight to process Blob Storage data and write the results to a SQL Database
This tutorial will show you how to use the HDInsight service to process data stored in Windows Azure Blob Storage and move the results to a Windows Azure SQL Database.
Manage
How to: Administer HDInsight
Learn how to create an HDInsight cluster and use the administrative tools available through the Windows Azure Management Portal.
How to: Monitor HDInsight
Learn how to monitor an HDInsight cluster and view Hadoop job history through the Windows Azure Management Portal.
How to: Deploy an HDInsight Cluster Programmatically
Learn how to use the Windows Azure service management REST API to create, list, and delete HDInsight clusters programmatically.
How to: Execute Remote Jobs on Your HDInsight Cluster Programmatically
Learn how to use the WebHCat REST API to provide metadata management and remote job submission to your Hadoop cluster.
Coming Soon
Other contents are coming in the next few days to 2 weeks.