Building My HDInsight Server Cluster

24 04 2013

After all the hype about Big Data, Hadoop, and now HDInsight, I decided to build out my own big data cluster on HDInsight. My overall goal is to have a cluster I can use with Excel and Data Explorer.  After all, I needed more data in my mashups. I am not going to get into the details or definitions of Big Data, there are entire books on the subject.  I will discuss any issues or tidbits during the process while I am here.

Setting Up the Environment

I am actually doing this on a VM on my Windows 8 laptop.  I created a Windows 2012 VM with 1 GB of RAM and 50 GB of storage.  (Need some help creating a VM in Windows 8, check out my post on the subject.

Installing the HDInsight Server

First, this product is still in Preview at the time of this writing, so mileage will vary and likely change over the next few months.  You will find the installer at http://www.microsoft.com/web/gallery/install.aspx?appid=HDINSIGHT-PREVIEW.  This uses the Microsoft Web Platform Installer.  When prompted I just ran the installer.  This took about one hour to complete on my VM setup. Once it completed, it opened up the dashboard view in IE.

image

At this point we have installed a cluster called “local (hdfs)”.

Exploring My Local Cluster

Well, things did not go well at first.  Whenever I clicked the big gray box to view my dashboard, I received the following error: “Your cluster ‘local (hdfs)’ is not responding.  Please click here to navigate to cluster.”  I clicked “here” and ended up on a IIS start page.  Not really effective.  Let the troubleshooting begin.

Based on this forum issue response, I opened the services window to find that none of my Apache Hadoop services were running after a restart AND they were set to manual.  To resolve this I took two steps.  First, I changed all of my services to run automatically.  This makes sense for my situation because the VM would be running when I wanted to use HDInsight.  Second, I used the command line option to restart all of the services as also noted in the forum post above.

From a command prompt execute the following code to restart all Hadoop services:

c:\hadoop\start-onebox

And, VOILA!, my cluster is now running.

image

Maybe we can get a better error message next time.

At this point I walked through the Getting Started option on the home screen and proceeded to do “Hello World”.  I used these samples as intended to get data in my cluster and start working with the various tools.  Stay tuned for more posts in the future on my Big Data adventures.

Why Not HDInsight Service on Azure?

The primary reason I did not use the HDInsight Service on Azure was that I did not want to risk the related charges.  Once I have a good understanding of how HDInsight Server works, I will be more comfortable working with HDInsight Service.

Other Resources

Here are some of the resources I used throughout the build.

HDInsight Service Quick Start and Tutorials

Getting Started With Microsoft HDInsight





Are You Signed Up for 24 Hours of PASS–Business Analytics?

29 01 2013

If you have not signed up for the 24 Hours of PASS-Business Analytics you should be.  This is a great chance to hear 12 speakers (they will be repeated in the following 12 hours).  Topics are varied from Big Data to Strategy to Collaboration.  Most importantly you24 Hours of PASS Business Analytics can’t beat the price to hear speakers like Denny Lee, Peter Meyers, and Stacia Misner to name a few.

I get the privilege of moderating two of the sessions: Session 8:  What Is Big Data? by Mark Whitehorn and Session 10: Visualizing Data with Power View by Sean Boon.

Finally, I heard Marc Reguera talk about how Microsoft Finance uses Power View at a different event.  If you want to see Power View put into practical use by a business user, I highly recommend you check out his session.  I think it is the final piece of the puzzle to join the technology with the business.

I hope you all take the opportunity to join us for this compelling and free event preview to the PASS Business Analytics Conference in Chicago on April 10-12, 2013.





Why I am excited about SQL Server 2012 (Part 2)

28 03 2012

Earlier this month I published a blog entry on this same sumagenic-custom-soltionsbject.  In honor of the local Minneapolis launch event, I decided to expand the list.  You can find five more reasons I am excited out on Magenic’s blog.

Here is the link and enjoy SQL Server 2012.

http://magenic.com/Blog/WhyIAmExcitedaboutSQLServer2012Part2.aspx








Follow

Get every new post delivered to your Inbox.

Join 787 other followers

%d bloggers like this: