Category Archives: Speaking

Minnesota BI User Group – Powering Up HDInsight with Power BI (December 2015)

On Wednesday, December 16, I presented on this topic at the Minnesota BI User Group.  This session is based on five blog posts that I created in August 2015.

You can find the presentation here: Powering Up HDInsight with Power BI (pdf).

The details can be found in the blog posts noted below:

HDInsight-Series-Featured-Pic_thumb

Setting Up and HDInsight Cluster (No Scripts Required)

Exploring the Microsoft Azure HDInsight Query Console (No Scripting Required)

Uploading Files to an HDInsight Cluster (No Scripting Required)

Using Power BI with HDInsight Part 1: Power Query and Files

Using Power BI with HDInsight Part 2: Power BI Desktop and Hive

My goals for this series

1. Document using Power BI with HDInsight

2. Prove that you can set up a HDInsight Cluster with no scripts

Other References from the Session

Azure: http://azure.microsoft.com/en-us/

Cloud Berry: http://www.cloudberrylab.com/free-microsoft-azure-explorer.aspx

 

Thanks for attending my session.

Advertisements

SQL Saturday #437–Boston BI Edition 2015–You Can Still Analyze Data with T-SQL

image3

Thanks for attending my session on analyzing data with TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

The presentation can be found here: Analyzing with TSQL

The code was put into a Word document that you can get here: Code to support the analysis with TSQL Sessions

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

Powering Up HDInsight with Power BI–

On Tuesday, September 15, I presented on this topic for Pragmatic Works. You can find that session here. This session is based on five blog posts that I created in August 2015.

Powering Up HDInsight with Power BI (pdf)https://dataonwheels.files.wordpress.com/2016/02/powering-up-hdinsight-with-power-bi.pdffoundin the

HDInsight Series Featured Pic

Setting Up and HDInsight Cluster (No Scripts Required)

Exploring the Microsoft Azure HDInsight Query Console (No Scripting Required)

Uploading Files to an HDInsight Cluster (No Scripting Required)

Using Power BI with HDInsight Part 1: Power Query and Files

Using Power BI with HDInsight Part 2: Power BI Desktop and Hive

My goals for this series

1. Document using Power BI with HDInsight

2. Prove that you can set up a HDInsight Cluster with no scripts

Other References from the Session

Azure: http://azure.microsoft.com/en-us/

Cloud Berry: http://www.cloudberrylab.com/free-microsoft-azure-explorer.aspx

Wrap Up from the Session

A few questions were asked during the session and I wanted to handle some of them here.

Why did you not use Azure Resource Manager to deploy storage?

I did this as simple as possible and did not need to use the Resource Manager for my demos. However, if you need to rebuild the cluster quickly, the Azure Resource Manager would be a good option. Find out more here: https://azure.microsoft.com/en-us/documentation/articles/hdinsight-provision-clusters/. This site will also walk through scripts and other options for setting up HDInsight clusters.

Why didn’t the table structure show up in the Power Query demo?

The Power Query demo worked with the data from a file approach. This approach is more “raw”. The files did not have column headers, so no headers were created in the table. However, when working with the Power BI Desktop demo, I used Hive. The table was defined in Hive and were easily seen. This is another case for using Hive or something similar to define the schema for ease of use.

What are the differences between Hadoop, Hortonworks, and HDInsight?

Starting from the top, Hadoop is the Adobe open source specification. All of the products listed above are based on Hadoop. Hadoop

Hortonworks and Cloudera are examples of Hadoop distributions. These companies have worked with the various versions of open source technologies around Hadoop and created a supported distribution as a result.

image 

Finally, HDInsight is Microsoft’s cloud-based Hadoop implementation. They continue to add functionality including Spark, R, Giraph, and Solr. You can expect Microsoft to continue to grow the capabilities of HDInsight as part of their cloud-based analytics solutions.

Thanks for attending my session.