Category Archives: Business Intelligence

Because business should be, but usually needs our help

Consumption Based Architecture for Modern Data Analytics

Throughout many years of working with BI solutions and data warehouse solutions, we have strived to put all the data in one location so it could be easily consumed by reporting and analysis tools from SQL Server Reporting Services to Microsoft Excel. We have followed the dimensional modeling processes promoted by Ralph Kimball and others. These techniques were developed to turn relational data platforms into viable and well-performing reporting platforms. They worked.

Throughout the years, I have built many star schemas, enterprise data warehouses, and reporting databases based on these techniques. However, they were not without their flaws. In the early days, it would take years to create the enterprise data warehouse. When it was done, the business had already moved on. So, we started creating data marts which were departmentally focused. This allowed us to shorten the development lifecycle to more quickly meet the needs of the business. But the speed of business continued to outpace IT’s ability to deliver effective BI solutions when they were needed.

During this same time, the worlds most ubiquitous BI tool, Microsoft Excel, ran more and more businesses, both large and small. Why was Excel so popular? It allowed the user, notExcel-2013-Icon_thumb.png IT, to do the analysis and produce results when the business needed them. Self-service BI is not new, we just refuse to accept Excel as a “real” BI tool in the industry. Inevitably, no matter how good your BI or reporting tool is, users want to know, “How do I export this to Excel?” I found it humorous that Microsoft suffered the same problem with Power BI. The ability to extract the data is hugely important to users. Why? So they could do this on their own. They feel empowered with Excel.

In today’s world, we are also witnessing a shift to a more mobile, tech savvy group of users. As my teenage and young adult children begin to enter the workforce, I still see a disconnect with enterprise BI solutions in most cases. They want the data at their fingertips and easily consumable to solve the question they have now, not in three weeks. That brings us to the architecture I have been promoting for a few years. It’s not new, but I needed a way to talk about modern data and BI solutions that focused on one the most significant needs in the business – consumable data.

The Consumption Based Architecture is based on the following key concepts:

  1. Keep the data close to the source
  2. Data interfaces should be easy to use
  3. Modern, in-memory tools make this possible

Keep the data close to the source

In the normal enterprise data warehouse solution, we process data to clean it up, reshape it, and generally make it “better”. However, the reality is that the users don’t see it as sourceoftruthbetter if it does not match the source. The only “source of truth” that matters in reality is the one that is closes to the data entry as possible. If that data is wrong, it needs to be corrected there. Why? Because users will always verify that the data warehouse or any BI solution is correct by checking the system of record.

Besides the source of truth issue, any time data is transformed or shaped differently than the source, documentation and maintenance are required. In most cases, documentation is lacking in data warehouse solutions. It’s hard and changes often. Furthermore, the amount of maintenance required to support a traditional data warehouse solution is a problem. When the business needs to change a field or the source changes, how do you measure how long it will take to get that in the data warehouse? Days? Weeks? Months? Years? By the time we have the change in place, tested and ready for use, the business has moved on.

In Consumption Based Architecture, the goal is to move the data as needed and transform it as little as possible. Operational Data Stores which are replicas or copies of the data from the source systems are the best mechanisms to move the data to a reporting area. Only transform if absolutely necessary. In most cases, our transforms will involve data type cleanup or other in cases where fields have changed use over time maybe adding a column to clarify the data. Each time you move data or transform data, it should be to make the data more consumable.

The primary exception to this rule is dimensional data. Dimensional data has a special place in the architecture. While it is not necessary to transform this data, the overall solution is improved by combining similar dimensions and using natural keys for relationships. For example, if you have a customer record in CRM and shipping data, you can create a “golden record” or conformed dimension which contains both natural keys. This will allow you to cross reference data easily in both solutions. Master data solutions help with this but are not required to be successful in this architecture.

Data Interfaces Should Be Easy

Having done a lot of work in the application development field where interfaces are used to simplify programming, I think we should have a similar concept in working with data. We have dabbled in the area for some time. I know that I have used views with schemas (or users in Oracle) that were specifically designed to support SQL Server Analysis Services. We used views to support a consistent data set to the cubes for processing and allow changes in the back end data as needed. The views operated as an interface between the data warehouse and the cube.

In a similar fashion, these interfaces need to be identified and used throughout the Consumption Based Architecture. If you are referencing a relational data structure, views continue to make sense. They allow consumers to interact with the data in a known fashion. They are also logical constructs which can be deprecated with a timeline for users to move off of them to the newer versions.

Analytic models such as those found in Power BI, Qlik, and SQL Server Analysis Services create a similar tool for consumers. Usually those models can be used in the tool itself or even in other tools to produce reports and dashboards. Depending on how the model is viewsmodvirtcreated, they are often a table based view of data. For instance, Power BI can turn a folder of files into a table structure for easy consumption in Power BI reports. Power Pivot models created in Excel can be shared in SharePoint and in Power BI. These are just a few examples of using analytic models as interfaces.

The third option is data virtualization. This tends to be fairly expensive. The two that I am aware of are Cisco Data Virtualization and RedHat OpenShift virtualization. The concept of data virtualization is perfect for Consumption Based Architecture; however, I have not seen this used much due to cost implications. Microsoft may be changing the landscape of data visualization. During PASS Summit 2016, they announced expanding the use of Polybase to reference other data sources such as Oracle and Teradata. Currently, Polybase supports Hadoop data, but this change could allow it to become a virtualization tool that is cost effective (included with SQL Server) and simple to use.  I will be reviewing these tools in a later post as I get more information on them.

Modern, In-Memory Tools

The most significant technological improvement in the past few years to make this architecture really possible is in-memory data tools. While this revolution has occurred in more technologies than Microsoft, Microsoft tools are what I am most familiar with. In the Microsoft arena, this started with Power Pivot in Excel. The ability to mashup various data sources using in-memory models is awesome.

When Power Pivot came on the scene, I was doing a lot of work with SQL Server Analysis Services cubes. While these cubes provided a great analytic layer, they were very fragile in my experience. However, Power Pivot allowed us create better performing models more quickly. At that point, I knew a shift was coming. Two of the biggest pain points with cube design were tackled – speed of development and ease of use. We could solve problems quicker and easier with Power Pivot.

speed-and-ease

Since the release of Power Pivot, Microsoft has also improved their overall in-memory solution set by adding SQL Server Analysis Services Tabular Models and OLTP in-memory and columnstore functionality in SQL Server. These improvements continue to make it easier than ever to build out consumable models in memory. You can now use columnstore in SQL Server without an analytics model, simplifying your architecture without performance penalties. With the release of SQL Server 2016, Microsoft has created a data engine capable of in-memory OLTP to improve transactional loads while also supporting columnstore indexes for reporting loads in the same database. While your mileage may vary on implementation, it is easy to see that Consumption Based Architectures are best able to take advantage of these advances and making data more easily accessible and consumable to our business users.

What’s Next

Over the next few weeks, I will be digging in on some of the topics that support Consumption Based Architecture. The goal is to help you begin to take advantage of this architecture in your business and build out a flexible, easily consumed data and analytics platform.

cba-featured-pic

Upcoming Topics:

  • Interface Layers
  • Data Dictionaries
  • Dimensional or Master Data
  • Moving and Transforming Data
  • Modeling in MSBI
  • Reporting with Consumption Based Architecture

Power BI Is Finally in the Azure Trust Center

With the most recent announcement of Power BI’s inclusion in the Azure Trust Center, it is a good time to review where we are today with Power BI security and compliance as it relates to various customer needs. I do a lot of work with financial, energy, and medical customers. These groups represent a large amount of compliance and regulation needs. I wanted to understand where we are today and this announcement is significant.

What’s in the Announcement?

One the primary roadblocks to accepting the Power BI service has been the lack of compliance and concerns around security. Microsoft has been making a number of enterprise level improvement to the Power BI service and desktop. Power BI now has the following compliance certifications:

PowerBI Compliance 2016

This announcement shows Microsoft’s continued commitment to security and compliance in its cloud based products. While Power BI is not yet to the level of Office 365, some key compliance areas are now covered.

I think the most significant compliance certification is HIPAA/HITECH which removes barriers related for the medical industry. As hospitals, insurance companies, and providers scramble to meet reporting demands from their customers and the government, Power BI gives them a flexible reporting and visualization platform to meet those needs. It will empower self-service in the organizations and departmental or enterprise collaboration with data. The HIPAA/HITECH certification will allow them to use the platform with more confidence and security.

Beyond medical, more institutions will be able to rely on Power BI in a manner that is compliant and safe. As Microsoft continues this journey with Power BI and its other Azure based offerings, customers will be able to react more quickly to the changing business and regulatory environments with confidence in the security and management of their data.

The Reality – You Are as Secure as You Choose to Be

Even with this significant move by Microsoft, you are still responsible for implementing a secure, compliant solution. Microsoft is merely providing tools that are secure and will comply with regulations if implemented correctly. The key to a secure environment will always be you. The data you use and analyze with Power BI is ultimately your responsibility.

I encourage you to review the following resources in addition to the ones above as you determine your security and compliance within the Power BI product:

 

Excel Tip #30: Excel Services Visual Limitations – Displaying Images

As I mentioned in my original post, Exploring Excel 2013 as Microsoft’s BI Client, I will be posting tips regularly about using Excel 2013 and later.  Much of the content will be a result of my daily interactions with business users and other BI devs.  In order to not forget what I learn or discover, I write it down … here.  I hope you too will discover something new you can use.  Enjoy!

Introducing Brian Wright – Guest Blogger

Brian Wright

Today, I am happy to announce that Brian will be joining DataOnWheels as a guest blogger. I have worked with Brian over the past couple of years and his Excel visualization skills are great. I look forward to his contributions to the Excel Tips series and other BI related topics. Thanks Brian.

Hello Data on Wheels Readers! Let’s start this blog post by letting you know a few things about myself. First, I am not a professional writer, blogger, or ever social guru, but I am passionate about what I do. I love data visualization. Watching boring data come to life in a visual report or dashboard is my “thing”. Secondly, when things don’t work the way I think they should, I become obsessed in finding out a way around it.

Images Are Not Displayed in Excel Services

That is what leads us to this blog post today. In the limited environment I work within, Excel Services is used quite often in our BI suite of tools. When I realized that the ever so important images I was adding in my Excel workbooks would not show on Excel Services, my obsession kicked in.

Here is the trick or hack. (Using the word hack makes me look much cooler in my kid’s eyes). Wherever you want your picture within your workbook, simply add a chart. Yes, you read correct, simply add a chart.

clip_image002

Using Charts to Display Images

The trick here is not to link the chart to any type of data at all. Just leave it blank. Right Click on the blank chart and navigate to “Format Chart Area”. Navigate to the fill area and select “Pattern or Texture Fill”.

Next, click on the File Button and select your image. Your image will now show as a background image in your chart. Save and then voila!

clip_image004

Once Excel Services displays your workbook, you will be pleasantly surprised to see your image right where you want it!

Boston BI User Group Meeting–Dashboard Design with Microsoft: Power BI vs Datazen (10/13/15)

Boston BI User Group

Thanks for joining Anthony Martin (@SQLMartini) and I at the Boston BI User Group Meeting in October. During the session, we demo’d and built dashboards in Power BI Desktop and Datazen Publisher.

Power BI

image

www.powerbi.com

Couple of thoughts from our demo:

  • Power BI is a way to get data, model data, and visualize it
  • Power BI Desktop allows you to work with data on your PC
  • Power BI is experiencing a lot of change – for example over 40 changes were applied in September 2015
  • Power BI has an open API that allows you to customize the experience

Datazen

image

www.datazen.com

Couple of thoughts from our demo:

  • Design first scenario – make it look good, then shape data to match
  • Datazen publisher allows us to create dashboard for many different profiles
  • Datazen handles custom shapes

Additional Training from Pragmatic Works

Questions from the Session

Can you use links in Datazen to support drillthrough?

Yes. You can find more information here: Drill-throughs to Other Dashboards or Custom URLs.

Power BI API Development

You have the ability customize Power BI. Check out the contest winners to get some ideas of what is possible.

image

Details about the solutions can be found on the Power BI blog: http://community.powerbi.com/t5/Best-Visual-Contest/con-p/best_visual_contest/tab/entries.

You can find more about custom visuals here: https://powerbi.microsoft.com/en-us/custom-visuals.

Thanks again for joining us.

SQL Saturday #437–Boston BI Edition 2015–You Can Still Analyze Data with T-SQL

image3

Thanks for attending my session on analyzing data with TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

The presentation can be found here: Analyzing with TSQL

The code was put into a Word document that you can get here: Code to support the analysis with TSQL Sessions

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources: