Category Archives: SQL Saturday

SQL Saturday #796 – Minnesota, 2018

sqlsat796_header

First, many thanks to the SQL Saturday and MNPass team for putting on another great event and letting me participate.

I co-presented with Joshuha Owen (@JoshSQL) in a precon on Modern Enterprise Data Warehousing on Azure. Thanks to those who attended and participated in the conversations around changing the way we implement data warehouse capabilities in Azure. Josh and I will be talking more about this in the future.

Now, those of you who attended my Saturday presentation on Consumption Based Architecture, I wanted to get you the slide deck and reference materials here. Thanks again for attending.

The slide deck from the session is here.

cba-featured-pic

I also have blog post around this topic: Consumption Based Architecture for Modern Data Analytics. Feel free to join the conversation there around this.

ERPs and the Consumption Based Architecture Conversation

One of the key topics that came up during the session was related to handling ERPs with minimal change. The key issue surrounding ERP solutions is with the data structure in those systems. Whether you work with SAP, JackHenry, or Dynamics, you have a situation where the data model is very complex and definitely not user friendly. In Consumption Based Architecture, we try to minimize data transformation and reshaping, but ERP solutions are by nature cryptic and complex. By definition, they are not consumable. So in the consumable space, we typically recommend using the vendor supplied solutions such as JHKnow, SAP BW and so on. These solutions provide a vendor managed interpretation of the data in the ERP for reporting and other solutions.

Security in this Architecture

The question was raised during the session around how to secure this. This does not have a simple answer. Each solution may have implemented security differently. For instance, an Oracle database may use user names and not have AD integration. This means that you need to determine how to secure your consumable space. For instance, if you pick Azure Active Directory, you would move data to AAD compliant structures in Azure such as Azure SQL Database, Azure SQL Datawarehouse, and Azure Databricks. This means you might need to use a tool like Goldengate with CDC to update a SQL DB which you can apply security to. This will allow you to centralize security for your consumable data. You will need to plan for security in whatever you do and create what you need to support it.

Thanks again everyone for joining us at SQL Saturday.

 

 

Advertisements

Do You Need a Relational Data Warehouse?

Are you looking to do a major update to your data warehouse or looking to modernize? Many technologies have come about that are changing the landscape of what data warehouses are made of. In this Azure Every Day session, I’d like to talk about 3 new technologies in Azure and HDInsight that break the rules.

1. HDInsight (with Spark in particular)

You don’t have to use SSIS to get ETL into your big data storage. HDInsight with Spark can give us ETL, as well as bring to light a lot of machine learning and other technologies, so we no longer have a dependency on SSIS. Earlier this year, Microsoft released Azure Databricks which improves of what is in HDInsight with expanded Spark capabilities.

2. Azure Data Lake

This is a great place to store your data, and no, it won’t cost you a lot of money and it’s not hard to work with. Azure Data Lake gives you the ability to store all your data, regardless of where it comes from or how it looks, in that space – including real time data.

Have you ever considered putting real time data into your data warehouse? Streaming data into your data warehouse would break everything we consider conventional. So, don’t put it in a data warehouse, move it into a Hadoop structure in Azure Data Lake, and avoid the issue in a structured data warehouse of: I got it in there, how do I get it out?

Check out Azure Data Week coming in October 2018

3. Interactive Hive and Spark SQL (also part of HDInsight)

The everyday report writers and users are not going to learn how to do map reduce or all those other technologies that make us the cool big data/data science people. These users just need to build a report.

These technologies bring us to a place where we can write SQL against those data structures and not care where the data came from, how it got there, or the type of data format it’s in. We apply the schema after the fact and it means the same data that we put in our data lake, can be used in multiple scenarios.

SQL Saturday – Dallas – May 2018

sqlsat-dallas-2018I was able to present at SQL Saturday Dallas this year. Thanks to those of you who were able to attend. As I noted in the meeting you can find details related to Power BI Data Security in the following posts on my site.

Power BI Is Finally in the Azure Trust Center

Power BI Data Security – Sharing in Email

Power BI Data Security – Sharing

Power BI and Data Security – App Workspaces and Power BI AppsPower BI Security Logo

Power BI and Data Security – Free User’s Cannot Share, Read Only in Premium

Power BI and Data Security – Row Level Security (RLS)

Power BI and Data Security – Data Classification and Privacy Levels

Power BI and Data Security – On-premises Data Gateway

Power BI and Data Security – Sharing Data

Power BI and Data Security – Compliance and Encryption

I have also added the presentation here if you want to review it as well.

Thanks again for joining me in Dallas.

SQL Saturday #492 Follow Up – A Window into Your Data

sqlsat492_web

Thanks for attending my session on window functions in TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

Questions and Comments

  1. Does RATIO_TO_REPORT exist in SQL Server? It is in Oracle.
    • Currently this function is not available in SQL Server
    • Here is the equivalent functionality using existing functions in SQL Server:
      • OrderAmt / SUM(OrderAmt) OVER (PARTITION BY OrderDate)
      • This example can use the source code I have referenced below. It uses the current value as the numerator and the sum by partition as the denominator. While not a simple function, the equivalent is still fairly simple using window functions to help.
  2. Demo issues with Azure SQL Database
    • During the session I ran into an issue with Azure SQL Database. It turns out that the following two functions are not supported there.
      • PERCENTILE_CONT
      • PERCENTILE_DISC

Slides, Code, and Follow Up Posts

The presentation can be found here: A Window into Your Data

The code was put into a Word document that you can get here: TSQL Window Function Code

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

SQL Saturday #486 Richmond – A Window Into Your Data

 

sqlsat486_web

Thanks for attending my session on window functions in TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

Questions

  1. Can an OVER clause be used in the WHERE clause?
    • No. The OVER clause can only be used in SELECT and ORDER BY clauses.
  2. Some follow up on ROWS and RANGE with context to CURRENT ROW.
    • We had a lot of discussion around this. In our examples below, RANGE aggregated all the data that fit into the ORDER BY clause. ROWS only referenced the row it was in. So, RANGE looks at everything that meets the criteria established by the PARTITION BY and ORDER BY clauses. ROWS is bound to the physical row.
    • Code examples:
      • OVER (PARTITION BY CustomerName ORDER BY OrderDate RANGE CURRENT ROW)
        • Summed two rows of data for the customer with the date. Both rows had the same date.
      • OVER (PARTITION BY CustomerName ORDER BY OrderDate ROWS CURRENT ROW)
        • Each row only contained the data for the row it was in.

Slides, Code, and Follow Up Posts

The presentation can be found here: A Window into Your Data

The code was put into a Word document that you can get here: TSQL Window Function Code

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources: