Category Archives: Pragmatic Works

SQL Saturday #492 Follow Up – A Window into Your Data

sqlsat492_web

Thanks for attending my session on window functions in TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

Questions and Comments

  1. Does RATIO_TO_REPORT exist in SQL Server? It is in Oracle.
    • Currently this function is not available in SQL Server
    • Here is the equivalent functionality using existing functions in SQL Server:
      • OrderAmt / SUM(OrderAmt) OVER (PARTITION BY OrderDate)
      • This example can use the source code I have referenced below. It uses the current value as the numerator and the sum by partition as the denominator. While not a simple function, the equivalent is still fairly simple using window functions to help.
  2. Demo issues with Azure SQL Database
    • During the session I ran into an issue with Azure SQL Database. It turns out that the following two functions are not supported there.
      • PERCENTILE_CONT
      • PERCENTILE_DISC

Slides, Code, and Follow Up Posts

The presentation can be found here: A Window into Your Data

The code was put into a Word document that you can get here: TSQL Window Function Code

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

Check Out Josh’s Blog on Creating Percentile in DAX

I would like to introduce a “new” blogger to you. Joshuha Owen has restarted his blog and will be covering topics on business intelligence and data. I have worked with Josh for years at a Magenic and now Pragmatic Works. I look forward to seeing what he will be writing about in the future as well on Bits, Bytes, and Words.

Here is his most current post. Enjoy!

Replicating Excel Percentile in DAX

Currently, DAX has no native percentile function so if you want to replicate a version that matches what the Excel Percentile.INC (inclusive) function does you have to jump through a few hoops. This will involve having to create several measures to hold some intermediate values to apply a final formula. In theory you could do it all in one DAX expression but it would very difficult to read and test.

Continue reading…

Intro to Data Factory–Training on the T’s Follow Up Post

PragmaticWorks-LogoThis is a follow up blog post based on the Intro to Data Factory session I gave on the Training on the T’s with Pragmatic Works. Find more free training from the past and upcoming here. I did my session on January 13, 2015.

 Intro To Data Factory

In this session, I gave a simple introduction to new Azure Data Factory using a CopyActivity pipeline between Azure Blob Storage and Azure SQL Database. Below is a diagram illustrating the factory that is created in the demo.

image

I have published my presentation materials here. This includes the sample JSON files, the Movies.csv, and PowerShell scripts.

Q & A

Here are a few questions that were answered during the session.

1. Does Availability refer to when data that has been transferred will be available? Or when the data source is actually available for query?

Availability refers to when the datasets will make a slice available. This is the when the dataset can be consumed as an input or be targeted as an output. This means you can consume data hourly but choose to push it to its final destination on a different cadence to prevent issues on the receiving end.

2. What pre-requisites are must haves?…e.g.(Azure account, HDInsight, Blob Storage Accounts, etc.)

    • An Azure Account is the only real must have. You could use two on premise SQL Server instances.
    • HDInsight if you want to use the HDInsight activitities
    • An Azure Storage account to use blob or table storage

3. How do you decide to use a Factory or Warehouse?

The factory is more of a data movement tool. A warehouse could be a source or target of a factory pipeline.

4. Is this similar to SSIS in SQL Server?

Yes and no. SSIS is definitely more mature and has more tooling available such as data sources and transformations. SSIS also have a good workflow constructor. The focus of the Data Factory initially was to load HDInsight tables from a variety of sources with more flexibility. The other note here is that Data Factory is being built from the ground up to support the scale of the cloud or Azure.

5. Can this be used for Big Data?

Absolutely. I would say that it is one of the primary reasons for the tool. In reference to the previous question, it will likely be the tool of choice for big data operations because it will be able to scale with Azure.

Links to Additional Resources on Data Factory or tools that were used in the presentation:

Azure Data Factory on Azure’s Website

Azure Data Factory Documentation

Azure Data Factory Pricing

Azure Storage Explorer

Azure PowerShell Documentation

Thanks for joining me for this presentation. We look forward to seeing you at the next Free Training on the T’s.