SQL Saturday Baton Rogue Recap

Thank you to everyone who came out and saw me present at the SQL Saturday in Baton Rogue! It was a blast to see smiling faces and not be the only one laughing at my jokes haha. HUGE thank you to the team of volunteers who made this event possible and to the sponsors for enabling such an amazing event.

My presentation of Power BI Meets Programmability had 58 people in attendance, the largest crowd of the whole weekend! Out of that came many great conversations about the potential shift in IT teams from separating reporting teams from application development to playing to each other’s strengths to deliver sustainable and scalable reporting to the end-users of your application. As promised, below is all the code used in the session. Thank you again to everyone who attended, and happy coding!

using System;
using Microsoft.AnalysisServices.Tabular;

namespace TOM_Testing_Live
{
        internal class Program
        {
        static void Main(string[] args)
        {
//-------------- Connect to PBI Premium Workspace ---------------- //

            // create the connect string - powerbi://api.powerbi.com/v1.0/myorg/WORKSPACE_NAME
            string workspaceConnection = "powerbi://api.powerbi.com/v1.0/myorg/Happy%20Coding";
            string connectString = $"DataSource={workspaceConnection};";

            // connect to the Power BI workspace referenced in connect string
            Server server = new Server();
            server.Connect(connectString);


//---------------- List out current state before we make changes -------------------//

            // enumerate through datasets in workspace to display their names
            foreach (Database database in server.Databases)
            {
                Console.WriteLine($"ID : {database.ID}, Name : {database.Name}, CompatibilityLevel: {database.CompatibilityLevel}, Last Updated : {database.LastSchemaUpdate}");
            }

            // enumerate through tables in one database (use the database ID from previous step)
            Model model = server.Databases["5d161e6b-697a-49b6-a3e6-7d19b940a8fd"].Model;

            //if you don't specify a database, it will only grab models from the first database in the list
            foreach (Table table in model.Tables)
            {
                Console.WriteLine($"Table : {table.Name} IsHidden? : {table.IsHidden}");
            }

            // Specify a single table in the dataset
            Table table_product = model.Tables["Product"];

            // List out the columns in the product table
            foreach (Column column in table_product.Columns)
            {
                Console.WriteLine($"Columns: {column.Name}");
            }

            // List out the measures in the product table
            foreach (Measure productmeasures in table_product.Measures)
            {
                Console.WriteLine($"Measures: {productmeasures.Name}");
            }

            // List of translations on the model
            foreach (Culture culture in model.Cultures)
            {
                Console.WriteLine($"Existing Culture: {culture.Name}");
            }

            // List out the hierarchies in the product table
            foreach (Hierarchy hierarchy in table_product.Hierarchies)
            {
                Console.WriteLine($"Hierarchies: {hierarchy.Name}, Lineage Tag = {hierarchy.LineageTag}");
            }

            // List out the levels in our category hierarchy
            if (table_product.Hierarchies.ContainsName("Category Hierarchy"))
                {
                Hierarchy hierarchy_category_ = table_product.Hierarchies["Category Hierarchy"];
                foreach (Level level_categoryhierarchy in hierarchy_category_.Levels)
                {
                    Console.WriteLine($"Category Hierarchy Level {level_categoryhierarchy.Ordinal}: {level_categoryhierarchy.Name} Lineage Tag: {level_categoryhierarchy.LineageTag} from {level_categoryhierarchy.Column.Name}");
                }
            }
            else
                {
                Console.WriteLine($"No Category Hierarchy");
                }


        
 //----------------Deleting columns and measures ----------------//
   /*         
           // Deleting a column if it exists
             if (table_product.Columns.ContainsName("Testing"))
             //this looks to see if there is a column already named "Testing"
             {
                 Console.WriteLine($"Column Exists");
                 table_product.Columns.Remove("Testing"); //if the column exists, this will remove it
                 Console.WriteLine($"Column Deleted");
             }
             else
             {
                 Console.WriteLine($"Column Does Not Exist");
             } 

             // Deleting a measure if it exists
             if (table_product.Measures.ContainsName("Test Measure"))
             //this looks to see if there is a measure already named "Test Measure"
             {
                  Console.WriteLine($"Measure Exists");
                  table_product.Measures.Remove("Test Measure"); //if the measure exists, this will remove it
                  Console.WriteLine($"Measure Deleted");
              }
              else
              {
                  Console.WriteLine($"Measure Does Not Exist");
              }

*/
//--------------- Adding columns and measures --------------------//            

              // Adding our column if it doesn't already exist
                if (table_product.Columns.ContainsName("Testing")) //this looks to see if there is a column already named "Testing"
                {
                    Console.WriteLine($"Column Exists");
                    //table_product.Columns.Remove("Testing"); //if the column exists, this will remove it
                    //Console.WriteLine($"Column Deleted");
                   // Column column_testing = new DataColumn() //this will add back the deleted column
                    //{
                  //      Name = "Testing",
                  //      DataType = DataType.String,
                  //      SourceColumn = "Product"
                  //  };
                  //  table_product.Columns.Add(column_testing);
                  //  Console.WriteLine($"Column Re-created!");
                }
                else
                {
                    Column column_testing = new DataColumn() //this will add the column
                    {
                        Name = "Testing",  //name your column for Power BI
                        DataType = DataType.String, //set the data type
                        SourceColumn = "Product" //this must match the name of the column your source 
                    };
                    table_product.Columns.Add(column_testing);
                    Console.WriteLine($"Column Created!");
                }

                //Get the partition sources for each table
                foreach (DataSource datasource in model.DataSources)
                {
                    Console.WriteLine($"Datasource : {datasource.Name}");

                };

                // Add a measure if it doesn't already exist in a specified table called product
                if (table_product.Measures.ContainsName("Test Measure"))
                {
                    Measure measure = table_product.Measures["Test Measure"];
                    measure.Expression = "\"Hello there\""; //you can update an existing measure using this script
                    Console.WriteLine($"Measure Exists");
                }
                else
                {
                    Measure measure = new Measure()
                    {
                        Name = "Test Measure",
                        Expression = "\"Hello World\"" //you can also use DAX here
                    };
                    table_product.Measures.Add(measure);
                    Console.WriteLine($"Measure Added");
                }


//------------------- Manipulating Hierarchies ---------------------//

            //Editing an existing hierarchy originally called Category Hierarchy
            Hierarchy hierarchy_category = table_product.Hierarchies["Category Hierarchy Rename"];
            {
                hierarchy_category.Name = "Category Hierarchy Rename"; //this renames the hierarchy, note the lineage tag will remain unchanged
                Console.WriteLine($"Category Hierarchy Renamed");
             }
            
            //Editing an existing hierarchy level 
              Level level_Category = hierarchy_category.Levels.FindByLineageTag("ca792793-d3c3-4b5d-9bee-2b46c01833bb");
              Level level_Subcategory = hierarchy_category.Levels.FindByLineageTag("4304e645-bb8f-4d7e-b25c-f629be2110d8");
              {
                  level_Category.Name = "Category";
                  level_Category.Ordinal = 0;
                  level_Subcategory.Name = "Subcategory";
                  level_Subcategory.Ordinal = 1;
                  Console.WriteLine($"Category Hierarchy Levels Renamed & Reordered");
              }

            //Adding a new level to the hierarchy if it doesn't already exist
              if (hierarchy_category.Levels.ContainsName("Model"))
              {
                  Console.WriteLine($"Hierarchy Level Exists");
              }
              else
              {
                  Level level_Model = new Level()
                  {
                      Name = "Model",
                      Ordinal = 2,
                      Column = table_product.Columns.Find("Model")
                  };
                  hierarchy_category.Levels.Add(level_Model);
                  Console.WriteLine($"Hierarchy Level Added");
              }
          
 
             //Add a new hierarchy if it doesn't already exist
               if (table_product.Hierarchies.ContainsName("New Hierarchy"))
               {
                   Console.WriteLine($"New Hierarchy Exists");
               }
               else
               {
                   Hierarchy hiearchy_new = new Hierarchy()
                   {
                       Name = "New Hierarchy",
                   };
                   table_product.Hierarchies.Add(hiearchy_new);
                   Console.WriteLine($"Hierarchy Added");
             //Creating levels to add to the new hierarchy
                   Level level_one = new Level()
                   {
                       Name = "Model",
                       Ordinal = 0,
                       Column = table_product.Columns.Find("Model")
                   };
                   Level level_two = new Level()
                   {
                       Name = "Product",
                       Ordinal = 1,
                       Column = table_product.Columns.Find("Product")
                   };
                   hiearchy_new.Levels.Add(level_one);
                   hiearchy_new.Levels.Add(level_two);
                   Console.WriteLine($"Levels added to new hiearchy");
               };

         
//-------------------------- Translations ------------------------------//

//Translations can be used to rename existing columns without rebuilding the model. This also updates any visuals that use that column.          

            // List of translations on the model
            foreach (Culture culture in model.Cultures)
            {
                Console.WriteLine($"Existing Culture: {culture.Name}");
            }
            // Let's get a list of the existing translations within the en_US culture
            Culture enUsCulture = model.Cultures.Find("en-US");

            foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations)
            {
                Console.WriteLine($"Translated Object: {objectTranslation.Value}");
            }
            // Narrow down what column within this culture/language you would like to add the translation to
            MetadataObject dataColumn = table_product.Columns.Find("SKU"); //this needs to always be the original column name within the data model.
            ObjectTranslation proposedTranslation = enUsCulture.ObjectTranslations[dataColumn, TranslatedProperty.Caption];

            // Only one translation per entity per culture.
            if (proposedTranslation != null)
            {
               Console.WriteLine($"Translation Exists for this Culture & Column combo");
               enUsCulture.ObjectTranslations.Remove(proposedTranslation); //need to remove the existing translation to overwrite it
               ObjectTranslation overwriteTranslation = new ObjectTranslation()
               {
                   Object = dataColumn,
                   Property = TranslatedProperty.Caption,
                   Value = "Cool Stuff"
               };
               enUsCulture.ObjectTranslations.Add(overwriteTranslation);
            }
            else
            {
               ObjectTranslation newTranslation = new ObjectTranslation()
               {
                   Object = dataColumn,
                   Property = TranslatedProperty.Caption,
                   Value = "Total Rad"
               };
               enUsCulture.ObjectTranslations.Add(newTranslation);
            }             

           // List out the translations to see what they are now that we have run the script    
           foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations)
           {
               Console.WriteLine($"Final Translated Object: {objectTranslation.Value}");
           }

//------------------- List out end state --------------------------------//

            // List out the columns in the product table one more time to make sure our column is added
            foreach (Column column in table_product.Columns)
            {
                Console.WriteLine($"Columns: {column.Name}");
            }         

            // List out the measures in the product table one more time to make sure our measure is added
            foreach (Measure productmeasures in table_product.Measures)
            {
                Console.WriteLine($"Measures: {productmeasures.Name}");
            }

            // List out the hierarchies in the product table
            foreach (Hierarchy hierarchy in table_product.Hierarchies)
            {
                Console.WriteLine($"Hierarchies: {hierarchy.Name}, Lineage Tag: {hierarchy.LineageTag}");
                foreach (Level level_hierarchy in hierarchy.Levels)
                {
                    Console.WriteLine($"Level {level_hierarchy.Ordinal}: {level_hierarchy.Name}, Lineage Tag: {level_hierarchy.LineageTag} from {level_hierarchy.Column.Name}");
                }
            }

//-------------- Refresh our version of the data model then push/save changes back to the model in the PBI service -------------//
            table_product.RequestRefresh(RefreshType.Full);
            model.RequestRefresh(RefreshType.Full);
            model.SaveChanges(); 
            //make sure this is the last line! Note, this line will not work unless your dataset has proper data source credentials connected in the service

            Console.WriteLine($"Script Complete!");


        }
    }
}

SQL Saturday #492 Follow Up – A Window into Your Data

sqlsat492_web

Thanks for attending my session on window functions in TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

Questions and Comments

  1. Does RATIO_TO_REPORT exist in SQL Server? It is in Oracle.
    • Currently this function is not available in SQL Server
    • Here is the equivalent functionality using existing functions in SQL Server:
      • OrderAmt / SUM(OrderAmt) OVER (PARTITION BY OrderDate)
      • This example can use the source code I have referenced below. It uses the current value as the numerator and the sum by partition as the denominator. While not a simple function, the equivalent is still fairly simple using window functions to help.
  2. Demo issues with Azure SQL Database
    • During the session I ran into an issue with Azure SQL Database. It turns out that the following two functions are not supported there.
      • PERCENTILE_CONT
      • PERCENTILE_DISC

Slides, Code, and Follow Up Posts

The presentation can be found here: A Window into Your Data

The code was put into a Word document that you can get here: TSQL Window Function Code

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

SQL Saturday #437–Boston BI Edition 2015–You Can Still Analyze Data with T-SQL

image3

Thanks for attending my session on analyzing data with TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

The presentation can be found here: Analyzing with TSQL

The code was put into a Word document that you can get here: Code to support the analysis with TSQL Sessions

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

SQL Saturday #453–Minnesota 2015–A Window Into Your Data

image

Thanks for attending my session on window functions in TSQL. I hope you learned something you can take back and use in your projects or at your work. You will find an link to the session and code I used below. If you have any questions about the session post them in comments and I will try to get you the answers.

The presentation can be found here: A Window into Your Data

The code was put into a Word document that you can get here: TSQL Window Function Code

This session is also backed by an existing blog series I have written.

T-SQL Window Functions – Part 1- The OVER() Clause

T-SQL Window Functions – Part 2- Ranking Functions

T-SQL Window Functions – Part 3: Aggregate Functions

T-SQL Window Functions – Part 4- Analytic Functions

Microsoft Resources:

Power Testing ETL with Power BI – Creating the Tests with Power Pivot

PowerTool_1This is the second deep dive into Power Testing ETL with Power BI. At this point, we have created the source table which will be used in our testing. The next step is to bring in the destination table and create the tests that will be “run” against the data. In its simplest form the tests are created using logical conditions based on whether source data matches destination data and calculations applied to those data sets also match. When they don’t match, you have data load error which results in a failed test.

How to Calculate Success and Failure

The basics of the testing is turn the results into numbers and calculate if and how much we succeeded or failed. Typically, every test will result in a 1 or 0. Whether you assign 1 to success or failure is largely dependent on how you plan to display your results. If you plan to use KPIs built into the Power Pivot model, you will be comparing the number of successful tests against the number of rows expected to be imported. The primary reason for this is that you cannot target zero when using KPIs. In this scenario, successful tests result in 1 and are therefore easily compared to the number of expected rows which would be 100% successful if they matched.

The other scenario is to measure failures. In this case, we assign 1 to each failed test and count the number of failed tests. This can easily be handled in visualizations such as conditional formatting where 0 can be displayed as green and the number of failures change the state from from green to yellow then red. This helps identify the most commonly failed tests.

The method you choose is up to you and how you prefer to see the results. We will cover using both variations in visualizations, but for sake of brevity here, we will measure success against our row count. Success = 1; Failure = 0.

Creating the Power Pivot Tests

In order to create the tests, you need to open the Power Pivot window and add the destination table to the model. In our case we have created a table in the HughesMediaLibrary database called books that is our target. Here is the syntax for the target table.

CREATE TABLE dbo.Books(
BookID int IDENTITY(1,1) NOT NULL
CONSTRAINT pk_Books PRIMARY KEY CLUSTERED,
BookName varchar(100) NOT NULL,
Publisher varchar(100) NULL,
Genre varchar(50) NULL,
CopyrightYear smallint NULL,
AuthorFName1 varchar(100) NULL,
AuthorLName1 varchar(100) NULL,
AuthorFName2 varchar(100) NULL,
AuthorLName2 varchar(100) NULL,
AuthorFName3 varchar(100) NULL,
AuthorLName3 varchar(100) NULL,
AuthorFName4 varchar(100) NULL,
AuthorLName4 varchar(100) NULL,
AuthorFName5 varchar(100) NULL,
AuthorLName5 varchar(100) NULL,
PageCount int NULL
)

While I realize this is not a good normalized table, it serves our purposes well to build out the tests. This table needs to be added to the Power Pivot model before we can do the next steps.

Relating the Source and Destination

The next step is to relate the source and destination. In our case, the only data that will work is the book name. We will use the Source table as the primary table in this relationship. The idea is that all the data in the source table should exist in the target. As this is not always the case, the source is the “source of truth” for the testing scenario.

 

Building the Tests

The tests are comprised of calculated columns that handle data analysis and calculated measures which summarize results.

Validating Data Field by Field,  Row by Row

This is the primary reason that we worked with Power BI. One of the most common testing scenarios is whether the data came over correctly. In the previous post, we shaped the data with Power Query. Now we will compare it with the results from our ETL process in SSIS. We will use Book Name as the example. Every field you wish to test can follow this pattern. The test consists of a calculated column and a calculated measure.

We create a column in the destination table called Book Name Matches. (Remember we are tracking success not failures.) In each row of the data we need determine that the book name in the destination is the exact match for the book name in our source. We used the following DAX for that calculation:

=IF(RELATED(‘Booklist Source Fixes'[BookName])=’Media Library – Books'[BookName],1,0)

It looks at the related table to determine that the field names match. If they match, the test is assigned a 1 for that row. If they do not match, a 0 is assigned. (The table names are how I named the source and destination. They may not match your solution if you are following along.) Once we have the rows evaluated, we will sum the values with a Book Name Matches measure:

Book Name Matches (34):=SUM([Book Name Mismatch])

We will use the Book Name Matches (34) measure to compare with the book count. If they match, all tests passed. If they do not, then some or all rows have failed.

The number after the measure, 34, is the test key from TFS. I added this into the measure to make it easier to identify which test case is being evaluated with this measure. In some cases, you may have multiple measures that are required to complete a test. You can either evaluate them independently or create and additional measure that summarizes them for your use.

Other Validations or Tests

Some other basic validations can be created as well. A common one would the book count. In my scenario, I return the book count then evaluate it using a KPI. Another way to do this is to add another measure that checks for equality between the two book count measures in the source and destination. If they match, success. If not, failure.

You can also use measures to validate expected totals the same way we were working with counts. This is particularly helpful in financial data loads where you would want to verify a sum of balances to make sure the results match. The point is that you can add any other measures that you want to compare in order to meet the unique needs of your situation. It is also possible that you can compare to entered values. If you know that 100 widgets are to be imported, you can have the measure evaluate against 100 instead of  a measure in the source.

Recording the Results in TFS

In order to bring the process full circle, we enter test results into TFS or Visual Studio Online. This allows us the ability to track test results, bugs, and fixes in a development lifecycle tool. It is also the best way to track testing history. One caveat here is that the query results from TFS do not permit you to set test results in Excel. Ideally, we should be able to link in the tests with the results. We could then update the results in the query and push it back. This is NOT supported at the moment. As a result, you will need to open the tests in TFS to update your results. This is not a significant issue because you should also create bugs for failed tests. It’s primarily a nuisance.

An added side effect of using this method to test is that we are able to collaborate with developers to determine what the bug actually is. Because all the data is loaded into Excel reviewing results is fairly simple and may actually be easier than trying to look at the destination system.

Quick Look at SSIS

Up to this point, we have focused on how an non-developer can set up the source and destination and proceed to test. I wanted to call out the author name work done in Power Query to highlight why Power BI is a great choice. When splitting author names, the work was done using right-click operations. Here is an example of the expression code used to split out the second author name column:

(DT_STR,200,1252)TRIM((FINDSTRING(AuthorNames,”,”,1) == 0 ? NULL(DT_WSTR,200) : TRIM(SUBSTRING(AuthorNames,FINDSTRING(AuthorNames,”,”,1) + 1,FINDSTRING(AuthorNames,”,”,2) == 0 ? LEN(AuthorNames) : 1 + LEN(AuthorNames) – FINDSTRING(AuthorNames,”,”,2)))))

Compared to Power Query, this is complex and not intuitive. While Power Query is not intended for enterprise ETL use, it’s simplicity helps test complex scenarios such as our author name split without having to create and equally complex SQL statement or expression.

The next post will take a look at some of the visualization options for the test results.