Power BI: Dynamically Removing Errors From Columns in M

If you have excel data, or user-entered data, you have likely experienced frustration from repeatedly seeing the error message below upon hitting “Close & Apply” or refreshing your data model. It can be extremely frustrating to see failed refresh emails come in every time a user types in a text value in a number field or adds a formula to the report that results in #N/A.

End users that are not trained in data governance but are actively involved in maintaining a data set can easily make data entry mistakes or create inconsistent data types within columns. For example, you may have a column in the dataset called “Sales” and instead of 0, someone may type “None” or “NA”. When this gets loaded into Power BI, Power BI will not know how to convert the text value “None” to a number, and it will throw an error on the refresh of the report.

One way to mitigate the impact of user-entered data is to replace errors with null values. This is not ideal since it doesn’t fix the data entry issues, but it does enable reports to still be refreshed and used while the data issues are addressed. In Power Query, you can manually replace the errors with null by going to the “Transform” tab then selecting the drop down for “Replace Values” and choosing “Replace Errors”.

After selecting “Replace Errors”, type null (all lowercase) into the value field to replace your errors with a null value that will allow the report to refresh no matter the data type.

As you can imagine, it can get quite tedious to apply this step to every column in every query in your report. Good news! I have a query that you can add to the end of your applied steps that will make sure every single column in your query will replace errors with nulls. Let’s walk through it.

Final Query:

let
    Source = Excel.Workbook(File.Contents("C:\Users\KristynaHughes\OneDrive - DataOnWheels\GitHub\AdventureWorks Sales w Errors.xlsx"), null, true),
    Sales_data_Sheet = Source{[Item="Sales_data",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sales_data_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"SalesOrderLineKey", Int64.Type}, {"ResellerKey", Int64.Type}, {"CustomerKey", Int64.Type}, {"ProductKey", Int64.Type}, {"OrderDateKey", Int64.Type}, {"DueDateKey", Int64.Type}, {"ShipDateKey", Int64.Type}, {"SalesTerritoryKey", Int64.Type}, {"Order Quantity", Int64.Type}, {"Unit Price", type number}, {"Extended Amount", type number}, {"Unit Price Discount Pct", Int64.Type}, {"Product Standard Cost", type number}, {"Total Product Cost", type number}, {"Sales Amount", type number}, {"Large Sales", type number}}),
    //ColumnsInTable grabs all the column names in your existing table dynamically
    ColumnsInTable = Table.ColumnNames(#"Changed Type"), 
    //Converting this list of columns to a table lets us add a column for what we will replace errors with
    #"Converted to Table" = Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    //This adds a column to show what we will replace errors with
    #"Added ReplacementColumn" = Table.AddColumn(#"Converted to Table", "Replacement", each null),
    #"Transposed Table" = Table.Transpose(#"Added ReplacementColumn"),
    //Transforms the table back into a list so we can use it in our replace error step
    ColumnList = Table.ToColumns(#"Transposed Table"),
    //Takes our last step in our original table called Changed Type and replaces all errors in all our columns with null
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type",ColumnList)
in
    #"Replaced Errors"

In the example above, #”Changed Type” is the last step in my original query but this resulted in some errors in my table (see screenshot below).

In order to apply a replaced error step to all the columns in the table, we need to understand a bit more about the Table.ReplaceErrorValues function. This function is comprised of two arguments: (table as table, errorReplacement as list). The table portion will be the name of our last step because that is the final table that we want to manipulate (#”Changed Type” in this example). The error replacement list will require two values – the column name and the value we want to replace errors with. To build the list, follow the steps below.

  1. Table.ColumnNames(#”Changed Type”) : this function will look at the last step in your query (#”Changed Type” in this example) and generate a list of the columns

2. Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error) : this function converts our list of columns into a table. The reason we are converting a list to a table then back to a list is so we can easily add in a column of nulls to our list.

3. Table.AddColumn(#”Converted to Table”, “Replacement”, each null) : this function adds a column called “Replacement” that contains the null values that we want to replace our errors with.

4. Table.Transpose(#”Added ReplacementColumn”) : transposing the table will make it easier to put our values into a list appropriately. We need the final result to be a comma separated record that has the column name then the replacement null value.

5. Table.ToColumns(#”Transposed Table”) : this function flips our transposed table back into a table full of lists that we can use in our final function. By clicking on the cell in our list of lists, you can see that the list is comprised of our column name and our replacement value. Pretty handy for our replaced error function.

6. Table.ReplaceErrorValues(#”Changed Type”,ColumnList) : time to replace our errors. Notice that we have #”Changed Type” as our first argument. This will force Power Query to grab our original last step and use that version of our table for the function. ColumnList is the name of the step that contained our list of lists with our column names and null values inside.

And there ya have it! To use the query above, you’ll need to swap #”Changed Type” with whatever the name is of your final step in your original query. Pretty handy when dealing with user-entered data, but keep in mind the ideal solution would be to put more data governance around the original data entry. For example, a simple application that forces people to enter the correct data type or else it throws an error or a drop down selection with valid options (this would be a great option for zip codes or states) would prevent data entry issues in the first place and raise the quality of your data.

Additional Resources

Power BI DAX Getting the Value of Previous Non-NULL Row

If you are an avid report designer or user, you may have wanted to see the percent difference between one row and the previous one. Sounds easy right? Well let’s break it down. Percent difference follows a consistent formula: (current value – previous value)/previous value. Current value? Easy peasy. Previous value? That’s where this blog comes into play. It’s fairly simple to get the previous month value or previous date period value because Power BI can tell it exactly what the filter would be (September minus one month is always August ). However, there may not have been any invoices in August, so how can we compare September with the last month that had an invoice?

Let’s set the scene for this example. A business user has requested to compare each day’s orders with the previous day they had orders. To get the previous day that had orders, we can create a measure called Last Day Order Quantity (see below). Within this measure, you’ll notice we use a variable to calculate the measure Order Quantity. By pulling this into the variable, it will save the row context for Order Quantity so we can make sure that we are only looking at rows that have orders. The next variable (LastDayOrdered) filters the ship date table to grab the last date where 1. there are orders (Order Quantity is not blank) and 2. the current day (aka the row we are currently on) also has orders. The final variable does the work of calculating our order quantity for the last day that contained orders.

Last Day Order Quantity = 
VAR Orders = [Order Quantity] //grab the charge amount for this line
VAR LastDayOrdered = MINX(FILTER(ALL('ShipDate'),'ShipDate'[DaysAgo]>MAX('ShipDate'[DaysAgo]) &&[Order Quantity] <> BLANK() && Orders<>BLANK()),'ShipDate'[DaysAgo])
VAR LastDayOrders = CALCULATE([Order Quantity],FILTER(ALL('ShipDate'),'ShipDate'[DaysAgo]=LastDayOrdered))
RETURN  
LastDayOrders

As you can see in the screenshot above, there is a gap in ship dates between 1/25/2014 and 1/28/2014 and the last day order quantity pulls in the amount from 1/25/2014 (1) instead of grabbing it from 1/27/2014 (0). Sweet! Now that makes finding the % difference in order quantity very simple. Below is the full code from this example and also a more parameterized version with tons of comments to help you use it as needed. Happy coding!

Final Measure:

% Difference Order Quantity = 
VAR Orders = [Order Quantity] //grab the charge amount for this line
VAR LastDayOrdered = MINX(FILTER(ALL('ShipDate'),'ShipDate'[DaysAgo]>MAX('ShipDate'[DaysAgo]) &&[Order Quantity] <> BLANK() && Orders<>BLANK()),'ShipDate'[DaysAgo])
VAR LastDayOrders = CALCULATE([Order Quantity],FILTER(ALL('ShipDate'),'ShipDate'[DaysAgo]=LastDayOrdered))
RETURN  
DIVIDE(Orders-LastDayOrders,LastDayOrders,0)

Commented version:

% Difference = 
//Make a variable to grab the value of your measure at your current line (we will use this later)
VAR Desired_Measure = [Order Quantity]

//Now we need to grab the time where this desired measure was not null
//In my example, ship date is the column that I want to see the previous order quantity sliced by
VAR Last_Time_Measure_Had_Value = 
    // I use MINX here to ensure I get the previous day, not the next day. Feel free to flip this as desired.
    MINX(FILTER(ALL('ShipDate') 
        //this makes sure that I'm only grabbing ship dates that are before my current ship date.
        ,'ShipDate'[DaysAgo] > MAX('ShipDate'[DaysAgo])
        //this makes sure that the options for days ago only include ones that we had orders on, AKA the desired measure is not NULL for this day. 
        //DO NOT USE Desired_Measure here because desired_measure will only look at your current row and we are trying to look at all the rows to make sure we are grabbing a non-NULL one. 
        &&[Order Quantity] <> BLANK() 
        //this checks that the current row is not blank. 
        && Desired_Measure<>BLANK())
    //I need this variable to return the smallest number of days ago (hence the MINX at the beginning) that meets the criteria above
    //For your use, you should swap daysago with whatever field you are hoping to slice and dice by
    ,'ShipDate'[DaysAgo])

//This final variable calulcates my desired measure (order quantity) and filters my table for the last time that measure had a value.
VAR Last_Instance_Of_Calculated_Measure = CALCULATE([Order Quantity],FILTER(ALL('ShipDate'),'ShipDate'[DaysAgo]=Last_Time_Measure_Had_Value))

//Easy peasy now! We can take our current days orders - last day we had orders number of orders, divded by the last day we had orders number of orders to get the % growth day over day. Phewf!
RETURN  
DIVIDE(Desired_Measure - Last_Instance_Of_Calculated_Measure,Last_Instance_Of_Calculated_Measure,0)

Power BI Adding Translations to Rename Columns – XMLA, TOM, C#

If you are new to using C# and the Tabular Object Model (TOM), please check out the previous blog post (https://dataonwheels.wordpress.com/2021/10/15/power-bi-meets-programmability-tom-xmla-and-c/) for both an introduction to the topic and detailed instructions on getting the demo stood up.

For the TOM and XMLA experts, imagine this. Your customer wants to dynamically rename columns without using the Power BI Desktop and would prefer all existing report visuals not get broken by the new name. Impossible? Not with TOM, XMLA, and translations within Power BI.

If you’ve ever tried to change a column name in a Power BI source, you’ve likely run into this error on any visuals that contained the renamed column. And when you hit that “See Details”, it will tell you the column that you simply renamed is no longer available for your visual.

So how do we get around that? Translations. Translations are typically used to translate report entities to other languages that will change depending on what language the end user has set on their browser. However, we can hijack this functionality to rename columns without having to impact the data model. It is a bit confusing on why this works, but imagine this: you build a Lego pyramid, but learn that one of the blocks needs to be changed from blue to green. Couple of options, you can take apart the entire pyramid (this would be akin to reopening the PBIX in Power BI Desktop and changing all of your visuals) OR you can take a green marker and color that blue brick green (adding a translation from blue to green).

If you don’t need to put this code into C#, the Tabular Editor is an excellent tool for adding translations to your data model (https://tabulareditor.com/creating-multilingual-power-bi-datasets/). However if you would like to programmatically update column names using C#, feel free to use the script below in your solution.

At a high level, here’s the hierarchy of entities used:
Workspace – Dataset – Data Model – Cultures – Object Translations
Workspace – Dataset – Data Model – Table – Column – Translated Properties

Note: There can only be one translated property per culture.

To add translations, we first need to set which culture this translation belongs in. For this example, we will use “en-US” because that is what default browser we want these names applied to. The code snippet below will list out all the cultures (aka website language codes) that are configured in this data model and list out all the translated objects (data columns in this case) that already exist.

After setting the culture/language, narrow down the column that this translation will be applied to and create a variable for the translation object. The translation object consists of two parts, the metadata object (column in this example) and the property of that metadata that we want to translate (caption in this example which is essentially display name).

Once we have these elements, we can check to see if this column already has a translation for this culture. If it does, this script will remove the old translation to allow for overwriting. If it does not, it will add the new translation to the culture within the data model.

And that’s it!

Here’s what it looks like in the service. Don’t forget to refresh your report page if you have it open for the new name to appear. There’s no need to refresh the dataset.

Full C# code:

using System;
using Microsoft.AnalysisServices.Tabular;



namespace PowerBI_TOM_Testing
{
    class Program
    {
        static void Main()
        {

            // create the connect string - powerbi://api.powerbi.com/v1.0/myorg/WORKSPACE_NAME
            string workspaceConnection = "powerbi://api.powerbi.com/v1.0/myorg/YOURWORKSPACE";
            string connectString = $"DataSource={workspaceConnection};";

            // connect to the Power BI workspace referenced in connect string
            Server server = new Server();
            server.Connect(connectString);
            // enumerate through datasets in workspace to display their names
            foreach (Database database in server.Databases)
            {
                Console.WriteLine($"ID : {database.ID}, Name : {database.Name}, CompatibilityLevel: database.CompatibilityLevel}, Last Updated : {database.LastSchemaUpdate}");
            }
            
            // enumerate through tables in one database (use the database ID from previous step)
            Model model = server.Databases["bb44a298-f82c-4ec3-a510-e9c1a9a28af2"].Model; 
            
            //if you don't specify a database, it will only grab models from the first database in the list
            foreach (Table table in model.Tables)
            {
                Console.WriteLine($"Table : {table.Name} IsHidden? : {table.IsHidden}");

            }
           
            // Specify a single table in the dataset
            Table table_product = model.Tables["Product"];

            
            
            // List out the columns in the product table
            foreach (Column column in table_product.Columns)
            {
                Console.WriteLine($"Columns: {column.Name}");
             }


            //Translations can be used to rename existing columns without rebuilding the model. This also updates any visuals that use that column. 
            // List of translations on the model
            foreach (Culture culture in model.Cultures)
            {
                Console.WriteLine($"Existing Culture: {culture.Name}"); 
            }

            // Let's get a list of the existing translations within the en_US culture
            Culture enUsCulture = model.Cultures.Find("en-US");
            
            foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations) 
            {
                Console.WriteLine($"Translated Object: {objectTranslation.Value}");
            }
            // Narrow down what column within this culture/language you would like to add the translation to
            MetadataObject dataColumn = table_product.Columns.Find("Description"); //this needs to always be the original column name within the data model.
            ObjectTranslation proposedTranslation = enUsCulture.ObjectTranslations[dataColumn, TranslatedProperty.Caption];

            // Only one translation per entity per culture.
            if (proposedTranslation != null)
            {
                Console.WriteLine($"Translation Exists for this Culture & Column combo");
                enUsCulture.ObjectTranslations.Remove(proposedTranslation); //need to remove the existing translation to overwrite it
                ObjectTranslation overwriteTranslation = new ObjectTranslation()
                {
                    Object = dataColumn,
                    Property = TranslatedProperty.Caption,
                    Value = "Blue"
                };
                enUsCulture.ObjectTranslations.Add(overwriteTranslation);
            }
            else
            {
                ObjectTranslation newTranslation = new ObjectTranslation()
                {
                    Object = dataColumn,
                    Property = TranslatedProperty.Caption,
                    Value = "Blue"
                };
                enUsCulture.ObjectTranslations.Add(newTranslation);
            }

            

            // List out the translations to see what they are now that we have run the script    
            foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations)
                {
                    Console.WriteLine($"Final Translated Object: {objectTranslation.Value}");
                }
            
model.SaveChanges(); //make sure this is the last line! 
       


        }
    }
}

Additional Resources:

https://www.kasperonbi.com/setting-up-translations-for-power-bi-premium/
https://tabulareditor.com/creating-multilingual-power-bi-datasets/
https://www.sqlbi.com/tools/ssas-tabular-translator/
https://docs.microsoft.com/en-us/analysis-services/tabular-models/translations-in-tabular-models-analysis-services?view=asallproducts-allversions
https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.tabular.culture?view=analysisservices-dotnet
https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.tabular.culture.objecttranslations?view=analysisservices-dotnet#Microsoft_AnalysisServices_Tabular_Culture_ObjectTranslations

Saving the Day from Delay: Recap & Series Kick-Off

Ironically, I’m a few months behind on a recap for my presentation on building and using a sustainable, dynamic date query in M code – Saving the Day from Delay. On a positive note, the delay means that I can do a series on the importance of date tables, how to effectively use them in DAX, and what I mean by sustainable and dynamic practices. At the end of this post is a link to a GitHub containing my most up-to-date M Query referenced in this series and the PowerPoint used in the aforementioned presentations.

Presentation Recap

Using a date query, whether from a SQL table or Power Query table (M), can aid in optimizing and honing your data model. A main culprit in sluggish Power BI data models are the native date hierarchies. Native date hierarchies are the default setting in Power BI and provide you with hierarchies to use in visuals that allow for easy drilldowns through year, quarter, month, and day. You can turn off this default under File > Options > Current File > Data Load (see below). You can also do this in global settings, but I recommend doing this file by file as the benefits reach diminishing returns on datasets containing under a 500,000 rows.

Where to change your default settings

If you’re unsure what I mean by native data hierarchy, the image below contains an example of a date hierarchy built by Power BI. To rebuild a date hierarchy from a standard date table, drag and drop columns on top of each other to create a custom date hierarchy. By right clicking on the Hierarchy title, you can rename and reorder your hierarchy accordingly. One of the biggest reasons to turn off the native date hierarchy is that you now only have one hidden DAX table in your data model for the date instead of however many date fields you have in your model (I’ve seen some models with up to 15 date fields!).

Native Date Hierarchy

In subsequent posts, I’ll cover the benefits of turning off this feature, reconnecting your data model, how to customize the date query, custom columns in M, and best practices for a lean, mean data model machine.

https://github.com/AnytsirkGTZ/DateTable_MCode/blob/main/MQuery%20Calendar

Congratulations to Pragmatic Works – Microsoft Partner of the Year Finalists, Again

Pragmatic Works has done it again. Microsoft has recognized us for our work this year with two finalist awards — Power BI and PowerApps and Power Automate. This recognizes work we have done for our customers with these products.

This follows awards over the last three years for Data Analytics (2019) and Data Platform (2017). I am proud to work for a great company who strives to be industry leaders in data and analytics in the cloud and on-premises. I am truly excited to see where we go from here!