Power BI: Data Quality Checks Using Python & SQL

Picture this, you have a report in Power BI that someone passes off to you for data quality checks. There are a few ways to make sure your measures match what is in the source data system, but for this demo we are going to use python and excel to perform our data quality checks in one batch. In order to do that, we are going to build a python script that can run Power BI REST APIs, connect to a SQL Server, and connect to Excel to grab the formulas and to push back the quality check into Excel for final review. To find a sample Excel and the final python script, please refer to my GitHub.

To build this, you’ll need a few pieces:

  1. A way to run Power BI REST API’s
    • Requires an Azure AD App – if you can’t make one then you will need someone in your company to create one and give you the client ID for it and credentials to use it with
  2. Ability to run python scripts (Visual Studio Code will be used in this demo)
  3. Access to a Power BI dataset that is published to the Power BI service (Premium is not needed)
  4. Access to a SQL Server
  5. A list of DAX functions that you want to test in Excel
  6. A list of SQL queries that should have the same result as your DAX functions in Excel

Authentication

To start, you’ll need to register an application for Power BI within Azure. First, create & register your app in the Azure portal. This link will take you through the manual process step-by-step including how to add permissions so the application can access the PBI service. You’ll need to give your app the delegated permission of Dataset.ReadWrite.All or Dataset.Read.All as well as Tenant.Read.All or Tenant.ReadWrite.All which will require an admin to grant (check the required scope in API documentation). One important limitation for accessing the Azure AD app, the user that you use to access it must have no MFA on it. If that makes your IT security team squeemish, remind them that the app can be given read only access to only Power BI’s metadata. No proprietary data is at risk and write back is not necessary to accomplish our goals.

Let’s test out running the REST API using python scripts! To run this in python, we will need to generate an access token then call the REST API using the access token acquired earlier.

Adding Libraries in Python

If you have not used some of the libraries that will be imported for this demo, you will need to run “pip install [library name]” in your command prompt. If you don’t have pip, follow directions on this link and/or modify your Python application to include pip. Newer versions often have this enabled for ya by default, but triple check that you have added python to your environment variables or else you will get errors saying Python isn’t installed when you try to run stuff in the command prompt.

Getting the REST API Token

Alrighty, let’s go to Visual Studio Code and build out the authentication portion of our Python. We are going to use the adal library to make a REST API call that will get the token for us. Make sure to put single or double quotes around your Client ID, Username, and Password to format them as text. Keep in mind, our final code will get these values from the excel file instead of hard coding them into our code. That means you can also get these values from a separate Excel file if your IT team would prefer to keep these values out of the final code (which is highly recommended).

import adal
authority_url = 'https://login.windows.net/common'
resource_url = 'https://analysis.windows.net/powerbi/api'
client_id = <INSERT CLIENT ID>
username = <INSERT USERNAME>
password = <INSERT PASSWORD>
context = adal.AuthenticationContext(authority=authority_url,
                                     validate_authority=True,
                                     api_version=None)
token = context.acquire_token_with_username_password(resource=resource_url,
                                                     client_id=client_id,
                                                     username=username,
                                                     password=password)
access_token = token.get('accessToken')
print(access_token)

Awesome, now we have a way to generate an access token to use for our future API calls!

Alrighty, let’s try this out with a simple REST API call – getting a list of the workspaces (called groups in the API for some reason) in my tenant. I only have a couple workspaces, but this will let us test out passing our token to get back a response. Add the following lines to the end of the code above:

get_workspaces_url = 'https://api.powerbi.com/v1.0/myorg/groups'
header = {'Authorization': f'Bearer {access_token}'}
r = requests.get(url=get_workspaces_url, headers=header)
r.raise_for_status()
print(r.text)

You should see something similar to the screenshot below. You’ll notice there are a lot of elements in this response, but we can look at the name section and confirm that the code is working as expected. Exciting stuff!

Connecting to a SQL Server database

To connect to SQL, we are going to use the pymssql library. See below for the format we will need to connect to our SQL Server database and run a sample query.

#SQL Server Authentication
import pymssql 

sql_server = 'server.database.windows.net'
sql_user = 'Sample'
sql_password= 'Password'
sql_database = 'DatabaseName'

sql_con = pymssql.connect(sql_server,sql_user,sql_password,sql_database)
sql_cursor = sql_con.cursor(as_dict=True)

#execute SQL script
sql_cursor.execute('SELECT TOP 1 SalesKey FROM dbo.FactSales')
for row in sql_cursor: 
    print(row)

Your result will be a dictionary with the column name then the value. This will be important when we try to compare it to our Power BI DAX query result.

Comparing DAX Query with SQL Query

Alrighty now to the main event! Running a DAX query against our Power BI data model and comparing it to a SQL query to ensure our measure is working as expected. For this demo, I’ll keep it pretty simple. I have a measure in my dataset, Sales Amount = SUM(FactSales[SalesAmount]). This should equal the result from the SQL query SELECT SUM(SalesAmount) ‘Sales Amount’ FROM FactSales. If you have errors using the execute queries REST API, please review the limitations section of the Microsoft Execute Queries Documentation.

Python Code to Run SQL Script

#SQL Script Execution
sql_cursor.execute('SELECT SUM(SalesAmount) as SalesAmount FROM dbo.FactSales')
for row in sql_cursor: 
    print(row["SalesAmount"])

Python Code to Run PBI Execute Queries REST API

#PBI Execute Queries 
pbi_dataset_id = '10c3f357-9b2a-42f4-8438-748295b47b9b'
execute_queries_url = f'https://api.powerbi.com/v1.0/myorg/datasets/{pbi_dataset_id}/executeQueries'
dax_query_json_body = {
                "queries":
                        [
                            {
                            "query": 'EVALUATE ROW("Sales_Amount", \'_Measures\'[Sales Amount])'
                            }
                        ]
                        
                }
eq = requests.post(url=execute_queries_url, headers=header,json=dax_query_json_body)
eq.raise_for_status()
query_json = eq.text.encode().decode('utf-8-sig')
print("JSON Response ", json.loads(query_json))

Awesome! Looks like now I can visually confirm that my Power BI DAX measure returns the same result as my SQL query. But what if I want to do more than one at a time? In the next section, we will flip the script to use an Excel file as our input and our output. That way, all you would need to do is have two columns for input – one with the SQL query and one with the corresponding DAX formula. Then the output would be a third column that returns the difference between the two (ideally, 0).

Use Excel File as Input and Output

To start, we are going to build an Excel template to use with this script. For ease of use, we are going to use formatted tables so end users can append new measures as needed (a copy is available on our GitHub for download). We will make two sheets- one for credentials and one for the data quality checks. On the credentials sheet, make one table for the SQL credentials and for the PBI credentials. I’m going to name the table “Creds”.

On the Quality Check tab, we will make one more table that has a column for Power BI DAX formula, the SQL query, the result of the DAX formula, the result of the SQL query, and the difference between the two. Let’s avoid using spaces in our column and table names to keep our coding simple. We’ll call this table “Quality”.

Alrighty, now to pull these values into python! Let’s start with grabbing our creds and pulling them in. For this section, we need to import the openpyxl library and use it to pull in our tables. To start, we are going to grab our file then go to the worksheet named “Credentials”. From there, we will see what tables are in that worksheet.

#Grab data from our excel file
from openpyxl import load_workbook

#Read file
wb = load_workbook(r"C:\Users\KristynaHughes\Downloads\PythonDataQualityChecker.xlsx") 
#we need the r at the beginning since "\" is a special character in python
sheets = wb.sheetnames
print(sheets) #see the sheets in our workbook
#access specific sheet called Credentials
cred_ws = wb['Credentials']
print(cred_ws.title)
cred_tables = cred_ws.tables.values()
for table in cred_tables:
    print(table.displayName)

Next, we are going to use the mapping function to create an object that contains a dataframe for our excel file. If you’re not familiar with python, a dataframe is essentially Python’s version of a SQL view or Excel table that can store and manipulate tabular data. Once we get our two tables into dataframes, we can start to pull out values as inputs for our earlier equations.

Don’t worry, at the end of this blog is the entire final script.

cred_mapping = {} #this will help us build a dataframe from our table
for entry, data_boundary in cred_ws.tables.items():
    #parse the data within the ref boundary
    data = cred_ws[data_boundary]
    #extract the data 
    #the inner list comprehension gets the values for each cell in the table
    content = [[cell.value for cell in ent] #ent makes our code cleaner by handling nested data sources
               for ent in data
          ]
    header = content[0]
    #the contents excluding the header (aka column names)
    rest = content[1:]
    #create dataframe with the column names
    #and pair table name with dataframe
    cred_df = pandas.DataFrame(rest, columns = header)
    cred_mapping[entry] = cred_df
print(cred_df)

server = cred_df._get_value(0,"SQL_Server")
db = cred_df._get_value(0,"SQL_Database")
print(server)
print(db)

Sweet, now we can get our SQL_Server and SQL_Database values directly from the spreadsheet instead of hard coding it in Python! To code out the rest of the variables, simply copy then replace “server” and “SQL_Server” in the code above with the variable name and column name that you want to return.

Alrighty, now we need to pull in our data quality spreadsheet and set up some for loops that can help iterate through the list of Power BI and SQL queries and return some outputs to our Excel sheet.

Note: Your SQL query will need to have every column named or else you will get an error when you run the cursor over it. Also, for this section we will need to import a couple more libraries. If you don’t already have win32.com installed, you’ll need to run pip install pypiwin32 in your command prompt.

This script is pretty awesome to see work because we will finally get some values back into our Excel file! The first section will use the win32com library to essentially open excel in edit mode instead of read only like we have been doing. After we open the worksheet to edit it, we will grab our SQL queries then iterate through them to get the results from the query and put it back into Excel. If that doesn’t get the nerd in you excited, I’m not sure what will!

from win32com.client import Dispatch; import os
#Using the win32com library, we can open excel in edit mode
xl = Dispatch("Excel.Application") #opens excel for us to edit
xl.Visible = True
edit_wb = xl.Workbooks.Open(excel_file)
edit_qc_sh = edit_wb.Worksheets("Quality Check")

#Running SQL query and putting result back into Excel
sql_df = quality_df.dropna() #removing any blank rows so we don't run blank sql queries lol
sql_queries = sql_df.loc[:,"SQL_Query"] 
#the first argument (:) signifies which rows we want to index (: indicates all columns), and the second argument lets us index the column we want
#print(sql_queries) #this returns the rows in our sql query column
rownumber = 1
for query in sql_queries:
    sql_cursor.execute(str(query))
    #print (sql_cursor.fetchall()[0])
    for row in sql_cursor:
        #print(row) #this is our dictionary created from the sql query result
        key = list(row.items())[0]
        rownumber += 1 #puts our sql responses in the right rows
        sql_result = key[1]
        quality_df.at[rownumber,"SQL_Result"] = sql_result #this will put our results into the proper cell in our dataframe
        edit_qc_sh.Range(f"D{rownumber}").Value = sql_result #this will put our results in the right excel cell
        #print(key[1]) #returns just our result values
        
edit_wb.Save() #saving our values back to excel

Okay, time to do the same thing with our DAX queries and wrap this project up. You may have noticed that the DAX queries in the sample data is not the exact same you would use in your measure. What we are doing is telling DAX how we want the data back, in this case we want it to return one row for us (hence, EVALUATE ROW at the beginning) based on measures I have in a table called “_Measures”. If you aren’t sure how to make the DAX query work for this API, feel free to put the measure you want to test in a card visual then use the performance analyzer to get the query. A good way to see if your DAX query will work is to test it out in DAX Studio.

Python time! We are going to tweak our code from earlier by renaming our variable “header” to “pbiheader” so we can reference it more clearly in our JSON request code. Again, feel free to skip to the end of this post for the final code if you have any questions (or need this to just work and fast). First, we will need all the pieces to execute our post script. Once we have the URL, the DAX query, and the json body, we can iterate through the list of queries from our Excel file.

#Running queries and putting results back into Excel
qdf = quality_df.dropna() #removing any blank rows so we don't run blank sql queries lol
sqlrownumber = 1
pbirownumber = -1

#Running SQL Queries
sql_queries = qdf.loc[:,"SQL_Query"] 
for query in sql_queries:
    sql_cursor.execute(str(query))
    #print (sql_cursor.fetchall()[0])
    for row in sql_cursor:
        #print(row) #this is our dictionary created from the sql query result
        key = list(row.items())[0] #gets the first item from the row dictionary
        sqlrownumber += 1 #puts our sql responses in the right rows
        sql_result = key[1] #grabs just the result of our query
        quality_df.at[sqlrownumber,"SQL_Result"] = sql_result #this will put our results into the proper cell in our dataframe
        edit_qc_sh.Range(f"D{sqlrownumber}").Value = sql_result #this will put our results in the right excel cell
        #print(key[1]) #returns just our result values
        
#Running PBI Queries
pbi_queries = quality_df.loc[:,"PBI_DAX_Query"]
execute_queries_url = f'https://api.powerbi.com/v1.0/myorg/datasets/{pbidatasetid}/executeQueries'
for items in pbi_queries:
    pbirownumber += 1
    pbiexcelrownumber = pbirownumber+2
    list_pbiquery = list(pbi_queries.items())[pbirownumber]
    #print(list_pbiquery)
    item_pbiquery = list_pbiquery[1]
    dax_query_json_body = {
                "queries":
                        [
                            {
                            "query": item_pbiquery
                            }
                        ]  
                }
    pbi_response = requests.post(url=execute_queries_url, headers=pbiheader,json=dax_query_json_body)
    query_json = pbi_response.text.encode().decode('utf-8-sig') #allows us to read the json response
    pbi_values = query_json.replace("}]}]}]}","") #drops the trailing encoding from json
    #print(query_json)
    pbi_result = pbi_values.split(":")[4] #grabs just the result of our query
    print(pbi_result) #grabs just the result of our query
    quality_df.at[pbiexcelrownumber,"PBI_Result"] = pbi_result #this will put our results into the proper cell in our dataframe
    edit_qc_sh.Range(f"C{pbiexcelrownumber}").Value = pbi_result #this will put our results in the right excel cell

edit_wb.Save() #saving our values back to excel

Pretty great! Now we have a functioning data quality testing tool where Excel holds all our inputs and outputs and all we need to do in Python is hit Run.

Final Query

'''---------------README------------------
Full instructions available: https://dataonwheels.wordpress.com/?s=Power+BI%3A+Data+Quality+Checks+Using+Python+%26+SQL
The goal of this script is to compare DAX measures to corresponding SQL queries. 
To accomplish this, it uses an excel file containing authentication variables and query values.
This idea came from seeing the Execute Queries REST API in Power BI and wanting to use python to perform our data quality checks efficiently.
To connect to a Power BI Data Model, we need to pass an authentication token generated through an Azure AD App. 
To run a DAX query and SQL query, we need to loop through our excel file then put the queries into the API and SQL calls respectively.
Finally, this script takes those query results and puts it back into the excel file.

I have left some print() lines commented out, feel free to uncomment them to troubleshoot or walk through the code step-by-step.

You will need to swap out the excel_file variable with your excel file path. Other than that, the rest of the variables are managed inside your excel file. 
Once you have your variables and queries in excel, hit run and you're good to go. 
'''

#---------------------------------------#
#       Import libraries needed  
#---------------------------------------#
import requests
import adal
import json
import pymssql 
import pandas
import openpyxl
from win32com.client import Dispatch; import os

#---------------------------------------#
#    Build out authentication steps  
#---------------------------------------#

#----- Authentication Variables from Excel -----#

#Grab authentication data from our excel file
from openpyxl import load_workbook
#we need the r at the beginning since "\" is a special character in python
excel_file = r"C:\Users\KristynaHughes\Downloads\PythonDataQualityChecker.xlsx"
wb = load_workbook(excel_file) 
sheets = wb.sheetnames
#print(sheets) #see the sheets in our workbook
cred_ws = wb['Credentials'] #access specific sheet called Credentials
cred_mapping = {} #this will help us build a dataframe from our table
for entry, data_boundary in cred_ws.tables.items():
    #parse the data within the ref boundary
    data = cred_ws[data_boundary]
    #extract the data 
    #the inner list comprehension gets the values for each cell in the table
    content = [[cell.value for cell in ent] #ent makes our code cleaner by handling nested data sources
               for ent in data
          ]
    header = content[0] #the contents excluding the header (aka column names)
    rest = content[1:] #create dataframe with the column names
    cred_df = pandas.DataFrame(rest, columns = header) #this is the dataframe we will use to get our creds
    cred_mapping[entry] = cred_df
#Use the dataframe to set up credential variables we can use later in the script
sqlserver = str(cred_df._get_value(0,"SQL_Server"))
sqldb = str(cred_df._get_value(0,"SQL_Database"))
sqluser = str(cred_df._get_value(0,"SQL_User"))
sqlpassword = str(cred_df._get_value(0,"SQL_Password"))
pbiclientid = str(cred_df._get_value(0,"PBI_ClientID"))
pbiusername = str(cred_df._get_value(0,"PBI_Username"))
pbipassword = str(cred_df._get_value(0,"PBI_Password"))
pbidatasetid = str(cred_df._get_value(0,"PBI_DatasetID"))
#check to make sure your variables are correct by uncommenting the next line
#print(sqlserver,sqldb,sqluser,sqlpassword,pbiclientid,pbiusername,pbipassword,pbidatsetid)


#----- Power BI REST API Authentication -----#

authority_url = 'https://login.windows.net/common'
resource_url = 'https://analysis.windows.net/powerbi/api'
context = adal.AuthenticationContext(authority=authority_url,
                                     validate_authority=True,
                                     api_version=None)
token = context.acquire_token_with_username_password(resource=resource_url,                                                     
                                                     username=pbiusername,
                                                     password=pbipassword,
                                                     client_id=pbiclientid)
access_token = token.get('accessToken')
pbiheader = {'Authorization': f'Bearer {access_token}'}
#try out the get workspaces REST API using our access token by uncommenting the next few lines
#get_workspaces_url = 'https://api.powerbi.com/v1.0/myorg/groups'
#r = requests.get(url=get_workspaces_url, headers=header)
#r.text will give us the response text for our get request, pretty neat!
#print(r.text)


#----- SQL Server Authentication -----#

try:
    sql_con = pymssql.connect(sqlserver,sqluser,sqlpassword,sqldb)
    sql_cursor = sql_con.cursor(as_dict=True)
except Exception as e:
    raise Exception(e)


#---------------------------------------#
#  Build out data quality check steps 
#---------------------------------------#

#----- Read excel to get quality check queries into a dataframe -----#

quality_ws = wb['Quality Check']
quality_mapping = {} #this will help us build a dataframe from our table
for entry, data_boundary in quality_ws.tables.items(): #grabs data dynamically from our table
    data = quality_ws[data_boundary] #parse the data within the ref boundary
    #the inner list comprehension gets the values for each cell in the table
    content = [[cell.value for cell in ent] #ent makes our code cleaner by handling nested data sources
               for ent in data
          ]
    header = content[0] #the contents excluding the header (aka column names)
    rest = content[1:] #create dataframe with the column names
    qualitydf = pandas.DataFrame(rest, columns = header) #this is the dataframe we will use to get our quality check queries
    quality_df = qualitydf.fillna(' ') #helps remove blank records from our excel file
    quality_mapping[entry] = quality_df
#print(quality_df)


#----- Open excel file in edit mode -----#

xl = Dispatch("Excel.Application") #opens excel for us to edit
xl.Visible = True
edit_wb = xl.Workbooks.Open(excel_file)
edit_qc_sh = edit_wb.Worksheets("Quality Check")

#----- Set variables to use in our iterators -----#

qdf = quality_df.dropna() #removing any blank rows so we don't run blank sql queries lol
sqlrownumber = 1
pbirownumber = -1

#----- Run SQL queries and put results back into excel -----#

sql_queries = qdf.loc[:,"SQL_Query"] 
for query in sql_queries:
    sql_cursor.execute(str(query))
    #print (sql_cursor.fetchall()[0])
    for row in sql_cursor:
        #print(row) #this is our dictionary created from the sql query result
        key = list(row.items())[0] #gets the first item from the row dictionary
        sqlrownumber += 1 #puts our sql responses in the right rows
        sql_result = key[1] #grabs just the result of our query
        quality_df.at[sqlrownumber,"SQL_Result"] = sql_result #this will put our results into the proper cell in our dataframe
        edit_qc_sh.Range(f"D{sqlrownumber}").Value = sql_result #this will put our results in the right excel cell
        #print(key[1]) #returns just our result values

#----- Run PBI DAX queries and put results back into excel -----#

pbi_queries = quality_df.loc[:,"PBI_DAX_Query"]
#print(pbi_queries)
execute_queries_url = f'https://api.powerbi.com/v1.0/myorg/datasets/{pbidatasetid}/executeQueries'
for items in pbi_queries:
    pbirownumber += 1
    pbiexcelrownumber = pbirownumber+2
    list_pbiquery = list(pbi_queries.items())[pbirownumber]
    #print(list_pbiquery)
    item_pbiquery = list_pbiquery[1]
    dax_query_json_body = {
                "queries":
                        [
                            {
                            "query": item_pbiquery
                            }
                        ]  
                }
    pbi_response = requests.post(url=execute_queries_url, headers=pbiheader,json=dax_query_json_body)
    query_json = pbi_response.text.encode().decode('utf-8-sig') #allows us to read the json response
    pbi_values = query_json.replace("}]}]}]}","") #drops the trailing encoding from json
    #print(query_json)
    pbi_result = pbi_values.split(":")[4] #grabs just the result of our query
    #print(pbi_result) #grabs just the result of our query
    quality_df.at[pbiexcelrownumber,"PBI_Result"] = pbi_result #this will put our results into the proper cell in our dataframe
    edit_qc_sh.Range(f"C{pbiexcelrownumber}").Value = pbi_result #this will put our results in the right excel cell

#----- Save our changes back to the excel file -----#

edit_wb.Save() #saving our values back to excel
print("All done")

Additional Resources:

This was my first experience with Python, and below are all the links that helped me build out this demo and learn more about how Python operates. Hope you find these as useful as I did!

Power BI: Dynamically Removing Errors From Columns in M

If you have excel data, or user-entered data, you have likely experienced frustration from repeatedly seeing the error message below upon hitting “Close & Apply” or refreshing your data model. It can be extremely frustrating to see failed refresh emails come in every time a user types in a text value in a number field or adds a formula to the report that results in #N/A.

End users that are not trained in data governance but are actively involved in maintaining a data set can easily make data entry mistakes or create inconsistent data types within columns. For example, you may have a column in the dataset called “Sales” and instead of 0, someone may type “None” or “NA”. When this gets loaded into Power BI, Power BI will not know how to convert the text value “None” to a number, and it will throw an error on the refresh of the report.

One way to mitigate the impact of user-entered data is to replace errors with null values. This is not ideal since it doesn’t fix the data entry issues, but it does enable reports to still be refreshed and used while the data issues are addressed. In Power Query, you can manually replace the errors with null by going to the “Transform” tab then selecting the drop down for “Replace Values” and choosing “Replace Errors”.

After selecting “Replace Errors”, type null (all lowercase) into the value field to replace your errors with a null value that will allow the report to refresh no matter the data type.

As you can imagine, it can get quite tedious to apply this step to every column in every query in your report. Good news! I have a query that you can add to the end of your applied steps that will make sure every single column in your query will replace errors with nulls. Let’s walk through it.

Final Query:

let
    Source = Excel.Workbook(File.Contents("C:\Users\KristynaHughes\OneDrive - DataOnWheels\GitHub\AdventureWorks Sales w Errors.xlsx"), null, true),
    Sales_data_Sheet = Source{[Item="Sales_data",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sales_data_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"SalesOrderLineKey", Int64.Type}, {"ResellerKey", Int64.Type}, {"CustomerKey", Int64.Type}, {"ProductKey", Int64.Type}, {"OrderDateKey", Int64.Type}, {"DueDateKey", Int64.Type}, {"ShipDateKey", Int64.Type}, {"SalesTerritoryKey", Int64.Type}, {"Order Quantity", Int64.Type}, {"Unit Price", type number}, {"Extended Amount", type number}, {"Unit Price Discount Pct", Int64.Type}, {"Product Standard Cost", type number}, {"Total Product Cost", type number}, {"Sales Amount", type number}, {"Large Sales", type number}}),
    //ColumnsInTable grabs all the column names in your existing table dynamically
    ColumnsInTable = Table.ColumnNames(#"Changed Type"), 
    //Converting this list of columns to a table lets us add a column for what we will replace errors with
    #"Converted to Table" = Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    //This adds a column to show what we will replace errors with
    #"Added ReplacementColumn" = Table.AddColumn(#"Converted to Table", "Replacement", each null),
    #"Transposed Table" = Table.Transpose(#"Added ReplacementColumn"),
    //Transforms the table back into a list so we can use it in our replace error step
    ColumnList = Table.ToColumns(#"Transposed Table"),
    //Takes our last step in our original table called Changed Type and replaces all errors in all our columns with null
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type",ColumnList)
in
    #"Replaced Errors"

In the example above, #”Changed Type” is the last step in my original query but this resulted in some errors in my table (see screenshot below).

In order to apply a replaced error step to all the columns in the table, we need to understand a bit more about the Table.ReplaceErrorValues function. This function is comprised of two arguments: (table as table, errorReplacement as list). The table portion will be the name of our last step because that is the final table that we want to manipulate (#”Changed Type” in this example). The error replacement list will require two values – the column name and the value we want to replace errors with. To build the list, follow the steps below.

  1. Table.ColumnNames(#”Changed Type”) : this function will look at the last step in your query (#”Changed Type” in this example) and generate a list of the columns

2. Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error) : this function converts our list of columns into a table. The reason we are converting a list to a table then back to a list is so we can easily add in a column of nulls to our list.

3. Table.AddColumn(#”Converted to Table”, “Replacement”, each null) : this function adds a column called “Replacement” that contains the null values that we want to replace our errors with.

4. Table.Transpose(#”Added ReplacementColumn”) : transposing the table will make it easier to put our values into a list appropriately. We need the final result to be a comma separated record that has the column name then the replacement null value.

5. Table.ToColumns(#”Transposed Table”) : this function flips our transposed table back into a table full of lists that we can use in our final function. By clicking on the cell in our list of lists, you can see that the list is comprised of our column name and our replacement value. Pretty handy for our replaced error function.

6. Table.ReplaceErrorValues(#”Changed Type”,ColumnList) : time to replace our errors. Notice that we have #”Changed Type” as our first argument. This will force Power Query to grab our original last step and use that version of our table for the function. ColumnList is the name of the step that contained our list of lists with our column names and null values inside.

And there ya have it! To use the query above, you’ll need to swap #”Changed Type” with whatever the name is of your final step in your original query. Pretty handy when dealing with user-entered data, but keep in mind the ideal solution would be to put more data governance around the original data entry. For example, a simple application that forces people to enter the correct data type or else it throws an error or a drop down selection with valid options (this would be a great option for zip codes or states) would prevent data entry issues in the first place and raise the quality of your data.

Additional Resources

Typing with Your Tongue – Voice to Text Technologies

This is the second in the series of tools and technologies that I use to deal with the loss of functionality in my hands and arms. Check out this article for the lead up to this series.

Setting the stage

The issue I’m dealing with involves muscle atrophy in my hands and my arms. As a result, I’ve lost a lot of strength in my hands and arms including my fingers. Some of the unintended or unplanned impacts included the inability to successfully type at times or diminished amount of time I can actually spend typing. I had previously used Logitech split keyboard which I loved. I consider myself a fairly good typist and used to be able to type and a code very effectively. With the onset of the atrophy, I encountered situations where my hands would actually stop working. I would be typing and then I couldn’t type anymore. Some of it is definitely related to physical exhaustion in the effort required given my condition. The first time this happened, was the first time I was concerned about my career.

As my condition has worsened, I have try to variety of software solutions that supported voice to text. In this blog I’m going to separate my voice to text solutions into two primary groups. The first group is those tools which I can use for dictation like creating this blog post or working with documents. The primary focus of this group of tools is to support the ability to add text while working on a computer with a mic. The second group of tools is primarily focused around note taking and using mobile tools on my phone or similar devices where I may not have access to the dictation tools I would you use in my normal work day. the one area that I am not going to cover in this blog post is related to voice automation tools or those tools which provide voice command capability. What I have found is that they are not the same. Currently I have not found a voice command solution that I like. As I do some more discovery in that area, I will share what I find.

Dictation tools

When my condition first surfaced, I immediately started thinking about how to do voice to text. The first software that came to mind with Dragon by Nuance. I started using Dragon as soon as we were able to get a professional account through work. The first thing I noticed about Dragon was that it felt like I had went backwards in time as it was not a updated piece of software or modernized. Dragon has been around a long time and services a lot of different areas of business including law and medical. It is a highly valuable tool in those spaces and has specialty products for some of those with specific terminology support.

DragonBar

What I liked about Dragon is that it has an extensive editing capability built into the software. This is particularly true if you use their special dialog box to create most of your content. That being said, you really need to have a good microphone to efficiently run Dragon. The other issue that I had was when we upgraded to Windows 11, it was not supported. This will likely change as Microsoft has purchased the product in recent months and will likely incorporate a lot of it into its own platform. I reverted to Windows 10 to determine how much I would use it. The biggest issue I had was the requirement for a high-quality microphone that would likely need to be on a headset to operate well.

With the switch to Windows 11, I needed to find alternative options and I turned to Microsoft to see what they had available. It turns out that Microsoft has two voice to text solutions that work in Windows 10 and 11. (These solutions may work in other versions of Windows, but I don’t use them.) The first tool I explored and worked with was Dictate that is available from Microsoft 365.

Dictate In Word

In particular, Dictate inside of Word. I immediately liked this tool because it is built-in to the Office platform. It also seemed to learn more quickly than Dragon did through general use which is likely to do the AI behind it. I also appreciate the fact I could use the open microphone effectively without making changes to my environment. I am writing this blog post in Word first because of the capabilities of Dictate. It is not without flaws, and the biggest issue I have with Microsoft 365 Dictate is that it does not know how to capitalize mid-sentence or to choose a word to capitalize. This seems like a significant oversight that many have complained about through the years of using this product. Hopefully Microsoft will resolve this soon as it seems like an oversight. I did discover that there is a change case option in text editing available in Word that has allowed me to handle this situation easily.

Change case in Word

I’m still learning Dictate and its capabilities but overall, it has been the most fluid solution I’ve used to date.

When Dictate is not available outside of the Office 365 suite. In that case, I use the Microsoft voice typing that you can find by hitting Windows+H.

Windows voice typing box

This will allow you to do voice dictation to any text box well, most text boxes. I use this for dictating messages in Teams, forms on websites, and similar type of functionality. This is not as capable as Dictate in Office, for example delete does not work the same way in the two tools. However, it too seems to learn my speech and respond well to the open mic which is why I have chosen to use it.

Before I move away from the dictation tooling, I want to add that in the Office suite I’ve been able to effectively use Dictate in Outlook. This has been very helpful in creating emails. Depending on where you are in Outlook you may or may not have Dictate available to you in which case you can always use voice typing. Dictate also works effectively in OneNote. The functionality in PowerPoint is severely lacking and I don’t know why. It does not seem to figure out what I’m trying to say most of the time when I’m working with this in PowerPoint. So, this is kind of frustrating when creating presentations but overall, the effectiveness in Outlook and Word have kept me quite productive.

In summary, if Dragon works for you and how you work it is likely the best tool for the job. With Microsoft purchase of Dragon, we can expect to see some of that functionality move into this Office suite is my expectation or into Windows directly. If you are like me and prefer using an open mic, you will find that the Microsoft 365 Dictate and Windows voice typing tools are more likely a better fit but still have significant gaps to fill.

Notetaking and mobile

I kind of grouped these together because of how I function. One of the immediate impacts of my condition is that I am no longer able to take handwritten notes. This has been a huge hit as most of the time I used a lot of pen and paper for design work, notetaking, etc. Losing this capability was a significant hit to my productivity. As a result, I needed to find alternatives.

Otter on Android

The first tool I added to my toolbox on my phone was Otter. This product was introduced to me by a peer at 3Cloud. It allows you to record and transcribe conversations so that you have notes from that conversation as well as the recording. It does are pretty good job in transcription frankly. I’ve used it to take notes during meetings, to take notes while working with my doctors, and just self-transcribed notes. I use exclusively on my phone and then transfer the notes to OneNote when I want to use them with other tools. This has been a lifesaver in particular in regard to doctors’ appointments. It has helped me keep track of that information and because of the transcription we can transfer that into other documents or even onto my CaringBridge site when we need exact details.

On my phone, I also use Google’s built in voice text technology and Samsung’s technology as I have a Galaxy phone. I will say this as hit or miss and often and it’s a little bit of fun to my text with my family for sure. However, it is still easier to use voice to text as opposed to typing on the device itself. So, I’m thankful that it works even if it stumbles a lot more than some of these other tools. Dragon does have a mobile option as well, but I did not get it working so I can’t really speak to its functionality at this point.

Summary of my new world

I still need to type to do my job. Part of my job entails building some technical labs which require coding. Coding is not easily done with voice to text or maybe we should say should not be done with voice to text. However, as intellisense and similar functionality has become more prevalent in the tools, it has reduced the stress on my hands when creating code. There’s new functionality from Microsoft in GitHub called copilot and similar tools that use AI to suggest code. For the moment I haven’t had a chance to test these functionalities out but I’m looking forward to seeing how they to improve my work environment. I would always recommend that you let people know that you’re using voice to text in particular when you’re using it in Teams or other chatting type environments. This means you don’t have to go back and correct everything you do all the time. People are forgiving and occasionally we get some really good fun like calling “Dennis” “dentist”. He wasn’t one, or so he says.

Before I end, I would like to say that this is not just helpful for those of us who struggle typing. You may find the dictation tools for example in Word to be a way to generate documents rather quickly. Just keep in mind:

  • plan to edit some
  • take your time
  • learn the tool
  • find success

I hope this helps someone out there. If you have found a tool that uses voice to text more efficiently or differently than when I’ve talked about I’d love to hear about it. Just add it in the comments below. Thanks for reading!

When is a mouse not a mouse?

This is the first in the series of tools and technologies that I use to deal with the loss of functionality in my hands and arms. Check out this article for the lead up to this series.

Setting the stage

The issue I’m dealing with involves muscle atrophy in my hands and my arms. As a result, I’ve lost a lot of strength in my hands and arms including my fingers. Some of the unintended or unplanned impacts included the inability to successfully type at times (more on that later) and what feels like cramping in my hand when holding a mouse for too long. I was using a nice Logitech mouse for most of my work. I have also used the Surface Arc mouse from Microsoft. I liked Arc mouse because it traveled very well because you can flatten it. The Logitech mouse and split keyboard were a part of my standard home office setup.

An older picture of my setup with the Logitech mouse and keyboard

As a condition worsened, I found myself struggling to use a mouse longer than half hour to an hour at a time. I would work with the mouse and then eventually I would start unintentionally clicking the buttons and was unable to actually move it. The level of frustration caused by working that way is pretty high.

Large trackpad try out

Because the problem appeared to be around the fact that I was holding the mouse, my first thought was to try a larger trackpad. This would allow me to use both hands and fingers effectively to manipulate the mouse on the computer. What I found was it just transferred the problem. My lack of finger control and the cramp in my hand still existed while using the trackpad.

Enter the roller bar mouse

A friend of ours works with hand specialists back in Minneapolis. She suggested we check out something called the RollerMouse. It’s an interesting tool. (One thing to note here is that all of these tools cost something. Trying to find the right tools to work is not inexpensive. Hopefully some of this information will help you save some money. ) I went online and did some research about how it’s supposed to help. A lot of the reviews talked about folks with arthritis or other similar conditions which impact their ability to work with a mouse for an extended period of time. Including the fact that they could have shoulder issues and so on. That being said, I decided to give it a try.

The RollerMouse Red that I purchased

It is a completely different way to work with the mouse functionality on your computer. Many of the reviews talked about the two-week learning cycle that was required to be effective. I found that I was affective using the mouse within a couple of hours. I do think this was related to the fact that I struggle so much to work with a traditional mouse or trackpad. The roller bar effectively supports a three-screen solution including two 34-inch monitors. The side-to-side motion of the bar navigates seamlessly from one screen to the other.

Beyond that the buttons are awesome! Naturally there are the left and right click buttons in the center of the bar. The bar itself functions as a left click which is a very natural function when working with the mouse. There is the scroll wheel in the middle which works like a scroll wheel on a normal mouse. Now it gets interesting. At the center in the bottom is a double click button. I did not know if I’d actually use this, but I find myself using it especially when switching hands while using the mouse. The steepest part of my learning curve has been effectively using the buttons. I must keep in mind that the cursor placement is where the clicks will occur, not the last place I had clicked. Up next, they have the copy and paste buttons right above the left and right clicks. In a later blog I will talk about speech to text functionality but having shortcut buttons for copy and paste means I did not have to do the Ctrl+C or Ctrl+V patterns on the keyboard. Which is great! All the buttons are programmable. This means you can change them if you have a better pattern that works for you. So far, I’ve only changed one button, the scroll button click. I use it to turn on dictation in Word.

It changed the way I work

To say that this has changed the way that I work and allowed me to work longer is the understatement of the year. I am still working through other tools and devices to continue to help me be productive. But to date this is by far, the best investment I have made. If you’re experiencing issues manipulating a mouse in the traditional fashion or anything that requires you to potentially either switch hands to give yourself a break or just because it’s hard sometimes to use a mouse and move it, this is a great solution for you. The mouse is stationary and has a nice wrist pad for you to work on. The ability to switch hands allows me to get the break I need on either hand at any given time. With this mouse I am able to stay more productive than I ever thought I was going to be able to win the started. As you can see, I’m a real fan.

My new desk set up with the RollerMouse in the middle

For those of you that aren’t dealing with issues and are wondering should you use it? If you want to experience a different method of working with a mouse and get rid of moving around mouse is on your desk or touching a trackpad this is a great solution. I personally think I would have fallen in love with this solution much earlier had I known about it. Now you know about it!

I love to hear from you if this was helpful and if you’ve decided to give this a try. For those of you suffering with carpal tunnel or arthritis come on give this a serious consideration. It is on the pricey side, but I will tell you in this case it may be worth it.

The Impact of Change

I started this summary on LinkedIn. Check out that article here. I only used LinkedIn to introduce what is happening, if you’re interested in learning more keep reading.

Kristyna Hughes

First, I have to say thank you to my daughter Kristyna, who has been contributing regularly to our Data on Wheels blog in my absence. I’m amazed at quality and depth of the content she has provided to the community including blogs on the tabular object model and C#. I’m glad she is able to provide great content for the data community.

3Cloud has been a key partner in my journey so far. Early in the diagnosis we were not sure if this was ALS. Frankly, we are still not sure if it will lead to that eventually. 3Cloud has stepped in and helped me find a place where I can contribute and support our teams as we continue to grow our business in data and analytics. For that I can say I am supremely thankful. This has allowed me to adapt to my new circumstances and continue to contribute in meaningful way.

Now we get a little more personal, I can say nothing but great things about my wife, Sheila. She has had to step up as a caregiver in ways that we were not expecting. Sheila has been my rock through this as well as my support when I really needed it. From helping me get ready for the day to keeping me going through the day, she has been magnificent through it all. What we do not know is what God has planned for us or what the next stage will be for me as we move forward. We can only trust him every day for what is next.

As a part of this process, we have moved from Minnesota to Kentucky to be closer to my family. That move was interesting. We had a lot of ups and downs as we went along, and I have my adult children to thank for a lot of the help throughout those weeks. We also had many friends help us with packing and prepping because I could not do much at all. They were very gracious and gave us time on their weekends and helped us make this move. When we got to Kentucky my kids did the lion’s share of unloading which was unexpected. With their help we were able to get moved in quickly. They have helped with painting, clean up, and unpacking throughout it all. I have watched my wife and daughters pick up new skills to fill the void where I would have normally done the work. It’s pretty impressive really. Other family members have also stepped up to help as needed which has been great as well.

We are in Frankfort

So, what does that mean for the blog? Well, Kristyna will continue to write on topics that she loves. I will contribute when I can on technical content. But I will also be providing reviews of various tools and technologies I have tried or used through this journey. I hope that some of these reviews will help some of you out there who may have similar or related issues. And maybe someone will find a tool that will be super helpful to them. I look forward to sharing this journey as we move forward.

If you made it this far, thank you for giving me some of your time and lending me your ear. If you want to know more about the personal side of this journey, you can check it out on CaringBridge. I look forward to getting back out in community and connecting with many of you along the way.