PASSMN June 2020 – Data Classification with SQL Server and Azure

I presented at the virtual Minnesota SQL Server User Group meeting on June 16, 2020. The topic was data classification with SQL Server 2019 and Azure SQL Database.

Data Classification Basics

Data classification in both SQL Server and Azure allow you to discover and label data based on information type and sensitivity. Information type is a way to describe the content of the data at high level. This includes types such as Address, Name, Networking, and Credit Card. By tagging your columns with types you will be able to easily see the types of data stored in your tables. You can also label the sensitivity. This includes labels such as Confidential and Confidential-GPDR.

Using SQL Server 2019 and SSMS 18.4+

For on premises implementations, you can use SQL Server Management Studio. I would recommend that you use SSMS 18.4 or greater. This has the most capability. SQL Server 2019 includes the sys.sensitivity_classifications system catalog view so you can query to see what field have been labeled.

To get started, open up SSMS. Right click the database and choose Tasks > Data Discovery and Classification > Classify Data. This will allow you to

Finding the Data Discovery and Classification Options in SSMS

view the Data Classification window in SQL Server. You will get a list of recommendations and the ability to add custom classifications in your SQL Server database.

The Data Classification view in SSMS

Once you have classified some of your data, you are able to view a report that shows the coverage of the classification work you have done.

Data Classification Report in SSMS

Adding Data Classification in Azure SQL Database

Azure SQL Database supports similar functionality for discovering and classifying data. The primary differences are (1) it requires Advanced Data Security which costs $15/month per server and (2) audit logging support is built in.

You can find this in the Azure portal with your SQL Database.

Advanced Data Security in Azure SQL Database

As you can see above, you get a visual here initially. Click the Data Discovery & Classification panel to open a similar classification window that we see in SSMS. This will allow you to discover and classify your data.

The key difference is turning on auditing and logging information about people querying the classified data. In the Security section in your SQL Database view in the Azure portal, choose Auditing. You can now add auditing to your server or database. (Click here for information about setting up Auditing.) I chose to use Log Analytics which is in preview. Log Analytics has a dashboard which shows activity in your database with this data.

Log Analytics Dashboard which Shows Access to Sensitive Data

You can click into the dashboard to dig into details. You can also use the Log Analytics query features to build your own queries to further analyze the data. The details contain who accessed the information, their IP address, and what was accessed. You can build more reports from that information to support more sophisticated auditing.

Final Thoughts

I think that there is still work to be done on SQL Server to better support auditing. Azure is ahead of the game in this area. More importantly, Azure logging is a platform level solution. You should be able to integrate your logging from the applications to the database in Azure.

You do have the ability to update the policy in SQL Server with a JSON file. I recommend you export the file and modify it. In Azure, you can update the information policy in the Security Center. Updating this policy allows you to discover data or information that you want to classify based on rules you set up. This should be part of your data governance plan.

One other follow up from the meeting. The question was raised about Visual Studio support in database projects. The answer is “sort of”. First, you need to make sure your project is targeting SQL Server 2019 or Azure SQL Database. Once that is set, you can use the following code to add the classification manually or you can apply it to your database and do a scheme compare to bring it in.

ADD SENSITIVITY CLASSIFICATION TO
    [SalesLT].[Customer].[FirstName]
    WITH (LABEL = 'Confidential - GDPR', LABEL_ID = 'fe62dcde-72c0-475c-b1af-fb8de4c8fc7e', INFORMATION_TYPE = 'Name', INFORMATION_TYPE_ID = '57845286-7598-22f5-9659-15b24aeb125e', RANK = MEDIUM);

You will need to know the GUIDs for the labels and types in your solution to do this manually. However, once this is done, you can see the information in the Properties window for the field as well.

Data Classification Properties in Visual Studio

The key thing to be aware of is that the properties are read only. You have to use the code to change them or do the changes in the database and use Schema Compare to bring them in.

Thanks again to those of you who joined us at the meeting. Here is the slide deck from that meeting. I look forward to sharing more with all of you later.

Power BI DAX Dynamic Measures

Dynamic measures are an effective way to avoid crowding your report with different versions of the same visual without using bookmarks. No disrespect to the versatility of bookmarks, but they can quickly clog up reports with the amount of visuals used and can be a headache to modify depending on the complexity. This post will explore when to use dynamic measures, how to set up your data model to accommodate them, how to implement them, and limitations.

A primary use case for dynamic measures is when you have multiple business owners looking at a report who each want to see the report in slightly different ways.

For example, you may have three major stakeholders looking at a report: CEO, CFO, and COO. The CEO may be more interested in future profits and looks at trends in the number of loads booked to see where we are headed. The CFO wants to see profit trends on loads that have already been completed (actualized profit). Additionally, the COO would rather see what’s currently in transit and look at only picked up loads. You could make three pages, one based on booked loads for the CEO, another based on delivered loads for the CFO, and one based on picked up loads for the COO. But what happens if your report is already going to be multiple pages and needs room to grow? You would need to triple your page count to accommodate the needs of your users or build three completely separate reports! Don’t worry, dynamic measures can solve this problem for you without clogging up your reports or workspaces with extra measures, visuals, and reports.

In order to use dynamic measures, you will need to set up your data model with some tables for the singular purpose of selecting the dynamic measure. You will also need to have a few measures that will correspond with those filters. For example, if you are want the end users to choose between seeing load count, profit, and revenue, you will need to make a table that contains all three of those fields as rows. The easiest way to do this is to go into the Power Query editor and use the “Enter Data” option. Make sure that your options are exactly as you’d like them displayed for your end users and are in a table that’s easy to reference.

Hit close and apply. Once your new table is loaded, you DO NOT need to connect it to your other data in the data model. Dynamic measures work because they do not rely on relationships between the filter and the data, but on the selection of measures to be used. You have a couple options here, you can build each of these measures separately or build them once within the dynamic measure itself. There are reasons to do both, so we will build out both examples. Having separate measures offers you lots of flexibility down the road, but if you are only using these measures within the dynamic measure then it’s best to keep it all within the dynamic measure to save some space and increase efficiency in your DAX queries.

Building the Dynamic Measure

After you have your desired measures built out, you can reference them inside your dynamic measure either by using separate measures or variables. The key expression in dynamic measures is an IN ALLSELECTED combination. It’s vital to know exactly how your options were spelled inside your recently make custom table because you will need to spell them out within the dynamic measure. The basic syntax is as follows:
Selected Measure =
SWITCH(TRUE(),
“Option1” IN ALLSELECTED(CustomTable[ColumnName]),[Option1Measure],
“Option2” IN ALLSELECTED(CustomTable[ColumnName]),[Option2Measure],
“Option3” IN ALLSELECTED(CustomTable[ColumnName]),[Option3Measure],[DefaultMeasure])

If your dynamic measure is being used to filter a selected field instead of choosing a completely different metric, I recommend using a variation of the following syntax:
Selected Measure =
VAR A =
SWITCH(TRUE(),
“Option1” IN ALLSELECTED(CustomTable[ColumnName]),[Filter1],
“Option2” IN ALLSELECTED(CustomTable[ColumnName]),[Filter2],
“Option3” IN ALLSELECTED(CustomTable[ColumnName]),[Filter3],[DefaultFilter])
RETURN
CALCULATE([Measure],FILTER(Table,Table[FilteredColumn]=A)

Using separate measures:

Using variables:

Using the switch function in conjunction with the IN ALLSELECTED allows the end user to “switch” their selection with the measure associated with it.

Using the Dynamic Measure

To use a dynamic measure, you’ll need to first build a slicer with your custom table you made earlier and referenced inside your dynamic measure.

Depending on how many options you have, it may be wise to format this slicer as a drop down or as a horizontal list to make it appear more like buttons.

To test your dynamic measures, let’s throw it into a card visual then select a couple different options to make sure it’s choosing the correct measure.

Perfect, working like a charm! From here, any visuals you build using these dynamic measures will adjust when users select their metric. Be sure to carefully label your visuals so it’s clear that what they are looking at will change depending on what they have selected. If you would like to have your profit and revenue to have ($) signs in front of the metrics, just build that formatting into the measure itself using FORMAT. In the example below, format is used to set profit and revenue as currency while load count is a standard number (comma separated). For more formatting options see the following documentation from Microsoft: https://docs.microsoft.com/en-us/dax/pre-defined-numeric-formats-for-the-format-function.

Now when you go to build your report, instead of 3 duplicate pages use this dynamic measure in all your visuals and place the slicer at the top of the page for various users to interact with at will.

Data On Wheels – Merging Two Great Minds

I am very excited to announce that my daughter, Kristyna Hughes, will be joining the Data On Wheels team. Our fates and paths to this point seem parallel though not sequential. Both of us went to college and studied cultural anthropology. This turned out to be a great foundation for working with customers in our respective technology fields. If you have followed me for a while, you know that my career has focused on data and analytics for the past 20 years. During that time I worked for XATA Corporation which was a transportation management company. We were a cutting edge, SaaS technology company that specialized in electronic driver logs. I created my blog while working there, thus Data On Wheels. I have since went back into consulting but kept the name. Fast forward to today, Kristyna started her career in logistics which led her to her current employer, GlobalTranz. While there, she created a couple of reports which were well received and got noticed by the business intelligence team. She has since helped them grow their Power BI practice and deliver awesome reports. I am very proud of the work she has done and look forward to her joining Data On Wheels as a co-contributor. Now, let’s hear from her.

Kristyna and Steve at Kristyna’s Graduation from Biola University

Kristyna here, I am thrilled to be joining my dad in the BI world and in writing up the lessons learned along the way. Like he said, I have been deep in the world of Power BI from administration to building the executive morning report. I believe people in the business are our best assets as data analysts, so I appreciate learning my logistics knowledge through experience and continue to learn from my colleagues in the field. Our mutual background with cultural anthropology allows us to get to the heart of problems efficiently through discussions with stakeholders to bring them valuable insights from their data. While not all companies literally have their data on wheels like logistics, data can turn the wheels of profitable change for any company.

Buffalo Trace Distillery, Frankfort KY

One other thing my dad and I enjoy together is good whiskey from rye to bourbon to scotch. Look for some of our favorites in the Fun Info page as we sip and savor some soul-warming whiskeys.

Starting and Stopping SQL Server with PowerShell

Have you ever had this issue? You are trying to explore new features of SQL Server or you want to install the latest version for some testing, but SQL Server is a resource hog and kills your the performance of your PC. I have been evaluating SQL Server 2019 including both flavors of Analysis Services. That means I am installing the Developer edition of SQL Server on my laptop (a Yoga 900) running Windows 10.

See the source image

I have dealt with this issue in the past by opening the services window and setting the Startup Type to Manual and then turning it on when I wanted to work with them. In my current use case, I did not want to have to manage that with a lot of clicks. So, my thought was this should be easy with PowerShell. Of course, nothing is easy when it is not something you do everyday. So I started digging.

I found it is definitely possible but you need to know the steps that matter including making sure that PowerShell will execute the scripts on your PC. So here are the steps and code.

Empowering Your PC to Run Your PowerShell

The first issue is that PowerShell will not execute on your PC until you allow it. You will need to open PowerShell as an Administrator. You can leave the window open once we are done here. The following code will allow code created on your PC to be executed on your PC.

Set-ExecutionPolicy RemoteSigned

Shutting Down SQL Server with PowerShell

I chose to create two PowerShell files that will allow me to execute these steps for all the services I am managing in. In my examples, I have three instances of SQL Server – Data Engine Services, Analysis Services – Multidimensional Model, and Analysis Services – Tabular Mode.

The code required to startup SQL Server is noted below.

Set-Service 'MSSQL$DOWSQL2019' -StartupType Disabled
Stop-Service -Name 'MSSQL$DOWSQL2019' -Force

Let’s break it down. The first line will actually set the StartupType to Disabled. This will prevent it from restarting via a reboot. You will need to be intentional about restarting the service. The second line is the command to stop the service. The “Force” flag will shut down dependant services like SQL Agent if you have that running.

You will need to know the service name. SQL Server Data Engine Services are typically named “MSSQL$” followed by the instance name. If you are using Analysis Services, the naming using “MSOLAP$” as the prefix.

You can run these scripts directly in your PowerShell window. I did this while testing them. I wrote them in Visual Studio Code, but had issues executing them. You may be fine, I wanted to let you know my experience. Once I had my three instances scripted, I saved them in a file called SQLServerOff.ps1.

Turning Them on with PowerShell

The process and the code is similar. You first need to enable the services then start them up.

Set-Service 'MSSQL$DOWSQL2019' -StartupType Manual
Start-Service -Name 'MSSQL$DOWSQL2019' 

I chose the Startup Type “Manual” so I still need to turn them on. It is possible to set that to Automatic if you want it to start up on a reboot for instance. I then saved these in a file called SQLServerOn.ps1.

Running the PS1 Files

There are a couple of options to execute your file. While I was testing and I used the PowerShell window I had open. In order to execute your file replace “YOUR PATH” with the full path to the script.

PS C:\WINDOWS\system32> & "YOUR PATH\SQLServicesOn.ps1"

While a good pattern, I was still looking for the “one-click” solution. Some more digging resulted in me finding the way to execute PowerShell file from a shortcut. I did this by going to the desktop, right clicking an empty space and creating a new, blank short cut. Here is the command you can use for the shortcut:

powershell.exe -noexit -File “YOUR PATH\SQLServicesOn.ps1”

Add the snippet above with the proper path for your code to the shortcut location. Once you have named your shortcut and saved it, you need to open the properties on the shortcut. Go to the advanced settings and run this as Administrator (this will not work without that setting turned on). You can now add the shortcut to a place convenient for you such as your task bar.

While I focused on SQL Server services, this will work other services as well. Enjoy!

Power BI, Dataflows, CDM, and Dynamics 365 – My Journey

PBIDataFlowsAndCDMSeries
Journey into Dataflows, Dynamics, and CDM

My company, Pragmatic Works, is working through a Dynamics 365 migration. As part of that migration, we need reports. We have made the decision, seemingly obvious given who we are, to use Power BI for all of our internal reporting. Through the process, I have been working with our migration team a lot and have been tasked to handle key reporting for the consulting team. We are implementing both CRM and PSA (Project Service Automation) as part of the rollout. I am responsible for reporting that supports the non-Sales operations for the consulting organization. This series of posts will follow my journey to get a good solution in place. I will give you the resources I used, the advice from the pros on my team, and anything else I can share to help your journey along as well.

I want to caveat that this is my journey through the process. I am sure some mistakes will be made along the way, but we should all learn together. I am not doing this in a vacuum. I have some very talented Power BI team members helping me with strategy and design. Their contributions will be reflected throughout the process, but want to give them credit now for sure.

Evaluating Power BI Dataflows vs Power BI Shared Datasets

I started the process by trying to determine what is the best option for building data models that can be used for ongoing report creation within our company. I was familiar with shared datasets and with the latest improvements in the service, it was a good place to start. However, I have been talking with the team about Power BI Dataflows (and Azure Data Factory Dataflows, but that is not relevant here). I put it out to the group above to discuss pros and cons. Overall, the team pointed out that Dataflows with Dynamics would be the best fit, however, there is not much out there on Dataflows in action.

Brian Knight and I were having a different conversation about the Common Data Model (CDM) and PowerApps for other projects. During that conversation, Brian also mentioned that the CDM was ideal for working with Dynamics data.

Coming full circle, the team agreed that CDM with Dynamics and Dataflows is a good way to go. And so starts the journey. I will be documenting the research and progress along the way here. I will be posting whenever I can. Here we go!