PASSMN June 2020 – Data Classification with SQL Server and Azure

I presented at the virtual Minnesota SQL Server User Group meeting on June 16, 2020. The topic was data classification with SQL Server 2019 and Azure SQL Database.

Data Classification Basics

Data classification in both SQL Server and Azure allow you to discover and label data based on information type and sensitivity. Information type is a way to describe the content of the data at high level. This includes types such as Address, Name, Networking, and Credit Card. By tagging your columns with types you will be able to easily see the types of data stored in your tables. You can also label the sensitivity. This includes labels such as Confidential and Confidential-GPDR.

Using SQL Server 2019 and SSMS 18.4+

For on premises implementations, you can use SQL Server Management Studio. I would recommend that you use SSMS 18.4 or greater. This has the most capability. SQL Server 2019 includes the sys.sensitivity_classifications system catalog view so you can query to see what field have been labeled.

To get started, open up SSMS. Right click the database and choose Tasks > Data Discovery and Classification > Classify Data. This will allow you to

Finding the Data Discovery and Classification Options in SSMS

view the Data Classification window in SQL Server. You will get a list of recommendations and the ability to add custom classifications in your SQL Server database.

The Data Classification view in SSMS

Once you have classified some of your data, you are able to view a report that shows the coverage of the classification work you have done.

Data Classification Report in SSMS

Adding Data Classification in Azure SQL Database

Azure SQL Database supports similar functionality for discovering and classifying data. The primary differences are (1) it requires Advanced Data Security which costs $15/month per server and (2) audit logging support is built in.

You can find this in the Azure portal with your SQL Database.

Advanced Data Security in Azure SQL Database

As you can see above, you get a visual here initially. Click the Data Discovery & Classification panel to open a similar classification window that we see in SSMS. This will allow you to discover and classify your data.

The key difference is turning on auditing and logging information about people querying the classified data. In the Security section in your SQL Database view in the Azure portal, choose Auditing. You can now add auditing to your server or database. (Click here for information about setting up Auditing.) I chose to use Log Analytics which is in preview. Log Analytics has a dashboard which shows activity in your database with this data.

Log Analytics Dashboard which Shows Access to Sensitive Data

You can click into the dashboard to dig into details. You can also use the Log Analytics query features to build your own queries to further analyze the data. The details contain who accessed the information, their IP address, and what was accessed. You can build more reports from that information to support more sophisticated auditing.

Final Thoughts

I think that there is still work to be done on SQL Server to better support auditing. Azure is ahead of the game in this area. More importantly, Azure logging is a platform level solution. You should be able to integrate your logging from the applications to the database in Azure.

You do have the ability to update the policy in SQL Server with a JSON file. I recommend you export the file and modify it. In Azure, you can update the information policy in the Security Center. Updating this policy allows you to discover data or information that you want to classify based on rules you set up. This should be part of your data governance plan.

One other follow up from the meeting. The question was raised about Visual Studio support in database projects. The answer is “sort of”. First, you need to make sure your project is targeting SQL Server 2019 or Azure SQL Database. Once that is set, you can use the following code to add the classification manually or you can apply it to your database and do a scheme compare to bring it in.

    WITH (LABEL = 'Confidential - GDPR', LABEL_ID = 'fe62dcde-72c0-475c-b1af-fb8de4c8fc7e', INFORMATION_TYPE = 'Name', INFORMATION_TYPE_ID = '57845286-7598-22f5-9659-15b24aeb125e', RANK = MEDIUM);

You will need to know the GUIDs for the labels and types in your solution to do this manually. However, once this is done, you can see the information in the Properties window for the field as well.

Data Classification Properties in Visual Studio

The key thing to be aware of is that the properties are read only. You have to use the code to change them or do the changes in the database and use Schema Compare to bring them in.

Thanks again to those of you who joined us at the meeting. Here is the slide deck from that meeting. I look forward to sharing more with all of you later.

Data On Wheels – Merging Two Great Minds

I am very excited to announce that my daughter, Kristyna Hughes, will be joining the Data On Wheels team. Our fates and paths to this point seem parallel though not sequential. Both of us went to college and studied cultural anthropology. This turned out to be a great foundation for working with customers in our respective technology fields. If you have followed me for a while, you know that my career has focused on data and analytics for the past 20 years. During that time I worked for XATA Corporation which was a transportation management company. We were a cutting edge, SaaS technology company that specialized in electronic driver logs. I created my blog while working there, thus Data On Wheels. I have since went back into consulting but kept the name. Fast forward to today, Kristyna started her career in logistics which led her to her current employer, GlobalTranz. While there, she created a couple of reports which were well received and got noticed by the business intelligence team. She has since helped them grow their Power BI practice and deliver awesome reports. I am very proud of the work she has done and look forward to her joining Data On Wheels as a co-contributor. Now, let’s hear from her.

Kristyna and Steve at Kristyna’s Graduation from Biola University

Kristyna here, I am thrilled to be joining my dad in the BI world and in writing up the lessons learned along the way. Like he said, I have been deep in the world of Power BI from administration to building the executive morning report. I believe people in the business are our best assets as data analysts, so I appreciate learning my logistics knowledge through experience and continue to learn from my colleagues in the field. Our mutual background with cultural anthropology allows us to get to the heart of problems efficiently through discussions with stakeholders to bring them valuable insights from their data. While not all companies literally have their data on wheels like logistics, data can turn the wheels of profitable change for any company.

Buffalo Trace Distillery, Frankfort KY

One other thing my dad and I enjoy together is good whiskey from rye to bourbon to scotch. Look for some of our favorites in the Fun Info page as we sip and savor some soul-warming whiskeys.

Starting and Stopping SQL Server with PowerShell

Have you ever had this issue? You are trying to explore new features of SQL Server or you want to install the latest version for some testing, but SQL Server is a resource hog and kills your the performance of your PC. I have been evaluating SQL Server 2019 including both flavors of Analysis Services. That means I am installing the Developer edition of SQL Server on my laptop (a Yoga 900) running Windows 10.

See the source image

I have dealt with this issue in the past by opening the services window and setting the Startup Type to Manual and then turning it on when I wanted to work with them. In my current use case, I did not want to have to manage that with a lot of clicks. So, my thought was this should be easy with PowerShell. Of course, nothing is easy when it is not something you do everyday. So I started digging.

I found it is definitely possible but you need to know the steps that matter including making sure that PowerShell will execute the scripts on your PC. So here are the steps and code.

Empowering Your PC to Run Your PowerShell

The first issue is that PowerShell will not execute on your PC until you allow it. You will need to open PowerShell as an Administrator. You can leave the window open once we are done here. The following code will allow code created on your PC to be executed on your PC.

Set-ExecutionPolicy RemoteSigned

Shutting Down SQL Server with PowerShell

I chose to create two PowerShell files that will allow me to execute these steps for all the services I am managing in. In my examples, I have three instances of SQL Server – Data Engine Services, Analysis Services – Multidimensional Model, and Analysis Services – Tabular Mode.

The code required to startup SQL Server is noted below.

Set-Service 'MSSQL$DOWSQL2019' -StartupType Disabled
Stop-Service -Name 'MSSQL$DOWSQL2019' -Force

Let’s break it down. The first line will actually set the StartupType to Disabled. This will prevent it from restarting via a reboot. You will need to be intentional about restarting the service. The second line is the command to stop the service. The “Force” flag will shut down dependant services like SQL Agent if you have that running.

You will need to know the service name. SQL Server Data Engine Services are typically named “MSSQL$” followed by the instance name. If you are using Analysis Services, the naming using “MSOLAP$” as the prefix.

You can run these scripts directly in your PowerShell window. I did this while testing them. I wrote them in Visual Studio Code, but had issues executing them. You may be fine, I wanted to let you know my experience. Once I had my three instances scripted, I saved them in a file called SQLServerOff.ps1.

Turning Them on with PowerShell

The process and the code is similar. You first need to enable the services then start them up.

Set-Service 'MSSQL$DOWSQL2019' -StartupType Manual
Start-Service -Name 'MSSQL$DOWSQL2019' 

I chose the Startup Type “Manual” so I still need to turn them on. It is possible to set that to Automatic if you want it to start up on a reboot for instance. I then saved these in a file called SQLServerOn.ps1.

Running the PS1 Files

There are a couple of options to execute your file. While I was testing and I used the PowerShell window I had open. In order to execute your file replace “YOUR PATH” with the full path to the script.

PS C:\WINDOWS\system32> & "YOUR PATH\SQLServicesOn.ps1"

While a good pattern, I was still looking for the “one-click” solution. Some more digging resulted in me finding the way to execute PowerShell file from a shortcut. I did this by going to the desktop, right clicking an empty space and creating a new, blank short cut. Here is the command you can use for the shortcut:

powershell.exe -noexit -File “YOUR PATH\SQLServicesOn.ps1”

Add the snippet above with the proper path for your code to the shortcut location. Once you have named your shortcut and saved it, you need to open the properties on the shortcut. Go to the advanced settings and run this as Administrator (this will not work without that setting turned on). You can now add the shortcut to a place convenient for you such as your task bar.

While I focused on SQL Server services, this will work other services as well. Enjoy!

Power BI, Dataflows, CDM, and Dynamics 365 – My Journey

Journey into Dataflows, Dynamics, and CDM

My company, Pragmatic Works, is working through a Dynamics 365 migration. As part of that migration, we need reports. We have made the decision, seemingly obvious given who we are, to use Power BI for all of our internal reporting. Through the process, I have been working with our migration team a lot and have been tasked to handle key reporting for the consulting team. We are implementing both CRM and PSA (Project Service Automation) as part of the rollout. I am responsible for reporting that supports the non-Sales operations for the consulting organization. This series of posts will follow my journey to get a good solution in place. I will give you the resources I used, the advice from the pros on my team, and anything else I can share to help your journey along as well.

I want to caveat that this is my journey through the process. I am sure some mistakes will be made along the way, but we should all learn together. I am not doing this in a vacuum. I have some very talented Power BI team members helping me with strategy and design. Their contributions will be reflected throughout the process, but want to give them credit now for sure.

Evaluating Power BI Dataflows vs Power BI Shared Datasets

I started the process by trying to determine what is the best option for building data models that can be used for ongoing report creation within our company. I was familiar with shared datasets and with the latest improvements in the service, it was a good place to start. However, I have been talking with the team about Power BI Dataflows (and Azure Data Factory Dataflows, but that is not relevant here). I put it out to the group above to discuss pros and cons. Overall, the team pointed out that Dataflows with Dynamics would be the best fit, however, there is not much out there on Dataflows in action.

Brian Knight and I were having a different conversation about the Common Data Model (CDM) and PowerApps for other projects. During that conversation, Brian also mentioned that the CDM was ideal for working with Dynamics data.

Coming full circle, the team agreed that CDM with Dynamics and Dataflows is a good way to go. And so starts the journey. I will be documenting the research and progress along the way here. I will be posting whenever I can. Here we go!

Power Platform Conference – 7/23-24/2019

Pragmatic Works is hosting a 2-day Power Platform online conference. If you are unable to get out to onsite conference but need some help with Microsoft Flow, PowerApps, or Power BI, this is a great opportunity to learn from some experts from around the world.power-platform-virtual-conference-v04_facebook Information About the Conference

This is a 2-day virtual conference covering Power BI, PowerApps and Flow. The cost is $49 for all sessions, recordings and bonus materials. There will be 12 sessions each day.

 Dates: July 23 & 24, 2019

Times: 7 a.m. to 7 p.m. ET


I hope to see many of you there virtually!