Power BI Embedded: Using Row-Level Security for Multilingual Reports

Making multilingual reports in Power BI requires a lot of different elements. Translations can be added to PBIX files to translate column names, visual titles, etc. but these translations will not impact the data itself. One way to translate the dimensions themselves (star schema introduction) is to have a translation table that contains a display language and value column for each ID of the dimension. In the example below, the translation table would replace the original table and we could apply RLS to the Language column.

Data Model Edits

To make RLS work in this manner, that is to filter down one table and not the entire data model, there are a couple of things that need to be set up. First, you will need a bridge table for every translated table that has a unique list of the ID’s. Second, the RLS filter needs to be added to each of the tables that has a translation. Lastly, the relationships need to be re-established so that the translated tables are only related to their bridge table.

Add RLS Filters

In order to apply RLS dynamically in an embedded Power BI scenario, the locale needs to be passed through the embedded token as an argument within the UPN (dynamic RLS instructions ). Unfortunately, Power BI can only receive one value for each embedded token, so if you need true RLS on the report along with dynamic filtering for the translation, you will need to pass both arguments through the embedded token and separate them later. For example, if the UPN of the user was example@test.com, you can program the embedded token to send the default browser locale send by that user and the UPN of the signed in user in this format: UPN|locale.

It looks a bit goofy, but the pipe delimitator will allow us to separate out the two RLS arguments via the DAX function PATHITEM(). With PATHITEM(), we can tell the RLS filter to either choose what’s before the pipe (the UPN in this case) or what’s after (the locale) to pass through for the role filter. The filter for the example above would be:

[Language] = PATHITEM(USERPRINCIPALNAME(),2)

Once you have applied this filter to all of the tables with the Language field, you’ll need to make some adjustments to your data model.

Reconnect Relationships

Per the diagram below, you will need to connect the bridge table into your existing data model. Remember, the ONLY relationship your translation table should have would be to the bridge table. Every other fact table can join to the bridge table. The joins are as follows – Translation Table has ONE bidirectional join as * to 1 to the Bridge Table, Bridge table joins to every fact table single direction with Bridge Table filtering the fact tables in a 1 to * relationship.

NOTE: do not use CROSSFILTER() in measures with tables that have RLS applied. They will break in the online Power BI service.

And that’s all there is to it! Hopefully this saves you time when building a multilingual, embedded report. Happy coding!

Additional Resources

2022 PASS Data Community Summit Recap: Power BI Meets Programmability – XMLA, TOM, C#

Thank you to everyone who made it out to PASS Data Community Summit! I have greatly enjoyed meeting so many fellow data nerds and have learned so much from all of you. Speaking in person is always a joy, but this crowd was one of the best. Thank you for laughing with me and asking great engaging questions. If you haven’t already, please connect with me on LinkedIN and reach out with any additional questions or fun stories as you begin to scale your Power BI! If you have seen this presentation before, please check out the section where we use actual DAX inside a calculated column and measure instead of just text. It was an adventure to update the code the day of presenting, but well worth it to add a little more flair to the presentation.

As promised, below is the code used in my presentation as well as a link to my GitHub with the presentation and zip file to run the code on your own laptop. Happy coding everyone!

Abstract

Tune in to learn how to programmatically add columns and measures to Power BI data models using TOM, XMLA, and C#! It is rare to find a Power BI developer who has a background in C#, but C# and other programming languages offer a lot of automation and scalability that is lacking in Power BI development. The XMLA endpoint is a powerful tool available in the online Power BI service that allows report developers to connect to their data model and adjust a variety of entities outside the Power BI Desktop application. For example, the XMLA endpoints can be used within a pipeline triggered by an application to update a Power BI model schema. This allows end users to create custom UDFs (user defined fields) on the fly and delete them. Similarly, developers can create and use translations for customer specific column renames without worrying about breaking visuals and complicated data models.

Links

Final C# Code (now with DAX calculated columns!)

using System;
using Microsoft.AnalysisServices.Tabular;

namespace TOM_Testing_Live
{
        internal class Program
        {
        static void Main(string[] args)
        {
//-------------- Connect to PBI Premium Workspace ---------------- //

            // create the connect string - powerbi://api.powerbi.com/v1.0/myorg/WORKSPACE_NAME
            string workspaceConnection = "powerbi://api.powerbi.com/v1.0/myorg/Happy%20Coding";
            string connectString = $"DataSource={workspaceConnection};";

            // connect to the Power BI workspace referenced in connect string
            Server server = new Server();
            server.Connect(connectString);


//---------------- List out current state before we make changes -------------------//

            // enumerate through datasets in workspace to display their names
            foreach (Database database in server.Databases)
            {
                Console.WriteLine($"ID : {database.ID}, Name : {database.Name}, CompatibilityLevel: {database.CompatibilityLevel}, Last Updated : {database.LastSchemaUpdate}");
            }

            // enumerate through tables in one database (use the database ID from previous step)
            Model model = server.Databases["5d161e6b-697a-49b6-a3e6-7d19b940a8fd"].Model;

            //if you don't specify a database, it will only grab models from the first database in the list
            foreach (Table table in model.Tables)
            {
                Console.WriteLine($"Table : {table.Name} IsHidden? : {table.IsHidden}");
            }

            // Specify a single table in the dataset
            Table table_product = model.Tables["Product"];

            // Specify a single table in the dataset (our measure table)
            Table table_measures = model.Tables["_Measures"];

            // List out the columns in the product table
            foreach (Column column in table_product.Columns)
            {
                Console.WriteLine($"Columns: {column.Name}");
            }

            // List out the measures in the product table
            foreach (Measure productmeasures in table_product.Measures)
            {
                Console.WriteLine($"Measures: {productmeasures.Name}");
            }

            // List of translations on the model
            foreach (Culture culture in model.Cultures)
            {
                Console.WriteLine($"Existing Culture: {culture.Name}");
            }

            // List out the hierarchies in the product table
            foreach (Hierarchy hierarchy in table_product.Hierarchies)
            {
                Console.WriteLine($"Hierarchies: {hierarchy.Name}, Lineage Tag = {hierarchy.LineageTag}");
            }

            // List out the levels in our category hierarchy
            if (table_product.Hierarchies.ContainsName("Category Hierarchy"))
                {
                Hierarchy hierarchy_category_ = table_product.Hierarchies["Category Hierarchy"];
                foreach (Level level_categoryhierarchy in hierarchy_category_.Levels)
                {
                    Console.WriteLine($"Category Hierarchy Level {level_categoryhierarchy.Ordinal}: {level_categoryhierarchy.Name} Lineage Tag: {level_categoryhierarchy.LineageTag} from {level_categoryhierarchy.Column.Name}");
                }
            }
            else
                {
                Console.WriteLine($"No Category Hierarchy");
                }


        
 //----------------Deleting columns and measures ----------------//
   /*         
           // Deleting a column if it exists
             if (table_product.Columns.ContainsName("Testing"))
             //this looks to see if there is a column already named "Testing"
             {
                 Console.WriteLine($"Column Exists");
                 table_product.Columns.Remove("Testing"); //if the column exists, this will remove it
                 Console.WriteLine($"Column Deleted");
             }
             else
             {
                 Console.WriteLine($"Column Does Not Exist");
             } 

             // Deleting a measure if it exists
             if (table_measures.Measures.ContainsName("Test Measure"))
             //this looks to see if there is a measure already named "Test Measure"
             {
                  Console.WriteLine($"Measure Exists");
                  table_measures.Measures.Remove("Test Measure"); //if the measure exists, this will remove it
                  Console.WriteLine($"Measure Deleted");
              }
              else
              {
                  Console.WriteLine($"Measure Does Not Exist");
              }

*/
//--------------- Adding columns and measures --------------------//            

              // Adding our column if it doesn't already exist
                if (table_product.Columns.ContainsName("Testing")) //this looks to see if there is a column already named "Testing"
                {
                    Console.WriteLine($"Column Exists");
                }
                else
                {
                    Column column_testing = new DataColumn() //this will add the column
                    {
                        Name = "Testing",  //name your column for Power BI
                        DataType = DataType.String, //set the data type
                        SourceColumn = "Product", //this must match the name of the column your source 
                        LineageTag = "product-testing-column"
                    };
                    table_product.Columns.Add(column_testing);
                    Console.WriteLine($"Column Created!");
                }

                //Get the partition sources for each table
                foreach (DataSource datasource in model.DataSources)
                {
                    Console.WriteLine($"Datasource : {datasource.Name}");

                };

                // Add a measure if it doesn't already exist in a specified table 
                if (table_measures.Measures.ContainsName("Test Measure"))
                {
                    Measure measure = table_measures.Measures["Test Measure"];
                    measure.Expression = "\"Hello there\""; //you can update an existing measure using this script
                    Console.WriteLine($"Measure Exists");
                }
                else
                {
                    Measure measure = new Measure()
                    {
                        Name = "Test Measure",
                        Expression = "\"Hello World\"", //you can also use DAX here
                        LineageTag = "test-measure"
                    };
                    table_measures.Measures.Add(measure);
                    Console.WriteLine($"Measure Added");
                }

//--------------- Adding columns and measures with DAX functions! --------------------//            

            // Adding our column if it doesn't already exist
            if (table_product.Columns.ContainsName("List Price w Tax")) //this looks to see if there is a column already named "Testing"
            {
                Console.WriteLine($"Calculated Column Exists");
            }
            else
            {
                Column column_testing = new CalculatedColumn() //this will add the column
                {
                    Name = "List Price w Tax",  //name your column for Power BI
                    DataType = DataType.Decimal, //set the data type
                    Expression = "'Product'[List Price] * 1.05", //this is the DAX used to create your new column!
                    LineageTag = "testing_w_dax-column"
                };
                table_product.Columns.Add(column_testing);
                Console.WriteLine($"Calculated Column Created!");
            }


            // Add a measure if it doesn't already exist 
            if (table_measures.Measures.ContainsName("DAX Measure Example"))
            {
                Measure measure = table_measures.Measures["DAX Measure Example"];
                measure.Expression = "SUM(Sales[Total Product Cost])+100"; //you can update an existing measure using this script
                Console.WriteLine($"DAX Measure Exists");
            }
            else
            {
                Measure measure = new Measure()
                {
                    Name = "DAX Measure Example",
                    Expression = "SUM(Sales[Total Product Cost])", 
                    LineageTag = "test-measure-with-dax"
                };
                table_measures.Measures.Add(measure);
                Console.WriteLine($"DAX Measure Added");
            }




//------------------- Manipulating Hierarchies ---------------------//

                        //Editing an existing hierarchy originally called Category Hierarchy
                        //Once you rename the hierarchy, you'll have to adjust this code to the renamed Hierarchy
                        Hierarchy hierarchy_category = table_product.Hierarchies["Category Hierarchy Rename"];
                        {
                            hierarchy_category.Name = "Category Hierarchy Rename"; //this renames the hierarchy, note the lineage tag will remain unchanged
                            Console.WriteLine($"Category Hierarchy Renamed");
                         }

                        //Editing an existing hierarchy level 
                          Level level_Category = hierarchy_category.Levels.FindByLineageTag("ca792793-d3c3-4b5d-9bee-2b46c01833bb");
                          Level level_Subcategory = hierarchy_category.Levels.FindByLineageTag("4304e645-bb8f-4d7e-b25c-f629be2110d8");
                          {
                              level_Category.Name = "Category";
                              level_Category.Ordinal = 1;
                              level_Subcategory.Name = "Subcategory";
                              level_Subcategory.Ordinal = 0;
                              Console.WriteLine($"Category Hierarchy Levels Renamed & Reordered");
                          }

                        //Adding a new level to the hierarchy if it doesn't already exist
                          if (hierarchy_category.Levels.ContainsName("Model"))
                          {
                              Console.WriteLine($"Hierarchy Level Exists");
                          }
                          else
                          {
                              Level level_Model = new Level()
                              {
                                  Name = "Model",
                                  Ordinal = 2,
                                  Column = table_product.Columns.Find("Model"),
                                  LineageTag = "hierarchy-model-column"
                              };
                              hierarchy_category.Levels.Add(level_Model);
                              Console.WriteLine($"Hierarchy Level Added");
                          }


                         //Add a new hierarchy if it doesn't already exist
                           if (table_product.Hierarchies.ContainsName("New Hierarchy"))
                           {
                               Console.WriteLine($"New Hierarchy Exists");
                           }
                           else
                           {
                               Hierarchy hiearchy_new = new Hierarchy()
                               {
                                   Name = "New Hierarchy",
                                   LineageTag = "new-hierarchy-lineage-tag"
                               };
                               table_product.Hierarchies.Add(hiearchy_new);
                               Console.WriteLine($"Hierarchy Added");
                         //Creating levels to add to the new hierarchy
                               Level level_one = new Level()
                               {
                                   Name = "Model",
                                   Ordinal = 0,
                                   Column = table_product.Columns.Find("Model"),
                                   LineageTag = "product-model-column"
                               };
                               Level level_two = new Level()
                               {
                                   Name = "Product",
                                   Ordinal = 1,
                                   Column = table_product.Columns.Find("Product"),
                                   LineageTag = "product-product-column"
                               };
                               hiearchy_new.Levels.Add(level_one);
                               hiearchy_new.Levels.Add(level_two);
                               Console.WriteLine($"Levels added to new hiearchy");
                           };

             
            //-------------------------- Translations ------------------------------//

            //Translations can be used to rename existing columns without rebuilding the model. This also updates any visuals that use that column.          
            
                        // List of translations on the model
                        foreach (Culture culture in model.Cultures)
                        {
                            Console.WriteLine($"Existing Culture: {culture.Name}");
                        }
                        // Let's get a list of the existing translations within the en_US culture
                        Culture enUsCulture = model.Cultures.Find("en-US");

                        foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations)
                        {
                            Console.WriteLine($"Translated Object: {objectTranslation.Value}");
                        }
                        // Narrow down what column within this culture/language you would like to add the translation to
                        MetadataObject dataColumn = table_product.Columns.Find("SKU"); //this needs to always be the original column name within the data model.
                        ObjectTranslation proposedTranslation = enUsCulture.ObjectTranslations[dataColumn, TranslatedProperty.Caption];

                        // Only one translation per entity per culture.
                        if (proposedTranslation != null)
                        {
                           Console.WriteLine($"Translation Exists for this Culture & Column combo");
                           enUsCulture.ObjectTranslations.Remove(proposedTranslation); //need to remove the existing translation to overwrite it
                           ObjectTranslation overwriteTranslation = new ObjectTranslation()
                           {
                               Object = dataColumn,
                               Property = TranslatedProperty.Caption,
                               Value = "Cool Stuff"
                           };
                           enUsCulture.ObjectTranslations.Add(overwriteTranslation);
                        }
                        else
                        {
                           ObjectTranslation newTranslation = new ObjectTranslation()
                           {
                               Object = dataColumn,
                               Property = TranslatedProperty.Caption,
                               Value = "Total Rad"
                           };
                           enUsCulture.ObjectTranslations.Add(newTranslation);
                        }             

                       // List out the translations to see what they are now that we have run the script    
                       foreach (ObjectTranslation objectTranslation in enUsCulture.ObjectTranslations)
                       {
                           Console.WriteLine($"Final Translated Object: {objectTranslation.Value}");
                       }
            
            //------------------- List out end state --------------------------------//

            // List out the columns in the product table one more time to make sure our column is added
            foreach (Column column in table_product.Columns)
            {
                Console.WriteLine($"Columns: {column.Name}");
            }         

            // List out the measures in the table one more time to make sure our measure is added
            foreach (Measure productmeasures in table_measures.Measures)
            {
                Console.WriteLine($"Measures: {productmeasures.Name}");
            }

            // List out the hierarchies in the product table
            foreach (Hierarchy hierarchy in table_product.Hierarchies)
            {
                Console.WriteLine($"Hierarchies: {hierarchy.Name}, Lineage Tag: {hierarchy.LineageTag}");
                foreach (Level level_hierarchy in hierarchy.Levels)
                {
                    Console.WriteLine($"Level {level_hierarchy.Ordinal}: {level_hierarchy.Name}, Lineage Tag: {level_hierarchy.LineageTag} from {level_hierarchy.Column.Name}");
                }
            }

//-------------- Refresh our version of the data model then push/save changes back to the model in the PBI service -------------//
            table_product.RequestRefresh(RefreshType.Full);
            table_measures.RequestRefresh(RefreshType.Full);
            model.RequestRefresh(RefreshType.Full);
            model.SaveChanges(); 
            //make sure this is the last line! Note, this line will not work unless your dataset has proper data source credentials connected in the service

            Console.WriteLine($"Script Complete!");


        }
    }
}



250th Blog Post

Kristyna and I were working through some updates to our site and realize that this was going to be our 250th blog post on Data on Wheels. I thought this would be a good time to reminisce about where we have come from, what has happened through the years, and what is next for Data on Wheels.

DataOnWheels

Where We Have Come From

On December 7th, 2010, I created a new blog called Data on Wheels on WordPress. My first blog was appropriately named “Time to start that blog.” It was a nice little paragraph that I’m sure no one actually read. so now is your opportunity check it out and like that first blog. I think more interestingly, my first series of blogs was on the nature of SQL Azure. This is back when SQL Azure only had web and business SKUs. Not sure how many of you remember those days, when the max size was 50 gigabytes, and we weren’t sure if it would go anywhere. The promise of greatness was there but there was still much to come.

Before we reflect on what has happened and what has changed through the years, I would like to emphasize how my blogging has started and matured through the years. Whenever I’m asked by others “Why do you blog?”, my response is “I blog for me and no one else.” This is important because we want to continue writing about something we enjoy. I turned my blog into a reference dictionary of things I’ve done, things I’ve heard, and things I did not want to forget. I have also written numerous tribute blogs through the years which highlighted mentors and others who have shaped my career.

What Has Happened Through the Years

We are coming up on year 13 of blogging. There was a period of a few years where my readership was really high because everyone wanted to figure out how to make Excel cool when working with SharePoint. Some of those posts are still being hit regularly today. I’ve also posted on many events that I have attended or presented at. I use my blog as a location for code and presentation materials.

I started when I returned to Magenic. Since then, my journey has taken me to Pragmatic Works and from there to 3Cloud where I am today. Along the way I have written blogs on XMLA, Excel, SQL Server, Azure, window functions, and Power BI. I’m sure I’ve covered many more topics than that including highlighting mentors such as my parents and my wife. looking back to the various posts it is interesting to see the number of events I either wrote about or presented at.

Me and My Wife Sheila
My Father-in-Law Ed
My Parents Jeanine and Terry

As my role changed through the years from a highly technical focus to more of a people focus, my writing diminished in count. It seemed I moved away from the technology more than I would have liked but managing was definitely a new area for me to work on. On June 8th of 2020, in the middle of a pandemic, I asked my daughter Kristyna to join me in writing this blog. She was just starting her career in Power BI and had done some great things that she wanted to write about. Having her join the Data on Wheels team did a couple things. First, it gave us great fresh technical content for the blog readers. Second, it gave Kristyna an opportunity to work with a large audience from the beginning. She actually has the record now for the most hits in one day for a post for Data on Wheels. That is correct, she’s beat my all-time record! Which is great!

As you may have read on LinkedIn or in some of my posts, I have been diagnosed with ALS. This has obviously made writing blog posts a little more challenging as I have to use voice to text technology in most cases. I have decided that part of what I will be writing about as we move forward is what technology is working for me to keep me active in the workplace. Some days it’s very much a changing environment as the disease takes over more of my body and we try to figure out how to accommodate what’s next.

What’s Next

We look forward to continuing to provide relevant technical content into our data community. It is an awesome place to be for me to have my daughter ready to take over the leadership on this blog and keep it moving forward. She will continue to write about things that excite her and that she wants to share with this community as will I. I will focus a lot more on workplace enablement, but as you can see from the notebooks post sometimes that bleeds over into more technical topics as well.

We are so excited that you have chosen to join us in this journey and look forward to putting together hundreds of more posts that have value or at least some fun for everyone. Just recently we updated our site and cleaned up some of the artifacts from the last few years. We also added a direct link to Working with ALS at the top of the page as there are others who are not focused on technology or are only interested in the products and services I am discovering in my journey. Right next to that link is a link about Distilleries. Kristyna and I both enjoy our whiskey and in her case some cigars, you should ask her. With our recent move to Kentucky, it gave us a unique opportunity to visit distilleries and try new whiskeys along the way. So, our first sharing with us is just which distillers you have visited and whether we have actually done a tour or tastings. We’ll provide more opinionated information as we move along. This is purely for fun, and we hope you enjoy it and provide feedback along the way as well.

Thanks again for joining us and we hope that you all either get a good laugh or learn something new!

Memphis SQL Saturday 2022 & a Notebook

Back in person again! It is awesome to be able to get back into the SQL community and see fellow data professionals. A huge shout out to the Memphis data community leaders in particular Zach Golden and Rob Demotsis who put on a great event for their first one out of the pandemic. I was also able to get together with fellow 3Clouders – Dawn Clement and Kristyna Hughes.

Steve, Dawn, Kristyna

A new but different opportunity

For me this was a very special event. Not only is it the first event I’ve been able to do in person since COVID started, but it is also the first event that I have presented at since being diagnosed with ALS. There are times I think I talk about this too much, but it is front and center of who I am now. I want to encourage others who have similar disabilities to remain active as they work in their new reality.

So how did this change for me? Well, having presented on SQL many times through the years, I typically use a method of highlighting code in management studio and executing it. That however would not work in this case. I moved all my code over to a notebook in Azure Data Studio. This allowed me to execute the code a step at a time with a simple button push. To read more about the experience of creating a notebook, check out my previous blog post here.

The other key thing that changed for me was having my wife, Sheila, join me on the platform to push the buttons that I needed for the presentation and the demo. This was definitely a new experience for her and me. She did a great job following my cues and sometimes a lack thereof. She was able to get us through the demos and leveraging the clever new notebook I used. This is the new normal for us and I look forward to presenting for as long as I am able.

Sheila and I co-presenting

Azure SQL Elasticity

This was the topic that I spoke on. We covered elastic queries, elastic jobs, and elastic transactions. As promised to the attendees and those of you who are reading this or are following up on my post about notebooks, I have published the notebook on the Data on Wheels GitHub which you can find here.

After you have downloaded the folders from GitHub, Open Azure Data Studio and browse to the Notebooks section. Click the Open Jupyter Book button has shown below.

This will open a File Explorer dialog. Choose azure SQL database elasticity folder and then click Select Jupyter Book.

This will open the Jupyter book which contains the markdown files with information and the notebooks you need to set up and run the demos. Enjoy!

Thanks to those of you who are able to attend. I hope you enjoyed the event as much as I did!

My experience working with notebooks in Azure Data Studio

I’ve seen notebooks used in Azure Data Studio on multiple occasions. I really like the concept of notebooks, having done some work within Azure Databricks notebooks, but not extensively. As I go into the process that I went through, it’s important to understand that I am not a data scientist and have not done extensive development or spent a lot of time in Python or Jupyter notebooks. Furthermore, my interest in the notebooks was elevated when I realized I wanted to continue presenting while working through my current ALS diagnosis. I have limited use of my hands and arms so highlighting and executing code, especially in front of a crowd, was going to be problematic. (If you want to learn more about my condition and tools I’m using to maintain my ability to work, please check out this series of articles on our blog.)

Let’s start with the core problem that I’m trying to solve today. I will be presenting a session on elastic queries in Azure SQL database. Most of the code is ready to go since I have done this presentation a few times. As I was working through testing my demo, I found executing code by highlighting and pushing “run” in either Data Studio or in SQL Server Management Studio was difficult because I struggled to control highlighting the code. I was also looking for better ways to automate the process, but more about that later. I watched a couple of demos on using notebooks and found some of the notebooks that have been created by Microsoft. I realized I could put together my entire demo package to share with the attendees and build the demo so that I could execute it a step at a time without highlighting. Now that you have the background of what I was trying to accomplish, let’s look at the process I went through getting this done.

How in the world do you work with notebooks in Azure Data Studio?

One of the interesting things about working with notebooks, is that if you want to work with notebooks, it’s likely that you already have and you prefer to use them. This means that the instructions for how to create, organize, and use notebooks within Azure Data Studio is a bit lacking. For example, it was not entirely clear to me that one part of the process is creating a folder to store your notebooks with your markdown files and other content. So, let’s go through the process of creating your first notebook step by step with explanations about what’s happening.

The organization of notebooks and files in Azure Data Studio

Part of my struggle in understanding what was happening is each time I tried to create a notebook it asked me for locations and files. I thought it should know where they should go. So, as a newbie with notebooks and organization with Azure Data Studio, I created a notebook and a Jupyter book so I could see how the files are organized. Then I could go back and create the Jupyter book correctly from the beginning. While I may not get all of the terminology correct in this process, this is my discovery as I move forward through the process.

Once I started working with the notebook process in Azure Data Studio, I realized there were multiple components involved:

  • Jupyter book
  • Markdown file
  • Notebook
  • Section

While I am sure there are simple ways to create what we would like to do, I’m coming at this entirely from Azure Data Studio as a data developer not a data scientist. Each time I tried to create my first Jupyter book, I didn’t understand what its purpose was in the beginning. When you create a Jupyter book, it looks like you’re creating a folder. That folder will also contain several helper files to organize your notebooks, markdown files, and sections. Before we leave the structure and organization section here, I want to clarify that the book is the parent folder, and the section is a sub folder within the book. Markdown files and notebooks are files created that are organized for particular purposes. The markdown file is effectively a document that allows you to create a nicely formatted informational component for your notebook. The notebook files are actual Jupyter notebook files which are split into sections for code and text.

Here is the high level organization of the Jupyter book we are going to create:

  • Jupyter book: Azure SQL database elasticity
    • Markdown file: README
    • Section: Setting up the demo
      • Markdown file: Set up instructions
      • Notebook: Prepping the demo
    • Section: Elastic query demo
      • Markdown file: Elastic query demo instructions
      • Notebook: Elastic query demo
    • Section: Elastic job demo
      • Markdown file: Elastic job demo instructions
      • Notebook: Elastic job demo

For the purposes of this blog post, we will walk through the process of creating the original Jupyter book and the elastic query demo section. That section has a good mix of code and text to illustrate the power and capabilities of notebooks.

Creating your first notebook in Azure Data Studio

Let’s begin creating our first notebook in Azure Data Studio. Before we dive into this process too deeply, I want to be clear that we are going to create a Jupyter book to add our notebooks to. This is not required as you can create a new notebook from the file menu or with the shortcut as noted on the screen in Azure Data Studio. What confused me about this initially is that you cannot create a simple notebook from the notebooks section in Azure Data Studio. When you create your notebook, you can save it as a file in the location of your choosing, but it will not show up in the notebook section. Once you create a notebook, if you are not using a Jupyter book to host it in, you can reopen it just by choosing Open File from the menu. While this may make sense to others, it was not entirely intuitive to me in the beginning. I had to do some mucking around to figure out that process.

So, we will start our process by creating a Jupyter book to host all our notebooks and markdown files. This Jupyter book will also be readily displayed in the notebook section on Azure Data Studio. Using the to get to the More Actions menu, choose Create Jupyter Book.

Create new Jupyter book

In the dialogue give your new Jupyter book a name and specify the location you want to store it in. I have not used the optional content folder for this exercise and will recommend that you do not either.

New Jupyter book dialogue

If you go to the folder location you created your Jupyter book in, you will see that it also created three files in the folder named the same as your Jupyter book:

  • _config.yml
  • _toc.yml
  • README.md

In the notebook section of Azure Data Studio, you should see your Jupyter book with a README markdown file in it. For now, we will leave the README file as an introduction to what is in your notebook. (Be aware, that you can remove the file by deleting it, but you will need to update the TOC file to reflect the changes you made. If you do not update the TOC file, you may see missing file error messages in Azure Data Studio.)

New Jupyter book with README

I will not take time in this post to review what is possible in a markdown file. The key here is you can update the README file that was created with headers and formatting to provide instructions on how to use the various contents of your Jupyter book. If you double click within the README file, it will open up the readme.md file in a new tab in Azure Data Studio. This has a line number and will allow you to update and add content.

The following code gives you an example of some markdown syntax:

# Welcome to the Jupyter book on Azure SQL Database elasticity
This book contains 3 sections
* The first section contains instructions on how to set up the demo
* The second section contains the demo for elastic queries
* The third section contains a demo for elastic jobs

This will result in the following look and feel in your README file

Formatted README markdown file

Adding a section

The next thing we will do is add a section where we will host the executable demo code. Right click on your notebook and choose Add Section. We will add the title as Elastic query.

Adding the notebook

Up to this point, we have been building the framework to support our first notebook. While all these steps are not required, this is the most complete approach. Right click on your section and choose New Notebook. This will create a Jupyter notebook in the subfolder of your section.

New section with a notebook

Once you create the notebook, it will open a tab in Azure Data Studio with the notebook. You will notice that it has something called Kernel. The kernel allows you to set the default language used for the notebook. For the work that we are doing we will be using the SQL kernel. This will allow us to execute SQL code against a database. In the Attach to dropdown, you will see databases that you can use to execute code. The Cell dropdown allows you to add cells which can contain code or text.

Azure Data Studio supports other kernels that can be used for executing code against various workloads. These include Python, Spark, PySpark, and PowerShell.

Now let us get down to the business of creating a notebook with executable code. Before we add executable code, let us add a text cell as an introduction to the code. You can do this by clicking the cell dropdown and choosing text. Once you add the text cell you will notice there is a formatting bar which ironically is missing in the markdown files editor. This means it is easier to create formatted text in a cell in a notebook rather than in the markdown file itself. Keep this in mind as you create your notebooks and add content to your Jupyter book. These cells are easier to work with at times than the full file. This is particularly true if you are not knowledgeable on formatting markdown.

At this point, let us add a quick introduction to what we are about to do in the in the following code cells.

Formatted text cell

Next, we will add a code cell. From the dropdown menu for cell, choose Code Cell. This will add a code cell to your notebook which uses the language selected in your kernel. There is also a play button which allows you to execute the code.

Empty code cell

I am going to add the code that is required to clean up the tables for the demo. The resulting code cell will look like the following:

Code cell with DDL code

As a last step to understanding how notebooks and code work in the environment, we can execute the code by pushing the play button in the code cell. This will return the result of that execution as shown below:

Code cell with results

Congratulations, you have created your first notebook with executable code against a SQL Server database! You can continue to add more text cells and code cells as needed. One of the reasons I like this pattern is that it allows me to execute the code without having to highlight it while doing demos. Each cell can be run independently. You will also notice there is a Run All button if you choose to run all the scripts at the same time that you have in your notebook. This could be valuable if you have a set of maintenance operations or related items you want to run and you have collected in a notebook for use.

Another key thing to remember is that notebooks are shareable. Because the connection is outside of the notebook, once you share the notebook, they will have to connect to an environment that allows them to execute the same code. You can add your notebooks to GitHub or similar source control to manage change and allow you to share common resources easily without just distributing SQL files.

Before we wrap up

I feel I would be remiss if I did not also demonstrate what happens when you get data results in a notebook. In my case I have a database I can connect to which has WideWorldImporters loaded into it. I am going to select the top 1000 rows from the DimSupplier table. Once I run the code cell, I get the rows affected, the execution time, and a table with results as shown here:

Code cell with data results

As you can see in the results window, you have several export options and a chart option that you can use to further visualize or work with the data that you have retrieved. I would encourage you to explore these options as it depends on the type of data you are working with whether they work well for you or not. For example, supplier data does not chart very well, whereas if I had used fact data there may have been some interesting charting options. A notebook could be a straightforward way to demonstrate some simple reporting for a technically savvy audience.

Wrapping it up

There are many more functions that I did not cover around notebooks, and I assume that Microsoft will continue to make improvements to the overall capabilities here. I look forward to using notebooks more as a terrific way to share code and run demos. I hope you find this as valuable as well.

For those of you who are not sure about using notebooks, this is an effective way to build your skills while not trying to learn a new language if you are familiar with SQL. My first exposure was using Python in a Databricks environment. That was much to learn while also trying to understand how notebooks functioned. As the data environment continues to expand and require new skill sets, understanding how to use and leverage notebooks on a regular basis is a good skill to have. Microsoft has done us a great favor by using standard Jupyter notebooks which are used in data science, Databricks, and other areas of data practice.

If you are following my work enablement series, you know one of the things that I am passionate about is simplifying how I work, in order to stay working while continuing to lose functionality in my arms. Notebooks help with this by allowing me to execute code without highlighting it when doing demos. Because highlighting code and executing it in a tool like SQL Server Management Studio requires multiple touches on the keyboard and mouse, I struggle to do it efficiently. The ability to organize my demo around code cells and then have a self-documenting notebook to pass along to attendees is a huge win for me. I hope this helps others who struggle in the same way. And I hope this was helpful to those who have not used or seen notebooks in their current work environment but may in the future.

I will be creating and sharing a completed notebook for the demos related to my presentation on elastic capabilities with Azure SQL. Look for that presentation follow up from the Memphis SQL Saturday in October 2022. I will publish a follow up blog post with a link to the completed notebook used with that demo.