Back in person again! It is awesome to be able to get back into the SQL community and see fellow data professionals. A huge shout out to the Memphis data community leaders in particular Zach Golden and Rob Demotsis who put on a great event for their first one out of the pandemic. I was also able to get together with fellow 3Clouders – Dawn Clement and Kristyna Hughes.
A new but different opportunity
For me this was a very special event. Not only is it the first event I’ve been able to do in person since COVID started, but it is also the first event that I have presented at since being diagnosed with ALS. There are times I think I talk about this too much, but it is front and center of who I am now. I want to encourage others who have similar disabilities to remain active as they work in their new reality.
So how did this change for me? Well, having presented on SQL many times through the years, I typically use a method of highlighting code in management studio and executing it. That however would not work in this case. I moved all my code over to a notebook in Azure Data Studio. This allowed me to execute the code a step at a time with a simple button push. To read more about the experience of creating a notebook, check out my previous blog post here.
The other key thing that changed for me was having my wife, Sheila, join me on the platform to push the buttons that I needed for the presentation and the demo. This was definitely a new experience for her and me. She did a great job following my cues and sometimes a lack thereof. She was able to get us through the demos and leveraging the clever new notebook I used. This is the new normal for us and I look forward to presenting for as long as I am able.
Azure SQL Elasticity
This was the topic that I spoke on. We covered elastic queries, elastic jobs, and elastic transactions. As promised to the attendees and those of you who are reading this or are following up on my post about notebooks, I have published the notebook on the Data on Wheels GitHub which you can find here.
After you have downloaded the folders from GitHub, Open Azure Data Studio and browse to the Notebooks section. Click the Open Jupyter Book button has shown below.
This will open a File Explorer dialog. Choose azure SQL database elasticity folder and then click Select Jupyter Book.
This will open the Jupyter book which contains the markdown files with information and the notebooks you need to set up and run the demos. Enjoy!
Thanks to those of you who are able to attend. I hope you enjoyed the event as much as I did!
I’ve seen notebooks used in Azure Data Studio on multiple occasions. I really like the concept of notebooks, having done some work within Azure Databricks notebooks, but not extensively. As I go into the process that I went through, it’s important to understand that I am not a data scientist and have not done extensive development or spent a lot of time in Python or Jupyter notebooks. Furthermore, my interest in the notebooks was elevated when I realized I wanted to continue presenting while working through my current ALS diagnosis. I have limited use of my hands and arms so highlighting and executing code, especially in front of a crowd, was going to be problematic. (If you want to learn more about my condition and tools I’m using to maintain my ability to work, please check out this series of articles on our blog.)
Let’s start with the core problem that I’m trying to solve today. I will be presenting a session on elastic queries in Azure SQL database. Most of the code is ready to go since I have done this presentation a few times. As I was working through testing my demo, I found executing code by highlighting and pushing “run” in either Data Studio or in SQL Server Management Studio was difficult because I struggled to control highlighting the code. I was also looking for better ways to automate the process, but more about that later. I watched a couple of demos on using notebooks and found some of the notebooks that have been created by Microsoft. I realized I could put together my entire demo package to share with the attendees and build the demo so that I could execute it a step at a time without highlighting. Now that you have the background of what I was trying to accomplish, let’s look at the process I went through getting this done.
How in the world do you work with notebooks in Azure Data Studio?
One of the interesting things about working with notebooks, is that if you want to work with notebooks, it’s likely that you already have and you prefer to use them. This means that the instructions for how to create, organize, and use notebooks within Azure Data Studio is a bit lacking. For example, it was not entirely clear to me that one part of the process is creating a folder to store your notebooks with your markdown files and other content. So, let’s go through the process of creating your first notebook step by step with explanations about what’s happening.
The organization of notebooks and files in Azure Data Studio
Part of my struggle in understanding what was happening is each time I tried to create a notebook it asked me for locations and files. I thought it should know where they should go. So, as a newbie with notebooks and organization with Azure Data Studio, I created a notebook and a Jupyter book so I could see how the files are organized. Then I could go back and create the Jupyter book correctly from the beginning. While I may not get all of the terminology correct in this process, this is my discovery as I move forward through the process.
Once I started working with the notebook process in Azure Data Studio, I realized there were multiple components involved:
While I am sure there are simple ways to create what we would like to do, I’m coming at this entirely from Azure Data Studio as a data developer not a data scientist. Each time I tried to create my first Jupyter book, I didn’t understand what its purpose was in the beginning. When you create a Jupyter book, it looks like you’re creating a folder. That folder will also contain several helper files to organize your notebooks, markdown files, and sections. Before we leave the structure and organization section here, I want to clarify that the book is the parent folder, and the section is a sub folder within the book. Markdown files and notebooks are files created that are organized for particular purposes. The markdown file is effectively a document that allows you to create a nicely formatted informational component for your notebook. The notebook files are actual Jupyter notebook files which are split into sections for code and text.
Here is the high level organization of the Jupyter book we are going to create:
Jupyter book: Azure SQL database elasticity
Markdown file: README
Section: Setting up the demo
Markdown file: Set up instructions
Notebook: Prepping the demo
Section: Elastic query demo
Markdown file: Elastic query demo instructions
Notebook: Elastic query demo
Section: Elastic job demo
Markdown file: Elastic job demo instructions
Notebook: Elastic job demo
For the purposes of this blog post, we will walk through the process of creating the original Jupyter book and the elastic query demo section. That section has a good mix of code and text to illustrate the power and capabilities of notebooks.
Creating your first notebook in Azure Data Studio
Let’s begin creating our first notebook in Azure Data Studio. Before we dive into this process too deeply, I want to be clear that we are going to create a Jupyter book to add our notebooks to. This is not required as you can create a new notebook from the file menu or with the shortcut as noted on the screen in Azure Data Studio. What confused me about this initially is that you cannot create a simple notebook from the notebooks section in Azure Data Studio. When you create your notebook, you can save it as a file in the location of your choosing, but it will not show up in the notebook section. Once you create a notebook, if you are not using a Jupyter book to host it in, you can reopen it just by choosing OpenFile from the menu. While this may make sense to others, it was not entirely intuitive to me in the beginning. I had to do some mucking around to figure out that process.
So, we will start our process by creating a Jupyter book to host all our notebooks and markdown files. This Jupyter book will also be readily displayed in the notebook section on Azure Data Studio. Using the … to get to the MoreActions menu, choose CreateJupyterBook.
In the dialogue give your new Jupyter book a name and specify the location you want to store it in. I have not used the optional content folder for this exercise and will recommend that you do not either.
If you go to the folder location you created your Jupyter book in, you will see that it also created three files in the folder named the same as your Jupyter book:
In the notebook section of Azure Data Studio, you should see your Jupyter book with a README markdown file in it. For now, we will leave the README file as an introduction to what is in your notebook. (Be aware, that you can remove the file by deleting it, but you will need to update the TOC file to reflect the changes you made. If you do not update the TOC file, you may see missing file error messages in Azure Data Studio.)
I will not take time in this post to review what is possible in a markdown file. The key here is you can update the README file that was created with headers and formatting to provide instructions on how to use the various contents of your Jupyter book. If you double click within the README file, it will open up the readme.md file in a new tab in Azure Data Studio. This has a line number and will allow you to update and add content.
The following code gives you an example of some markdown syntax:
# Welcome to the Jupyter book on Azure SQL Database elasticity
This book contains 3 sections
* The first section contains instructions on how to set up the demo
* The second section contains the demo for elastic queries
* The third section contains a demo for elastic jobs
This will result in the following look and feel in your README file
Adding a section
The next thing we will do is add a section where we will host the executable demo code. Right click on your notebook and choose AddSection. We will add the title as Elastic query.
Adding the notebook
Up to this point, we have been building the framework to support our first notebook. While all these steps are not required, this is the most complete approach. Right click on your section and choose NewNotebook. This will create a Jupyter notebook in the subfolder of your section.
Once you create the notebook, it will open a tab in Azure Data Studio with the notebook. You will notice that it has something called Kernel. The kernel allows you to set the default language used for the notebook. For the work that we are doing we will be using the SQL kernel. This will allow us to execute SQL code against a database. In the Attach to dropdown, you will see databases that you can use to execute code. The Cell dropdown allows you to add cells which can contain code or text.
Now let us get down to the business of creating a notebook with executable code. Before we add executable code, let us add a text cell as an introduction to the code. You can do this by clicking the cell dropdown and choosing text. Once you add the text cell you will notice there is a formatting bar which ironically is missing in the markdown files editor. This means it is easier to create formatted text in a cell in a notebook rather than in the markdown file itself. Keep this in mind as you create your notebooks and add content to your Jupyter book. These cells are easier to work with at times than the full file. This is particularly true if you are not knowledgeable on formatting markdown.
At this point, let us add a quick introduction to what we are about to do in the in the following code cells.
Next, we will add a code cell. From the dropdown menu for cell, choose CodeCell. This will add a code cell to your notebook which uses the language selected in your kernel. There is also a play button which allows you to execute the code.
I am going to add the code that is required to clean up the tables for the demo. The resulting code cell will look like the following:
As a last step to understanding how notebooks and code work in the environment, we can execute the code by pushing the play button in the code cell. This will return the result of that execution as shown below:
Congratulations, you have created your first notebook with executable code against a SQL Server database! You can continue to add more text cells and code cells as needed. One of the reasons I like this pattern is that it allows me to execute the code without having to highlight it while doing demos. Each cell can be run independently. You will also notice there is a Run All button if you choose to run all the scripts at the same time that you have in your notebook. This could be valuable if you have a set of maintenance operations or related items you want to run and you have collected in a notebook for use.
Another key thing to remember is that notebooks are shareable. Because the connection is outside of the notebook, once you share the notebook, they will have to connect to an environment that allows them to execute the same code. You can add your notebooks to GitHub or similar source control to manage change and allow you to share common resources easily without just distributing SQL files.
Before we wrap up
I feel I would be remiss if I did not also demonstrate what happens when you get data results in a notebook. In my case I have a database I can connect to which has WideWorldImporters loaded into it. I am going to select the top 1000 rows from the DimSupplier table. Once I run the code cell, I get the rows affected, the execution time, and a table with results as shown here:
As you can see in the results window, you have several export options and a chart option that you can use to further visualize or work with the data that you have retrieved. I would encourage you to explore these options as it depends on the type of data you are working with whether they work well for you or not. For example, supplier data does not chart very well, whereas if I had used fact data there may have been some interesting charting options. A notebook could be a straightforward way to demonstrate some simple reporting for a technically savvy audience.
Wrapping it up
There are many more functions that I did not cover around notebooks, and I assume that Microsoft will continue to make improvements to the overall capabilities here. I look forward to using notebooks more as a terrific way to share code and run demos. I hope you find this as valuable as well.
For those of you who are not sure about using notebooks, this is an effective way to build your skills while not trying to learn a new language if you are familiar with SQL. My first exposure was using Python in a Databricks environment. That was much to learn while also trying to understand how notebooks functioned. As the data environment continues to expand and require new skill sets, understanding how to use and leverage notebooks on a regular basis is a good skill to have. Microsoft has done us a great favor by using standard Jupyter notebooks which are used in data science, Databricks, and other areas of data practice.
If you are following my work enablement series, you know one of the things that I am passionate about is simplifying how I work, in order to stay working while continuing to lose functionality in my arms. Notebooks help with this by allowing me to execute code without highlighting it when doing demos. Because highlighting code and executing it in a tool like SQL Server Management Studio requires multiple touches on the keyboard and mouse, I struggle to do it efficiently. The ability to organize my demo around code cells and then have a self-documenting notebook to pass along to attendees is a huge win for me. I hope this helps others who struggle in the same way. And I hope this was helpful to those who have not used or seen notebooks in their current work environment but may in the future.
I will be creating and sharing a completed notebook for the demos related to my presentation on elastic capabilities with Azure SQL. Look for that presentation follow up from the Memphis SQL Saturday in October 2022. I will publish a follow up blog post with a link to the completed notebook used with that demo.