It was a lot of fun to speak at the Louisville Data Technology Group in February. Sheila and I presented on Jupyter notebooks in Azure Data Studio. The session was very fun with a lot of interesting interaction from individuals who were looking at both developer and administration tooling within Azure Data Studio as well as understanding how to use Jupyter notebooks most for the first time.
The start of the session was a general introduction to Jupyter notebooks and Jupyter books. You can find the short slide deck here. I think the key thoughts from the introduction was the fact that Jupyter notebooks have been around for a long time and have often been used in data science as well as data engineering with Python. For example, my first exposure to notebooks was working with Databricks and more of an data engineering workload. One interesting note is that Jupyter books appear to be a nonstandard or as far as I can tell hard to understand component. Jupyter books are in fact a folder structure used to organize the contents including various markdown files, subfolders, and notebooks. Jupyter books allow you to store and organize your content and even share it in an organized way.
My first real exposure to Jupyter notebooks in a functional way was to create a platform on which my wife could help with presentations in a simpler manner than just using SQL Server Management Studio. As a result, I began to dig into how Jupyter notebooks could help us during presentations. We have since used Jupyter notebooks at two different SQL Saturdays and presented on how to use notebooks in this session at this user group. You can read about my first experience with notebooks in this post.
As part of our presentation at the Louisville Data Technology Group, my wife and I worked on a step by step walk through the demo. I’ve made some updates to the instructions to hopefully help any of you recreate the demo that we did during the presentation. You can find that step by step here. Besides the demo instructions, a sample of the completed sample notebook is also stored in that GitHub location.
Questions from the group
Can we mix Python and SQL in an SQL kernel notebook?
This is not possible at this time. Currently the notebook attaches to a single kernel and while there is an option to change what type of code is in the cell the only option available when you click in the SQL on the lower right corner is SQL.
When working with an SQL notebook does it create one or more sessions with each cell that is used?
We’re working with Azure Data Studio, each notebook or file will create a new session when connected. In our case each notebook will have its own session and the queries will run within that session for the single notebook. If you open separate notebooks, you will get separate sessions for each notebook to operate in.
In a SQL tab on Azure Data Studio, can you use the same charting functions with your result sets?
We were able to demonstrate this during the user group meeting. The charting and export functions that are available with the results in a notebook code cell execution are also available for results that’s from a traditional SQL execution. The image below shows where you can find be charting and export options from a result set in traditional SQL.
Insert image here
What is the best way to share notebooks with your team?
During our demo, we illustrated how to connect to remote Jupyter books. That however is a great approach for content you want to share with the general public. If you are working with a team and are managing a set of code in notebooks, the preferred approach would be to use GitHub. This would allow each of you to clone the repo and commit its changes back to the notebook and retrieve updates made by other team members.
Converting existing SQL files to notebooks
If you open a .sql file in the Azure Data Studio you have the option to convert the notebook to a SQL file. Typically, this will take comments and try to put them into text cells and separate your code the best it can into code cells that make sense. Be aware that it’s not always consistent and you will likely want to run through your notebook to verify that the result in the notebook is what you would desire. If you want to be proactive you can use markdown formatting in your comments that will then be converted to proper markdown when converted to a notebook.
/* # This is an example of a header Here is an example of **bolded text** */
The code above would look like this when converted.
It is also possible to convert a notebook to SQL and it will create the reverse process with commented code and markdown tagging.