About 5 Minutes with Microsoft Fabric

On September 14, 2023 By SteveIn Azure, Microsoft Fabric

Introducing the Fabric 5 from Data on Wheels

Microsoft Fabric went to public preview this past summer. There has been content created around using the various cool features that the product brings together. As I watched and learned from so many great presenters from Microsoft and the community, I really wanted to take a few minutes to get people thinking about how they would implement on Microsoft Fabric. For example, what would be the best way to implement a medallion architecture in the Fabric ecosystem? What are the options we could look at? Why would we pick a lakehouse over a data warehouse? These and other questions like these I hope to give a few minutes of thought to and spur on a conversation.

This will be a weekly video series posted on our YouTube channel. There may be a couple spot additions here and there if it makes sense and is relevant. Overall, the goal is to strike up the conversation that will hopefully lead to all of us thinking through the best implementations of our data estates on Microsoft Fabric. We plan to drop a video on the fifth day of the week at 5:00 o’clock somewhere that is about 5 minutes long.

A Quick Word about Our YouTube Channel

The Data on Wheels YouTube channel has a blend of technical and personal content. The technical content will consist of things like the Fabric 5 as well as content related to Working with ALS. I love to talk about technology in the data and analytics space where I have been working in for years. I hope that you will enjoy this content as much as I had fun creating it.

As many of you know, I am battling ALS. I continue to work, and I share the technology hardware and software that I use to be able to keep working and the technology I love. While this may not be interesting to you, if you know someone who could benefit from learning about these tools, please share that content as well.

I have also chosen to share personal content here. This is primarily content related to my ALS journey that is not directly related to working with ALS. Frankly, this is just a great place to share this with family and friends and you can count yourselves among them if you choose to watch it.

Wrapping it up

I hope you enjoy this Fabric 5 video series. The plan is to provide 5 minutes of content on Fabric on the fifth day of the week around 5:00 PM somewhere. As always, I am happy to share and hope that you find some value in the content we created.

Effectively Integrating FHIR Data from Azure Health Services

On December 13, 2022December 13, 2022 By SteveIn 3Cloud, ADLS, Azure, Azure Data Studio, FHIR, JSON, Jupyter Notebook, Microsoft Azure, SQL Saturday, Synapse, T-SQL1 Comment

This blog is intended to be a follow up from the SQL Saturday 2022 in Oregon & SW Washington. In that session I presented an introduction to FHIR and JSON data produced from the Azure Health Services API’s.

With the recent updated mandates in the healthcare environment in the United States, Microsoft has continued to expand its capability to support the FHIR standard for integrating healthcare data. While the standard is well documented and Microsoft’s capabilities are expansive, it falls on data professionals to interpret that data and build meaningful reports and produce meaningful insights from the data as it is collected and integrated across environments. This requires a good working knowledge of JSON in SQL to manipulate complex data models. In the session, we did a short review of the FHIR standard and the overall implementation of FHIR in Azure. From there we reviewed the resulting data in the data lake and in Synapse. That was followed up with an overview into the heart of complex SQL using JSON functions in Synapse. Whether or not you are active in healthcare today, this will be an enlightening session on how to use JSON SQL functions within the Azure SQL platforms.

What is FHIR and why should you care?

FHIR stands for Fast Healthcare Interoperability Resources. this is the latest specification for interoperability in healthcare produced by HL7. To be clear the word fast has nothing to do with performance, but more about the ability to implement and integrate data quickly. With the latest regulations around the world in health care, this standard is the established standard for integrating healthcare data and we’ll continue to be on the forefront of this work. If you do any work in health care, you will need to understand FHIR because you will likely run across data formatted to the standard from many different sources.

FHIR is very well documented. In many ways when the standard is properly followed the JSON documents or other supported formats are effectively self-documenting. It is commonly understood that the core FHIR specification handles about 80% of the use cases in healthcare. It is designed to be flexible so that it can support specialized needs within regions or healthcare areas. For example, in the US there is a need to support race and ethnicity. The U.S. Core Implementation Guide provides guidance on the specification enhancements to support this need for U.S. healthcare organizations. You will find similar support for other countries as well as specific implementations for healthcare vendors such as Epic.

Neither the notebook, the presentation, or this blog is expected to be and exhaustive coverage of FHIR. before we move on to some of the other implementation pieces, it is important to understand one key aspect of FHIR is the basic building block called a resource. A resource is the core exchangeable content within the specification. All resources share the following characteristics:

A common way to define and represent the resource including data types and patterns
A common set of metadata which can be discovered easily
A human readable part

For more detailed information on the supported resources and other details around FHIR implementation, you should visit the following website:

Azure Health Services and the FHIR API

I will not be digging into a lot of the health care services information nor the FHIR support within Azure in this post. The important things to understand is that Microsoft has made a concerted effort to support this specification which includes technology and architectures for the extraction of data from various healthcare systems which will then use the FHIR APIs to standardize that extracted data into the FHIR spec typically in JSON files in the data lake. Because of the standardized format, Microsoft is able to supply a set of common schemas that can be used in serverless synapse to create external tables and views to accelerate the implementation and usage of data produced from the APIs. It is from this starting point that we are able to start working with the data in reporting and analytics solutions.

At this point I want to put a plug in for the company I work for. If you're interested in learning how Azure health services and the FHIR specification can be implemented at your company, we have FHIR Quick Start and FHIR Data Blueprint solutions. These solutions have been used by many other customers to achieve high levels of integration in their health care data estate. If you're interested in learning more, please reach out to us at: https://3cloudsolutions.com/get-started/

Working with the data from the FHIR API using JSON in SQL

As noted in the previous section, Azure Health Services comes with setup serverless tables and views to be used with the extracted data. However due to the complexity of FHIR, there are a number of columns within those tables and views which still contain JSON snippets. For example, there is one field for name which has several objects and arrays to support the specification. You cannot simply select the name from the table and use that as you move forward. There are many different fields like this throughout the data. For the rest of this blog and in the notebook, we will work through a number of scenarios to build a view of the patient resource that can be used for simple reporting. This view will contain a few JSON functions from SQL Server and solve simple to complex scenarios in the illustration.

The functions we will be using:

ISJSON
JSON_VALUE
OPENJSON

In addition to these functions, we will also be using the CROSS APPLY operator in SQL to join our data with relational data.

The examples in the notebook are built on the tables resulting from working with the Azure FHIR API. I am unable to provide a sample of the data to use with the set of information in the notebook currently. However, the SQL will work if you have your own FHIR implementation and a Patient resource to work with. rather than rewrite the entire contents of the notebook in the blog post, here is a link to the notebook.

If you plan to implement this in the same way, you will need Azure Data Lake, Azure Synapse serverless, and Azure Data Studio. the notebook can be opened in Azure Data Studio. If you are unfamiliar with working with notebooks inside of Azure Data Studio, you are not alone. Check out this post which discusses how to implement your first notebook in Azure Data Studio.

Building our view and SQL with JSON functions

If you decide not to open the notebook but are curious what the view looks like here is a finished product that we created in the notebook.

SELECT TOP (20) p.resourceType + '/' +  p.id as PatientResourceID
    , p.resourceType as ResourceType
    , p.id as ResourceID 
    , cast(p.[meta.versionId] as int) as VersionID 
    , cast(p.[meta.lastUpdated] as DATETIME2(7)) as LastUpdated 
    , JSON_VALUE(p.[name], '$[0].family') as LastName
    , JSON_VALUE(p.[name], '$[0].given[0]') as FirstName
    , cast(p.active as bit) as IsActive
    , p.gender as Gender 
    , CAST(p.birthDate as date) as BirthDate
    , CASE WHEN p.[maritalStatus.coding] is null THEN NULL
           WHEN  JSON_VALUE(p.[maritalStatus.coding], '$[0].system') = 'http://terminology.hl7.org/CodeSystem/v3-MaritalStatus' 
                    THEN JSON_VALUE(p.[maritalStatus.coding], '$[0].code')
           ELSE NULL
           END as MaritalStatus 
    , CASE WHEN JSON_VALUE(p.[address], '$[0].use') = 'home' THEN JSON_VALUE(p.[address], '$[0].state')
            WHEN JSON_VALUE(p.[address], '$[1].use') = 'home' THEN JSON_VALUE(p.[address], '$[1].state')
            WHEN JSON_VALUE(p.[address], '$[2].use') = 'home' THEN JSON_VALUE(p.[address], '$[2].state')
            WHEN JSON_VALUE(p.[address], '$[3].use') = 'home' THEN JSON_VALUE(p.[address], '$[3].state')
            ELSE NULL
            END as HomeStateOrProvince
    , e.Ethnicity
    , r.Race
FROM fhir.Patient p
INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
    ON p.[meta.versionId] = cp.currentVersion
    AND p.id = cp.id
LEFT JOIN 
    (SELECT p.id
        , CASE WHEN JSON_VALUE(ext.value,'$.extension[0].url') = 'ombCategory'
            THEN
            CASE WHEN JSON_VALUE(ext.value, '$.extension[1].valueString') IS NOT NULL  THEN JSON_VALUE(ext.value, '$.extension[1].valueString')
                    WHEN JSON_VALUE(ext.value, '$.extension[0].valueString') IS NOT    NULL THEN JSON_VALUE(ext.value, '$.extension[0].valueString')
                    ELSE JSON_VALUE(ext.value, '$.extension[0].valueCoding.display')
                    END
            ELSE JSON_VALUE(ext.value, '$.valueCodeableConcept.coding[0].display')
            END AS Ethnicity 
        FROM 
        (
            SELECT fp.id, fp.extension FROM fhir.Patient fp
            INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
                ON fp.[meta.versionId] = cp.currentVersion
                AND fp.id = cp.id
            WHERE ISJSON(fp.extension) =1
        ) p 
        CROSS APPLY 
            OPENJSON(p.extension,'$'
            ) as ext
        WHERE JSON_VALUE(ext.value,'$.url') = 'http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity'
    ) e on e.id = p.id 
LEFT JOIN 
    (SELECT p.id
        , CASE WHEN JSON_VALUE(ext.value,'$.extension[0].url') = 'ombCategory'
            THEN
            CASE WHEN JSON_VALUE(ext.value, '$.extension[3].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[3].valueString')
                    WHEN JSON_VALUE(ext.value, '$.extension[2].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[2].valueString')
                    WHEN JSON_VALUE(ext.value, '$.extension[1].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[1].valueString')
                    WHEN JSON_VALUE(ext.value, '$.extension[0].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[0].valueString')
                    ELSE JSON_VALUE(ext.value, '$.extension[0].valueCoding.display')
                    END
            ELSE JSON_VALUE(ext.value, '$.valueCodeableConcept.coding[0].display')
            END AS Race 
        FROM 
        (
            SELECT fp.id, fp.extension FROM fhir.Patient fp
            INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
                ON fp.[meta.versionId] = cp.currentVersion
                AND fp.id = cp.id
            WHERE ISJSON(fp.extension) =1
        ) p 
        CROSS APPLY 
            OPENJSON(p.extension,'$'
            ) as ext
        WHERE JSON_VALUE(ext.value,'$.url') = 'http://hl7.org/fhir/us/core/StructureDefinition/us-core-race'
    ) as r on r.id = p.id

Here is a sample of the results from that view:

PatientResourceID

ResourceType

ResourceID

VersionID

LastUpdated

LastName

FirstName

IsActive

Gender

BirthDate

MaritalStatus

HomeStateOrProvince

Ethnicity

Race

Patient/d8af7bfa-5008-4a0f-85d1-0af3448a31dd

Patient

d8af7bfa-5008-4a0f-85d1-0af3448a31dd

2022-05-31 18:07:03.2150000

DUCK

DONALD

male

1965-07-14

NULL

Patient/78cf7725-a0e1-44a4-94d4-055482781afb

Patient

78cf7725-a0e1-44a4-94d4-055482781afb

2022-05-31 18:07:30.7490000

Gretzky

Wayne

NULL

1990-05-31

NULL

Patient/9e909e52-61a1-be50-1878-a12ef8c36346

Patient

9e909e52-61a1-be50-1878-a12ef8c36346

2022-05-31 18:39:58.1780000

EVERYMAN

ADAM

NULL

male

1988-08-18

NULL

Non Hispanic or Latino

White+Asian

Patient/585f3cc0-c727-4989-9214-a7a7b60a2ade

Patient

585f3cc0-c727-4989-9214-a7a7b60a2ade

2022-05-31 13:14:57.0640000

DUCK

DONALD

male

1965-07-15

NULL

Patient/29a819c4-f553-8189-2354-9441b86d37ef

Patient

29a819c4-f553-8189-2354-9441b86d37ef

2022-05-18 15:18:40.1560000

FORD

ELAINE

NULL

female

1992-03-10

NULL

Patient/d5fe6802-a680-e762-8f43-9659340b00ac

Patient

d5fe6802-a680-e762-8f43-9659340b00ac

2022-05-18 14:39:52.2550000

EVERYMAN

ADAM

NULL

male

1961-06-15

NULL

Patient/4d661053-a8d0-148c-7023-54508fd04a52

Patient

4d661053-a8d0-148c-7023-54508fd04a52

2022-05-21 13:48:24.9720000

EVERYMAN

sam

NULL

male

1966-05-07

NULL

Not Hispanic or Latino

White

Wrapping it up

As you can see, understanding the specification well enough to build a complex SQL statement using JSON functions is required to work within FHIR effectively. Due to the complex nature of the nested JSON, you may not be able to reconcile this in tools such as power BI. Being able to build this out in SQL guarantees that you have provided you will report writers and analysts with a solid result set which can be used with confidence.

Resources summary:

Link to notebook
Link to FHIR spec
Link to 3Cloud FHIR
Link to Azure Health Services

My experience working with notebooks in Azure Data Studio

On October 13, 2022October 21, 2022 By SteveIn Azure, Azure Data Studio, Jupyter Notebook, Work Enablement4 Comments

I’ve seen notebooks used in Azure Data Studio on multiple occasions. I really like the concept of notebooks, having done some work within Azure Databricks notebooks, but not extensively. As I go into the process that I went through, it’s important to understand that I am not a data scientist and have not done extensive development or spent a lot of time in Python or Jupyter notebooks. Furthermore, my interest in the notebooks was elevated when I realized I wanted to continue presenting while working through my current ALS diagnosis. I have limited use of my hands and arms so highlighting and executing code, especially in front of a crowd, was going to be problematic. (If you want to learn more about my condition and tools I’m using to maintain my ability to work, please check out this series of articles on our blog.)

Let’s start with the core problem that I’m trying to solve today. I will be presenting a session on elastic queries in Azure SQL database. Most of the code is ready to go since I have done this presentation a few times. As I was working through testing my demo, I found executing code by highlighting and pushing “run” in either Data Studio or in SQL Server Management Studio was difficult because I struggled to control highlighting the code. I was also looking for better ways to automate the process, but more about that later. I watched a couple of demos on using notebooks and found some of the notebooks that have been created by Microsoft. I realized I could put together my entire demo package to share with the attendees and build the demo so that I could execute it a step at a time without highlighting. Now that you have the background of what I was trying to accomplish, let’s look at the process I went through getting this done.

How in the world do you work with notebooks in Azure Data Studio?

One of the interesting things about working with notebooks, is that if you want to work with notebooks, it’s likely that you already have and you prefer to use them. This means that the instructions for how to create, organize, and use notebooks within Azure Data Studio is a bit lacking. For example, it was not entirely clear to me that one part of the process is creating a folder to store your notebooks with your markdown files and other content. So, let’s go through the process of creating your first notebook step by step with explanations about what’s happening.

The organization of notebooks and files in Azure Data Studio

Part of my struggle in understanding what was happening is each time I tried to create a notebook it asked me for locations and files. I thought it should know where they should go. So, as a newbie with notebooks and organization with Azure Data Studio, I created a notebook and a Jupyter book so I could see how the files are organized. Then I could go back and create the Jupyter book correctly from the beginning. While I may not get all of the terminology correct in this process, this is my discovery as I move forward through the process.

Once I started working with the notebook process in Azure Data Studio, I realized there were multiple components involved:

Jupyter book
Markdown file
Notebook
Section

While I am sure there are simple ways to create what we would like to do, I’m coming at this entirely from Azure Data Studio as a data developer not a data scientist. Each time I tried to create my first Jupyter book, I didn’t understand what its purpose was in the beginning. When you create a Jupyter book, it looks like you’re creating a folder. That folder will also contain several helper files to organize your notebooks, markdown files, and sections. Before we leave the structure and organization section here, I want to clarify that the book is the parent folder, and the section is a sub folder within the book. Markdown files and notebooks are files created that are organized for particular purposes. The markdown file is effectively a document that allows you to create a nicely formatted informational component for your notebook. The notebook files are actual Jupyter notebook files which are split into sections for code and text.

Here is the high level organization of the Jupyter book we are going to create:

Jupyter book: Azure SQL database elasticity
- Markdown file: README
- Section: Setting up the demo
  - Markdown file: Set up instructions
  - Notebook: Prepping the demo
- Section: Elastic query demo
  - Markdown file: Elastic query demo instructions
  - Notebook: Elastic query demo
- Section: Elastic job demo
  - Markdown file: Elastic job demo instructions
  - Notebook: Elastic job demo

For the purposes of this blog post, we will walk through the process of creating the original Jupyter book and the elastic query demo section. That section has a good mix of code and text to illustrate the power and capabilities of notebooks.

Creating your first notebook in Azure Data Studio

Let’s begin creating our first notebook in Azure Data Studio. Before we dive into this process too deeply, I want to be clear that we are going to create a Jupyter book to add our notebooks to. This is not required as you can create a new notebook from the file menu or with the shortcut as noted on the screen in Azure Data Studio. What confused me about this initially is that you cannot create a simple notebook from the notebooks section in Azure Data Studio. When you create your notebook, you can save it as a file in the location of your choosing, but it will not show up in the notebook section. Once you create a notebook, if you are not using a Jupyter book to host it in, you can reopen it just by choosing Open File from the menu. While this may make sense to others, it was not entirely intuitive to me in the beginning. I had to do some mucking around to figure out that process.

So, we will start our process by creating a Jupyter book to host all our notebooks and markdown files. This Jupyter book will also be readily displayed in the notebook section on Azure Data Studio. Using the … to get to the More Actions menu, choose Create Jupyter Book.

In the dialogue give your new Jupyter book a name and specify the location you want to store it in. I have not used the optional content folder for this exercise and will recommend that you do not either.

If you go to the folder location you created your Jupyter book in, you will see that it also created three files in the folder named the same as your Jupyter book:

_config.yml
_toc.yml
README.md

In the notebook section of Azure Data Studio, you should see your Jupyter book with a README markdown file in it. For now, we will leave the README file as an introduction to what is in your notebook. (Be aware, that you can remove the file by deleting it, but you will need to update the TOC file to reflect the changes you made. If you do not update the TOC file, you may see missing file error messages in Azure Data Studio.)

I will not take time in this post to review what is possible in a markdown file. The key here is you can update the README file that was created with headers and formatting to provide instructions on how to use the various contents of your Jupyter book. If you double click within the README file, it will open up the readme.md file in a new tab in Azure Data Studio. This has a line number and will allow you to update and add content.

The following code gives you an example of some markdown syntax:

# Welcome to the Jupyter book on Azure SQL Database elasticity
This book contains 3 sections
* The first section contains instructions on how to set up the demo
* The second section contains the demo for elastic queries
* The third section contains a demo for elastic jobs

This will result in the following look and feel in your README file

Adding a section

The next thing we will do is add a section where we will host the executable demo code. Right click on your notebook and choose Add Section. We will add the title as Elastic query.

Adding the notebook

Up to this point, we have been building the framework to support our first notebook. While all these steps are not required, this is the most complete approach. Right click on your section and choose New Notebook. This will create a Jupyter notebook in the subfolder of your section.

Once you create the notebook, it will open a tab in Azure Data Studio with the notebook. You will notice that it has something called Kernel. The kernel allows you to set the default language used for the notebook. For the work that we are doing we will be using the SQL kernel. This will allow us to execute SQL code against a database. In the Attach to dropdown, you will see databases that you can use to execute code. The Cell dropdown allows you to add cells which can contain code or text.

Azure Data Studio supports other kernels that can be used for executing code against various workloads. These include Python, Spark, PySpark, and PowerShell.

Now let us get down to the business of creating a notebook with executable code. Before we add executable code, let us add a text cell as an introduction to the code. You can do this by clicking the cell dropdown and choosing text. Once you add the text cell you will notice there is a formatting bar which ironically is missing in the markdown files editor. This means it is easier to create formatted text in a cell in a notebook rather than in the markdown file itself. Keep this in mind as you create your notebooks and add content to your Jupyter book. These cells are easier to work with at times than the full file. This is particularly true if you are not knowledgeable on formatting markdown.

At this point, let us add a quick introduction to what we are about to do in the in the following code cells.

Next, we will add a code cell. From the dropdown menu for cell, choose Code Cell. This will add a code cell to your notebook which uses the language selected in your kernel. There is also a play button which allows you to execute the code.

I am going to add the code that is required to clean up the tables for the demo. The resulting code cell will look like the following:

As a last step to understanding how notebooks and code work in the environment, we can execute the code by pushing the play button in the code cell. This will return the result of that execution as shown below:

Congratulations, you have created your first notebook with executable code against a SQL Server database! You can continue to add more text cells and code cells as needed. One of the reasons I like this pattern is that it allows me to execute the code without having to highlight it while doing demos. Each cell can be run independently. You will also notice there is a Run All button if you choose to run all the scripts at the same time that you have in your notebook. This could be valuable if you have a set of maintenance operations or related items you want to run and you have collected in a notebook for use.

Another key thing to remember is that notebooks are shareable. Because the connection is outside of the notebook, once you share the notebook, they will have to connect to an environment that allows them to execute the same code. You can add your notebooks to GitHub or similar source control to manage change and allow you to share common resources easily without just distributing SQL files.

Before we wrap up

I feel I would be remiss if I did not also demonstrate what happens when you get data results in a notebook. In my case I have a database I can connect to which has WideWorldImporters loaded into it. I am going to select the top 1000 rows from the DimSupplier table. Once I run the code cell, I get the rows affected, the execution time, and a table with results as shown here:

As you can see in the results window, you have several export options and a chart option that you can use to further visualize or work with the data that you have retrieved. I would encourage you to explore these options as it depends on the type of data you are working with whether they work well for you or not. For example, supplier data does not chart very well, whereas if I had used fact data there may have been some interesting charting options. A notebook could be a straightforward way to demonstrate some simple reporting for a technically savvy audience.

Wrapping it up

There are many more functions that I did not cover around notebooks, and I assume that Microsoft will continue to make improvements to the overall capabilities here. I look forward to using notebooks more as a terrific way to share code and run demos. I hope you find this as valuable as well.

For those of you who are not sure about using notebooks, this is an effective way to build your skills while not trying to learn a new language if you are familiar with SQL. My first exposure was using Python in a Databricks environment. That was much to learn while also trying to understand how notebooks functioned. As the data environment continues to expand and require new skill sets, understanding how to use and leverage notebooks on a regular basis is a good skill to have. Microsoft has done us a great favor by using standard Jupyter notebooks which are used in data science, Databricks, and other areas of data practice.

If you are following my work enablement series, you know one of the things that I am passionate about is simplifying how I work, in order to stay working while continuing to lose functionality in my arms. Notebooks help with this by allowing me to execute code without highlighting it when doing demos. Because highlighting code and executing it in a tool like SQL Server Management Studio requires multiple touches on the keyboard and mouse, I struggle to do it efficiently. The ability to organize my demo around code cells and then have a self-documenting notebook to pass along to attendees is a huge win for me. I hope this helps others who struggle in the same way. And I hope this was helpful to those who have not used or seen notebooks in their current work environment but may in the future.

I will be creating and sharing a completed notebook for the demos related to my presentation on elastic capabilities with Azure SQL. Look for that presentation follow up from the Memphis SQL Saturday in October 2022. I will publish a follow up blog post with a link to the completed notebook used with that demo.

Moving Synapse Databases Between Subscriptions – Practical Guidance

On June 1, 2021 By SteveIn Azure, Azure SQL Database, Microsoft Azure, Synapse, Syndicated1 Comment

One of the tasks, we often do with migration projects is move large volumes of data. Depending on how you are configured, you may need to do the migration project in a development or UAT environment as opposed to a production environment. This is particularly true if you have policies in place on your production subscription that don’t allow the individuals doing the migration and validation tasks to work in that subscription.

Just Copy It… Nope

So you can copy Azure SQL Database using the Azure Portal, PowerShell, Azure CLI, and T-SQL. However, this functionality is limited to Azure SQL Database and does not work for Azure Synapse databases (a.k.a. SQL Pools). Early in 2021, the ability to use the copy functionality to copy databases between subscriptions is also supported but requires security work to make sure the permissions in the database servers and networking allow that to happen.

You can find out more about copying Azure SQL Database in this Microsoft Doc.

Just Restore It… Nope

You can restore to the current server or another server on the same subscription. However, you are unable to restore across subscription boundaries at this time. If you need to move to another server in the current subscription, the process is straightforward, you can use the restore process in Synapse to restore to the current server using a different name. You can also restore to a different server in either the current or different resource group in the same subscription. The restore technique is used in our move process, so details on how to restore a Synapse database will be in the next section.

Let’s Move a Synapse Database

The process to move a Synapse database to another subscription requires some planning and pre-work. The first thing you need to do is create a new SQL Server in the same subscription you have your current Synapse environment. Because you can’t simply create servers, I would recommend that you add an Azure SQL Database to the server as a placeholder. An S0 should be sufficient to keep this server in place for what we are doing. DO NOT ADD anything to this server that will not be migrated. This is a temporary holding place for migrating databases. (This also works for other SQL Databases, but other options may work as well but are not the focus of this post.)

Now that you have the migration server created, the next step is to create a restore point. While this is not required because you can use the automatic restore points, creating a user-defined restore point is recommended. A user-defined restore point, allows you to choose the status of the database you want to migrate, rather than relying on the automatic points and trying to make sure you pick the right time (in UTC of course).

Once you have set the restore point, in the database you want to migrate, select Restore to open the panel to restore your database.

On the restore page, you have a number of options to complete.

Restore point type: Choose User-defined restore points
SQL pool name: This is not a big deal. The name is the database name used during the migration process and is not the final name used in the target server. Make sure it is something you know.
Restore point: Select the restore point you created for this purpose.
Server: Choose the migration server you created as the target.
Performance level: This one is more interesting. I typically choose a smaller performance level for this restore. Keep in mind that Azure needs to allocate resources to support the restore. Because this is not a final deployment, smaller may go faster. However, NO SLAs exist for this process. That means your mileage will vary. We have seen restores happen in 30 minutes one day and over 5 hours the next. It will be very dependent on the data center and how busy it is. This time variation must be accounted for in your planning.

The next step is to move the server using the Move operation on the server page. You have the option to move to another resource group or another subscription. In our case, we will choose another subscription. IMPORTANT: You will need Contributor permissions in the target subscription in order to move the server to that subscription.

After you have moved the server to the target subscription, you need to set a restore point for that database on the migration server. Then you can restore that database to the target server. It is very important that you use the naming convention and performance levels that you need for this restore as it is the final step in the process. Once again the restore process has no Microsoft SLA and as a result may take longer than planned. You need to have contingencies in place if you are working in a deployment window or have time restrictions.

Finally, you need to clean up the migration server. I would recommend either scaling down or pausing the Synapse database to give you a backup for a while if needed. Once the database is validated on the target server, you can remove the Synapse database (removes storage costs). I would recommend keeping this server as your migration server to use in the future. You can use this process to create copies of databases for development and UAT or similar needs from production instances.

Other Thoughts and Considerations

Here are my final thoughts on this process. First, the fact that no SLA on the restore process is provided by Microsoft has created issues for us in some cases. We have had to extend deployment windows during production deployments on occasion. My recommendation is to plan for the worst case and finish early if all comes together on time.

This process works! You can use it with other SQL assets and you can use it in multiple directions. Keep the migration server around so you can support other processes. If you clean most of it out, the cost of maintaining it is the S0 SQL Database.

One final thought, this is Azure. Thus, this guidance could change tomorrow. We have been using this for about 12 months when this was written. I hope this helps some of you move these databases to support your business and development needs.

Welcome to 3Cloud …

On October 23, 2020October 23, 2020 By SteveIn 3Cloud, Azure, Pragmatic Works

A little over a month ago, Pragmatic Works Consulting was a part of a merger that included 3Cloud and Applied Cloud Services over a period of a few months. Let’s look at the journey.

June 30, 2020: 3Cloud Receives Growth Equity from Gryphon Investors
July 30, 2020: 3Cloud Acquires Applied Cloud Services
September 9, 2020: 3Cloud Acquires Pragmatic Works Consulting

From June through September, 3Cloud went from about 70 people to 170. We are now the largest Azure “pure-play” consulting company in the United States. And this is just the beginning…

So, what are my thoughts on this?

I am truly excited about the opportunity to grow a consulting company that is focused on Azure. Sure there is a lot of change, but change is not bad. As we bring our three companies together to become one, there are challenges and successes. All of us have already benefited from the merging of skills and teams to create a more complete solution team for our customers.

I look forward to seeing how we evolve over the next few months and years. Exciting times are ahead!

Customer Impact

While working for a data centric company like Pragmatic Works was great, the shift to cloud technologies and Azure data services required us to expand our capabilities beyond data and SQL Server. This merger allows us to immediately add value to our customers by adding application development and infrastructure capabilities to our toolbox. Beyond that, 3Cloud and ACS bring a mature managed services offering including the ability to host and manage customer resources in Azure (CSP).

I think our customers get a significant boost in services as we become a more complete Azure company.

Some Final Thoughts

I will miss working directly with Brian Knight and Tim Moolic, two of the founding partners at Pragmatic Works. Their vision helped shape a great organization over 12 years. In case you did not realize it, Pragmatic Works will continue on as a training organization. You can still expect excellent technical training from the team there. We all will continue to learn and grow with their support.

If you are interested in joining our team or learning more about what 3Cloud offers, reach out to me at shughes@3cloudsolutions.com. I look forward to seeing you on a webinar or working with you in the future.

Data on Wheels – Kristyna Ferris & Steve Hughes

Category: Azure

About 5 Minutes with Microsoft Fabric

Introducing the Fabric 5 from Data on Wheels

A Quick Word about Our YouTube Channel

Wrapping it up

Effectively Integrating FHIR Data from Azure Health Services

What is FHIR and why should you care?

Azure Health Services and the FHIR API

Working with the data from the FHIR API using JSON in SQL

Building our view and SQL with JSON functions

Wrapping it up

My experience working with notebooks in Azure Data Studio

How in the world do you work with notebooks in Azure Data Studio?

The organization of notebooks and files in Azure Data Studio

Creating your first notebook in Azure Data Studio

Adding a section

Adding the notebook

Before we wrap up

Wrapping it up

Moving Synapse Databases Between Subscriptions – Practical Guidance

Just Copy It… Nope

Just Restore It… Nope

Let’s Move a Synapse Database

Other Thoughts and Considerations

Welcome to 3Cloud …

So, what are my thoughts on this?

Customer Impact

Some Final Thoughts

Introducing the Fabric 5 from Data on Wheels

A Quick Word about Our YouTube Channel

Wrapping it up

Share this:

What is FHIR and why should you care?

Azure Health Services and the FHIR API

Working with the data from the FHIR API using JSON in SQL

Building our view and SQL with JSON functions

Wrapping it up

Share this:

How in the world do you work with notebooks in Azure Data Studio?

The organization of notebooks and files in Azure Data Studio

Creating your first notebook in Azure Data Studio

Adding a section

Adding the notebook

Before we wrap up

Wrapping it up

Share this:

Just Copy It… Nope

Just Restore It… Nope

Let’s Move a Synapse Database

Other Thoughts and Considerations

Share this:

So, what are my thoughts on this?

Customer Impact

Some Final Thoughts

Share this: