Category Archives: JSON

JSON in SQL Server 2016 – The Good, The Bad, and The Ugly

JSONonSQL

Microsoft introduced support for JSON data in SQL Server 2016 and in Azure SQL Database. I was excited to see this functionality coming. As an early user of XML when it was introduced, my expectations were high. Microsoft has done some really good work with JSON support, but I find that is not really as comprehensive as I had hoped.

The Good: Functions to Work with JSON Data

SQL Server now has some built in functions that support working with JSON. These are on par with XML data type methods. Here is the rundown on what they are and what they do.

OPENJSON

OPENJSON allows us to convert JSON to a tabular format. This function supports two output formats: default and explicit. Default returns key value pairs, where as explicit lets us define the context. Here is a sample of the syntax that can be used.

select Restaurant.RestaurantName, Restaurant.City, Restaurant.SeatsJSON, Seats.*
from dbo.Restaurant
cross apply

openjson(Restaurant.SeatsJSON, '$.Seats')
with (SeatNumber int '$."Seat Number"'
 , SeatCode varchar (50) '$.SeatCode'
 , TableNumber int '$.TableNumber' 
) as Seats

FOR JSON

FOR JSON works in reverse. This allows us to convert tabular data into JSON. AUTO automatically formats the results as arrays and nested arrays when joins are used. You are able to use aliases to as object names. PATH allows you to specify the JSON path you want to use for the results.

ISJSON

ISJSON can be used against a string value to determine if it is properly formatted. This can be handy when working with JSON functions or even in a CHECK constraint to make sure the column has properly formatted data.

JSON_VALUE

JSON_VALUE returns a scalar value from a JSON string using JSON path. The key here is that the value returned is scalar so working with arrays can sometimes be an issue if you cannot identify the position in the array.

JSON_VALUE (SeatsJSON, '$.Seats[0].SeatCode')

JSON_QUERY

JSON_QUERY returns a JSON formatted array or object using JSON path.

JSON_QUERY (SeatsJSON, '$.Seats')

JSON_MODIFY

JSON_MODIFY allows us to change the value based on the JSON path specified.

JSON_MODIFY(@RestJSON, '$.Restaurant.ZIP', '55337')

I know that I highlighted the basics around these functions. I plan to follow up on these later. These functions represent the good parts of the JSON implementation in SQL Server.

The Bad: Not a Datatype

My key disappointment with the implementation is the fact it is not a native data type like XML. At first I did not think this would be an issue, but after working with the new functionality, it feels incomplete. We can add a constraint with the ISJSON function to make sure our data is of the right type, but XML is just a data type. The biggest miss around the data type is likely concerning indexes as I talk about next.

The Ugly: Indexes

This is the worst part of the JSON solution. Because it is not a data type, no native indexing is supported. The current recommendation is to create a computed column using the JSON_VALUE function. However, this does not work with arrays, making the indexes of limited value. In a simple set of data, such as seats in a restaurant shown below, you cannot index the seats, only the restaurants.

{
    "Restaurants": [
        {
            "Restaurant": {
                "Restaurant ID": 1,
                "RestaurantName": "Sensational Servings MSP",
                "Seats": [
                    {
                        "SeatCode": "SSMSP-1-1",
                        "TableType": "Bar",
                        "Seat Number": 1
                    },
                    {
                        "SeatCode": "SSMSP-1-2",
                        "TableType": "Bar",
                        "Seat Number": 2
                    }                ]
            }
        },
        {
            "Restaurant": {
                "Restaurant ID": 2,
                "RestaurantName": "Sensational Servings LAS",
                "Seats": [
                    {
                        "SeatCode": "SSLAS-1-101",
                        "TableType": "Bar",
                        "Seat Number": 101
                    },
                    {
                        "SeatCode": "SSLAS-1-102",
                        "TableType": "Bar",
                        "Seat Number": 102
                    }                ]
            }
        }
    ]
}

So if I am looking for seats with the TableType Bar, we would not be able to do that with an index without storing the JSON file differently because each restaurant contains an array of seats. With the attribute as part of the array, we are unable to return every instance of the seats in the index. This will result in a table scan in most cases.

We can add indexes to simple JSON snippets, but complex JSON will result in table scans due to the nature of the functions. You should test your solutions to determine if the index is sufficient to support your query pattern.

The Moral of the Story

The JSON functionality is similar to XML data type methods. The lack of real index support will likely cause issues with the functionality at scale. Use the functions to help make JSON more usable in your environment, but be aware of its limitations as well.

Advertisements

VS Live 2016 – Las Vegas Follow Up

LVSPK18

I spoke at Visual Studio Live in Vegas on two topics. While the presentations have been uploaded to the site and were available for attendees, the code was not distributed yet as an oversight on my part. In this post, I will do a quick summary of the sessions and make sample code available. I will be writing more on these topics throughout the year and will tag VS Live in the notices.

JSON - VSLive

JSON & SQL Server Finally Together

JSON is now part of SQL Server 2016. SQL Server now includes functions to generate and shred JSON. Here are the basics:

  • OPENJSON: Used to convert JSON data into a tabular format
  • FOR JSON: Used to create JSON from tabular data
  • ISJSON: Determines if the data in question is JSON
  • JSON_VALUE: Returns scalar values from JSON data
  • JSON_QUERY: Returns JSON formatted arrays or objects
  • JSON_MODIFY: Used to modify JSON data and properties

With all of this support, JSON is not a native data type in SQL Server like XML.

You can download supporting files and code here.

Hive - VS Live

Using Hive and Hive ODBC with HDInsight and Power BI

During this session I went through the process of setting up HDInsight and loading data into the cluster. Once created, Hive tables were created and queries created that were used with Power BI to analyze the results.

You can find the details here.