SQL Azure’s place in the data world (Part 2)

In Part 1, I discussed the synchronization of data and availability of data on Microsoft’s SQL Azure platform.  In Part 2, I would like to discuss the next set of questions related to using SQL Azure with your data solution.  Here are the questions I will address in this post:

  • How much data can be supported by SQL Azure?
  • Is it really cost effective?

How much data can be supported by SQL Azure?

The answer to this question is entirely dependent on how and, in some cases, who you ask.  SQL Azure currently only supports databases up to 50 GB in size.  Because this is subscription based, you will only pay for what you use or allocate.  I will delve into the subscription and their cost in the next section.  For our purposes, it is important to realize that any database up to 50 GB is supported by Microsoft.  This limitation leads to a couple of additional questions.  First, how is the size measured?  Keep in mind that you are managing by the database not the server, so the system databases are not included in this size.  Furthermore, Microsoft does not include the size of the log in this measurement.  So when you determining how much space you will need, focus on the size of the MDF file and be sure to check how much is actually in use.

The second question that comes to mind is what happens if you need more than 50 GB? While not supported directly, it is possible to distribute or to “shard” a larger database across multiple smaller databases.  Currently this is a custom solution, but Federation is coming.  With Federation, SQL Azure will natively support sharding a large database and be able to present the database as a single unit which will make development easier.  If you can’t wait for SQL Azure Federation, check out the SQL Server and SQL Azure Shard Library available on CodePlex (http://enzosqlshard.codeplex.com/).

So you can see that if you would like to support or scale out a larger database solution it is possible with SQL Azure.

Is it cost effective?

At the time of this writing SQL Azure subscriptions break out as follows (http://www.microsoft.com/windowsazure/pricing/#sql)

Edition Max Size Cost per Month
Web Edition 1 GB $9.99
Web Edition 5 GB $49.99
Business Edition 10 GB $99.99
Business Edition 20 GB $199.98
Business Edition 30 GB $299.97
Business Edition 40 GB $399.96
Business Edition 50 GB $499.95

Microsoft also charges for data transfers.  Currently it is $0.10/GB for data in and $0.15/GB for data out.

For those of you who don’t have the privilege of working with software licensing, may not be able to appreciate the complexity of determining the cost effectiveness of the solution.  For comparison let’s review the MSRP cost for SQL Server Standard and Enterprise licenses.  (You check out other licensing options here:  http://www.microsoft.com/sqlserver/2008/en/us/pricing.aspx.)  SQL Server 2008 R2 Enterprise Edition costs $27,595/processor and SQL Server 2008 R2 Standard Edition costs $7,171/per processor.  These are the listed prices.  Customers can get a variety of discounts or other pricing options depending on their agreements with Microsoft including Software Assurance, Enterprise Agreements, and Volume Licensing.  In any case, there is is a significant license cost when ramping up a production SQL Server.

Besides licensing, you have to consider the cost of the hardware and personnel to maintain the servers.  By the time you have premium servers and a disk solution in place you could easily have spent around $100,000 for a simple production environment.  The next piece of the puzzle is related to additional manpower required to support the environment.

Now we have a bunch of numbers, but how can you compare solutions?  There are some significant differences between a SQL Azure solution and a SQL Server solution.  First, and most glaring to me, is the fact that SQL Azure is managed at the database level while SQL Server is managed at the server or instance level.  When you are trying to determine the cost-effectiveness for your application in either solution, you need to determine the number of databases required for your solution.

Next, you need to determine the cost of an on premise solution on a monthly basis so you can compare the solutions over an expected period of time.  Generally speaking, we expect hardware and software licenses to last around three to four years before an upgrade will be required for the solution.  In the case of SQL Azure, you have a known monthly cost with hardware and software licensing and related upgrades “included” in the cost.

Finally, when looking at SQL Azure, you might think that the fact it does not include Analysis Services or Integration Services in the license will make it cost-prohibitive in your solution.  However, since best practices dictate that separate servers be used for these services, they really do not factor into the cost-effectiveness of the solution.

So, what does this all mean?  It means you have some work to do to determine how cost-effective a SQL Azure solution will be for your application.  So let’s assume the following hardware/software solution to provide disaster recovery and high availability for your on premise database:

Component Cost factor Total
Dell PowerEdge with PowerVault disk – 2 Servers (2 proc/16GB) $65,000 with Rack $65,000
SQL Enterprise Licenses $25,000 / processor (Active/Passive Cluster) $50,000
Install and setup $100/hour $4,000
Total $119,000
$ Per month for 3 years $3305

As you can see, a simple on premise solution without considering the cost of maintenance and without considering the setup time will cost you around $3300 a month.  At this point, $500 a month plus data transfer costs should be less on SQL Azure for your database.  So, like me, you might think that dropping Enterprise to Standard licensing will make up the difference, but it still costs $2300 a month.  That means that you can get Enterprise capabilities with HA built in for less than standing up your own Standard environment.

Where this will break down is when your databases are larger than 50GB.  At that point you will have to scale both solutions and determine what is the more cost-effective solution for you.  You could reasonably use sharding to increase your space to 200 GB and still be cost-effective.  I admittedly did not think the cost savings would be that significant for SQL Azure.  Microsoft appears to be on the right track to provide reasonable pricing for SQL Azure.  I would be interested in hearing additional cost factors for on premise versus off premise database solutions.  (Please comment on this blog with additional factors you see that would impact the cost-effectiveness analysis of the SQL Azure solution.)

In the final blog entry on this topic, I will delve into solutions that make sense on SQL Azure and yes there are some cases for using SQL Azure sooner rather than later.

SQL Azure’s place in the data world (Part 1)

I recently presented on SQL Azure at Code Mastery in Minneapolis.  The presentation focused on using SQL Azure in application development.  We took the time to create a new server and a new SQL Azure database.  After that part of the demo, we hooked up various management tools (SQL Server 2008 R2 Management Studio and SQL Azure Database Manager) and applications (e.g.  SQL Server 2008 Integration Services, Microsoft Access, Microsoft Excel).

At the end of that presentation, I started discussing how developers and architects could use SQL Azure.  That raised a number of questions on where SQL Azure fits into the data or information architecture in the real world.  Some of the questions that matter to implementers were:

  • Can you back up databases?
  • What is Microsoft’s SLA for SQL Azure?
  • How easily can you keep the data up to date between on premise and cloud databases?
  • How much data can be supported by SQL Azure?
  • Is it really cost effective?
  • What are some use cases that make sense based on the answers to the questions above?

I will tackle these questions in this and some following posts. Hopefully you will get a perspective of where SQL Azure is today and how it may or may not fit in your environment.

Can you backup databases?
What is Microsoft’s SLA for SQL Azure?

The question behind these questions is what are my disaster recovery and high availability options from service and data standpoints.  One of the coolest features of SQL Azure is the fact that it has built in disaster recovery and high availability at the service level.  Microsoft creates three copies of your database within their data centers.  In this way, Microsoft guarantees 99.9% availability of SQL Azure databases.  (See http://www.microsoft.com/windowsazure/sla/ for more details.)

However, this only covers the service availability.  What happens if your data becomes suspect or corrupt?  Well for starters, SQL Azure has no support for the standard backup and restore operations we are used to.  Microsoft is currently pointing us to creating a copy of the database within the SQL Azure platform.  (Details: http://msdn.microsoft.com/en-us/library/ff951631.aspx.)  This will be a new SQL Azure database which is a copy of the existing database.  You can use SQL Azure Data Sync (CTP1) to keep these databases in sync.  This still does not have a way to bring that database backup on premise for storage and historical retention.  Your best option at this point is to use SQL Server Integration Services until SQL Azure Data Sync supports on-premise to cloud synchronization which is expected in CTP2.

How easily can you keep the data up to date between on premise and cloud databases?

So the last questions lead into how to keep the data up to date between on premise and cloud databases.  The best option is most likely yet to come in the SQL Azure Data Sync CTP2.  When that is released, you will be able to exchange data between on premise and cloud databases with built in tools.  Until then, you do have some options.  First, you can look at using SQL Server Integration Services to keep the data moving between the two databases.  It is very straightforward to create an ADO.NET data connection and point it to a SQL Azure database.  This can be a source or destination for the data and it works very well.  Beyond that you can use BCP or T-SQL scripts to move data back and forth, but this involves more development work to complete.

As you can see, until Data Sync support is ready for primetime, keeping data synchronized between on premise and Azure databases will require some development on your end.  You will need to review the current options and stay alert for new options for syncing your data at the moment.

Next post

In the next post, we will discuss database size, cost, and effective use cases for SQL Azure.  Stay tuned.