Steps to Preload Data into Tables with SSDT

6 03 2013

I am working as the data architect and developer on a modern appMALL13_Badge_See125x125 build with a the team from Modern Apps Live! in Vegas.  The goal of the project is to provide guidance to build modern applications and use this application as a reference.  While the conference is focused on the why of the build, we have learned some interesting things about how as well.  This is one of those how items.

In this post, I needed to preload some data into the database.  I wanted to include this process in the database project I had created.  However, I quickly found out that this was not a straightforward as I thought it would be.  Here are the steps I followed and any of the gotchas along the way.

1. Create Scripts for the Load Queries.

I started out with scripts that included a DELETE statement followed by an INSERT statement.  However, this created problems when data existed, particularly when the table is a list table used as a foreign key.

Next, I tried MERGE.  This worked great.  This gives me a way to handle new records that are required for the lookup or any changes made to existing data.  Here is the script I used:

merge dbo.MVCategory as target 
using ( 
        select 1, 'Fun' 
        union 
        select 2, 'Technology' 
        union 
        select 3, 'Entertainment' 
        union 
        select 4, 'News' 
        union 
        select 5, 'Sports' 
        union 
        select 6, 'Off-Topic' 
    ) as source (CategoryID, CategoryName) 
    on target.CategoryID = source.CategoryID 
when matched then 
    update set target.CategoryName = source.CategoryName 
when not matched then 
    insert (CategoryID, CategoryName) values (source.CategoryID, source.CategoryName) 
;

After going through this process on my own, I also found the same recommendation from the SSDT team at Microsoft as noted here: http://blogs.msdn.com/b/ssdt/archive/2012/02/02/including-data-in-an-sql-server-database-project.aspx

2. Add the Scripts to Your Project

This step is pretty straight forward.  You can either create the script files and add them to your project or you can create them within your project as script files.

3. Change the Build Action to None

This was one of the key pieces I missed.  After I added the scripts to the project and then ran a build, it was broke the build.  Each of these files which were merge scripts reported an error during the build.  It turns out this is called out in the article I reference above as well.  SSDT (SQL Server Data Tools) is designed to build database objects not manipulate data.  One other area of grief caused by this is that you can break the build in the solution if your project is part of a bigger solution such as mine.  As a result, you will get grief from the other developers, you can trust me on this one.

The image below shows where to set the Build Action property to NONE.  This will exclude these files from the build in this format.

image

4. Add a PostDeployment Script to Your Project

If you do not already have a PostDeployment Script, you need to do this at this point.  This is a specific type of script task that can be found in the Add menu.

image

5. Add SQLCMD Statements to the PostDeployment Script

The final part of the process is to add SQLCMD statement to the PostDeployment script to execute the files you have created.  As noted in the help in the template, you can execute the scripts by calling a single SQLCMD statement for each script.

:r .\PreLoadMVCategory.sql

The :r {filename} syntax will expand the script for execution during a publish call or DACPAC creation.

I hope you find this useful as well.  This is a common task required in creating solutions.





T-SQL Window Functions on LessThanDot and at SQL Saturday 149

26 09 2012

LessThanDot Sit LogoI recently completed a series of blog posts on www.lessthandot.com on T-SQL Window functions.  The enhancements to SQL Server 2012 in this area are phenomenal.  They solve a myriad of issues including calculating running totals with SQL.  Check it out if you want to learn more and get some simple examples related to the functions and structure related to the window functions.  Here is the series outline and links to each section.

T-SQL Window Functions:

I do a presentation related to T-SQL functions for SQL Saturdays and am presenting it at the PASS Summit this year.  Maybe I will see you there.

I recently presented this at SQL Saturday #149 in Minnesota.  Here is the presentation and the demo code. Thanks for attending.

 

Finally, if you use Oracle, you will find this series helpful as well.  Most of the syntax is supported in Oracle as well.  Look for an Oracle tip with the Oracle samples for your use soon.





SQL Saturday #149 and CodeMastery–Minnesota Events

18 09 2012

sqlsat149_webWe are less than two weeks away from SQL Saturday #149 in Minneapolis on September 29, 2012 with two preconference sessions on September 28.  In case you haven’t heard, we are having the main event on a Saturday.  Yes, the precons are on Friday this year.  Check out the details here.  I am really excited about this event as we have a great group of local, regional, and national speakers at this event.  There are nine rooms being used for this event, so go out to the site and build your schedule.

cm-logoThe following Tuesday, Magenic is hosting CodeMastery with a BI track at the Microsoft Technology Center in Edina, MN.  This event includes a sessions on managing the BI stack in SharePoint and xVelocity.  The other track is Windows 8 development with sessions on WinRT and Game Development.

I’m a Speaker at Both Events

Besides plugging these two awesome events on their own, I am also a speaker for both events.  Here is what I will be speaking on at each event:

SQL Saturday #149: A Window into Your Data: Using SQL Window Functions

In this session, I will walk through the window functions enabled by the OVER clause in SQL Server.  Come join me as we celebrate the SQL Server 2012 release of analytic functions and expansion of aggregate functionality to support tasks such as running totals and previous row values.  Thankfully, this is a demo heavy session as it is one of the last sessions of the day.

CodeMastery: Data Mining with the Tools You Already Have

The next week, I will be presenting on data mining tools which Microsoft has made available to us in SSAS and Excel.  The goal of this session is to help developers understand how to implement data mining algorithms into their business intelligence solutions.

I look forward to seeing you at both events.  They are priced right, FREE!





O, There’s the Data: Using OData in SSIS

23 07 2012

image

The Open Data Protocol (OData) is an open specification created Microsoft to enable exposing data in a standard way from a variety of sources.  OData is natively supported in many of Microsoft’s products including PowerPivot, Excel 2013, SQL Server 2012 Analysis Services Tabular Model, Windows Communication Foundation (WCF), and Entity Framework to name a few.  Furthermore, Microsoft uses OData to expose data feeds from the Windows Azure Data Marketplace as well.

I pursued adding an OData source to SSIS as a result of Mark Souza’s presentation at the Minnesota SQL Server User Group in April 2012.  I posed a question about easier interaction with Oracle.  He mentioned that OData would be a good way to solve that issue.  This led me to put together a presentation which I delivered for PASSMN in July 2012 entitled O, There’s My Data: The Open Data Protocol.  At that presentation, I reviewed the “pain and agony” of a data pro putting together a data feed using Entity Framework in C# and WCF to expose it.  For the most part, with the help of .NET pros at Magenic including Dave Stienessen ( B ) and Sergey Barskiy ( B ), I was able to create my first entity model and expose it using WCF.  After that I worked on how to consume the feed without purchasing a 3rd party tool.  Here is the rest of the story.

Using ATOM as Shown in a Channel 9 Exercise

While looking for solutions that allowed me to implement an OData feed into an SSIS package, I came across a Hands on Lab on Channel 9.  While the focus was on Reporting Services, I was able to use the steps to create a package that would read a feed and make the data available to the ETL process.  In a nutshell, this exercise involved three tasks – creating an ATOM file, processing the ATOM file and loading the data using an HTTP connection manager pointed to the OData feed.  While you are creating this package, you should run each step after you have created it in order to use the files created in the following steps.

image

Task 1 – Create ATOM File (Script Task)

In the Main method,  I used the following code which was copied and adapted from the Channel 9 exercise. (NOTE: The code for this script has an error.  The object declaration should be condensed to one line to work properly.)

public void Main()
 {
 // Get the unmanaged connection
 object nativeObject = Dts.Connections["TestSvc"].AcquireConnection(null);
    // Create a new HTTP client connection
 HttpClientConnection connection = new HttpClientConnection(nativeObject);
    // Save the file from the connection manager to the local path specified
 string filename = "C:\\Source\\SSIS 2012 Projects\\ODataIntegration\\Departments.atom";
 connection.DownloadFile(filename, true);
Dts.TaskResult = (int)ScriptResults.Success;

}

This task will create an ATOM file that will be used in the next step to retrieve the data.

Task 2 – Process ATOM File (XML Task)

This task will use the new ATOM file to create an XML file with the data.  It uses the XSLT operation type pointing to the File Connection Manager created in the previous step as the source.  This will result in another File Connection Manager to support the destination XML file with the data.  Finally, in the exercise as second operand set of XML is used to clear unsupported headers.  Admittedly, I just copied this straight from the example and still am not sure of the details of what it does.

Here is a look at the XML Task Editor so you can see the settings I used.

image

Here is the code from the Channel 9 exercise used in the SecondOperand property:

<?xml version="1.0" encoding="utf-8" ?>
  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="no" />
 <xsl:template match="/|comment()|processing-instruction()">
 <xsl:copy>
 <xsl:apply-templates />
  </xsl:copy>
  </xsl:template>
  <xsl:template match="*">
  <xsl:element name="{local-name()}">
  <xsl:apply-templates select="@*|node()" /> </xsl:element>
  </xsl:template>
  <xsl:template match="@*">
  <xsl:attribute name="{local-name()}">
  <xsl:value-of select="." />
  </xsl:attribute>
  </xsl:template>
  </xsl:stylesheet> 

Task 3 – Load Data (Data Flow Task)

The final task is a straightforward data load using the XML Source Component pointed at the file XML file I created.  Then I created a matching table in a database which I used as the destination.image

Wrap Up on the ATOM Feed Option

This will work with SSIS 2008 and SSIS 2012.  I tested most of the work in 2012, but the code in the illustration supports 2008.  This option does require that the package write at least two files to the server to work correctly.  In some cases, this will not work in enterprise environments as the data will now rest on the server for a period of time or the admins do not want files created on the server.

Using a Custom SSIS Source to Get the Data

NOTE: This is the preferred solution, but is not available in SSIS 2008 which uses the .NET 2.0 Framework.  This solution requires the .NET 3.5 Framework.

This version uses a custom SSIS source to connect to the OData feed and populate the data flow pipeline.  I did not find this option illustrated anywhere and used help from the Dave and Sergey to put this together.  I spent many hours trying to solve this issue and at the end of the day, it is fairly simple.  So, hopefully, this will save you some time as well.

This package only has one workflow task – a data flow task which contains the rest of the code.  In the data flow task, I have a Script Component implemented as a source and a Row Count with a data viewer on the pipeline to check results.

image

This was my first experience creating a custom source.  I used a post from SSIS Talk – SSIS Using a Script Component as a Source as a reference.  If you need help creating your first script source check it out.

Be sure to set your outputs prior to creating the script or you will not have them available to map to in the code.  You also need to add the HTTP Connection Manager you are using to point to your OData feed.

Add References, Using Statements, and Declarations

Once you have the basics set up, you need to add some references including the targeted data service and System.Data.Services.Client.  These are the key references for the code we are implementing.

image

Once you have these references you will need to add the following to the Using statements to the Namespaces region.

using System.Data.Services.Client;
 using SC_68e99fec2dce4cd794450383662f6ac7.TestSvc;

The SC_ reference is the internal name for your script component and will be different from mine, although it will likely be in the same format.

Next, you need to add the following declarations in the ScriptMain class as shown here.

public class ScriptMain : UserComponent
 {
private Uri svcUri = new Uri  (http://localhost/ODataSQL/TestSvc1.svc);
 private AdventureWorksEntities context;

The AdventureWorksEntities is from the service reference I created. You will need to know the context name for the service reference you are using.

The Working Code: Using DataServiceQuery

In the CreateNewOutputRows method in the SSIS script you will add code that runs a DataServiceQuery which adds the data to the data flow pipeline. In my case, my Output was called Departments and created the buffer reference you see in the code.  It has the output fields I defined for my source.  Here is the code I used to implement the solution.

public override void CreateNewOutputRows()
 {
 context = new AdventureWorksEntities(svcUri);
 DataServiceQuery<Department> dept = context.Departments;
    foreach (Department d in dept)
 {
 DepartmentsBuffer.AddRow();
        DepartmentsBuffer.DeptID = d.DepartmentID;
 DepartmentsBuffer.DeptName = d.Name;
 DepartmentsBuffer.GroupName = d.GroupName;
 }

This will query the service and return the rows. Alas, that is all it really took to solve this problem.  While this solution does not work in SSIS 2008, if you are planning to use a lot of OData, I would recommend using this as another reason to upgrade to SQL Server 2012.

SSIS Needs an OData Source Component

What I found interesting is that Microsoft does not have a native method to load OData feeds into the Data Flow Task in SSIS.  I have since created an Connect item to see if we can get this added.  Vote here if you agree.

Resources Used throughout the Process

Connecting to Windows Azure SQL Database Through WCF

Loading Data from an ATOM Data Feed into SQL Server

SSIS – Using a Script Component as a Source

DataServiceContext Class

Chris Woodruff – 31 Days of OData Blog Series

PASSMN Presentation – July 17, 2012

Consuming SharePoint Lists via OData and SSIS – Uses Linq





Presenting at PASS Summit 2012

22 06 2012

I am pleased to announce that I will be speaking at the PASS Summit again after a four-year “break”.  I get the privilege to present on some awesome new technologies released in SQL Server 2012.

Here are my sessions for this year’s Summit:

A Window into Your Data:  Using SQL Window Functions

Application & Database Development Track

Window functions are an underused feature in T-SQL that have been greatly expanded in SQL Server 2012. These functions can help you solve complex business problems such as running totals and ranking. If you have never used these functions or are looking to solve ranking and aggregate types of calculations without using GROUP BY, join us for this demo-filled session on how and when to use SQL window functions.

Building a Tabular Model Database

BI Platform, Architecture, Development & Administration Track

Come learn about the tabular model in SQL Server 2012 Analysis Services and watch a model built from “Create Project” to deployment. Microsoft introduced the tabular model in SSAS 2012. Its supporting technology is based on the xVelocity query processor, which was introduced in PowerPivot. In this session, we will create a tabular model database in SSAS using SQL Server Data Tools. We will add data from various types of data sources and build relationships between them, then extend the solution by adding calculations, measures, and hierarchies. Finally, we will cover some of the management techniques required to support a tabular model in an enterprise solution.

Check out the other sessions that have been announced and get signed up soon to take advantage of the current price $1,395 which is good until June 30, 2012, at http://www.sqlpass.org/summit/2012/.  I will look forward to seeing you there and at other community events this year.








Follow

Get every new post delivered to your Inbox.

Join 538 other followers

%d bloggers like this: