Tag Archives: SQL Server 2012

Setting Up Tabular Models on Windows Azure

In my last post, I discussed how to set up Oracle in Windows Azure. During a customer call, there were questions about using SQL Server Analysis Services (SSAS) tabular models with Azure. This post will walk through setting up an Azure VM and deploy a tabular model to that VM.

If you do not have an Windows Azure account you can use a trial account with your Microsoft or Live account. Check out http://www.windowsazure.com for details on how to “try it free.”

Setting Up the VM in Azure

From the Management Portal on your Azure account, select Virtual Machines from the left then Add at the bottom. On the next screen, choose to create your VM from the gallery. You should see the Choose an Image option as seen below. As you can see, I have the SQL Server 2012 SP1 Enterprise image highlighted. You will need to use the Enterprise license as Tabular does not run on Standard. In this case, the Data Warehousing image is NOT the Business Intelligence Edition of SQL Server.

image

You can also choose to create a “blank” VM and load up SQL Server on your own. I chose to use the image to speed up the process – which it did substantially.

After selecting the image, the next few steps guide you through setting up the VM. For the most part, the defaults will work fine. Be aware that once this is turned on, you will be charged for it running. It is up to you to make sure you understand the costs, even if you are using the free trial.

During the setup steps, you will create the VM and its related cloud service. Remember that the account is your admin account for the VM and you will need those credentials to Remote Desktop into the VM. On the last, setup page is the Endpoints. Leave the defaults, we will add an endpoint for our tabular model later.

At this point, it will take a few minutes to set up your new VM. Once it is setup, open a Remote Desktop session into it. If you look at services or at the SQL Configuration console you will notice that everything except a tabular instance have been set up for you. As a result, I would not recommend using this gallery image for a production deployment. You should look at creating your own template if you want a more locked down and refined setup.

Setting Up the Tabular Instance in Azure

As noted before, the tabular instance is not set up. The installation media is on the server, so you can run that to install your tabular instance. I won’t walk through the install process, but this was great to find because that meant I did not have to copy media to my VM.

Making the Tabular Instance Available

This section covers the tedious tasks required to make your tabular instance available for querying outside of the VM. There are three basic steps to getting your instance “online”: setting the port number in SSAS, updating the firewall, and adding endpoints. I will walk you through the steps I used to get this done followed by some references that helped me get here.

Setting the Port Number in SSAS

By default, SSAS, both multidimensional and tabular instances, use dynamic ports. In order, to allow connections through the firewall and endpoints, the port number needs to be fixed. I used guidance from TechNet and did the following steps to set the port.

    1. Opened the Task Manager to get the PID for MSOLAP$<<instance name>>.
    2. Ran netstat –ao –p TCP to get a list of ports used by current processes. Once I had identified my port number, I also noted the server IP address which is required in the next step.
    3. I chose to confirm that I had the correct port by connecting to the instance using the IP address and port number.
    4. Next, we have to go old school and modify the msmdsrv.ini file. The typical install path for this file is C:\Program Files\Microsoft SQL Server\<<instance name>>\OLAP\Config\msmdsrv.ini.
    5. Open the file in notepad and find the <Port>0</Port> tag.
    6. Change the port number to the port number that was identified above. (Technically we could have used any open port number. I chose to do this because I was sure the port number was available.)
    7. Save the changes and restart the service.
    8. Once again confirm you can connect to the server with SSMS using the IP address and port number.

Now you have set up the SSAS instance to use a fixed port number.

Updating the Firewall

Now that we have a port number, we can create a firewall rule. We access the firewall rules from the Server Manger. In the Windows Firewall console, we will be creating a new Inbound Rule..

image

  1. The rule type is Port
  2. We will apply the rule to TCP and specify the port we defined above.
  3. On the action tab, we selected Allow the Connection. (If you are planning to use this in a production environment, you will need to verify your connection requirements.)
  4. Next, we clear any connection we don’t want to apply.
  5. Finally, I named the rule with a descriptive name.

Now we have the firewall rule in place to allow external connections for the tabular instance.

Adding Endpoints

The final step to making the instance available is to add the endpoints in Azure. In the WIndows Azure portal, we need to go the VMs page again, select our VM, and open the ENDPOINTS tab. From here we create a new endpoint.

  1. We are creating a standalone endpoint.
  2. Next, we select the TCP protocol and add the port number to both the private and public port textboxes.
  3. Finally, we apply the changes.

We are now ready to test connectivity.

References

Setting up VM Endpoints

Configuring Windows Firewall

Configuring Windows Firewall with SSAS

Connecting to the Tabular Instance in Azure

So to verify this works, I wanted to connect to the model with SSMS on my desktop. However, it uses Windows authentication and I am not on the same domain. (My IT staff is likely still laughing about me joining my VM to our domain.)

Thankfully, Greg Galloway (blog) reminded me of how to set up runas to use different credentials to connect to SSAS. Here is the syntax I used to connect to the tabular instance on Azure using a command window:

runas /netonly /user:<<VM name>>\<<username>> “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\Ssms.exe”

This allowed me to open SSMS using those credentials. Next, I used the VM + port to connect. You will see that the Windows credentials in the dialog are not what you entered on the command line. This is expected and the credentials you entered in the command line will be passed through to the VM.

Deploying a Tabular Model to Azure

I attempted three different ways to deploy my model to the Azure VM. Two worked, one did not.

Deploying from Visual Studio on My Desktop

My first attempt was to deploy from Visual Studio on my desktop. I set the deployment properties to point to the Azure instance using the same credentials I had used with SSMS. I also set up a command line execution to use runas like with SSMS.

It appeared I could connect, but I continually got a permissions error which is shown below. After much frustration, I gave up on it and moved to the next options.

image

Using XMLA to Deploy

This is the most straightforward way to deploy an SSAS DB on a different server. I used SSMS to generate the Create Database XMLA statement. Because I had not deployed it locally, I needed to modify the XMLA statement to remove the user name and guid from the database name and database ID. (AdvWorksTab1_SteveH_<<Some GUID>>)

In a bit of irony, I can use the SSMS connection from my desktop using the runas to deploy the database to the VM.

The reality is that this is easy and acceptable way to deploy SSAS databases to production environments, in Azure or in your datacenter.

Deploying from Visual Studio on the VM

The final method I used was deploying Visual Studio onto the VM and deploying from there. I used VisualStudio.com (TFS online) to transfer the source code from my desktop to the VM. I had to install the TFS client on the VM, but SSDT with BI tools is already there.

  1. Installed the VS 2010 TFS Client: http://www.microsoft.com/en-us/download/details.aspx?id=329
  2. Then installed Visual Studio SP1  http://social.msdn.microsoft.com/Forums/vstudio/en-US/4e4851dc-eb29-4081-9484-d38a6efa07ee/unable-to-connect-to-tfs-online?forum=TFService
  3. Finally installed VS2010 Team Foundation Server Compatibility GDR (KB2662296) http://www.microsoft.com/en-us/download/details.aspx?id=29082

Now it will connect to TFS Online. I got the latest from my project and pointed the project to my tabular instance.

Be sure to check the impersonation settings.

Next, I deployed the project to the local tabular instance on the VM and it worked great. This might make sense for development, but I would not use this method in a production environment.

Some Closing Thoughts

I was amazed at how simple it was to create the VM and set up tabular in Azure. Knowing what I know now, I would be able to set up a usable instance fairly quickly and deploy a database using XMLA without much effort. That was very nice.

Doesn’t work with Office 365

I started this project to determine the connectivity capability with Office 365. Well, that does not work in my current configuration. I was able to create a workbook on my desktop using my Azure tabular model and Excel. It works just as you would expect. However, when I deployed the workbook to Office 365, data refresh always failed.

image

My next thought was to use a data gateway, but those only work with SQL Server Data Engine and Oracle, not SSAS. This is still a significant issue with making Power BI able to fully take advantage of existing BI assets at a company.

Using Azure Active Directory

My next step would be to use Azure Active Directory services to try to get Windows Authentication to work throughout. But that is for a later project and post.

Advertisements

T-SQL Window Functions on LessThanDot and at SQL Saturday 149

LessThanDot Sit LogoI recently completed a series of blog posts on www.lessthandot.com on T-SQL Window functions.  The enhancements to SQL Server 2012 in this area are phenomenal.  They solve a myriad of issues including calculating running totals with SQL.  Check it out if you want to learn more and get some simple examples related to the functions and structure related to the window functions.  Here is the series outline and links to each section.

T-SQL Window Functions:

I do a presentation related to T-SQL functions for SQL Saturdays and am presenting it at the PASS Summit this year.  Maybe I will see you there.

I recently presented this at SQL Saturday #149 in Minnesota.  Here is the presentation and the demo code. Thanks for attending.

 

Finally, if you use Oracle, you will find this series helpful as well.  Most of the syntax is supported in Oracle as well.  Look for an Oracle tip with the Oracle samples for your use soon.

X-XMLA: vi. Processing and Out-of-Line Bindings in XMLA

image_thumb_thumbIn this post, I will progressively go through processing a partition from full process, to incremental process, and finally to incremental process with out-of-line bindings.  This will provide you with additional samples to examine the results of partition processing with XMLA.

Note: these examples use the Adventure Works DW 2008R2 SE sample database available on CodePlex (http://msftdbprodsamples.codeplex.com/).

Processing with XMLA

Processing is a command that can be executed in a batch and in parallel with similar tasks.  This is particularly useful when processing multiple objects that are not dependent on each other, such as dimensions.  The basic structure is as follows:

<Batch>
<Parallel>
<Process>
<Object>
~object definition~
      <Type>
<ErrorConfiguration>
<WriteBackTableCreation>

Processing Type Options

This is a list of the most commonly used types when processing SSAS objects.

  • ProcessDefault:  This will determine the best option based on the condition of the object.  Be aware that if SSAS determines that a full process is required, it will do that. I would not recommend using this process type in most production situations.
  • ProcessFull: This option completely rebuilds the dimension, partition, cube or even database by removing existing data and reloading the data with aggregations and indexes.  In the case of cubes, measure groups and partitions, this is the equivalent of ProcessData + ProcessIndex.
  • ProcessClear: This option drops the data and related structures such as indexes and aggregations from the object.  Once this run, no data is available in the object.
  • ProcessData:  This option drops the data and reloads it, but does not rebuild the aggregations or indexes. It works with cubes, measure groups, partitions, and dimensions.
  • ProcessAdd: This option adds new data to a partition or dimension.  It is not available as an option when processing dimensions from SQL Server Management Studio, but is available in XMLA scripts.
  • ProcessUpdate:  This options will reread the data and updates the dimension attributes.  It is only available for dimensions and will drop flexible aggregations and indexes on related partitions.
  • ProcessIndex: This option rebuilds indexes and aggregations for partitions.  It requires that the partition has already been loaded with data (ProcessData).

Of these options, the most commonly used when working with XMLA in my experience have been, ProcessFull, ProcessData, ProcessAdd and ProcessIndex.  I will use each throughout the examples below and describe the cases where they have applied in my projects over the years.

Processing with XMLA

The process command in XMLA let’s you define the object to be processed, the type of processing to implement, and how to handle any writeback partitions.  (NOTE: the ErrorConfiguration options is optional.) Overall the structure is fairly simple.  Here is a simple example of XMLA will fully process the Fact Internet Sales measure group.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:       </Object>

   9:       <Type>ProcessFull</Type>

  10:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  11:     </Process>

  12:   </Parallel>

  13: </Batch>

The syntax is identical for processing dimensions.  The <Object> element identifies the object to be processed starting with the database.  Next the <Type> element specifies the processing type using the options from the list above.  Finally, the <WriteBackTableCreation> element instructs Analysis Services on how to handle a write back partition that has not been fully processed.  The default is UseExisting which will use existing table or create a new table if needed.  This is only relevant if your cube supports writeback.

Processing Affected Objects

This is a setting that is part of the Process command element.  By default it is set to True as in the example above.  This means that any objects that are affected will be processed as well.  The best example is if you process a dimension and it ends up being fully processed, this will cause all related measure groups to be fully processed as well, keeping the data in a consistent usable state.  However, if you are tightly managing your processing jobs, you may need to set this to False to reduce unexpected and blocking processing jobs.

Using Out-of-Line Binding with Incremental Processing

A great feature of XMLA is the ability to dynamically bind data and even connections when processing SSAS objects.  This is called “Out-of-Line” binding (http://msdn.microsoft.com/en-us/library/ms128523).   One of the core concepts behind out-of-line bindings is that you can specify the connection and the data source query or table at run time.  This is particularly useful when you want to do incremental processing.

This first example illustrates how to use a query to change the data being added to the partition.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:         <PartitionID>Internet_Sales</PartitionID>

   9:       </Object>

  10:       <Type>ProcessAdd</Type>

  11:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  12:     </Process>

  13:   </Parallel>

  14:   <Bindings>

  15:     <Binding>

  16:       <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

  17:       <CubeID>Adventure Works DW</CubeID>

  18:       <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

  19:       <PartitionID>Internet_Sales</PartitionID>

  20:       <Source xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xsi:type="QueryBinding">

  21:         <DataSourceID>Adventure Works DW</DataSourceID>

  22:         <QueryDefinition>Select 1 from SomeTable</QueryDefinition>

  23:       </Source>

  24:     </Binding>

  25:   </Bindings>

  26: </Batch>

The second example shows the syntax for changing a table or view in the bindings.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:         <PartitionID>Internet_Sales</PartitionID>

   9:       </Object>

  10:       <Type>ProcessAdd</Type>

  11:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  12:     </Process>

  13:   </Parallel>

  14:   <Bindings>

  15:     <Binding>

  16:       <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

  17:       <CubeID>Adventure Works DW</CubeID>

  18:       <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

  19:       <PartitionID>Internet_Sales</PartitionID>

  20:       <Source xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xsi:type="TableBinding">

  21:         <DataSourceID>Adventure Works DW</DataSourceID>

  22:         <DbTableName>Fact_InternetSales</DbTableName>

  23:       </Source>

  24:     </Binding>

  25:   </Bindings>

  26: </Batch>

In the end, by using the Bindings definitions you can choose what data is getting processed into the SSAS object.  Here are some ways that I have used this technique to support my projects.

  • Changed the datasource used with the object.  I have used this to point to a different data source to test whether it works with the cube without overwriting the existing datassource in the cube definition.
  • Changed the table to a view for processing.  Because I use XMLA when scheduling most of SSAS process jobs, I can dynamically set the table or view as needed.
                        • Changed the query definition to limit the data to new rows only. I have used this the most.  I have changed the query definition to reflect the latest data for both dimensions and partitions.  More importantly, I was able to leverage this in environments where I have not had control over the data source itself, specifically Oracle data warehouses.

Out-of-line bindings are very powerful when controlling your SSAS process options and allow you to maximize the efficiency of your jobs.

SQL Saturday #149 and CodeMastery–Minnesota Events

sqlsat149_webWe are less than two weeks away from SQL Saturday #149 in Minneapolis on September 29, 2012 with two preconference sessions on September 28.  In case you haven’t heard, we are having the main event on a Saturday.  Yes, the precons are on Friday this year.  Check out the details here.  I am really excited about this event as we have a great group of local, regional, and national speakers at this event.  There are nine rooms being used for this event, so go out to the site and build your schedule.

cm-logoThe following Tuesday, Magenic is hosting CodeMastery with a BI track at the Microsoft Technology Center in Edina, MN.  This event includes a sessions on managing the BI stack in SharePoint and xVelocity.  The other track is Windows 8 development with sessions on WinRT and Game Development.

I’m a Speaker at Both Events

Besides plugging these two awesome events on their own, I am also a speaker for both events.  Here is what I will be speaking on at each event:

SQL Saturday #149: A Window into Your Data: Using SQL Window Functions

In this session, I will walk through the window functions enabled by the OVER clause in SQL Server.  Come join me as we celebrate the SQL Server 2012 release of analytic functions and expansion of aggregate functionality to support tasks such as running totals and previous row values.  Thankfully, this is a demo heavy session as it is one of the last sessions of the day.

CodeMastery: Data Mining with the Tools You Already Have

The next week, I will be presenting on data mining tools which Microsoft has made available to us in SSAS and Excel.  The goal of this session is to help developers understand how to implement data mining algorithms into their business intelligence solutions.

I look forward to seeing you at both events.  They are priced right, FREE!

O, There’s the Data: Using OData in SSIS

image

The Open Data Protocol (OData) is an open specification created Microsoft to enable exposing data in a standard way from a variety of sources.  OData is natively supported in many of Microsoft’s products including PowerPivot, Excel 2013, SQL Server 2012 Analysis Services Tabular Model, Windows Communication Foundation (WCF), and Entity Framework to name a few.  Furthermore, Microsoft uses OData to expose data feeds from the Windows Azure Data Marketplace as well.

I pursued adding an OData source to SSIS as a result of Mark Souza’s presentation at the Minnesota SQL Server User Group in April 2012.  I posed a question about easier interaction with Oracle.  He mentioned that OData would be a good way to solve that issue.  This led me to put together a presentation which I delivered for PASSMN in July 2012 entitled O, There’s My Data: The Open Data Protocol.  At that presentation, I reviewed the “pain and agony” of a data pro putting together a data feed using Entity Framework in C# and WCF to expose it.  For the most part, with the help of .NET pros at Magenic including Dave Stienessen ( B ) and Sergey Barskiy ( B ), I was able to create my first entity model and expose it using WCF.  After that I worked on how to consume the feed without purchasing a 3rd party tool.  Here is the rest of the story.

Using ATOM as Shown in a Channel 9 Exercise

While looking for solutions that allowed me to implement an OData feed into an SSIS package, I came across a Hands on Lab on Channel 9.  While the focus was on Reporting Services, I was able to use the steps to create a package that would read a feed and make the data available to the ETL process.  In a nutshell, this exercise involved three tasks – creating an ATOM file, processing the ATOM file and loading the data using an HTTP connection manager pointed to the OData feed.  While you are creating this package, you should run each step after you have created it in order to use the files created in the following steps.

image

Task 1 – Create ATOM File (Script Task)

In the Main method,  I used the following code which was copied and adapted from the Channel 9 exercise. (NOTE: The code for this script has an error.  The object declaration should be condensed to one line to work properly.)

public void Main()
 {
 // Get the unmanaged connection
 object nativeObject = Dts.Connections["TestSvc"].AcquireConnection(null);
    // Create a new HTTP client connection
 HttpClientConnection connection = new HttpClientConnection(nativeObject);
    // Save the file from the connection manager to the local path specified
 string filename = "C:\\Source\\SSIS 2012 Projects\\ODataIntegration\\Departments.atom";
 connection.DownloadFile(filename, true);
Dts.TaskResult = (int)ScriptResults.Success;

}

This task will create an ATOM file that will be used in the next step to retrieve the data.

Task 2 – Process ATOM File (XML Task)

This task will use the new ATOM file to create an XML file with the data.  It uses the XSLT operation type pointing to the File Connection Manager created in the previous step as the source.  This will result in another File Connection Manager to support the destination XML file with the data.  Finally, in the exercise as second operand set of XML is used to clear unsupported headers.  Admittedly, I just copied this straight from the example and still am not sure of the details of what it does.

Here is a look at the XML Task Editor so you can see the settings I used.

image

Here is the code from the Channel 9 exercise used in the SecondOperand property:

<?xml version="1.0" encoding="utf-8" ?>
  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="no" />
 <xsl:template match="/|comment()|processing-instruction()">
 <xsl:copy>
 <xsl:apply-templates />
  </xsl:copy>
  </xsl:template>
  <xsl:template match="*">
  <xsl:element name="{local-name()}">
  <xsl:apply-templates select="@*|node()" /> </xsl:element>
  </xsl:template>
  <xsl:template match="@*">
  <xsl:attribute name="{local-name()}">
  <xsl:value-of select="." />
  </xsl:attribute>
  </xsl:template>
  </xsl:stylesheet> 

Task 3 – Load Data (Data Flow Task)

The final task is a straightforward data load using the XML Source Component pointed at the file XML file I created.  Then I created a matching table in a database which I used as the destination.image

Wrap Up on the ATOM Feed Option

This will work with SSIS 2008 and SSIS 2012.  I tested most of the work in 2012, but the code in the illustration supports 2008.  This option does require that the package write at least two files to the server to work correctly.  In some cases, this will not work in enterprise environments as the data will now rest on the server for a period of time or the admins do not want files created on the server.

Using a Custom SSIS Source to Get the Data

NOTE: This is the preferred solution, but is not available in SSIS 2008 which uses the .NET 2.0 Framework.  This solution requires the .NET 3.5 Framework.

This version uses a custom SSIS source to connect to the OData feed and populate the data flow pipeline.  I did not find this option illustrated anywhere and used help from the Dave and Sergey to put this together.  I spent many hours trying to solve this issue and at the end of the day, it is fairly simple.  So, hopefully, this will save you some time as well.

This package only has one workflow task – a data flow task which contains the rest of the code.  In the data flow task, I have a Script Component implemented as a source and a Row Count with a data viewer on the pipeline to check results.

image

This was my first experience creating a custom source.  I used a post from SSIS Talk – SSIS Using a Script Component as a Source as a reference.  If you need help creating your first script source check it out.

Be sure to set your outputs prior to creating the script or you will not have them available to map to in the code.  You also need to add the HTTP Connection Manager you are using to point to your OData feed.

Add References, Using Statements, and Declarations

Once you have the basics set up, you need to add some references including the targeted data service and System.Data.Services.Client.  These are the key references for the code we are implementing.

image

Once you have these references you will need to add the following to the Using statements to the Namespaces region.

using System.Data.Services.Client;
 using SC_68e99fec2dce4cd794450383662f6ac7.TestSvc;

The SC_ reference is the internal name for your script component and will be different from mine, although it will likely be in the same format.

Next, you need to add the following declarations in the ScriptMain class as shown here.

public class ScriptMain : UserComponent
 {
private Uri svcUri = new Uri  (http://localhost/ODataSQL/TestSvc1.svc);
 private AdventureWorksEntities context;

The AdventureWorksEntities is from the service reference I created. You will need to know the context name for the service reference you are using.

The Working Code: Using DataServiceQuery

In the CreateNewOutputRows method in the SSIS script you will add code that runs a DataServiceQuery which adds the data to the data flow pipeline. In my case, my Output was called Departments and created the buffer reference you see in the code.  It has the output fields I defined for my source.  Here is the code I used to implement the solution.

public override void CreateNewOutputRows()
 {
 context = new AdventureWorksEntities(svcUri);
 DataServiceQuery<Department> dept = context.Departments;
    foreach (Department d in dept)
 {
 DepartmentsBuffer.AddRow();
        DepartmentsBuffer.DeptID = d.DepartmentID;
 DepartmentsBuffer.DeptName = d.Name;
 DepartmentsBuffer.GroupName = d.GroupName;
 }

This will query the service and return the rows. Alas, that is all it really took to solve this problem.  While this solution does not work in SSIS 2008, if you are planning to use a lot of OData, I would recommend using this as another reason to upgrade to SQL Server 2012.

SSIS Needs an OData Source Component

What I found interesting is that Microsoft does not have a native method to load OData feeds into the Data Flow Task in SSIS.  I have since created an Connect item to see if we can get this added.  Vote here if you agree.

Resources Used throughout the Process

Connecting to Windows Azure SQL Database Through WCF

Loading Data from an ATOM Data Feed into SQL Server

SSIS – Using a Script Component as a Source

DataServiceContext Class

Chris Woodruff – 31 Days of OData Blog Series

PASSMN Presentation – July 17, 2012

Consuming SharePoint Lists via OData and SSIS – Uses Linq