Steps to Preload Data into Tables with SSDT

6 03 2013

I am working as the data architect and developer on a modern appMALL13_Badge_See125x125 build with a the team from Modern Apps Live! in Vegas.  The goal of the project is to provide guidance to build modern applications and use this application as a reference.  While the conference is focused on the why of the build, we have learned some interesting things about how as well.  This is one of those how items.

In this post, I needed to preload some data into the database.  I wanted to include this process in the database project I had created.  However, I quickly found out that this was not a straightforward as I thought it would be.  Here are the steps I followed and any of the gotchas along the way.

1. Create Scripts for the Load Queries.

I started out with scripts that included a DELETE statement followed by an INSERT statement.  However, this created problems when data existed, particularly when the table is a list table used as a foreign key.

Next, I tried MERGE.  This worked great.  This gives me a way to handle new records that are required for the lookup or any changes made to existing data.  Here is the script I used:

merge dbo.MVCategory as target 
using ( 
        select 1, 'Fun' 
        union 
        select 2, 'Technology' 
        union 
        select 3, 'Entertainment' 
        union 
        select 4, 'News' 
        union 
        select 5, 'Sports' 
        union 
        select 6, 'Off-Topic' 
    ) as source (CategoryID, CategoryName) 
    on target.CategoryID = source.CategoryID 
when matched then 
    update set target.CategoryName = source.CategoryName 
when not matched then 
    insert (CategoryID, CategoryName) values (source.CategoryID, source.CategoryName) 
;

After going through this process on my own, I also found the same recommendation from the SSDT team at Microsoft as noted here: http://blogs.msdn.com/b/ssdt/archive/2012/02/02/including-data-in-an-sql-server-database-project.aspx

2. Add the Scripts to Your Project

This step is pretty straight forward.  You can either create the script files and add them to your project or you can create them within your project as script files.

3. Change the Build Action to None

This was one of the key pieces I missed.  After I added the scripts to the project and then ran a build, it was broke the build.  Each of these files which were merge scripts reported an error during the build.  It turns out this is called out in the article I reference above as well.  SSDT (SQL Server Data Tools) is designed to build database objects not manipulate data.  One other area of grief caused by this is that you can break the build in the solution if your project is part of a bigger solution such as mine.  As a result, you will get grief from the other developers, you can trust me on this one.

The image below shows where to set the Build Action property to NONE.  This will exclude these files from the build in this format.

image

4. Add a PostDeployment Script to Your Project

If you do not already have a PostDeployment Script, you need to do this at this point.  This is a specific type of script task that can be found in the Add menu.

image

5. Add SQLCMD Statements to the PostDeployment Script

The final part of the process is to add SQLCMD statement to the PostDeployment script to execute the files you have created.  As noted in the help in the template, you can execute the scripts by calling a single SQLCMD statement for each script.

:r .\PreLoadMVCategory.sql

The :r {filename} syntax will expand the script for execution during a publish call or DACPAC creation.

I hope you find this useful as well.  This is a common task required in creating solutions.





Join Me at Modern Apps Live! Las Vegas

5 02 2013

ModernAppsLive

If you are familiar at all with Visual Studio Live! then you should check out this new conference.  At the Vegas conference, I am participating in this unique three-day event which is a series of sessions that build on each other as the conference progresses called Modern Apps Live! The goal is to do a end-to-end application build during which the presenters build the application as well as focus on Application Lifecycle Management, best practices, tools, and a variety of technologies.

I will be presenting on the database design and business intelligence components of the solution and would love to see you there.  To make this even sweeter, if you register with the following code, MVSPK3, you will qualify for one of the following offers:

  • The 5 day all-access Best Value Conference Package for just $1,595 – a savings of $500.00 off the standard price of $2,095! (*Includes pre and post conference events.)

OR:

  • The 3 day Modern Apps Live! Conference for just $1,295 – a savings of $400.00 off the standard price of $1,695

This is for new registrants only.  Make sure to use the code by February 27, 2013, to take advantage of the full discount.

Also keep in mind that you will be able to take advantage of the Visual Studio Live! sessions as well as this is a cohosted conference.

I hope to see many of you there!





PASS Summit 2012 Wrap Up

13 11 2012

Wow, what a week. Once again, PASS put on a great event that provided much in the way of events and training for the SQL Server community. If you followed my countdown you know some of what I love about PASS. Last year I blogged everyday but I did not do that this week. So, what was different for me? Well, for one I volunteered much more this year than last and I was privileged to speak twice. I spent more time meeting new people and catching up with friends and that was great as well. Enjoy my wrap up from my week.

Tuesday – Leadership Meetings, Welcome Reception, and some Karaoke

Before the event officially kicked off, I joined community leaders from around the world for a series of leadership meetings. First we had a meeting on SQL Saturdays which was an opportunity to see the immense growth of these free training events around the U.S. and throughout the world. What a great opportunity for SQL Server professionals to improve their skills and for those passionate about the community to improve their abilities by leading these events. Many ideas were shared among the team including a panel on how to effectively run a SQL Saturday on a tight budget.

Once that was completed, the Regional Mentors enjoyed a lunch together and an opportunity to share what we do to support the user groups in our regions. I particularly enjoyed the fact that I was able to spend some time with Regional Mentors from Germany, Holland, and Portugal. This highlighted further the international scope and reach of PASS. This was followed by the Chapter Leaders meeting. That meeting was held as a series of round tables that the chapter leaders could move through. I was working at the table focused on leadership with Ryan Adams ( B | T ) from the North Texas SQL Server User Group – NTSSUG. We had a number of good conversations around building leadership teams for user groups and what is needed to have an effectively led user group. Check out the NTSSUG site for the by-laws sample we discussed multiple times.

All of these meetings were followed up with the Welcome Reception, which I made a small portion of as I was trying to drop my backpack at my hotel and work my way back there. After the reception, I headed out to Bush Gardens with a number of others. During that time, Jes Borland ( T ) managed to get a microphone in my hand and I had my first round of karaoke. Yes, I actually did sing and had fun doing it. All-in-all, it was a good time had by all.

Wednesday – SQL Around the World, Microsoft Announcements, Tabular Models, and Magenic Team Dinner

This was the true kick off to the event. For many, they looked at the key note as the kick off. Before that even began, I was working in the Community Zone encouraging people to participate in the SQL Around the World community activity. It was a great game. You needed to find 10 people from 10 different countries and find out something interesting about them or their country. I found a dancer and someone who had ridden a cheetah as a kid. I also surprised someone from the Czech Republic when she mentioned her home town only to have me let her know I had been to her home town many years ago. It was a fun conversation. If you did this and have other cool stories let me know. It was amazing as well over 50 different countries were represented at PASS.

Next, Ted Kummert had the first keynote session of the day. His keynote was filled with announcements concerning SQL Server including the following:

  • Hekaton: the project code name for a new in-memory OLTP engine
  • The Columnstore Index will be updateable
  • Next version of PDW will be out in H1 2013
  • Polybase: allows you to query across multiple types of data sources such as SQL Server and Hadoop with T-SQL
  • DAX Queries will be able to query SSAS Cubes

He also highlighted some recent announcements related to the SQL Server stack:

  • Microsoft HDInsight Server CTP: Hadoop for Windows Server
  • Windows Azure HDInsight Service Preview: Hadoop for Azure
  • Power View and PowerPivot fully implemented in Excel 2013

After the keynote, I hit a session on BigData and Hive which was put on by SQL CAT and very informative. My big takeaway was to use EXTERNAL tables not INTERNAL tables when working with Hive. I then went to do final prep for my Tabular Model session. In this session, “Building a Tabular Model Database”, I present on what tabular and in-memory is, and then proceed to open up a Visual Studio project and create a database. I think it went well and the attendees seemed to enjoy the upbeat nature for an end of day session. The night wrapped up with dinner with the Magenic team (7 of us). Good chance to grow relationships across offices from around the country.

Thursday – Community Zone and DAX as a Query Language

Thursday was a fairly low key day for me. Once again I spent time in the Community Zone. I had the opportunity to talk with a few people on creating a user group in their area. As always, I like to see people interested in growing their local community.

I also attended Alberto Ferrari’s session on DAX. I think the biggest surprise to me was that you can now query DAX directly from SSMS. I am not sure that I am convinced that it is a full query language yet, but it is definitely closer. The key to it all is the EVALUATE expression which allows you to create the DAX query ironically in the MDX window. Here is just a taste of DAX as a query:

EVALUATE
    ‘DimCurrency’
ORDER BY

    ‘DimCurrency’[CurrencyAlternateKey]

What I found interesting is that you can create columns, build measures, and perform many other operations against the tabular model using DAX. In the end, it will not increase the memory used as storage as it is all calculated. Look for some more on this in later blog posts as I delve more into the in-memory storage and usage when working with DAX.

Friday – More Community Zone, HDInsight, Paul White, and Window Functions

Last day. I spent more time in the Zone. I really did enjoy my time there as I continued to meet more people. I was even present when a contract was completed for the Shanghai user group. Very cool indeed. I then attended a session on HDInsight by Mike Flasco from Microsoft. This is very cool stuff as you can create simple Hadoop cluster on your desktop to test the technology. Microsoft and Hortonworks have done a great job of bringing Hadoop data into the Microsoft stack.

On my way to present my final session of the day and the conference, I stopped in for the second half of Allen White’s ( B ) optimization presentation. In a word (or two), mind-blowing! Wow, who knew that the optimizer did all those things? I was highly impressed and think he should look at a precon on the subject next year. Unlike some three hour presentations, he could have went longer as he was not stretching his content out. Nice work Paul. So, I got to follow that with a presentation on Window Functions in T-SQL. For the second time, I had the last slot of the last day. I think this presentation went well even though we were all worn out from a content-filled week. It was fun to try some ideas from the audience in the demos. That always makes for a more interesting demo. I will be doing a follow up post on what I learned from some of the attendees on the subject as well, proving once again this is a user community event. We all have something to contribute! (If you attended this session, you will find links to the blogs on the subject here.)

What’s Next?

Coming in April is the new Business Analytics conference in Chicago followed by the PASS Summit in Charlotte, North Carolina. Of course, your local user groups will continue to meet with regional SQL Saturdays sprinkled throughout the year as well. How will you participate and contribute in 2013? We look forward to seeing you all again, soon.





X-XMLA: vii. Partition Management with XMLA

10 10 2012

image_thumb_thumb_thumbIn the 7th article of this series, I will discuss how to manage your partitions with XMLA.   We will be focused on the partition structures as opposed to partition processing which was covered in the previous post.

Creating or altering a partition

I have already discussed object creation with XMLA in part 3 of this series.  Remember that when altering the partition you need to have all of the partition definition you wish to keep as well as what you are changing in the XMLA.  If you have a subset of the definition, SSAS will assume you want to remove the missing parts.

Adding a partition to an existing measure group

One area where a partition is different from other objects is that you can add a partition to an existing measure group.  This is often done to improve processing performance or to match the underlying data architecture (especially if table partitioning is already in place for the source tables).  Often the partitions are based on date and correspond to the date the data is loaded.  For instance, you may have separate partitions for each year.  So at the beginning of the next year you need to add a partition.  In this case, you create a new partition with a specific data source definition for that partition.

If the measure group only has one partition that is based on a single table, you may need to change the source definition so you can properly add a new partition.  There are two common ways to manage the data sources to support partitions.  The first is to create views that map to the partition structure in the cube and then the data source for each partition can point directly to a particular table or view.  The second way is to specify a query definition that usually points at one table and filters the data based on one or more keys.

In either case, the XMLA is constructed as a Create command with the ParentObject specifying the measure group that the partition will be added to.  The Object definition contains the ID, Name, Source and other properties that make up the partition.  The example below shows the definition for adding a partition with that covers the month of January using a filter in the QueryDefinition for the Source.

<Create xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <ParentObject>
        <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>
        <CubeID>Adventure Works DW</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
    </ParentObject>
    <ObjectDefinition>
        <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">
            <ID>Internet_Sales_Jan2011</ID>
            <Name>Internet_Sales_Jan2011</Name>
            <Source xsi:type="QueryBinding">
                <DataSourceID>Adventure Works DW</DataSourceID>
                <QueryDefinition>SELECT * 
                        FROM dbo.FactInternetSales 
                        WHERE OrderDateKey BETWEEN 20110101 AND 20110131</QueryDefinition>
            </Source>
            <StorageMode>Molap</StorageMode>
            <ProcessingMode>Regular</ProcessingMode>
            <ProactiveCaching>
                <SilenceInterval>-PT1S</SilenceInterval>
                <Latency>-PT1S</Latency>
                <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
                <ForceRebuildInterval>-PT1S</ForceRebuildInterval>
                <AggregationStorage>MolapOnly</AggregationStorage>
                <Source xsi:type="ProactiveCachingInheritedBinding">
                    <NotificationTechnique>Server</NotificationTechnique>
                </Source>
            </ProactiveCaching>
            <EstimatedRows>60398</EstimatedRows>
            <AggregationDesignID>Internet Sales</AggregationDesignID>
        </Partition>
    </ObjectDefinition>
</Create>

Merging partitions

In some cases, you will find the need to merge partitions.  In our example above we added a partition for the month of January.  However, as the your partition count becomes larger there is benefit to merging partitions into historical or archive partitions.    Or you may even decide that merging will help with maintenance or processing.  In either case, the logic is fairly simple.  You have one or more source partitions being merged into target partition.  The XMLA for this is pretty straightforward.

There are a couple of nuances to be aware of including the fact that the structure and the aggregation designs need to be the same to support merging.  If you plan to merge, you should take this into account when building partitions and applying aggregation designs.  Furthermore, if you use partition slices you will likely need to modify the target’s slice after the merge is account for the new data.

Here is the code required to merge two source partitions into a target partition.  The result will be one partition.

<MergePartitions xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Sources>
    <Source>
      <DatabaseID>Adventure Works DW 2008R2 SE2</DatabaseID>
      <CubeID>Adventure Works DW</CubeID>
      <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
      <PartitionID>Internet_Sales_Jan2011</PartitionID>
    </Source>
    <Source>
        <DatabaseID>Adventure Works DW 2008R2 SE2</DatabaseID>
        <CubeID>Adventure Works DW</CubeID>
        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
        <PartitionID>Internet_Sales_Feb2011</PartitionID>
    </Source>
  </Sources>
  <Target>
    <DatabaseID>Adventure Works DW 2008R2 SE2</DatabaseID>
    <CubeID>Adventure Works DW</CubeID>
    <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
    <PartitionID>Internet_Sales</PartitionID>
  </Target>
</MergePartitions>

As you can see in the example above, all that you need to specify is the full address for each partition and identify which are the sources and which one is the target.

Setting the Slice Property

Before digging into the XMLA a couple of notes on the data slice in cubes.  First, slices are not required for MOLAP partitions, however, they can help with query performance.  While in current versions of SSAS (2005 and later), the system will determine the slice based on the data, it is not always accurate.  Keep in mind that a slice does not affect processing or the source data for a partition.  That is determined by the Source attribute and related definition.

Refer to the following blog posts Jesse Orosz and Mosha Pasumanky for more information as the details around the value of assigning scope which is beyond the scope of this post:

http://jesseorosz.wordpress.com/?s=slices

http://sqlblog.com/blogs/mosha/archive/2008/10/14/get-most-out-of-partition-slices.aspx

Assuming you need to set the slice or if you need to change it due to a merge here is the XMLA to alter a partition’s slice property.

<Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW 2008R2 SE2</DatabaseID>
    <CubeID>Adventure Works DW</CubeID>
    <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
    <PartitionID>Internet_Sales_Feb2011</PartitionID>
  </Object>
  <ObjectDefinition>
    <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">
      <ID>Internet_Sales_Feb2011</ID>
      <Name>Internet_Sales_Feb2011</Name>
      <Source xsi:type="QueryBinding">
        <DataSourceID>Adventure Works DW</DataSourceID>
        <QueryDefinition>SELECT * 
                        FROM dbo.FactInternetSales 
                        WHERE OrderDateKey BETWEEN 20110102 AND 20110228</QueryDefinition>
      </Source>
      <StorageMode>Molap</StorageMode>
      <ProcessingMode>Regular</ProcessingMode>
      <Slice>{[Date].[Calendar].[Month].&amp;[2011]&amp;[2]}</Slice>
      <ProactiveCaching>
        <SilenceInterval>-PT1S</SilenceInterval>
        <Latency>-PT1S</Latency>
        <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
        <ForceRebuildInterval>-PT1S</ForceRebuildInterval>
        <AggregationStorage>MolapOnly</AggregationStorage>
        <Source xsi:type="ProactiveCachingInheritedBinding">
          <NotificationTechnique>Server</NotificationTechnique>
        </Source>
      </ProactiveCaching>
      <EstimatedRows>60398</EstimatedRows>
      <AggregationDesignID>Internet Sales</AggregationDesignID>
    </Partition>
  </ObjectDefinition>
</Alter>

If we were to merge the January and February partitions we created previously, we could change the slice definition to include both months.

<Slice>{[Date].[Calendar].[Month].&amp;[2011]&amp;[1],[Date].[Calendar].[Month].&amp;[2011]&amp;[2]}</Slice>

Setting the Storage Location

Partitions also support the ability to change the storage location for each partition further capitalizing on hardware investments such as SSDs and low cost disks.  This is set in the XMLA using the StorageLocation element with a valid path.

<Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW 2008R2 SE2</DatabaseID>
    <CubeID>Adventure Works DW</CubeID>
    <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
    <PartitionID>Internet_Sales_Jan2011</PartitionID>
  </Object>
  <ObjectDefinition>
    <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">
      <ID>Internet_Sales_Jan2011</ID>
      <Name>Internet_Sales_Jan2011</Name>
      <Source xsi:type="QueryBinding">
        <DataSourceID>Adventure Works DW</DataSourceID>
        <QueryDefinition>SELECT * 
                        FROM dbo.FactInternetSales 
                        WHERE OrderDateKey BETWEEN 20110101 AND 20110131</QueryDefinition>
      </Source>
      <StorageMode>Molap</StorageMode>
      <ProcessingMode>Regular</ProcessingMode>
      <StorageLocation>C:\Data\SSASOptimized</StorageLocation>
      <ProactiveCaching>
        <SilenceInterval>-PT1S</SilenceInterval>
        <Latency>-PT1S</Latency>
        <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
        <ForceRebuildInterval>-PT1S</ForceRebuildInterval>
        <AggregationStorage>MolapOnly</AggregationStorage>
        <Source xsi:type="ProactiveCachingInheritedBinding">
          <NotificationTechnique>Server</NotificationTechnique>
        </Source>
      </ProactiveCaching>
      <EstimatedRows>60398</EstimatedRows>
      <AggregationDesignID>Internet Sales</AggregationDesignID>
    </Partition>
  </ObjectDefinition>
</Alter>

As you can see, there are many partition management operations that can be done using XMLA.  As you work to develop a management strategy keep in mind that these operations can be scripted for automated execution when managing your Analysis Services solution.





T-SQL Window Functions on LessThanDot and at SQL Saturday 149

26 09 2012

LessThanDot Sit LogoI recently completed a series of blog posts on www.lessthandot.com on T-SQL Window functions.  The enhancements to SQL Server 2012 in this area are phenomenal.  They solve a myriad of issues including calculating running totals with SQL.  Check it out if you want to learn more and get some simple examples related to the functions and structure related to the window functions.  Here is the series outline and links to each section.

T-SQL Window Functions:

I do a presentation related to T-SQL functions for SQL Saturdays and am presenting it at the PASS Summit this year.  Maybe I will see you there.

I recently presented this at SQL Saturday #149 in Minnesota.  Here is the presentation and the demo code. Thanks for attending.

 

Finally, if you use Oracle, you will find this series helpful as well.  Most of the syntax is supported in Oracle as well.  Look for an Oracle tip with the Oracle samples for your use soon.





X-XMLA: vi. Processing and Out-of-Line Bindings in XMLA

19 09 2012

image_thumb_thumbIn this post, I will progressively go through processing a partition from full process, to incremental process, and finally to incremental process with out-of-line bindings.  This will provide you with additional samples to examine the results of partition processing with XMLA.

Note: these examples use the Adventure Works DW 2008R2 SE sample database available on CodePlex (http://msftdbprodsamples.codeplex.com/).

Processing with XMLA

Processing is a command that can be executed in a batch and in parallel with similar tasks.  This is particularly useful when processing multiple objects that are not dependent on each other, such as dimensions.  The basic structure is as follows:

<Batch>
<Parallel>
<Process>
<Object>
~object definition~
      <Type>
<ErrorConfiguration>
<WriteBackTableCreation>

Processing Type Options

This is a list of the most commonly used types when processing SSAS objects.

  • ProcessDefault:  This will determine the best option based on the condition of the object.  Be aware that if SSAS determines that a full process is required, it will do that. I would not recommend using this process type in most production situations.
  • ProcessFull: This option completely rebuilds the dimension, partition, cube or even database by removing existing data and reloading the data with aggregations and indexes.  In the case of cubes, measure groups and partitions, this is the equivalent of ProcessData + ProcessIndex.
  • ProcessClear: This option drops the data and related structures such as indexes and aggregations from the object.  Once this run, no data is available in the object.
  • ProcessData:  This option drops the data and reloads it, but does not rebuild the aggregations or indexes. It works with cubes, measure groups, partitions, and dimensions.
  • ProcessAdd: This option adds new data to a partition or dimension.  It is not available as an option when processing dimensions from SQL Server Management Studio, but is available in XMLA scripts.
  • ProcessUpdate:  This options will reread the data and updates the dimension attributes.  It is only available for dimensions and will drop flexible aggregations and indexes on related partitions.
  • ProcessIndex: This option rebuilds indexes and aggregations for partitions.  It requires that the partition has already been loaded with data (ProcessData).

Of these options, the most commonly used when working with XMLA in my experience have been, ProcessFull, ProcessData, ProcessAdd and ProcessIndex.  I will use each throughout the examples below and describe the cases where they have applied in my projects over the years.

Processing with XMLA

The process command in XMLA let’s you define the object to be processed, the type of processing to implement, and how to handle any writeback partitions.  (NOTE: the ErrorConfiguration options is optional.) Overall the structure is fairly simple.  Here is a simple example of XMLA will fully process the Fact Internet Sales measure group.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:       </Object>

   9:       <Type>ProcessFull</Type>

  10:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  11:     </Process>

  12:   </Parallel>

  13: </Batch>

The syntax is identical for processing dimensions.  The <Object> element identifies the object to be processed starting with the database.  Next the <Type> element specifies the processing type using the options from the list above.  Finally, the <WriteBackTableCreation> element instructs Analysis Services on how to handle a write back partition that has not been fully processed.  The default is UseExisting which will use existing table or create a new table if needed.  This is only relevant if your cube supports writeback.

Processing Affected Objects

This is a setting that is part of the Process command element.  By default it is set to True as in the example above.  This means that any objects that are affected will be processed as well.  The best example is if you process a dimension and it ends up being fully processed, this will cause all related measure groups to be fully processed as well, keeping the data in a consistent usable state.  However, if you are tightly managing your processing jobs, you may need to set this to False to reduce unexpected and blocking processing jobs.

Using Out-of-Line Binding with Incremental Processing

A great feature of XMLA is the ability to dynamically bind data and even connections when processing SSAS objects.  This is called “Out-of-Line” binding (http://msdn.microsoft.com/en-us/library/ms128523).   One of the core concepts behind out-of-line bindings is that you can specify the connection and the data source query or table at run time.  This is particularly useful when you want to do incremental processing.

This first example illustrates how to use a query to change the data being added to the partition.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:         <PartitionID>Internet_Sales</PartitionID>

   9:       </Object>

  10:       <Type>ProcessAdd</Type>

  11:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  12:     </Process>

  13:   </Parallel>

  14:   <Bindings>

  15:     <Binding>

  16:       <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

  17:       <CubeID>Adventure Works DW</CubeID>

  18:       <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

  19:       <PartitionID>Internet_Sales</PartitionID>

  20:       <Source xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xsi:type="QueryBinding">

  21:         <DataSourceID>Adventure Works DW</DataSourceID>

  22:         <QueryDefinition>Select 1 from SomeTable</QueryDefinition>

  23:       </Source>

  24:     </Binding>

  25:   </Bindings>

  26: </Batch>

The second example shows the syntax for changing a table or view in the bindings.

   1: <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

   2:   <Parallel>

   3:     <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">

   4:       <Object>

   5:         <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

   6:         <CubeID>Adventure Works DW</CubeID>

   7:         <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

   8:         <PartitionID>Internet_Sales</PartitionID>

   9:       </Object>

  10:       <Type>ProcessAdd</Type>

  11:       <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

  12:     </Process>

  13:   </Parallel>

  14:   <Bindings>

  15:     <Binding>

  16:       <DatabaseID>Adventure Works DW 2008R2 SE</DatabaseID>

  17:       <CubeID>Adventure Works DW</CubeID>

  18:       <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

  19:       <PartitionID>Internet_Sales</PartitionID>

  20:       <Source xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xsi:type="TableBinding">

  21:         <DataSourceID>Adventure Works DW</DataSourceID>

  22:         <DbTableName>Fact_InternetSales</DbTableName>

  23:       </Source>

  24:     </Binding>

  25:   </Bindings>

  26: </Batch>

In the end, by using the Bindings definitions you can choose what data is getting processed into the SSAS object.  Here are some ways that I have used this technique to support my projects.

  • Changed the datasource used with the object.  I have used this to point to a different data source to test whether it works with the cube without overwriting the existing datassource in the cube definition.
  • Changed the table to a view for processing.  Because I use XMLA when scheduling most of SSAS process jobs, I can dynamically set the table or view as needed.
                        • Changed the query definition to limit the data to new rows only. I have used this the most.  I have changed the query definition to reflect the latest data for both dimensions and partitions.  More importantly, I was able to leverage this in environments where I have not had control over the data source itself, specifically Oracle data warehouses.

Out-of-line bindings are very powerful when controlling your SSAS process options and allow you to maximize the efficiency of your jobs.





SQL Saturday #149 and CodeMastery–Minnesota Events

18 09 2012

sqlsat149_webWe are less than two weeks away from SQL Saturday #149 in Minneapolis on September 29, 2012 with two preconference sessions on September 28.  In case you haven’t heard, we are having the main event on a Saturday.  Yes, the precons are on Friday this year.  Check out the details here.  I am really excited about this event as we have a great group of local, regional, and national speakers at this event.  There are nine rooms being used for this event, so go out to the site and build your schedule.

cm-logoThe following Tuesday, Magenic is hosting CodeMastery with a BI track at the Microsoft Technology Center in Edina, MN.  This event includes a sessions on managing the BI stack in SharePoint and xVelocity.  The other track is Windows 8 development with sessions on WinRT and Game Development.

I’m a Speaker at Both Events

Besides plugging these two awesome events on their own, I am also a speaker for both events.  Here is what I will be speaking on at each event:

SQL Saturday #149: A Window into Your Data: Using SQL Window Functions

In this session, I will walk through the window functions enabled by the OVER clause in SQL Server.  Come join me as we celebrate the SQL Server 2012 release of analytic functions and expansion of aggregate functionality to support tasks such as running totals and previous row values.  Thankfully, this is a demo heavy session as it is one of the last sessions of the day.

CodeMastery: Data Mining with the Tools You Already Have

The next week, I will be presenting on data mining tools which Microsoft has made available to us in SSAS and Excel.  The goal of this session is to help developers understand how to implement data mining algorithms into their business intelligence solutions.

I look forward to seeing you at both events.  They are priced right, FREE!





Upcoming BI and Data Events with Magenic in Fall 2012

14 08 2012

As I started looking over my event calendar, I was amazed at how many events Magenic is sponsoring, hosting or providing speakers for over the next couple of months.  As always, one of my goals is to make everyone aware of opportunities to increase their knowledge in BI and data or even network with other data pros.  Not all of the events are Magenic events, but Magenic has stepped up to sponsor or provide speakers for the event.  I will update this blog with more details as they become available, but I want to get you thinking about the possibilities to learn and connect.bi-summit-logo

BI Summit (Chicago, August 23, 2012)

This event is coming up very soon.  Register today to hear about BI topics that matter to you as a customer and in particular as a business user.  Ted Krueger ( B | T ) and Aaron Lowe ( B | T ) will be showing off some of the new Microsoft BI capabilities.  I will be joining them for the round table at the end of the day.  Sign up today as space is limited!

Code Mastery (Chicago, September 12, 2012)

cm-logoNot to be outdone, a few weeks later Magenic’s Code Mastery event comes to Chicago.  There is an opening session by Ted Krueger discussing query tuning for developers and admins alike.  Then, a full track will be offered on a variety of BI topics.  The speakers include your truly, James Phillips ( T ), and Aaron Lowe.

SQL Saturday #156 (Providence, September 15, 2012)

Join James Phillips at this event as well.  If you see him at Code Mastery in Chicago then at SQL Saturday in Providence let him know.  I am sure we all would be interested in how and why you SQL Saturdaytraveled the country to hear him.  He will be speaking on SSIS 2012 and loading data warehouses.

SQL Saturday #149 (Minneapolis, September 28-29, 2012)

We are loaded to bear for this event.  Steve Hughes and Ted Krueger will be conducting preconference sessions.  Beyond that, a number of Magenicons have submitted sessions.  We expect the final list before the end of the week and I will update this post when they list is finalized.  Magenic is also a Gold Sponsor for this event.  Oh, and this year the precons are on Friday with the full event on Saturday.  Thought we would try something new this year.

8/15/2012 UPDATE:  Magenic speakers announced:  Ted Krueger, Aaron Lowe, Jeff Prom and I have sessions at this event.

Code Mastery (Minneapolis, October 2, 2012)

This event is still being finalized, but hold the date!  We have a BI track that includes implementing BI with SharePoint.  Our speakers include Jeff Prom ( B ), Josh Owens ( T ), Curtis Smith and Greg Moser ( T ).  Look for more details coming soon.

Minnesota SQL Server User Group (Minneapolis, October 14, 2012)

Magenic is the sponsor for October’s user group meeting and Jeff Prom will be speaking on Data Quality Services.  PASS_2012_120x240

PASS Summit (Seattle, November 6-9, 2012)

And the Summit wraps up a busy fall season.  I have two sessions at this conference.  Ted will be a part of Ask the Experts and Birds of a Feather.  A number of other Magenicon’s will also be in attendance.  We will look forward to wrapping up a this fall’s data and BI community events at the Summit.

Join Magenic’s data pro’s at any of these conferences or meetings.  Magenic is a huge promoter of business intelligence and data related education within the SQL Server community.  We look forward to seeing you at any of these events.





O, There’s the Data: Using OData in SSIS

23 07 2012

image

The Open Data Protocol (OData) is an open specification created Microsoft to enable exposing data in a standard way from a variety of sources.  OData is natively supported in many of Microsoft’s products including PowerPivot, Excel 2013, SQL Server 2012 Analysis Services Tabular Model, Windows Communication Foundation (WCF), and Entity Framework to name a few.  Furthermore, Microsoft uses OData to expose data feeds from the Windows Azure Data Marketplace as well.

I pursued adding an OData source to SSIS as a result of Mark Souza’s presentation at the Minnesota SQL Server User Group in April 2012.  I posed a question about easier interaction with Oracle.  He mentioned that OData would be a good way to solve that issue.  This led me to put together a presentation which I delivered for PASSMN in July 2012 entitled O, There’s My Data: The Open Data Protocol.  At that presentation, I reviewed the “pain and agony” of a data pro putting together a data feed using Entity Framework in C# and WCF to expose it.  For the most part, with the help of .NET pros at Magenic including Dave Stienessen ( B ) and Sergey Barskiy ( B ), I was able to create my first entity model and expose it using WCF.  After that I worked on how to consume the feed without purchasing a 3rd party tool.  Here is the rest of the story.

Using ATOM as Shown in a Channel 9 Exercise

While looking for solutions that allowed me to implement an OData feed into an SSIS package, I came across a Hands on Lab on Channel 9.  While the focus was on Reporting Services, I was able to use the steps to create a package that would read a feed and make the data available to the ETL process.  In a nutshell, this exercise involved three tasks – creating an ATOM file, processing the ATOM file and loading the data using an HTTP connection manager pointed to the OData feed.  While you are creating this package, you should run each step after you have created it in order to use the files created in the following steps.

image

Task 1 – Create ATOM File (Script Task)

In the Main method,  I used the following code which was copied and adapted from the Channel 9 exercise. (NOTE: The code for this script has an error.  The object declaration should be condensed to one line to work properly.)

public void Main()
 {
 // Get the unmanaged connection
 object nativeObject = Dts.Connections["TestSvc"].AcquireConnection(null);
    // Create a new HTTP client connection
 HttpClientConnection connection = new HttpClientConnection(nativeObject);
    // Save the file from the connection manager to the local path specified
 string filename = "C:\\Source\\SSIS 2012 Projects\\ODataIntegration\\Departments.atom";
 connection.DownloadFile(filename, true);
Dts.TaskResult = (int)ScriptResults.Success;

}

This task will create an ATOM file that will be used in the next step to retrieve the data.

Task 2 – Process ATOM File (XML Task)

This task will use the new ATOM file to create an XML file with the data.  It uses the XSLT operation type pointing to the File Connection Manager created in the previous step as the source.  This will result in another File Connection Manager to support the destination XML file with the data.  Finally, in the exercise as second operand set of XML is used to clear unsupported headers.  Admittedly, I just copied this straight from the example and still am not sure of the details of what it does.

Here is a look at the XML Task Editor so you can see the settings I used.

image

Here is the code from the Channel 9 exercise used in the SecondOperand property:

<?xml version="1.0" encoding="utf-8" ?>
  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="no" />
 <xsl:template match="/|comment()|processing-instruction()">
 <xsl:copy>
 <xsl:apply-templates />
  </xsl:copy>
  </xsl:template>
  <xsl:template match="*">
  <xsl:element name="{local-name()}">
  <xsl:apply-templates select="@*|node()" /> </xsl:element>
  </xsl:template>
  <xsl:template match="@*">
  <xsl:attribute name="{local-name()}">
  <xsl:value-of select="." />
  </xsl:attribute>
  </xsl:template>
  </xsl:stylesheet> 

Task 3 – Load Data (Data Flow Task)

The final task is a straightforward data load using the XML Source Component pointed at the file XML file I created.  Then I created a matching table in a database which I used as the destination.image

Wrap Up on the ATOM Feed Option

This will work with SSIS 2008 and SSIS 2012.  I tested most of the work in 2012, but the code in the illustration supports 2008.  This option does require that the package write at least two files to the server to work correctly.  In some cases, this will not work in enterprise environments as the data will now rest on the server for a period of time or the admins do not want files created on the server.

Using a Custom SSIS Source to Get the Data

NOTE: This is the preferred solution, but is not available in SSIS 2008 which uses the .NET 2.0 Framework.  This solution requires the .NET 3.5 Framework.

This version uses a custom SSIS source to connect to the OData feed and populate the data flow pipeline.  I did not find this option illustrated anywhere and used help from the Dave and Sergey to put this together.  I spent many hours trying to solve this issue and at the end of the day, it is fairly simple.  So, hopefully, this will save you some time as well.

This package only has one workflow task – a data flow task which contains the rest of the code.  In the data flow task, I have a Script Component implemented as a source and a Row Count with a data viewer on the pipeline to check results.

image

This was my first experience creating a custom source.  I used a post from SSIS Talk – SSIS Using a Script Component as a Source as a reference.  If you need help creating your first script source check it out.

Be sure to set your outputs prior to creating the script or you will not have them available to map to in the code.  You also need to add the HTTP Connection Manager you are using to point to your OData feed.

Add References, Using Statements, and Declarations

Once you have the basics set up, you need to add some references including the targeted data service and System.Data.Services.Client.  These are the key references for the code we are implementing.

image

Once you have these references you will need to add the following to the Using statements to the Namespaces region.

using System.Data.Services.Client;
 using SC_68e99fec2dce4cd794450383662f6ac7.TestSvc;

The SC_ reference is the internal name for your script component and will be different from mine, although it will likely be in the same format.

Next, you need to add the following declarations in the ScriptMain class as shown here.

public class ScriptMain : UserComponent
 {
private Uri svcUri = new Uri  (http://localhost/ODataSQL/TestSvc1.svc);
 private AdventureWorksEntities context;

The AdventureWorksEntities is from the service reference I created. You will need to know the context name for the service reference you are using.

The Working Code: Using DataServiceQuery

In the CreateNewOutputRows method in the SSIS script you will add code that runs a DataServiceQuery which adds the data to the data flow pipeline. In my case, my Output was called Departments and created the buffer reference you see in the code.  It has the output fields I defined for my source.  Here is the code I used to implement the solution.

public override void CreateNewOutputRows()
 {
 context = new AdventureWorksEntities(svcUri);
 DataServiceQuery<Department> dept = context.Departments;
    foreach (Department d in dept)
 {
 DepartmentsBuffer.AddRow();
        DepartmentsBuffer.DeptID = d.DepartmentID;
 DepartmentsBuffer.DeptName = d.Name;
 DepartmentsBuffer.GroupName = d.GroupName;
 }

This will query the service and return the rows. Alas, that is all it really took to solve this problem.  While this solution does not work in SSIS 2008, if you are planning to use a lot of OData, I would recommend using this as another reason to upgrade to SQL Server 2012.

SSIS Needs an OData Source Component

What I found interesting is that Microsoft does not have a native method to load OData feeds into the Data Flow Task in SSIS.  I have since created an Connect item to see if we can get this added.  Vote here if you agree.

Resources Used throughout the Process

Connecting to Windows Azure SQL Database Through WCF

Loading Data from an ATOM Data Feed into SQL Server

SSIS – Using a Script Component as a Source

DataServiceContext Class

Chris Woodruff – 31 Days of OData Blog Series

PASSMN Presentation – July 17, 2012

Consuming SharePoint Lists via OData and SSIS – Uses Linq





X-XMLA: v. Creating XMLA Script from Visual Studio and Deploy the Database

26 06 2012

image

This will be the easiest entry in this series.  This is a reblog of an article I wrote last year.

Simple batch script to generate XMLA and deploy SSAS DB

As I was working through some issues with my cube design, I realized that I wanted to deploy my SSAS databases using XMLA in SSMS. This can be easily scripted from SSMS, but I had not deployed the database yet. As a result, I created a batch file to build the XMLA used to deploy a database. This script can be used for automated builds or even automated deployments if that fits into your model. I don’t write a lot of batch files so you may want to update the syntax as needed.  Read more…








Follow

Get every new post delivered to your Inbox.

Join 538 other followers

%d bloggers like this: