Our Blogs
English
Español
Skip Navigation Links
News
Our Team
Mentoring
Erik Veerman
Go to home page
Go Search

 
Categories
There are no items in this list.
Other Blogs
There are no items in this list.
Links
There are no items in this list.

Our Blogs > English > Erik Veerman
Virtual Conference and other happenings...

Here's a quick update on happenings!  I have some great things coming down the pipe for me and also its been a busy 12 months!  But yes, blogging hasn't been my forte, but it's not for a lack of interest.  I hope to catch up a little in the next few months.

 

Looking to the future

  • Check out the SSWUG Business Intelligence Virtual Conference… http://www.vconferenceonline.com/business-intelligence/speakers.asp It looks to be a *great* conference and is September 24-26.  I’m presenting 3 sessions on Designing, Architecting, and Tuning your BI solution.  Don’t miss out!
  • Our new WROX SSIS 2008 Pro book is finished! (http://www.amazon.com/Professional-Microsoft-Server-Integration-Services/dp/0470247959/ref=sr_1_2?ie=UTF8&s=books&qid=1216041624&sr=8-2) Expect it to be on the shelves in a couple months.  We’ve added a lot of great stuff in there… I wrote the Data Warehouse ETL chapter plus there is a new chapter on leveraging the SQL 2008 RDBMS with SSIS that Grant Dickinson from MS authored.
  • We (Solid Quality) has a new seminar series called SQL Directions… it’s like a pre-conference seminars, deep-dive, lots of material without labs.  I am doing a SQL Server 2008 BI seminar in Atlanta from Oct 1-2, http://learning.solidq.com/na/CourseDetail.aspx?CourseScheduleId=302. It’s a lot cheaper and less time than a class. 
  • PASS this year looks to be very exciting, and they are expecting 3000 people!  I was accepted as a pre-conference speaker and will be presenting on Designing, planning, and Tuning a BI solution, http://summit2008.sqlpass.org/precon-erik-veerman.html I’m really looking forward to this and I have some great content to share about overall BI solution architecture.
  • A new Training Kit is in process!  This is the SQL 2008 BI Implementation TK and we just started.  It’s for the 70-448 Exam and we are building upon the TK 70 445 material… as the lead author, I’ve made a lot of decisions on the chapter content and layout to make this TK better for anyone planning on using SQL 2008 BI.  No links yet, but I’ll keep you posted!

 

What I have been doing the last few months?

  • Well, for starters, my wife and I welcomed a new baby into the family in Feb (Caleb James Veerman).  He’s our 4th kid! It’s a little chaotic around the house, as you can imagine.
  • I did a series of SQL Mag road-shows in May… on high-performance BI.  In fact, one week, I was in NYC on Monday, LA on Tuesday, Chicago on Wednesday, and Boston on Thursday.  Look for some future ones this fall!
  • TK 70-445 was released last fall… http://www.amazon.com/MCTS-Self-Paced-Training-Exam-70-445/dp/0735623414/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1216041624&sr=8-1 I was the lead author for the book…  this one really stretched me and boy what a relief it was to get it out the door.
  • Also, if you’ve read some of my earlier posts, I did some ground breaking work with the Many to Many relationships in SSAS and co-authored a White Paper with the SQL CAT team on how to optimize them.  Check out the paper here: http://sqlcat.com/whitepapers/archive/2008/05/03/analysis-services-many-to-many-dimensions-query-performance-optimization-techniques.aspx
  • Wow, last fall was full of conferences… SQL PASS, SQL Connections, and a series of SQL Mag road-shows on Project REAL.
  • Of course, project work and classes.  I’ve taught probably 8 classes in the last 8 months and have been working with a client in Atlanta implement SQL 2005 BI… they love it!  I always hear comments like “Wow, we’ve never had access to data like this before”

I’ve got some more things in the pipeline that I can’t share yet, but stay tuned!

 

Erik

...on practical BI

Maximizing Attribute Relationship Aggregations in Changing Dimensions
I wrote an article for the Solid Quality Newsletter on understanding the impact of dimension changes on Analysis Services Attribute Relationships...
 
Read the full article here!
 
BTW, if you want to sign up for our newsletter, you can sign up here:
 
Erik
SSAS Many to Many Optimization Technique #2 (using partitions)
Yep, I am finally getting around to posting my *second* idea on optimizing many-to-many relationships.  Really, its more than an idea as I have used the technique successfully on a couple client projects.
 
Here's the summary, then I'll walk through how to implement this with Adventureworks...
 
Partition your intermediate MGs by date and include the date dim relationship in the dimension usage.
 
Again, the same goal is in play, which is how to reduce the number of records that are in the intermediate measure group.  In the prior post, I walked through the Matrix idea to actually compress the number of rows in the intermediate measure group.  This approach is a little different:
 
  • M2M queries use *ALL* the common dimensions between the intermediate measure group and the other measure groups needed when a M2M dimension relationship is used.
  • The date dimension is the most common dimension used to filter or slice in MDX queries
  • THEREFORE: If you add the Date/time dimension to the intermediate measure group AND partition by the date dimension, then when M2M queries are run that use the date dimension to filter/slice, the number of rows needed for the runtime join *can* be significantly reduced.
 
That's it!  And it works, I worked with a client where we had an intermediate measure group with 150M rows. Including this reduced queries times in some cases by 10-50X.  Here's the example for AdventureWorks and the proof.
 
Implementation Example:
 
1.) In DSV, right-click on Fact Sales Reason
replace table -> with new names query
 
SELECT FISR.SalesOrderNumber, FISR.SalesOrderLineNumber, FISR.SalesReasonKey,
FIS.OrderDateKey, FIS.DueDateKey, FIS.ShipDateKey
FROM dbo.FactInternetSalesReason FISR
INNER JOIN dbo.FactInternetSales FIS
ON FISR.SalesOrderNumber = FIS.SalesOrderNumber
AND FISR.SalesOrderLineNumber = FIS.SalesOrderLineNumber
 
save/close DSV
 
2) Go to Partitions and expand the "Sales Reasons" partitions
rename Internet_Sales_Reasons to "Internet_Sales_Reasons_2001"
 
Change the source to be a query binding and use the following query:
 
SELECT FISR.SalesOrderNumber, FISR.SalesOrderLineNumber, FISR.SalesReasonKey,
FIS.OrderDateKey, FIS.DueDateKey, FIS.ShipDateKey
FROM dbo.FactInternetSalesReason FISR
INNER JOIN dbo.FactInternetSales FIS
ON FISR.SalesOrderNumber = FIS.SalesOrderNumber
AND FISR.SalesOrderLineNumber = FIS.SalesOrderLineNumber
WHERE FIS.OrderDateKey <= '184'
 
Create 3 new identical partitions with different SQL WHERE filters...
 
"Internet_Sales_Reasons_2002"
...WHERE FIS.OrderDateKey >= '185' AND FIS.OrderDateKey <= '549'
 
Internet_Sales_Reasons_2003
...WHERE FIS.OrderDateKey >= '550' AND FIS.OrderDateKey <= '914'
 
Internet_Sales_Reasons_2004
...WHERE FIS.OrderDateKey >= '915' AND FIS.OrderDateKey <= '1280'
 
Here's what your partitions will now look like:
 
SSAS Partitions
 
3.) Go to the dimension usage tab and add relationships between the Sales Reasons measure group and create regular relationships to the Date, Ship Date, and Delivery Date dimensions.
Use the Time key... for the relationship.  Here's what it looks like:
Dim Usage
 
Optional Step: Change the Internet Sales Detail to MOLAP.  The only thing this does is make it easier to see the, otherwise the M2M queries will submit SQL to the DB to get the result.  It still works both ways, but a little cleaner with MOLAP.
 
Here's the XMLA code of the project.
 
The Results
Now, turn on SQL Server Profiler and connect to SSAS.  Run the following query against the updated version of ADW...
 
SELECT [Measures].[Internet Order Quantity] ON COLUMNS
, [Sales Reason].[Sales Reasons].[Sales Reason] ON ROWS
FROM [Adventure Works]
WHERE [Date].[Calendar].[Calendar Quarter].&[2002]&[3]
 
Check out the trace.  Specifically look at the Partition entries for Sales Reasons partitions.
 
"Finished reading data from the 'Internet_Sales_Reasons_2002' partition."
 
Partitions 2003 and 2004 are not being scanned for the query... (NOTE: 2001 actually IS read and the only reason is because it only has 1454 rows, and SSAS will always read it)
 
Multiply this out to millions of records and dozens of partitions and your M2M queries that include a date filter or slice will perform way better.
 
The 2 keys to implementing this are:
1.) Include partitions in your M2M measure groups on Date ranges AND
2.) be sure to add measure group relationships to the date dimension(s) in the dimension usage.
 
That's it.
 
Regards,
Erik
 
Old blog entries
I have yet to move my old blog entries over to our new blog site... but you can find them here:
 
 

Erik Veerman

...on practical SSIS (and SSAS)

erik (at) solidq.com 

 

SSAS Many to many optimization technique #1
This has been something brewing for a few weeks that I am just getting to writing down… and yes, naming this idea #1 does imply that I have another idea :) so stay tuned.  Before diving into the details, the summary is that this approach when applied against the AdventureWorksDW database was able to take the intermediate measure group rowcount down from approx 16,000 to only 19.  Wow… because the size of the intermediate measure group has a big impact on performance.
 
Lets call this the “Many to Many Matrix Relationship Optimization” approach for SSAS M:N relationships… I’ve been working on this with Dan Hardan (Msft) on this because of a need to make the Many to Many relationships perform better.
 
As a background, there are many kinds of many to many relationships, but if I can summarize two of the common kinds…
 
1.)  If you have a many to many relationship that only relates one dimension to another without the m-n tied to the main fact table, then you are lucky!  For example, you have a vendor dimension that contains all your company vendors and then a second dimension that is a vendor role dimension that has different roles (such as shipper, buyer, etc) and a vendor ties to one or more roles directly through an intermediate fact table.  Your main fact table, say inventory fact ties directly to the vendor dimension but you don’t know which specific vendor role that an inventory record relates to, so the vendor to vendor role relationship is only between the two dimension.  In this case, your intermediate fact table will be very small as is.
 
2.) The second many to many relationship is what this (long) blog entry is about… and you can look at the AdventureWorksDW SalesReason implementation to see it.   It’s the situation when you have a fact table to dimension M:N relationship with an intermediate fact table that has just as many rows (if not more) than your main fact table… such as through a degenerate dimension.  A single sale can be tied to zero, one or more Sales Reasons… and there is no “normal” dimension that the sales fact table has to go through to get to the intermediate fact table (measure group).  Instead, the intermediate fact table also has the degenerate sales key included in it (so its in both fact tables).  I have run into this situation 2-3 times at difference clients.  In SSAS, a “fact dimension” is created and is tied to each measure group (the sales measure group and sales reason intermediate measure group).  The sales reason dimension is directly tied to the intermediate measure group and tied to the sales measure group as a many to many dimension relationship.  Look at the AdventureWorks cubes to see how this works.  So essentially the shared dimension is huge (the same number of rows as the sales fact) AND the intermediate measure group is potentially bigger than the sales fact table because a single sale can have more than one sales reason!
The solution described in scenario 2 surely works, but only for smaller size cubes where the fact tables don’t exceed approx 20-50M records (depending on your hardware and SLA).
 
There’s a much better solution for this scenario… and its based on the premise that the relationships can be collapsed to common sets of the many to many dimension records.  For example, if in my sales fact table, I may have 25 records that have the same combination of sales reasons… such as “internet” and “friend.”  In the current solution, there are 50 records in the intermediate fact table for those 25 records.  One per sale for the “internet” reason and one per sale for the “friend” reason.  However, this can be collapsed to 2 records if you are able to associate all the 25 sales fact records together and identify them as having these two specific combinations of sales reasons.
 
How?  Through creating a matrix relationship key, meaning that you get all the different combinations of the sales reason members (not the cross product of them, only the ones that are actually used) and then assign a key to the “matrix” of records.  Here's a great visualization that Dan Hardan put together:
 
M2M Matrix
 
These are the steps to make it happen (and the code for AdventureWorks is below).
 
1.) Create a new dimension the called Sales Reason Matrix Relationship Dimension and it will need to have a column that has a delimited list of surrogate key combinations from the Sales Reason AND an IDENTITY column that is the Matrix Relationship Key.
 
2.) Add a new dimension key to the fact table that ties the sales records to this new dimension (meaning, look at the old intermediate fact table and find the right combination and then lookup the right new dimension record)
 
3.) Create an intermediate measure group that has two SKs… the Matrix Dimension Key and the associated Sales Reason Key.  So you will have potentially more than one record per Matrix Dimension Key if you have.
 
4.) Finally, redo your SSAS design to use the new intermediate measure group and the new Matrix Dimension.  Hide the Matrix Dimension as it’s only there to resolve the many to many relationship.  Here's what the Dimension Usage l;ooks like in SSAS... and the XMLA is below.

DimUsage

 
OK!  Now, here’s some TSQL code and the SSAS XMLA to test it out in AdventureWorks:
 
 
FYI, I’ve only made this work for the Sales fact table… but obviously this can extend beyond that.
 
I am very interested in hearing your feedback on this.
 
Kind regards,
Erik

 ‭(Hidden)‬ Admin Links