Gautham KAMATH

BI Developer specializing in SSAS and MDX

CoordinatorExecutionMode and Process MaxThreads

with one comment

Just recently I got into a processing performance test of my Analysis Services database and I noticed that only a little over 50% of the available system resources were being used. On investigation I found that the busy (processing pool busy threads), idle (processing pool idle threads) and queue (processing pool job queue length) that were to be used by the processing were way below the SSAS defaults.

This did not seen right. On closer investigation I found that, during the processing cycle, small amounts of data were being processed (ranging from a 1000 to 100000 records per batch) and ProcessAdd was being used to process the partition data.

I changed the processing cycle to process all the records using ProcessFull and the data processed significantly faster. Yes, I know ProcessFull must have been used instead of ProcessAdd but as per our earlier requirements this was not possible and besides this is not the topic of discussion today.

As I mentioned, the processing time did reduce but there were a significant amount of job queues. This, as a matter of fact, was not surprising as I was using SSAS default settings for CoordinatorExecutionMode and Process MaxThreads.

Just for information let me define a couple of Analysis Services server properties and three performance counters that I will be referring to in this post.

Analysis Services server properties

CoordinatorExecutionMode: Processing of dimensions and partitions and querying of partitions are handled by job operations. A job operation is a unit of execution of command that is fired against the SSAS server. The property that controls these job operations to run in parallel is the CoordinatorExecutionMode. The CoordinatorExecutionMode can have the following values:

  • Negative number indicates absolute maximum number of parallel jobs per core. The CoordinatorExecutionMode defaults to -4 indicating 4x Number of cores of parallel jobs. i.e. If you have 8 cores on the server then the absolute maximum number of parallel jobs would be 32.
  • Positive number indicates absolute maximum number of parallel jobs per server. If you set it to 10 then absolute maximum number of parallel jobs for the server would be 10.
  • Zero indicates the server to determine the maximum number of parallel jobs, based on workload and available system resources.

Process MaxThreads: This server property determines the maximum number of available threads to SSAS during processing (and for accessing the IO system during querying). This defaults to 64 or 10 time the number of cores whichever is maximum.

Performance Monitor (PerfMon) counters

Processing pool BUSY threads: Indicates the number of threads in use.
Processing pool IDLE threads: Indicates the number of threads that are idle.
Processing pool JOB QUEUE LENGTH: Indicates the number of jobs that are waiting for threads to be available.

Now coming back to the issue, as a test case, I changed the CoordinatorExecutionMode to zero.

Changing the CoordinatorExecutionMode to zero meant that I was forcing the SSAS engine to use all the available system resources. Ironically changing the CoordinatorExecutionMode to zero slowed the processing performance significantly and increased the processing time by 50%.

This was because now the all the available threads (in excess of 300) were either fighting or queuing up to do the same portion of processing work. The system resources were being hammered and ignoring the SSAS setting of Process MaxThreads that had the default value of 64.

I changed the CoordinatorExecutionMode back to -4 and the system was processing as before. But I still saw a significant amount of job queues and for that I increased the Process MaxThreads from the default value of 64 to 128. The queues were nullified and processing time just improved marginally. The important thing here was changing the Process MaxThreads to 128 zeroed the queued threads.

I went slightly adventures by decreasing the CoordinatorExecutionMode to -1. The processing pool idle threads were reduced significantly and processing pool busy threads increased while processing pool job queue length remained 0. But for me the default CoordinatorExecutionMode of -4 and changing the Process MaxThreads to 128 worked.

To summarize:

  • DO NOT change CoordinatorExecutionMode to 0.
  • Try to get the queue lengths to minimum. Playing with Process MaxThreads is a good idea but do not be over generous as it may backfire. Please note that CoordinatorExecutionMode and Process MaxThreads complement each other and if you change CoordinatorExecutionMode make sure you have enough threads to play with.
  • CPU is underutilized when
    • Processing pool job queue length is greater than 0. This means that SSAS wants to process more but does not have available threads.
    • Processing pool idle threads = 0. This means that there is a potential to use more threads hence increase Process MaxThreads. Increase gradually and check. Ideally it should not go above 0 but gradual increase will give you a clear indication of its capacity.
    • Processing pool idle threads greater than 0. Increase the CoordinatorExecutionMode from default value of -4 to -5 or -6 (Yes this is an increase and not decrease. Please note that negative sign (-) is just an indication that it is a per core setting. Positive on the other hand means that it is per server setting). Idle threads greater than 0 does not mean that it does not have enough threads bit it means that it does not have enough jobs.

Please note this test was carried out against:

  • SQL Server Analysis Services (SSAS) 2008R2 64 Bit
  • Windows Server 2008R2 64Bit
  • 10GB RAM
  • 4 Cores

This behaviour has an improvement in Analysis Services 2012 (Multi-dimensional model) with changes to the way Multi-dimensional model uses Non-Uniform Memory Access (NUMA).

Rahul Dravid, I salute you.

leave a comment »

“If I have to name anyone to bat for my life, it’ll be Jacques Kallis or Rahul Dravid” said Brian Lara. The magnanimous personification of “The Wall” in Rahul “The Wall” Dravid was instantly made clear. Rahul Dravid is one of the greatest cricketers to have graced the Indian Cricket through its golden era. A true gentleman and an excellent sportsperson, he has, all through his career, displayed great modesty and been an honest cricketer.

Lets flashback to 1996 particularly the back to back series in Singapore, where Dravid made his ODI debut, Sharjah and England where he made his Test debut. There were a few faces drafted into the Indian Team for those series. One was the dashing Vikram Rathour with a tall and rough physique and second was someone going with the name of Rahul Dravid who looked more like a chocolate hero. Between the two I was confident that Vikram Rathour will eat into this Karnataka batsman’s opportunity. How wrong was I and in this instance it felt particularly good.

Dravid was given more chances before he had to relinquish his place for Vikram Rathour in Sharjah. They continued to shuffle their places till the second Test against England which almost sealed Dravid’s place in the Indian team.  With his classy 95 in the debut Test match at Lords, Dravid truly arrived in the international scene as a confident enigmatic textbook Test cricketer. In many ways he reminded us the forgotten art of Test match batting. Seldom, since then,  has he let himself, the team or cricket down.

Having scored 24208 runs all forms of International cricket, some of his great innings have been a second fiddle to a greater contribution by his partners. For instance his innings of 180 at Calcutta in March 2001 was overshadowed by the 281 by VVS Laxman but was as much significant that led India to their greatest ever victory in Test cricket. In ODI he has been involved in both 300+ partnerships only to be overshadowed by his partner’s innings. Nonetheless, in all such partnerships and in many others he stood at crease like “The Wall”.

Rahul Dravid batting during his last ODI at Cardiff

Elegantly batting at number 3, he shielded the Indian batting line-up with a determination and temperament that fed on opposition’s patience. He exemplified on leaving the balls that were outside the line off-stump often frustrating the bowlers and forcing them to bowl at his wicket. His defence was immaculate as if scripted by God himself. If the bowlers were little off line or length and I really mean fractionally little bit, the ball would be punished. It always seemed it was not Dravid who was getting 6 chances to survive, instead it was Dravid who was giving the bowler 6 attempts to bowl accurately. His was always the prized wicket as he had this innate temperament that frustrated the bowlers to make mistakes.

Apart from his batting, he manned the slip cordon, and almost all the time, alone at first slip. He was the safest pair of hands that has now held more catches than anyone else in the history of Test cricket (non wicket keeper).

He was modest and a very accomplished cricketer often being an inspiration and a mentor to many who played around him. His experiences and words of wisdom have and will always be cherished and respected. The greatness of this man lied in his honesty towards cricket. In many ways he has become Gandhi of cricket. The spirit with which he played has had a profound impact on the game of cricket.

The retirement of Rahul Dravid is truly the end of the golden era of Indian cricket. Sad but inevitable, I wish Rahul Dravid good luck, hoping to see him in an administrative role.

Rahul Sharad Dravid, I salute you.

Vikram Kamath Nalin Pai Gautham Kamath at Cardiff 16 September 2011On a personal note, my closest encounter with Rahul Dravid was at Cardiff when I along with two of my friends had the privilege of watching him play his Last ODI. The ground had an electrifying buzz, more so because it was Rahul Dravid’s last ODI match and perhaps the last time that the fans could see him play in England. For us it was more of a celebration, celebrating Dravid’s glorious ODI career. That day too Dravid did not disappoint scoring a masterful 69 runs.

The only reason we are here is to show our gratitude to Rahul Dravid

Like many we had made a poster that read “The only reason we are here is to show our gratitude to Rahul”. The most important thing for me, that day was that Dravid, just before India’s bowling innings saw the poster and raised his hand to acknowledge and possibly to say  thank you to the message. The fact that he just acknowledged was enough for us.

Rahul, if you happen read this blog, just so u know I still have that poster in a “not so good state” and would very much appreciate it if you can sign it for us.

Written by Gautham KAMATH

13/03/2012 at 22:04

Processing using Snapshot Isolation in Analysis Services (SSAS)

with 7 comments

Just a few months ago I was hit with an issue of dimension processing failure. The error read

Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: ‘<<TableName>>’, Column: ‘<<ColumnName>>’, Value: ‘XXXXXXX’. The attribute is ‘<<AttributeName>>’.

On investigation I found that constant updates to the dimension were the main cause. One of my dimensions had as many as 15 attributes and the processing failed when any update occurred during dimension processing. Generally speaking, the first attribute’s SELECT DISTINCT statement would conflict with the SELECT DISTINCTs of other attributes because of the constant update.

One way to get around this is was to change the ProcessingGroup property of the dimension to be ByTable (instead of the default value of ByAttribute). As the ByTable option takes the entire table data for that dimension into memory, it would have to process the same data without having to worry about the constant updates. The new updates would then get incorporated in the next processing cycle.

This option did not work as the ByTable option works only when you have only a solitary table to draw the data from. Multiple tables in star schema or snow-flake schema or a RDBMS view that encapsulates multiple tables will issue multiple queries against the data source.

P/S Try processing the Product and Geography dimensions using ByAttribute and ByTable options. The profile trace for ByTable ProcessingGroup mode for Product dimension will show you multiple SELECT DISTINCT statements being fired whereas for the Geography dimension, because of its single table data source, it would fire only one SELECT DISTINCT statement.

As I am using RDBMS database views that uses multiple table joins, this option for me, also went to pits. The only feasible solution was to use the Snapshot Isolation.

Snapshot Isolation

The Snapshot Isolation works on the Row Versioning framework.

Row versioning based isolation levels reduce the number of locks acquired by transaction by eliminating the use of shared locks on read operations.

For the Analysis Services database to work on Snapshot Isolation mode the source RDBMS database needs to have the READ_COMMITTED_SNAPSHOT option and ALLOW_SNAPSHOT_ISOLATION to be ON.

  • When the READ_COMMITTED_SNAPSHOT is ON, READ_COMMITTED transactions provide statement level read consistency using row versioning.
  • When ALLOW_SNAPSHOT_ISOLATION is ON, SNAPSHOT transactions provide transaction-level read consistent using row versioning.

The following is the SQL command that can enable the mentioned options in your RDBMS database.

        USE <<DatabaseName>>
        GO
        ALTER DATABASE <<DatabaseName>>
        SET READ_COMMITTED_SNAPSHOT ON
        GO
        ALTER DATABASE <<DatabaseName>>
        SET ALLOW_SNAPSHOT_ISOLATION ON
        GO

Enabling these options causes all data modifications for the database to be versioned. A copy of the data before update (or add) is copied and stored in tempdb. The data for processing is thereafter used from the tempdb. After the dimension processing operation (under Snapshot Isolation) is finished the tempdb frees the space occupied by the versioned rows.

Please note, the tempdb must be enabled to have enough space to hold the row versioned data generated for the source RDBMS database.

Another important thing is to enable the Multiple Active Result Sets (MARS) connection property in the DataSource in the Analysis Services database.

Not enabling MARS will give you the following error

OLE DB error: OLE DB or ODBC error: Cannot create new connection because in manual or distributed transaction mode.

By enabling the Snapshot Isolation on my Analysis Services database the dimension processing has not had any errors due to constant updates at source. The tempdb size also has been well within acceptable limits.

Written by Gautham KAMATH

24/02/2012 at 16:34

Examining MDX Query performance using Block Computation

leave a comment »

In 2007 I read a very interesting blog by Mosha outlining how to Optimize a Count(Filter(…)) expression in MDX. Since then I have been particularly careful in writing MDX queries and also make sure to run it past the MDX Studio to check it. But quite frequently I come across queries on forums and blogs that take the machine and its resources for a ride and still question why aren’t my queries faster when I have enough resources (disk space, RAM, paging, processors etc.). Its quite simple at times to change the query but in most occasions it is simply not possible to change the way the query returns data. Lets first get a sense of how the Analysis Services processes MDX queries. There are two major parts of the Analysis Services engine.

Formula Engine: The Formula Engine (FE) processes and parses the incoming MDX query and figures out what data is required to satisfy them. This information is then forwarded to the Storage Engine.

Storage Engine: Apart from reading and writing of data the Storage Engine (SE) also retrieves and aggregates the data from the Analysis Services that the Formula Engine requests.

In a nut-shell, when an MDX query is executed, the query is passed to the formula engine where it is parsed then passed to the storage engine. The raw data comes back from the storage engine (with necessary aggregations) and the formula engine performs any calculations required on that data before publishing it.

The Formula Engine uses a mechanism called “SONAR” to create subcubes requests to the storage engine (In MDX Studio they appear as Sonar subcubes). The Storage Engine, with the help of subcube query events (in the profiler trace they appear as Query Subcube event class and in MDX Studio you can see them as SE Queries), uses the aggregation data cache or the aggregation data registry to retrieve the aggregated data. The data then comes back to the Formula Engine where the necessary calculations are applied before publishing it.

Now let me take up a simple query that returns the number of products where Product Line = Accessory and Product Color = Black that were sold by Date.

    WITH MEMBER [Measures].[Count Of Products] AS
        IIF({NonEmpty(
                    Exists([Product].[Product].[Product].Members,
                        ([Product].[Product Line].[Accessory], [Product].[Color].&[Black]))
                    , [Measures].[Internet Sales Amount]) AS [ProductSet]}.Count = 0
            , NULL, [ProductSet].Count)
    SELECT {[Measures].[Count Of Products]} ON 0,
    NON EMPTY [Date].[Date].[Date].Members ON 1
    FROM [Adventure Works]

The Performance Monitor counters on the MDX Studio show the following information.

On cold cache…..

        Time              : 1 sec 453 ms
        Calc covers       : 0
        Cells calculated  : 427
        Sonar subcubes    : 2393
        NON EMPTYs        : 1189
        Autoexists        : 1194
        EXISTINGs         : 0
        SE queries        : 1198
        Flat cache insert : 0
        Cache hits        : 2378
        Cache misses      : 27
        Cache inserts     : 11
        Cache lookups     : 2405
        Memory Usage KB   : 160

On warm cache…..

	Time              : 1 sec 78 ms
	Calc covers       : 0
	Cells calculated  : 390
	Sonar subcubes    : 2377
	NON EMPTYs        : 1189
	Autoexists        : 1188
	EXISTINGs         : 0
	SE queries        : 1188
	Flat cache insert : 0
	Cache hits        : 2379
	Cache misses      : 0
	Cache inserts     : 0
	Cache lookups     : 2379
	Memory Usage KB   : 4

As you can see there is no significant improvement in the query performance even in warm cache. The number of Sonar subcubes and Subcube Event queries (SE queries) fired does not change. The explanation is that having NonEmpty or Exists inside an MDX calculation is not a good idea. In our case the set inside the Exists does not change and hence context remains fixed. Almost all the Formula Engine requests (SONAR) are having cache hits and a significant number of Subcube Event queries (SE queries) are being fired. Now lets change the query and derive the calculated member using SUM function.

    WITH MEMBER [Measures].[Count Of Products] AS
        SUM([Product].[Product].[Product].Members,
            IIF(([Product].[Product Line].[Accessory],
                [Product].[Color].&[Black],
                [Measures].[Internet Sales Amount]) > 0, 1, NULL)
            )
    SELECT {[Measures].[Count Of Products]} ON 0,
    NON EMPTY [Date].[Date].[Date].Members ON 1
    FROM [Adventure Works]

The Performance Monitor counters show the following information.

On cold cache…..

	Time              : 875 ms
	Calc covers       : 0
	Cells calculated  : 427
	Sonar subcubes    : 14
	NON EMPTYs        : 1
	Autoexists        : 6
	EXISTINGs         : 0
	SE queries        : 10
	Flat cache insert : 0
	Cache hits        : 4
	Cache misses      : 29
	Cache inserts     : 12
	Cache lookups     : 33
	Memory Usage KB   : 848

As you can see there is significant reduction in number of formula engine requests (SONAR subcubes) and subcube event queries (SE queries) even though the cells calculated remains same. This is because now the query is executing under block-computation mode rather than the cell-by-cell computation.

Block-computation allows MDX queries to calculate a cell value once and use it numerous times. Please note that this does not work in all situations, but wherever possible it should be used.

Now if you u look at the same query in the warm cache scenario there is only one SONAR subcube request and SE queries and cache hits fills up the rest.

On warm cache…..

	Time              : 578 ms
	Calc covers       : 0
	Cells calculated  : 390
	Sonar subcubes    : 1
	NON EMPTYs        : 1
	Autoexists        : 0
	EXISTINGs         : 0
	SE queries        : 1
	Flat cache insert : 0
	Cache hits        : 7
	Cache misses      : 0
	Cache inserts     : 0
	Cache lookups     : 7
	Memory Usage KB   : 0

EXISTING and Block-Computation mode

EXISTING is an operator that can also be best avoided but certain situations may force you to use it. Block-computation with EXISTING is also possible that will give a performance boost to the query. Consider the following query

    WITH MEMBER [Measures].[Count Of Products] AS
        COUNT(EXISTING NONEMPTY([Product].[Product].[Product].Members
                                , [Measures].[Internet Sales Amount]))
    SELECT { [Measures].[Count Of Products], [Measures].[Internet Sales Amount]} ON 0,
    NONEMPTY(
        {[Product].[Product Line].&[S]}
        * {[Product].[Color].&[Black]}
        * {[Date].[Date].[Date].Members}
    ,[Measures].[Internet Sales Amount]) ON 1
    FROM [Adventure Works]

The Performance Monitor counters on the MDX Studio show the following information.

	Time              : 1 sec 328 ms
	Calc covers       : 0
	Cells calculated  : 817
	Sonar subcubes    : 798
	NON EMPTYs        : 391
	Autoexists        : 6
	EXISTINGs         : 390
	SE queries        : 403
	Flat cache insert : 390
	Cache hits        : 422
	Cache misses      : 746
	Cache inserts     : 370
	Cache lookups     : 1168
	Memory Usage KB   : 80

The warm cache for this query does not trouble the numbers much. Now lets change the query to use the block-computation mode.

    WITH MEMBER [Measures].[Count Of Products] AS
        SUM(EXISTING [Product].[Product].[Product].Members ,
            IIF([Measures].[Internet Sales Amount] > 0, 1, NULL))
    SELECT { [Measures].[Count Of Products], [Measures].[Internet Sales Amount]} ON 0,
    NONEMPTY(
        {[Product].[Product Line].&[S]}
        * {[Product].[Color].&[Black]}
        * {[Date].[Date].[Date].Members}
    ,[Measures].[Internet Sales Amount]) ON 1
    FROM [Adventure Works]

On cold cache the following are the perfmon numbers from MDX Studio.

	Time              : 421 ms
	Calc covers       : 0
	Cells calculated  : 817
	Sonar subcubes    : 19
	NON EMPTYs        : 1
	Autoexists        : 6
	EXISTINGs         : 2
	SE queries        : 13
	Flat cache insert : 0
	Cache hits        : 8
	Cache misses      : 32
	Cache inserts     : 12
	Cache lookups     : 40
	Memory Usage KB   : 80

Check how the number of EXISTING usage has changed from 390 to 2. Now on the warm cache, even though the EXISTING usage remains the same, most of the query data is drawn from the cache.

	Time              : 62 ms
	Calc covers       : 0
	Cells calculated  : 780
	Sonar subcubes    : 5
	NON EMPTYs        : 1
	Autoexists        : 0
	EXISTINGs         : 2
	SE queries        : 2
	Flat cache insert : 0
	Cache hits        : 10
	Cache misses      : 2
	Cache inserts     : 0
	Cache lookups     : 12
	Memory Usage KB   : 0

To summarize, it is important to understand how the Analysis Services engine analyses MDX queries and very important to find a way to use block-computation wherever possible.

Written by Gautham KAMATH

08/02/2012 at 10:07

Too many Attributes in the SSAS dimension can spoil the MDX query

leave a comment »

All of us have heard about the importance of having the correct and appropriate attribute relationships and their impact on MDX query performance. There are a few instances when having too many attributes and attribute relationships can backfire. One such instance is with the Explicit and Implicit conversions. Let me explain.

Attribute relationships:

Attribute relationships create dependencies between attributes. In other words, if City has a RELATED attribute State, then City has a many to one relation with State and conversely State has a one to many relation with City. More specifically, if the current city is Seattle, then we know the State must be Washington.

Related and Relating relationship:

If State is RELATED to City, then City is said to be a RELATING to State.

Because RELATING attribute has a many-to-one relation with RELATED attribute the coordinates for RELATING attributes (City in this case) move to [All] member. For RELATED attribute, that has a one-to-many relation with the RELATING attribute, the coordinates move to one specific member.

Explicit and Implicit Overwrite:

An Explicit overwrite takes place in an expression. Related and Relating attributes undergo an Implicit overwrite. Consider the following query

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount]),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Customer].[Customer Geography].[City].&[Seattle]&[WA]

It is a simple MDX query that returns the Internet Sales Amount for the City of Seattle in Washington.

x Internet Sales Amount
$75,164.86 $75,164.86

Now lets play with it and add [Customer].[Stare-Province].CurrentMemberto the calculated member expression

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Customer].[State-Province].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Customer].[Customer Geography].[City].&[Seattle]&[WA]

The query returns the following results

x Internet Sales Amount
$2,467,248.34 $75,164.86

As we know WHERE clause reduces the dimensionality of cube space and hence changes the cube context for the query. What has happened here is

  • [City].[Seattle] is current context (WHERE Clause reduces the dimensionality of cube space and performs an implicit EXISTS with the AXES).
  • Explicit conversion of [State-Province].CURRENTMEMBER to [State-Province].[Washington] (one member from State)
  • Because [State-Province] is related to [City], [State-Province].[Washington] remains as it is (one member from State).
  • Now because [City] is relating to [State-Province], [City].[Seattle] gets implicitly converted to [City].[ALL Customers]
  • Hence the tuple expression for [Measures].[x] becomes ([State-Province].[Washington], [City].[ALL Customers], [Measures].[Internet Sales Amount])

Out of curiosity, let us see what happens when we have subselects instead of WHERE clause?

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Customer].[State-Province].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM  (SELECT [Customer].[Customer Geography].[City].&[Seattle]&[WA] ON 0
           FROM [Adventure Works]) 
x Internet Sales Amount
$29,358,677.22 $75,164.86

Subselects do not reduce dimensionality of cube space and therefore do not change the cube context in which the MDX expressions are to be evaluated. In this case there is no Implicit EXISTS between the Subselect and the AXES and hence the [State-Province].CurrentMember gets Explicitly converted to All member [State-Province].[ALL Customers]. The tuple expression for [Measures].[x] in this query becomes ([State-Province].[ALL Customers], [Measures].[Internet Sales Amount]).

So far so good. This is as per design and how it should behave. Now let us look at the above mentioned scenario but with different dimension ([Product]) and members. Consider the following query

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount]),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

Again a plain and simple MDX query that returns the Internet Sales Amount for the Product “Mountain-100 Silver, 38”.

x Internet Sales Amount
$197,199.42 $197,199.42

If we tweek the expression

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Product].[Subcategory].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

We get the same result as the previous expression.

x Internet Sales Amount
$197,199.42 $197,199.42

Result does not seem to be correct as we were expecting $9,952,759.56 but we are getting $197,199.42. It is supposed to be doing the same thing as I mentioned before

  • [Product].[Mountain-100 Silver, 38] is current context (WHERE Clause performs an implicit EXISTS with the AXES).
  • Explicit conversion of [Subcategory].CURRENTMEMBER to [Subcategory].[Mountain Bikes] (one member from Subcategory)
  • Because [Subcategory] is related to [Product], [Subcategory].[Mountain Bikes] remains as it is (one member from Subcategory).
  • Also because [Product] is relating to [Subcategory], [Product].[Mountain-100 Silver, 38] gets implicitly converted to [Product].[ALL Products]
  • Hence the tuple for [Measures].[x] becomes ([Subcategory].[Mountain Bikes], [Product].[ALL Products]).

Yet the result seems to be incorrect. Why? This is because there are TOO MANY ATTRIBUTES in [Product] dimenion and most of the attributes are RELATED to the Product level (leaf level). All those attributes get set by auto-exists when filtering to a specific Product. The following is the subcube information from profiler when the query is fired.

00000000, 000, 000, 010111111111111111111, 000000000000000000000, 00, 100, 000000000000000000000, 000000000000000000000, 000000000000000000000, 00

It equates to the following.

     Dimension:   Product (21)
          Attribute:    Standard Cost
          Attribute:    Color
          Attribute:    Safety Stock Level
          Attribute:    Reorder Point
          Attribute:    List Price
          Attribute:    Size
          Attribute:    Size Range
          Attribute:    Weight
          Attribute:    Days to Manufacture
          Attribute:    Dealer Price
          Attribute:    Class
          Attribute:    Style
          Attribute:    Model Name
          Attribute:    Product Line
          Attribute:    Subcategory
          Attribute:    Status
          Attribute:    Start Date
          Attribute:    End Date
          Attribute:    Large Photo
     Dimension:   Destination Currency (3)
          Attribute:    Destination Currency

There are 21 attributes in the Product dimension of which 15 attributes are related to the Product level. What it means is that the above 15 attributes perform auto exist at their respective all level with the Product Level (leaf level) because of the attribute relationship. In context of the query featuring City and State there was only one related attribute (State) at the City level and hence the issue did not arise there.

ROOT function:

The desired behaviour is got back by application of ROOT function. As defined by mdsn, the ROOT function returns a tuple that consists of the All members from each attribute hierarchy within the current scope in a cube, dimension, or tuple. So by modifying the above query in the following way

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    ROOT([Product].[Subcategory].CurrentMember)),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

we get the desired results

x Internet Sales Amount
$9,952,759.56 $75,164.86

The profiler subcube trace reads the following

00000000, 000, 000, 000000000000000010000, 000000000000000000000, 00, 100, 000000000000000000000, 000000000000000000000, 000000000000000000000, 00

and it equates to

     Dimension:   Product (21)
          Attribute:    Subcategory
     Dimension:   Destination Currency (3)
          Attribute:    Destination Currency

P/S To read the profiler trace I have used a tool created by James Snape

To summarize:

  • Have the correct attribute relationships.
  • Be particularly careful with CurrentMember and know where the Explicit and Implicit overwrite takes place.
  • As far as possible keep the attributes to the minimum.

Written by Gautham KAMATH

04/12/2011 at 17:27

Populate “All Caption” property of Analysis Services 2000 Dimension using DSO

leave a comment »

My Analysis Services 2000 post was brought up just yesterday. In that post I had proposed a solution to populate the “All caption” property of Analysis Services 2000 Dimension using DSO and C# as I could not find an answer in Books Online (BOL).

The following is that solution:

Using just the MemberKeyColumn property solves the problem.

DSO.Level allLevelDepartment = (DSO.Level)dsoDimDepartment.Levels.AddNew(“(All)”, DSO.SubClassTypes.sbclsRegular);
allLevelDepartment.LevelType = DSO.LevelTypes.levAll;
allLevelDepartment.EstimatedSize = 1;
allLevelDepartment.MemberKeyColumn = “All Department”;

This syntax of C# will be enough to populate the “All Caption” property of a Analysis Services 2000 Dimension using DSO.

Written by Gautham KAMATH

15/11/2011 at 21:56

An open letter to Sachin

leave a comment »

Dear Sachin

It is my first letter to you in 22 years that I have known you and needless that I say you keep fascinating me. Its not only because you are a great player, but also a great athlete. Its hard for me to imagine any athlete being on top (or at the very least near the top) for so long. 22 years by far is very long. To me the 22 years holds far more value than 33207 runs (as off 15 November 2011) that you have scored and which very nicely brings me to the main subject of the letter your hundredth century.

Your hundredth century will come and there is nothing stopping it. There is a lot of media chats or  politely speaking bitching going on regarding. Topics like “the hundredth 100 is getting better of Sachin”, “Do tell Tendulkar how it is done” etc etc. While I am sure that you are not affected by this but because you are still a mortal human being, a part of me thinks that it might be true.

Let me divert you from this topic. A good few years ago I wanted pass my driving test here in UK. UK driving test is possibly the most difficult to pass and possibly very expensive. I failed the first time (it was my bad luck as I sincerely thought I did nothing wrong). The desperation to pass on my second attempt and the possibility of losing another good chunk of money made me so nervous that I made a mess of my test. Even before the driving instructor told me that I had not passed, I had failed myself. I deliberately allowed myself to be nervous. This was a lesson to me. On my third attempt I went with a clear mind and told myself that if I fail again I will pay again and go for it again. The pain of losing money was no longer in that feeling. On my third attempt I passed.

The reason I am telling this to you is self-explanatory. Dear Sachin you will get your hundred. Go and bat with an attitude “Aaj nahi, kal sahi”.

Eagerly looking forward to your 100.

Warm regards
Gautham

Written by Gautham KAMATH

15/11/2011 at 21:39

Always use {} and ()

leave a comment »

When SSAS 2005 was released, MDX has a very nice change (at least it looked nice that time). The change was that it was no longer necessary to be very strict about i.e curly bracing {} or bracketing () the MDX syntax. This leniency sounded fun at that time but now after years relying in the intrinsic conversion between members, tuples and sets, it does not seem all that good now. Let me explain.

Consider the following simple query

     SELECT {[Measures].[Internet Sales Amount]} ON 0,
     {[Customer].[Customer Geography].[City].&[Berkshire]&[ENG]
     * [Product].[Product Categories].[Category].Members
     * [Date].[Calendar].[Calendar Year].[CY 2008]} ON 1
     FROM [Adventure Works]

The results are good.

But if we swap a couple of members ([Date] and [Product]) in the ROWS axis, it bangs.

     SELECT {[Measures].[Internet Sales Amount]} ON 0,
     {[Customer].[Customer Geography].[City].&[Berkshire]&[ENG]
     * [Date].[Calendar].[Calendar Year].[CY 2008]
     * [Product].[Product Categories].[Category].Members} ON 1
     FROM [Adventure Works]

The error reads…

Executing the query …
Query (2, 2) The * function expects a tuple set expression for the 1 argument. A string or numeric expression was used.
Execution complete

There is so much internal conversion going on between members, tuples, and sets that interpretation and construction of MDX statements have become more complex.
The row and column axes require sets. CrossJoin statement also require sets and neither of the above queries properly declares it.
The lesson: Always declare your sets with {} and tuples with ().

So if the above query is written as

     SELECT {[Measures].[Internet Sales Amount]} ON 0,
     {[Customer].[Customer Geography].[City].&[Berkshire]&[ENG]}
     * {[Date].[Calendar].[Calendar Year].[CY 2008]}
     * {[Product].[Product Categories].[Category].Members} ON 1
     FROM [Adventure Works]

…everyone is happy

Please note that by contrast the WHERE clause is not as lenient as the rest of the query. A single hierarchy in the WHERE clause without round brackets is acceptable as the member will be explicitly converted to a tuple. Modify it to reference a second hierarchy and add round brackets, it will not like it as it expects curly braces {}.

Written by Gautham KAMATH

03/11/2011 at 18:36

Posted in MDX

Tagged with , , ,

CommitTimeout and ForceCommitTimeout in SSAS

leave a comment »

There has been some confusion over CommitTimeout and ForceCommitTimeout  in the past. Hopefully this will be of some use.

SSAS Processing requires write lock on the cube to commit the process. To acquire the write lock there should be no read lock present on the cube. Read locks are acquired when a MDX query is fired against SSAS cube. COMMITTIMEOUT property will wait till all the read locks are closed before committing. For example, say the commit timeout is 30 seconds and if read locks are not released for commit to happen then rolling back of processing transactions takes place.

FORCECOMMITTIMEOUT property is used to avoid rolling back of processing transactions. FORCECOMMITTIMEOUT will wait the specified time for long running queries (that have acquired read locks) to complete its work before cancelling them (to acquire write lock for processing).

The following is general representation of how locks for processing and querying work.

To summarize:
COMMITTIMEOUT – Cancels processing (after specified time) if write locks cannot be obtained.
FORCECOMMITTIMEOUT – Cancels long running queries (after specified time) to acquire write locks for processing. When these queries are cancelled the user gets an error “Server action is cancelled”.

COMMITTIMEOUT of 0 will never let the processing operation timeout.

COMMITTIMEOUT of 0 and a low value of FORCECOMMITTIMEOUT would mean user queries getting cancelled more often. A high value of FORCECOMMITTIMEOUT would mean user queries (even badly performing queries) get more time to finish its operation and processing operations get delayed as they will have to wait until all the read locks (held by querying operation) are released, thus giving a perception of bad processing performance.

If we set both COMMITTIMEOUT and FORCECOMMITTIMEOUT to zero then write lock on cube will wait indefinitely till all the read locks are released. There will be no errors or timeouts but the system will generally appear to be badly performing. Please note, if the user queries the cube during the commit phase of processing then the queries get queued up.

If you want the processing to take priority the set the COMMITTIMEOUT to 0 and FORCECOMMITTIMEOUT to 500 (default of 3000 is also OK). FORCECOMMITTIMEOUT of 500 will enforce the user queries to be cancelled after 5 seconds.

If querying has to take precedence then set COMMITTIMEOUT to 500 which would enforce the processing operations to be cancelled after 5 seconds of holding the write lock. Setting COMMITTIMEOUT and FORCECOMMITTIMEOUT to 0 would also give precedence to the query but without any timeouts of processing operations. This will generally give an impression of server performing badly.

To avoid such unwanted situations it is best scale out Analysis Services operations by separating the  processing and querying operations.

Written by Gautham KAMATH

02/11/2011 at 09:42