Gautham KAMATH

BI Developer specializing in SSAS and MDX

Archive for the ‘Analysis Services’ Category

CoordinatorExecutionMode and Process MaxThreads

leave a comment »

Just recently I got into a processing performance test of my Analysis Services database and I noticed that only a little over 50% of the available system resources were being used. On investigation I found that the busy (processing pool busy threads), idle (processing pool idle threads) and queue (processing pool job queue length) that were to be used by the processing were way below the SSAS defaults.

This did not seen right. On closer investigation I found that, during the processing cycle, small amounts of data were being processed (ranging from a 1000 to 100000 records per batch) and ProcessAdd was being used to process the partition data.

I changed the processing cycle to process all the records using ProcessFull and the data processed significantly faster. Yes, I know ProcessFull must have been used instead of ProcessAdd but as per our earlier requirements this was not possible and besides this is not the topic of discussion today.

As I mentioned, the processing time did reduce but there were a significant amount of job queues. This, as a matter of fact, was not surprising as I was using SSAS default settings for CoordinatorExecutionMode and Process MaxThreads.

Just for information let me define a couple of Analysis Services server properties and three performance counters that I will be referring to in this post.

Analysis Services server properties

CoordinatorExecutionMode: Processing of dimensions and partitions and querying of partitions are handled by job operations. A job operation is a unit of execution of command that is fired against the SSAS server. The property that controls these job operations to run in parallel is the CoordinatorExecutionMode. The CoordinatorExecutionMode can have the following values:

  • Negative number indicates absolute maximum number of parallel jobs per core. The CoordinatorExecutionMode defaults to -4 indicating 4x Number of cores of parallel jobs. i.e. If you have 8 cores on the server then the absolute maximum number of parallel jobs would be 32.
  • Positive number indicates absolute maximum number of parallel jobs per server. If you set it to 10 then absolute maximum number of parallel jobs for the server would be 10.
  • Zero indicates the server to determine the maximum number of parallel jobs, based on workload and available system resources.

Process MaxThreads: This server property determines the maximum number of available threads to SSAS during processing (and for accessing the IO system during querying). This defaults to 64 or 10 time the number of cores whichever is maximum.

Performance Monitor (PerfMon) counters

Processing pool BUSY threads: Indicates the number of threads in use.
Processing pool IDLE threads: Indicates the number of threads that are idle.
Processing pool JOB QUEUE LENGTH: Indicates the number of jobs that are waiting for threads to be available.

Now coming back to the issue, as a test case, I changed the CoordinatorExecutionMode to zero.

Changing the CoordinatorExecutionMode to zero meant that I was forcing the SSAS engine to use all the available system resources. Ironically changing the CoordinatorExecutionMode to zero slowed the processing performance significantly and increased the processing time by 50%.

This was because now the all the available threads (in excess of 300) were either fighting or queuing up to do the same portion of processing work. The system resources were being hammered and ignoring the SSAS setting of Process MaxThreads that had the default value of 64.

I changed the CoordinatorExecutionMode back to -4 and the system was processing as before. But I still saw a significant amount of job queues and for that I increased the Process MaxThreads from the default value of 64 to 128. The queues were nullified and processing time just improved marginally. The important thing here was changing the Process MaxThreads to 128 zeroed the queued threads.

I went slightly adventures by decreasing the CoordinatorExecutionMode to -1. The processing pool idle threads were reduced significantly and processing pool busy threads increased while processing pool job queue length remained 0. But for me the default CoordinatorExecutionMode of -4 and changing the Process MaxThreads to 128 worked.

To summarize:

  • DO NOT change CoordinatorExecutionMode to 0.
  • Try to get the queue lengths to minimum. Playing with Process MaxThreads is a good idea but do not be over generous as it may backfire. Please note that CoordinatorExecutionMode and Process MaxThreads complement each other and if you change CoordinatorExecutionMode make sure you have enough threads to play with.
  • CPU is underutilized when
    • Processing pool job queue length is greater than 0. This means that SSAS wants to process more but does not have available threads.
    • Processing pool idle threads = 0. This means that there is a potential to use more threads hence increase Process MaxThreads. Increase gradually and check. Ideally it should not go above 0 but gradual increase will give you a clear indication of its capacity.
    • Processing pool idle threads greater than 0. Increase the CoordinatorExecutionMode from default value of -4 to -5 or -6 (Yes this is an increase and not decrease. Please note that negative sign (-) is just an indication that it is a per core setting. Positive on the other hand means that it is per server setting). Idle threads greater than 0 does not mean that it does not have enough threads bit it means that it does not have enough jobs.

Please note this test was carried out against:

  • SQL Server Analysis Services (SSAS) 2008R2 64 Bit
  • Windows Server 2008R2 64Bit
  • 10GB RAM
  • 4 Cores

This behaviour has an improvement in Analysis Services 2012 (Multi-dimensional model) with changes to the way Multi-dimensional model uses Non-Uniform Memory Access (NUMA).

Advertisements

Processing using Snapshot Isolation in Analysis Services (SSAS)

with 7 comments

Just a few months ago I was hit with an issue of dimension processing failure. The error read

Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: ‘<<TableName>>’, Column: ‘<<ColumnName>>’, Value: ‘XXXXXXX’. The attribute is ‘<<AttributeName>>’.

On investigation I found that constant updates to the dimension were the main cause. One of my dimensions had as many as 15 attributes and the processing failed when any update occurred during dimension processing. Generally speaking, the first attribute’s SELECT DISTINCT statement would conflict with the SELECT DISTINCTs of other attributes because of the constant update.

One way to get around this is was to change the ProcessingGroup property of the dimension to be ByTable (instead of the default value of ByAttribute). As the ByTable option takes the entire table data for that dimension into memory, it would have to process the same data without having to worry about the constant updates. The new updates would then get incorporated in the next processing cycle.

This option did not work as the ByTable option works only when you have only a solitary table to draw the data from. Multiple tables in star schema or snow-flake schema or a RDBMS view that encapsulates multiple tables will issue multiple queries against the data source.

P/S Try processing the Product and Geography dimensions using ByAttribute and ByTable options. The profile trace for ByTable ProcessingGroup mode for Product dimension will show you multiple SELECT DISTINCT statements being fired whereas for the Geography dimension, because of its single table data source, it would fire only one SELECT DISTINCT statement.

As I am using RDBMS database views that uses multiple table joins, this option for me, also went to pits. The only feasible solution was to use the Snapshot Isolation.

Snapshot Isolation

The Snapshot Isolation works on the Row Versioning framework.

Row versioning based isolation levels reduce the number of locks acquired by transaction by eliminating the use of shared locks on read operations.

For the Analysis Services database to work on Snapshot Isolation mode the source RDBMS database needs to have the READ_COMMITTED_SNAPSHOT option and ALLOW_SNAPSHOT_ISOLATION to be ON.

  • When the READ_COMMITTED_SNAPSHOT is ON, READ_COMMITTED transactions provide statement level read consistency using row versioning.
  • When ALLOW_SNAPSHOT_ISOLATION is ON, SNAPSHOT transactions provide transaction-level read consistent using row versioning.

The following is the SQL command that can enable the mentioned options in your RDBMS database.

        USE <<DatabaseName>>
        GO
        ALTER DATABASE <<DatabaseName>>
        SET READ_COMMITTED_SNAPSHOT ON
        GO
        ALTER DATABASE <<DatabaseName>>
        SET ALLOW_SNAPSHOT_ISOLATION ON
        GO

Enabling these options causes all data modifications for the database to be versioned. A copy of the data before update (or add) is copied and stored in tempdb. The data for processing is thereafter used from the tempdb. After the dimension processing operation (under Snapshot Isolation) is finished the tempdb frees the space occupied by the versioned rows.

Please note, the tempdb must be enabled to have enough space to hold the row versioned data generated for the source RDBMS database.

Another important thing is to enable the Multiple Active Result Sets (MARS) connection property in the DataSource in the Analysis Services database.

Not enabling MARS will give you the following error

OLE DB error: OLE DB or ODBC error: Cannot create new connection because in manual or distributed transaction mode.

By enabling the Snapshot Isolation on my Analysis Services database the dimension processing has not had any errors due to constant updates at source. The tempdb size also has been well within acceptable limits.

Written by Gautham KAMATH

24/02/2012 at 16:34

Examining MDX Query performance using Block Computation

leave a comment »

In 2007 I read a very interesting blog by Mosha outlining how to Optimize a Count(Filter(…)) expression in MDX. Since then I have been particularly careful in writing MDX queries and also make sure to run it past the MDX Studio to check it. But quite frequently I come across queries on forums and blogs that take the machine and its resources for a ride and still question why aren’t my queries faster when I have enough resources (disk space, RAM, paging, processors etc.). Its quite simple at times to change the query but in most occasions it is simply not possible to change the way the query returns data. Lets first get a sense of how the Analysis Services processes MDX queries. There are two major parts of the Analysis Services engine.

Formula Engine: The Formula Engine (FE) processes and parses the incoming MDX query and figures out what data is required to satisfy them. This information is then forwarded to the Storage Engine.

Storage Engine: Apart from reading and writing of data the Storage Engine (SE) also retrieves and aggregates the data from the Analysis Services that the Formula Engine requests.

In a nut-shell, when an MDX query is executed, the query is passed to the formula engine where it is parsed then passed to the storage engine. The raw data comes back from the storage engine (with necessary aggregations) and the formula engine performs any calculations required on that data before publishing it.

The Formula Engine uses a mechanism called “SONAR” to create subcubes requests to the storage engine (In MDX Studio they appear as Sonar subcubes). The Storage Engine, with the help of subcube query events (in the profiler trace they appear as Query Subcube event class and in MDX Studio you can see them as SE Queries), uses the aggregation data cache or the aggregation data registry to retrieve the aggregated data. The data then comes back to the Formula Engine where the necessary calculations are applied before publishing it.

Now let me take up a simple query that returns the number of products where Product Line = Accessory and Product Color = Black that were sold by Date.

    WITH MEMBER [Measures].[Count Of Products] AS
        IIF({NonEmpty(
                    Exists([Product].[Product].[Product].Members,
                        ([Product].[Product Line].[Accessory], [Product].[Color].&[Black]))
                    , [Measures].[Internet Sales Amount]) AS [ProductSet]}.Count = 0
            , NULL, [ProductSet].Count)
    SELECT {[Measures].[Count Of Products]} ON 0,
    NON EMPTY [Date].[Date].[Date].Members ON 1
    FROM [Adventure Works]

The Performance Monitor counters on the MDX Studio show the following information.

On cold cache…..

        Time              : 1 sec 453 ms
        Calc covers       : 0
        Cells calculated  : 427
        Sonar subcubes    : 2393
        NON EMPTYs        : 1189
        Autoexists        : 1194
        EXISTINGs         : 0
        SE queries        : 1198
        Flat cache insert : 0
        Cache hits        : 2378
        Cache misses      : 27
        Cache inserts     : 11
        Cache lookups     : 2405
        Memory Usage KB   : 160

On warm cache…..

	Time              : 1 sec 78 ms
	Calc covers       : 0
	Cells calculated  : 390
	Sonar subcubes    : 2377
	NON EMPTYs        : 1189
	Autoexists        : 1188
	EXISTINGs         : 0
	SE queries        : 1188
	Flat cache insert : 0
	Cache hits        : 2379
	Cache misses      : 0
	Cache inserts     : 0
	Cache lookups     : 2379
	Memory Usage KB   : 4

As you can see there is no significant improvement in the query performance even in warm cache. The number of Sonar subcubes and Subcube Event queries (SE queries) fired does not change. The explanation is that having NonEmpty or Exists inside an MDX calculation is not a good idea. In our case the set inside the Exists does not change and hence context remains fixed. Almost all the Formula Engine requests (SONAR) are having cache hits and a significant number of Subcube Event queries (SE queries) are being fired. Now lets change the query and derive the calculated member using SUM function.

    WITH MEMBER [Measures].[Count Of Products] AS
        SUM([Product].[Product].[Product].Members,
            IIF(([Product].[Product Line].[Accessory],
                [Product].[Color].&[Black],
                [Measures].[Internet Sales Amount]) > 0, 1, NULL)
            )
    SELECT {[Measures].[Count Of Products]} ON 0,
    NON EMPTY [Date].[Date].[Date].Members ON 1
    FROM [Adventure Works]

The Performance Monitor counters show the following information.

On cold cache…..

	Time              : 875 ms
	Calc covers       : 0
	Cells calculated  : 427
	Sonar subcubes    : 14
	NON EMPTYs        : 1
	Autoexists        : 6
	EXISTINGs         : 0
	SE queries        : 10
	Flat cache insert : 0
	Cache hits        : 4
	Cache misses      : 29
	Cache inserts     : 12
	Cache lookups     : 33
	Memory Usage KB   : 848

As you can see there is significant reduction in number of formula engine requests (SONAR subcubes) and subcube event queries (SE queries) even though the cells calculated remains same. This is because now the query is executing under block-computation mode rather than the cell-by-cell computation.

Block-computation allows MDX queries to calculate a cell value once and use it numerous times. Please note that this does not work in all situations, but wherever possible it should be used.

Now if you u look at the same query in the warm cache scenario there is only one SONAR subcube request and SE queries and cache hits fills up the rest.

On warm cache…..

	Time              : 578 ms
	Calc covers       : 0
	Cells calculated  : 390
	Sonar subcubes    : 1
	NON EMPTYs        : 1
	Autoexists        : 0
	EXISTINGs         : 0
	SE queries        : 1
	Flat cache insert : 0
	Cache hits        : 7
	Cache misses      : 0
	Cache inserts     : 0
	Cache lookups     : 7
	Memory Usage KB   : 0

EXISTING and Block-Computation mode

EXISTING is an operator that can also be best avoided but certain situations may force you to use it. Block-computation with EXISTING is also possible that will give a performance boost to the query. Consider the following query

    WITH MEMBER [Measures].[Count Of Products] AS
        COUNT(EXISTING NONEMPTY([Product].[Product].[Product].Members
                                , [Measures].[Internet Sales Amount]))
    SELECT { [Measures].[Count Of Products], [Measures].[Internet Sales Amount]} ON 0,
    NONEMPTY(
        {[Product].[Product Line].&[S]}
        * {[Product].[Color].&[Black]}
        * {[Date].[Date].[Date].Members}
    ,[Measures].[Internet Sales Amount]) ON 1
    FROM [Adventure Works]

The Performance Monitor counters on the MDX Studio show the following information.

	Time              : 1 sec 328 ms
	Calc covers       : 0
	Cells calculated  : 817
	Sonar subcubes    : 798
	NON EMPTYs        : 391
	Autoexists        : 6
	EXISTINGs         : 390
	SE queries        : 403
	Flat cache insert : 390
	Cache hits        : 422
	Cache misses      : 746
	Cache inserts     : 370
	Cache lookups     : 1168
	Memory Usage KB   : 80

The warm cache for this query does not trouble the numbers much. Now lets change the query to use the block-computation mode.

    WITH MEMBER [Measures].[Count Of Products] AS
        SUM(EXISTING [Product].[Product].[Product].Members ,
            IIF([Measures].[Internet Sales Amount] > 0, 1, NULL))
    SELECT { [Measures].[Count Of Products], [Measures].[Internet Sales Amount]} ON 0,
    NONEMPTY(
        {[Product].[Product Line].&[S]}
        * {[Product].[Color].&[Black]}
        * {[Date].[Date].[Date].Members}
    ,[Measures].[Internet Sales Amount]) ON 1
    FROM [Adventure Works]

On cold cache the following are the perfmon numbers from MDX Studio.

	Time              : 421 ms
	Calc covers       : 0
	Cells calculated  : 817
	Sonar subcubes    : 19
	NON EMPTYs        : 1
	Autoexists        : 6
	EXISTINGs         : 2
	SE queries        : 13
	Flat cache insert : 0
	Cache hits        : 8
	Cache misses      : 32
	Cache inserts     : 12
	Cache lookups     : 40
	Memory Usage KB   : 80

Check how the number of EXISTING usage has changed from 390 to 2. Now on the warm cache, even though the EXISTING usage remains the same, most of the query data is drawn from the cache.

	Time              : 62 ms
	Calc covers       : 0
	Cells calculated  : 780
	Sonar subcubes    : 5
	NON EMPTYs        : 1
	Autoexists        : 0
	EXISTINGs         : 2
	SE queries        : 2
	Flat cache insert : 0
	Cache hits        : 10
	Cache misses      : 2
	Cache inserts     : 0
	Cache lookups     : 12
	Memory Usage KB   : 0

To summarize, it is important to understand how the Analysis Services engine analyses MDX queries and very important to find a way to use block-computation wherever possible.

Written by Gautham KAMATH

08/02/2012 at 10:07

Too many Attributes in the SSAS dimension can spoil the MDX query

leave a comment »

All of us have heard about the importance of having the correct and appropriate attribute relationships and their impact on MDX query performance. There are a few instances when having too many attributes and attribute relationships can backfire. One such instance is with the Explicit and Implicit conversions. Let me explain.

Attribute relationships:

Attribute relationships create dependencies between attributes. In other words, if City has a RELATED attribute State, then City has a many to one relation with State and conversely State has a one to many relation with City. More specifically, if the current city is Seattle, then we know the State must be Washington.

Related and Relating relationship:

If State is RELATED to City, then City is said to be a RELATING to State.

Because RELATING attribute has a many-to-one relation with RELATED attribute the coordinates for RELATING attributes (City in this case) move to [All] member. For RELATED attribute, that has a one-to-many relation with the RELATING attribute, the coordinates move to one specific member.

Explicit and Implicit Overwrite:

An Explicit overwrite takes place in an expression. Related and Relating attributes undergo an Implicit overwrite. Consider the following query

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount]),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Customer].[Customer Geography].[City].&[Seattle]&[WA]

It is a simple MDX query that returns the Internet Sales Amount for the City of Seattle in Washington.

x Internet Sales Amount
$75,164.86 $75,164.86

Now lets play with it and add [Customer].[Stare-Province].CurrentMemberto the calculated member expression

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Customer].[State-Province].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Customer].[Customer Geography].[City].&[Seattle]&[WA]

The query returns the following results

x Internet Sales Amount
$2,467,248.34 $75,164.86

As we know WHERE clause reduces the dimensionality of cube space and hence changes the cube context for the query. What has happened here is

  • [City].[Seattle] is current context (WHERE Clause reduces the dimensionality of cube space and performs an implicit EXISTS with the AXES).
  • Explicit conversion of [State-Province].CURRENTMEMBER to [State-Province].[Washington] (one member from State)
  • Because [State-Province] is related to [City], [State-Province].[Washington] remains as it is (one member from State).
  • Now because [City] is relating to [State-Province], [City].[Seattle] gets implicitly converted to [City].[ALL Customers]
  • Hence the tuple expression for [Measures].[x] becomes ([State-Province].[Washington], [City].[ALL Customers], [Measures].[Internet Sales Amount])

Out of curiosity, let us see what happens when we have subselects instead of WHERE clause?

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Customer].[State-Province].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM  (SELECT [Customer].[Customer Geography].[City].&[Seattle]&[WA] ON 0
           FROM [Adventure Works]) 
x Internet Sales Amount
$29,358,677.22 $75,164.86

Subselects do not reduce dimensionality of cube space and therefore do not change the cube context in which the MDX expressions are to be evaluated. In this case there is no Implicit EXISTS between the Subselect and the AXES and hence the [State-Province].CurrentMember gets Explicitly converted to All member [State-Province].[ALL Customers]. The tuple expression for [Measures].[x] in this query becomes ([State-Province].[ALL Customers], [Measures].[Internet Sales Amount]).

So far so good. This is as per design and how it should behave. Now let us look at the above mentioned scenario but with different dimension ([Product]) and members. Consider the following query

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount]),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

Again a plain and simple MDX query that returns the Internet Sales Amount for the Product “Mountain-100 Silver, 38”.

x Internet Sales Amount
$197,199.42 $197,199.42

If we tweek the expression

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    [Product].[Subcategory].CurrentMember),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

We get the same result as the previous expression.

x Internet Sales Amount
$197,199.42 $197,199.42

Result does not seem to be correct as we were expecting $9,952,759.56 but we are getting $197,199.42. It is supposed to be doing the same thing as I mentioned before

  • [Product].[Mountain-100 Silver, 38] is current context (WHERE Clause performs an implicit EXISTS with the AXES).
  • Explicit conversion of [Subcategory].CURRENTMEMBER to [Subcategory].[Mountain Bikes] (one member from Subcategory)
  • Because [Subcategory] is related to [Product], [Subcategory].[Mountain Bikes] remains as it is (one member from Subcategory).
  • Also because [Product] is relating to [Subcategory], [Product].[Mountain-100 Silver, 38] gets implicitly converted to [Product].[ALL Products]
  • Hence the tuple for [Measures].[x] becomes ([Subcategory].[Mountain Bikes], [Product].[ALL Products]).

Yet the result seems to be incorrect. Why? This is because there are TOO MANY ATTRIBUTES in [Product] dimenion and most of the attributes are RELATED to the Product level (leaf level). All those attributes get set by auto-exists when filtering to a specific Product. The following is the subcube information from profiler when the query is fired.

00000000, 000, 000, 010111111111111111111, 000000000000000000000, 00, 100, 000000000000000000000, 000000000000000000000, 000000000000000000000, 00

It equates to the following.

     Dimension:   Product (21)
          Attribute:    Standard Cost
          Attribute:    Color
          Attribute:    Safety Stock Level
          Attribute:    Reorder Point
          Attribute:    List Price
          Attribute:    Size
          Attribute:    Size Range
          Attribute:    Weight
          Attribute:    Days to Manufacture
          Attribute:    Dealer Price
          Attribute:    Class
          Attribute:    Style
          Attribute:    Model Name
          Attribute:    Product Line
          Attribute:    Subcategory
          Attribute:    Status
          Attribute:    Start Date
          Attribute:    End Date
          Attribute:    Large Photo
     Dimension:   Destination Currency (3)
          Attribute:    Destination Currency

There are 21 attributes in the Product dimension of which 15 attributes are related to the Product level. What it means is that the above 15 attributes perform auto exist at their respective all level with the Product Level (leaf level) because of the attribute relationship. In context of the query featuring City and State there was only one related attribute (State) at the City level and hence the issue did not arise there.

ROOT function:

The desired behaviour is got back by application of ROOT function. As defined by mdsn, the ROOT function returns a tuple that consists of the All members from each attribute hierarchy within the current scope in a cube, dimension, or tuple. So by modifying the above query in the following way

     WITH MEMBER [Measures].[x] AS ([Measures].[Internet Sales Amount],
                                    ROOT([Product].[Subcategory].CurrentMember)),
                                    FORMAT_STRING = "currency"
     SELECT {[Measures].[x], [Measures].[Internet Sales Amount]} on 0
     FROM [Adventure Works]
     WHERE [Product].[Product Categories].[Product].[Mountain-100 Silver, 38]

we get the desired results

x Internet Sales Amount
$9,952,759.56 $75,164.86

The profiler subcube trace reads the following

00000000, 000, 000, 000000000000000010000, 000000000000000000000, 00, 100, 000000000000000000000, 000000000000000000000, 000000000000000000000, 00

and it equates to

     Dimension:   Product (21)
          Attribute:    Subcategory
     Dimension:   Destination Currency (3)
          Attribute:    Destination Currency

P/S To read the profiler trace I have used a tool created by James Snape

To summarize:

  • Have the correct attribute relationships.
  • Be particularly careful with CurrentMember and know where the Explicit and Implicit overwrite takes place.
  • As far as possible keep the attributes to the minimum.

Written by Gautham KAMATH

04/12/2011 at 17:27

Populate “All Caption” property of Analysis Services 2000 Dimension using DSO

leave a comment »

My Analysis Services 2000 post was brought up just yesterday. In that post I had proposed a solution to populate the “All caption” property of Analysis Services 2000 Dimension using DSO and C# as I could not find an answer in Books Online (BOL).

The following is that solution:

Using just the MemberKeyColumn property solves the problem.

DSO.Level allLevelDepartment = (DSO.Level)dsoDimDepartment.Levels.AddNew(“(All)”, DSO.SubClassTypes.sbclsRegular);
allLevelDepartment.LevelType = DSO.LevelTypes.levAll;
allLevelDepartment.EstimatedSize = 1;
allLevelDepartment.MemberKeyColumn = “All Department”;

This syntax of C# will be enough to populate the “All Caption” property of a Analysis Services 2000 Dimension using DSO.

Written by Gautham KAMATH

15/11/2011 at 21:56

CommitTimeout and ForceCommitTimeout in SSAS

leave a comment »

There has been some confusion over CommitTimeout and ForceCommitTimeout  in the past. Hopefully this will be of some use.

SSAS Processing requires write lock on the cube to commit the process. To acquire the write lock there should be no read lock present on the cube. Read locks are acquired when a MDX query is fired against SSAS cube. COMMITTIMEOUT property will wait till all the read locks are closed before committing. For example, say the commit timeout is 30 seconds and if read locks are not released for commit to happen then rolling back of processing transactions takes place.

FORCECOMMITTIMEOUT property is used to avoid rolling back of processing transactions. FORCECOMMITTIMEOUT will wait the specified time for long running queries (that have acquired read locks) to complete its work before cancelling them (to acquire write lock for processing).

The following is general representation of how locks for processing and querying work.

To summarize:
COMMITTIMEOUT – Cancels processing (after specified time) if write locks cannot be obtained.
FORCECOMMITTIMEOUT – Cancels long running queries (after specified time) to acquire write locks for processing. When these queries are cancelled the user gets an error “Server action is cancelled”.

COMMITTIMEOUT of 0 will never let the processing operation timeout.

COMMITTIMEOUT of 0 and a low value of FORCECOMMITTIMEOUT would mean user queries getting cancelled more often. A high value of FORCECOMMITTIMEOUT would mean user queries (even badly performing queries) get more time to finish its operation and processing operations get delayed as they will have to wait until all the read locks (held by querying operation) are released, thus giving a perception of bad processing performance.

If we set both COMMITTIMEOUT and FORCECOMMITTIMEOUT to zero then write lock on cube will wait indefinitely till all the read locks are released. There will be no errors or timeouts but the system will generally appear to be badly performing. Please note, if the user queries the cube during the commit phase of processing then the queries get queued up.

If you want the processing to take priority the set the COMMITTIMEOUT to 0 and FORCECOMMITTIMEOUT to 500 (default of 3000 is also OK). FORCECOMMITTIMEOUT of 500 will enforce the user queries to be cancelled after 5 seconds.

If querying has to take precedence then set COMMITTIMEOUT to 500 which would enforce the processing operations to be cancelled after 5 seconds of holding the write lock. Setting COMMITTIMEOUT and FORCECOMMITTIMEOUT to 0 would also give precedence to the query but without any timeouts of processing operations. This will generally give an impression of server performing badly.

To avoid such unwanted situations it is best scale out Analysis Services operations by separating the  processing and querying operations.

Written by Gautham KAMATH

02/11/2011 at 09:42