SQL2k Analysis Services Notes

Home Up

Contents

  1. Analysis Manager

  2. Calculated Cells

  3. Calculated Measures

  4. Cubes

  5. Cube Editor Properties

  6. Custom Member Formulae

  7. Custom Roll-Ups

  8. Data Sources

  9. Derived Measures

  10. Dimension Tables

  11. Dimensions

  12. Dimension Editor Properties.

  13. Drillthrough

  14. Fundamentals

  15. Local Cubes

  16. MDX

  17. Member Properties

  18. Metadata Repository

  19. MSDN

  20. PivotTable List (OWC)

  21. PivotTable Service

  22. Processing (Updates)

  23. Query Performance

  24. Security

  25. Star/Snowflake Schema

  26. Virtual Cubes

  27. Writeback

 

Analysis Manager

1.      Analysis Manager is the administrator tool used to communicate with the Analysis Server
2.
      Analysis Manager communicates with AS via a layer called DSO (Decision Support Objects)
        a. SQL Distributed Management Objects (SQL-DMO) is the object model for SQL Server
        b. ActiveX Data Objects Multidimensional (ADO MD) is the object model for PivotTable Services

 

Calculated Cells

 

1.      Apply formulas to cells or sets of cells to perform simple or complex calculations.

2.      Financial models  usually require calculated cells

 

Calculated Members/Measures

1.      A calculated measure is calculated after aggregations are created
2.
      A calculated measure is not stored in the cube
3.
      If a calculation contains only addition or subtraction it is possible to use a derived measure or a calculated measure;
      or if the calculation involves multiplying by a constant, then
       
a.  If used frequently then use a derived measure
        b. If infrequently use a calculated measure
4.
      If a calculation involves multiplying or dividing one column by another the choice between a calculated measure & a derived measure is critical to getting the correct values. However, if aggregation is SUM, rather than Max or Min, then you will usually use a calculated measure
5.
      VBA functions are available for calculated members

 

Cubes

1.      1 to 64 dimensions
2.
      Data stored in proprietary structure
3.
      F2 to change name in cube wizard

 

Cube Editor Properties

1.      Display Format Property – doesn’t change the way the values are stored, just the way they are displayed
2.
      Level Naming Property
        a.
      Non-leaf data requires parent-child dimension
        b.
      Use the “Level Naming” option on the advanced tab for naming the levels in a parent-child dimension
3.
      Aggregate Function Property – Sum, Min, Max, Count, and Distinct Count

 

Custom Member Formulas

      1.      Use the LookupCube function to retrieve a value from another cube
            e.g. LookupCube("Sales","([Sales Dollars].[All Product],"+[Time].[Calendar].CurrentMember.UniqueName + ")")
      2.
      The LookupCube function is useful when building multi-cube business models
     
3.      It can be slow though!

 

Custom Roll-Ups

1.      When using parent-child dimensions it may be that you don't want some members to be summed to their parent
2.   This is the case in finance where some ‘accounts’ sum to a parent, some subtract from a parent & others don't get summed or subtracted, that is they have no impact on the parent value
3.
      Each member needs its own aggregation rule: custom rollup operators +, - or ~

 

Data Sources

 An Analysis Services database can contain multiple data sources, but a single cube will only use a single data source

 

Derived Measures

1.      See Calculated Measures, above, for when to use derived measures
2.   A derived measure is calculated before aggregations are created
3.
      Derived measures are stored in the cube file
4.   To create a derived measure amend the source column property of a measure
      e.g. “SalesFact”.”Sales_Sterling”*0.175 & then rename the measure to e.g.“VAT”
5.
      All columns in a derived measure must come from the fact table
6.   The only way to create a derived measure that includes columns from multiple tables is to create a view in the relational data source & use that view as the fact table
 

 

Dimension Tables

1.      Unique key = primary key
2.      Indexing can reduce cube reprocessing time
3.      Foreign key in the fact table

 

Dimensions

1.      Balanced – If a member has children then so must all its sibling members. It is a standard dimension e.g. time
2.
      Unbalanced – A member may have children, but its sibling members need not, e.g. organisation chart
3.
      Ragged e.g. geography: misses levels in some cases
        a.
      In a ragged standard dimension, a parent is hidden & extra space is removed
                i. Use Hide Member If property on the advanced tab of Dimension Editor
        b.
      In a ragged parent-child dimension, extra space is added in the place of one or more parents
                i. Use Skipped Levels Column property on the advanced tab of Dimension Editor, which requires an additional column in the dimension table specifying number of rows to skip
4.
      Parent-child dimensions allows unbalanced dimensions to be created
        a.
      A member with children can be a sibling of a member without children
        b.
      Non-leaf data is permitted; parents have data entries as well as their children
5.
      Multiple hierarchies – alternate ways of grouping members
6.
      Member Properties – further information about members
7.
      Private & shared dimensions
        a.  Private belong to just one cube
        b.  It is usually best to stick with shared dimensions
8.
      Time dimension
        a.  Can be created directly off fact table
        b.  However, if fact table is large it is better to have an integer key & time dimension table
9.
      The top level of a dimension must always have unique key values

 

Dimension Editor Properties

  1. Member Keys/Names
    1. The member key identifies each individual member
    2. The member name is what the end-user sees
    3. Member keys & names can be the same
    4. Member key & member name column properties can consist of more than a single column name
    5. Any SQL expression that the relational data source understands can be used, provided it returns a value or a string
    6. e.g. Member name column property – “Product”.”Brand”+’ ‘+”Product”.”Product_Name”
    7. Also, DatePart(‘m’,”TimeMonth”.”Month”)
    8. Or, Format(“TimeMonth”.”Month”,’mmmm’)
    9. Or, DatePart(‘yyyy’,DateAdd(‘m’,-2,”TimeMonth”.”Month”))
    10. Or, Right(“DateTable”.”DateString”,4)
  2. Grouping Property
    1. Automatic grouping for very large dimensions
    2. Uses the square root of the number of members
    3. Can be controlled by creating an expression in Member Key/Name Column, e.g. Left(“Visits”.”Visitor_ID”,3) groups the visitors by the first three digits.
  3. Order By Property – Name or Key
  4. Skipped Levels Option – For ragged dimensions
  5. Type Property – Dimension (& Level) type property: has effect on certain MDX functions & client applications
  6. Write-enable dimensions (Enterprise edition only)

 

Drillthrough

    1.  It is possible to build a cube without taking all the individual values in the fact table into the cube
    2.  Then using drillthrough see individual rows from a fact table that went into a specific value in a cube
    3.  Analysis Services cubes can be big & still perform well, so you can usually build a cube based on all the values in the fact table

 

Fundamentals

1.      Data is aggregate through SUM, COUNT, MIN, or MAX
        a.
      Ratios are usually calculated after aggregation of composite values; use calculated measures
        b.
      Some dimension hierarchies don’t sum, e.g. accounts dimension; use custom roll-ups
2.
      Create dimensions from dimension tables not fact tables 
3.   
Use surrogate keys
       
a.  Instead of a column entry such as "England", the fact table should have a small integer or integer data type, such as "44"
        b.  This means more ETL work when creating/updating the fact tables, but makes cube processing quicker & dealing with slowly changing dimensions easier

 

Local Cubes

1.     Usually smaller than server based cubes
2.     They don’t have:
        a.      Aggregations,
        b.      Member properties,
        c.      Shared dims,
        d.      Parent-Child dims,
        e.      Virtual dims,
        f.       Virtual cubes,
        g.      Partitions or
        h.      Writeback tables.
3.     
Two approaches to creating local cubes:
        a.      Use Excel Pivot Tables to create a local cube which is a subset of the server-based cube
                i.      Choose Client-Server Settings on the PivotTable menu
                ii.     Create Local Data File
                iii.    Create Cube File wizard
        b.      Use Microsoft Query to define a local cube based on a relational data source
                i.      Even without an AS server it is possible to create a local cube by creating one directly from a relational data source using PTS in Microsoft Office
                ii.      Three options:
                        1.      A .cub file
                        2.      ROLAP cube that only exists as long as the client application remains connected to the cube (completely in PTS)
                        3.      MOLAP cube with no aggregations completely in PTS

 

MDX

1.      MDX is part of the OLE DB for OLAP Specification; it is the SQL of the OLAP world
2.
      MDX Expressions
        a.
      MDX expression = single value
        b.
      MDX = value + value;  where value = constant, tuple or another MDX expression
        c.
      MDX can return either a number or a string
        d.
      “//” for adding comments
3.
      MDX Functions
        a.
      Aggregate –
                i.
      Returns a calculated value using the appropriate aggregate function, based on the aggregation type of the member.
       
ii.
  In the following expression, the calculated member Total is displayed first against the measure SumSales and then against the measure MaxSales. In the former case, Total is calculated by adding (with Sum). In the latter case, Total is calculated by taking the maximum.
WITH MEMBER Geography.Total AS  ‘AGGREGATE({USA, France})’
SELECT {Measures.SumSales, Measures.MaxSales} ON COLUMNS,
{USA, France, Total} ON ROWS
FROM SalesCube
WHERE ([1998])
     
iii.      In general, when creating an calculated member on a nonmeasure dimension use the aggregate function rather than the sum function
        b.
      CDbl’ (VBA function) converts a string into a number
        c.
      Crossjoin function – to combine 2 sets from two dims into a single set
        d.
      CurrentMember
        e.
      IIf(«Logical Expression», «Numeric Expression1», «Numeric Expression2») or  IIf(«Logical Expression», «String Expression1», «String Expression2»)
        f.
        LookupCube function
                i.
      e.g. LookupCube("YOAMeasures", "([Policies],"+[Cover].CurrentMember.uniquename+","+[Product].CurrentMember.uniquename+","+[Version].CurrentMember.uniquename+","+[YOA].CurrentMember.uniquename+")")*LookupCube("ArrivalCurveMeasures", "([ACPolicies],"+[Cover].CurrentMember.uniquename+","+[Product].CurrentMember.uniquename+","+[Version].CurrentMember.uniquename+","+[YOA].CurrentMember.uniquename+","+[WorkedMonth].CurrentMember.uniquename+")")
               
 ii.      IIF(ISEMPTY(LookupCube(...))<>NULL,X/LookupCube(...),DoNothing) – Is this possible?
        g.
      OR & AND, & XOR & NOT can be used in logical expressions
        h.
      NextMember, PrevMember, Lag functions
        i.
        PeriodsToDate function
        j.
        Union, Intersect & Except functions
4.
      MDX Query = rpt with many values
        a.
      Office Pivot Table list & Excel Pivot Table reports write the MDX queries
        b.
      Report – row axis, column axis, filter axis
        c.
      An axis can include labels from >1 dim
        d.
      Aside from the cube name in the FROM clause, a member is the only object from a cube that appears directly in an MDX query
        e.
      Query files – can contain >1  MDX queries
5.
      Members, Tuples, Sets
        a.
      A set includes one or more tuples & a tuple includes one or more members
        b.
      {}set ()tuple [ ] member e.g. {([Sales Dollars]), ([Sales Units])}
        c.
      Tuple = coordinates from >1 dim
        d.
      Tuples appear either in a set or in the WHERE clause to create the filter axis
        e.
      Sets appear on the axes of a report

 

Member Properties

1.      Member properties: in the dimension tables
2.
      Any column that isn’t a member name or member key is a potential member property.
3.
      When the dimension tables appear in a snowflake schema, where each level of the hierarchy has its own table, it is easy to identify the appropriate level for the member property: simply the level that corresponds to the members from the same source table.

 

Metadata Repository

1.      Metadata repository is msmdrep.mdb stored in the Bin folder in the Analysis Services folder
2.
      Possible to migrate it to SQL server but can’t then move it back

 

PivotTable List (OWC)

1.      The browser must be able to use ActiveX controls to be able to browse a PivotTable list
2.
      An Excel Pivot Table offers a greater degree of control over label & measure formatting, but the PT List overall is a more flexible tool for working with AS cubes
3.
      In Office Web Components different fonts & backgrounds can only be applied to three groups of item:
        a.
      Changing a single dim or level label changes all dim & level labels
        b.
      Changing a single item label changes all item labels
        c.
      Changing the font or background for a single measure changes the format of all measures
4.
      Office PT list can handle events (VB)

 

MSDN (Microsoft Developer Network)

1.      Analysis Services Overview - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/olapdmad/aggettingstart_80xj.asp
2.
      OLAP Services: Optimizing Cube Performance Using Microsoft OLAP Services - http://www.msdn.microsoft.com/library/default.asp?url=/library/en-us/dnolap/html/olapunisys.asp
3.
      Optimizing the Data Warehouse Database for Analysis Services Performance - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/olapdmad/agoptimizing_3gyt.asp
4.
      How to configure Analysis Services for the Web - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/olapdmad/aghtconfig_4giq.asp
5.
      Analysis Services: Choosing Dimension Types in SQL Server 2000 Analysis Services - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/dimmsdn.asp
6.
      Using Microsoft Excel 2002 with SQL Server Analysis Services - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlsql.asp

 

PivotTable Service

1.      Both Analysis Services & PTS maintain storage buffers called caches, which retain recently retrieved values. This makes use of the client/server set-up
2.
      PTS communicates with the server every few seconds to make sure the values in its cache are still valid
3.
      Excel uses OLE DB for OLAP to communicate with PTS
4.
      The OLE DB for OLAP specification is a published interface for interacting with multidimensional data sources; all third party applications that use this specification can access AS cubes
5.
      OLE DB for OLAP comes with a wrapper layer called Active X Data Objects (multidimensional), i.e. ADO MD
            a.
      Neither are proprietary to Microsoft.
6.
      Microsoft OLE DB provider for ODBC drivers (a native OLE DB provider can be up to twice as fast as using OLE DB provider for ODBC drivers)
7.
      All clients must have PTS; so thin-client

 

Processing (Updates)

1.      Replace all dimensions & cubes
2.
      Use Full Process
3.
      Importance of dimension optimisation on cube processing (Optimise Schema Command) – this changes the member key from being the primary key in the dimension table to the foreign key in the fact table
4.
      Use member keys unique TRUE when possible (big saving – more important than Optimise Schema)

 

Query Performance

    1.   Server-Side Cache – User queries, metadata, & data are stored in the OLAP server’s cache, making it possible to answer new queries by calculating answers from cached data rather than retrieving data from the disk
      2.
  Client-Side Cache – e.g. if client cache contains values for the four quarters of a specific year & the user asks for the total for the same year, PivotTable Services calculates the answer from the cached data.
      3.
  Client/Server Data Management – PivotTable Services shares much of the same functionality as the server, enabling it to bring the server’s multidimensional calculation engine, caching features, & query management directly to the client’s computer to optimise performance & minimise network traffic
      4.   Important to remember difference between cold & warm cache; i.e. more queries stored in cache as the day goes on so queries become faster

 

Security

  1. At the server, database, cube, and mining model level, administrators can offer users "all-or-nothing" access
  2. Dimension-level security, administrators can offer groups of users access to specific dimensions or parts of dimensions; a budgeting application could be designed so that users see all data but can only write back for their own departments
  3. Cell-level security provides even finer-grained control over access to data. Administrators can develop complex rules to limit users to specific set of cells at various levels in the cube. For example, users might see detailed data for their own departments, but only see summarized data for other departments

 

Star/Snowflake Schema

1.      Analysis Services query performance is unaffected whether star or snowflake schema are used, because if MOLAP all aggregates are stored in its own internal structure
2.
      The data storage compression of Analysis Services stores a cube in 25%-40% of the space required for the same data (without indexes) in the relational database, and the cube includes indexes and aggregations

 

Virtual Cubes

1.      A virtual cube is the multidimensional equivalent of a relational view
2.
      It is more restrictive though: a virtual cube cannot be created from a portion of a dimension, the entire dimension is either included or excluded, just as measures are either included or excluded.
3.
      Useful for combining measures from more than one cube or removing them from a cube.

 

Writeback

1.      Empty fact table for writeback is usual for forecast cubes; change fact table size in advanced properties to ‘1’
2.
      Only possible to writeback at the lowest level; however, it is possible to splash values so it appears that entry is at an aggregated level
3.
      The incremental changes are written to either the client cache in PTS (what-if analysis: temporary writeback) or to a special writeback table (permanent write-back)
4.
      The writeback table doesn’t have to be in the same database as the fact table; it is possible to use an Access database if desired
5.
      It is possible to append the writeback table values to the fact table & delete the writeback table data. This is more efficient
6.
      UpdateCube MDX statement allocates a high level input to the necessary lower levels