SSAS 2008 Unleashed - Chapter 6. Cubes and Multidimensional Analysis

2012-8-16 15:00| 发布者: demo| 查看: 430| 评论: 0

摘要: Chapter 6. Cubes and Multidimensional Analysis In This Chapter Cube DimensionsThe Dimension CubePerspectives The main object you use in multidimensional analysis is the cube. As opposed to the geomet ...

Chapter 6. Cubes and Multidimensional Analysis

In This Chapter

The main object you use in multidimensional analysis is the cube. As opposed to the geometric cube, our multidimensional cube (hypercube) can have any number of dimensions, and its dimensions aren’t necessarily of equal size. It’s not really a cube as we understood “cube” from our geometry classes, but we’ve borrowed the term for multidimensional analysis.

A cube defines a multidimensional space, the data available for analysis, the methods for calculating the data, and restrictions to data access. To define all this, a cube has two main collections: a collection of dimensions that defines the dimensions in our multidimensional space, and a collection of data values (measures) that reside along those dimensions and that we’ll analyze. In Chapter 12 , “Cube-Based MDX Calculations,” we discuss the way cubes define the rules for calculating the values of the cube. In the chapters of Part 8 , “Security,” we examine the rules that govern access to the data in the cube.

The cube is a major object and accordingly has all the parameters that characterize the major object: Identifier, Name, Description, Language, Collation, Annotation, and a collection of Translations. The Language and Collation parameters of the cube define the language for all the elements that don’t have those properties specified.

In the following list, we define the most important parameters of the conceptual model of the cube:

Visible determines whether the cube is visible to the user or whether users accesses the cube’s data through other objects.
Dimensions defines the dimensionality of the cube, the most important characteristic of a cube. This collection includes all the dimensions except the Measure dimension. (We provide more information about the Measure dimension in Chapter 7, “Measures and Multidimensional Analysis.”)
MeasureGroups defines the measures in the cube that are accessible for analysis.
CubePermissions defines access to data. (For more information about security architecture, see Chapter 35, “Security Model for Analysis Services.”)
Perspectives defines different views of the cube. The Perspectives collection enables you to limit the visibility to some elements of the cube and to simplify the model for the user.
MDXScripts defines the methods for using the physical space (see Chapter 2 , “Multidimensional Space”) to calculate the theoretical space of the cube. The rules and language for building the script are covered in Chapter 12, “Cube-Based MDX Calculations.”
KPIs defines a list of objects that support key performance indicators that are available for the client application. We cover the KPIs in Chapter 15, “Key Performance Indicators, Actions, and the DRILLTHROUGH Statement.”
Actions is a list of objects that enable you to define the actions that must occur in an application when the user accesses specific cells of the cube. We cover actions in Chapter 15.

A cube has a large number of parameters that define the behavior of its elements during its life cycle or that define default parameters for various elements of the physical model. (For information about those parameters, see the chapters in Part 5 , “Bringing Data into Analysis Services,” and Part 6, “Analysis Server Architecture”.)

In Listing 6.1 , we show the definition of the Warehouse and Sales cube, without expanding the Dimensions collection and the Measures collection. This definition of the cube from the point of view of the conceptual data model is very simple.

Listing 6.1. A Cube Definition


<Cube xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <ID>FoodMart2008</ID>
  <Name>Warehouse and Sales</Name>
  <Language>1033</Language>
  <Translations>
    <Translation>
      <Language>1049</Language>
      <Caption>

</Caption>
    </Translation>
  </Translations>
  <Dimensions />
  <MeasureGroups />
</Cube>

Cube Dimensions

The multidimensional space of a cube is defined by a list of its dimensions, which is a subset of the dimensions in the database. In Listing 6.2 , you can see the definition of the Dimensions collection for our Warehouse and Sales cube.

Listing 6.2. The `Dimensions` Collection for the Cube `Warehouse and Sales`

  <Dimensions>
    <Dimension>
      <ID>Product</ID>
      <Name>Product</Name>
      <DimensionID>Product</DimensionID>
      <Attributes>
        <Attribute>
          <AttributeID>Product</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>Brand Name</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>SKU</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>SRP</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>Product Subcategory</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>Product Category</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>Product Department</AttributeID>
        </Attribute>
        <Attribute>
          <AttributeID>Product Family</AttributeID>
        </Attribute>
      </Attributes>
      <Hierarchies>
      <Hierarchy>
        <HierarchyID>Hierarchy</HierarchyID>
      </Hierarchy>
    </Hierarchies>
  </Dimension>
  <Dimension>
    <ID>Time</ID>
    <Name>Time</Name>
    <DimensionID>Time By Day</DimensionID>
    <Attributes>
      <Attribute>
        <AttributeID>Time By Day</AttributeID>
      </Attribute>
      <Attribute>
        <AttributeID>The Day</AttributeID>
      </Attribute>
      <Attribute>
        <AttributeID>The Month</AttributeID>
      </Attribute>
      <Attribute>
        <AttributeID>The Year</AttributeID>
      </Attribute>
      <Attribute>
        <AttributeID>Week Of Year</AttributeID>
      </Attribute>
      <Attribute>
        <AttributeID>Quarter</AttributeID>
      </Attribute>
    </Attributes>
    <Hierarchies>
      <Hierarchy>
        <HierarchyID>Hierarchy</HierarchyID>
      </Hierarchy>
      <Hierarchy>
        <HierarchyID>Hierarchy 1</HierarchyID>
      </Hierarchy>
    </Hierarchies>
  </Dimension>
</Dimensions>

In most cases when you define a cube, it’s enough to enumerate the database dimensions used in the cube, and you don’t need to include any additional information. When you enumerate the database dimensions, all the database dimension objects (attributes and hierarchies) that were available at the time the cube was created are copied to the cube dimensions.

However, sometimes you need to make the availability of objects in the cube different from the availability of the objects in the database. To do this, you can define some parameters of dimension on the cube level. For example, you can hide an attribute by setting its parameters Enable or Visible to true. You can’t expand the space. On the cube level, you can only reduce the space defined by the dimensions. It means you cannot enable an attribute or hierarchy in the cube, if it was disabled in a database.

Cube dimensions are minor objects and can be changed only with the cube itself. (You can’t send a command that will change only a dimension of a cube; your command must change the whole cube.) In the following list, we define the most important parameters of the conceptual model of the cube dimension:

DimensionID defines the database dimension on which the cube dimension is based. All the parameters of the cube dimension will be the same as those of the database dimension at the time they were included in the cube. If the name of database dimension changes later, that change won’t affect the cube dimension.
ID is the identifier of the cube dimension. This identifier can be used for reference by other elements of the cube. (You can find more information on this in Chapter 7.)
Name is the name by which this dimension will be available to the cube user. In most cases, this name is the same as the name of the database dimension, but it can differ if necessary.
The Translations collection defines the translation to different languages of the name of the dimension. If the name of the dimension is the same in the cube as it is in the database, you don’t need the Translations collection. If you specify a Translations collection for the cube, any translation defined for the database dimension is no longer available.
HierarchyUniqueNameStyle defines the rules for creating a unique name for the hierarchy. When the name of the hierarchy is unique across all the dimensions in the cube, you can use that hierarchy name without qualifying it with the dimension name. This makes it possible to use hierarchy-oriented applications that don’t reference dimensions.
MemberUniqueNameStyle defines the rules for creating unique names for the members of a hierarchy in a cube. This parameter enables the server to create member unique names according to the full path to the member. We don’t recommend using this parameter because the server would have to use a relatively inefficient code path to create the unique names.
Note

It’s best that you treat the unique name of a member as an internal name, and that you don’t base any application that you might write on the rules of unique name generation. The unique name can change as the structure of the model changes, or during an update to another version of Analysis Services. The name can also change when some internal setting that controls the rules for defining unique names changes.

Therefore, we recommend that you avoid building your architecture based on what you know about the unique name structure, because it’s internally controlled. Don’t try to parse a unique name and don’t store data based on the unique name of a member. Use the member key or the fully qualified path (by their names or keys) to the member in the hierarchy.
The Attributes collection defines the list of attributes, with parameters, that differ from the attributes in the database dimension. If all the parameters are the same in the cube dimension as they are in the database dimension, this collection remains empty. If you can, avoid filling such a collection; otherwise, there may be some confusion as to why the same dimension behaves differently in different cubes from the same database.
The Hierarchies collection defines the list of hierarchies whose parameters differ in the cube dimension from those in the database dimension. If all the parameters are the same in the cube dimension as they are in the database dimension, this collection remains empty. As with the Attributes collection, avoid filling such a collection, if possible; otherwise, there may be some confusion as to why the same dimension behaves differently in different cubes from the same database.

The collections of attributes and hierarchies of the cube dimension have a great effect on the data model. In addition, they can affect the performance of the system while it’s filling the model with data and querying that data.

Cube Dimension Attributes

A cube dimension, by our specifications, contains references to database dimensions. When you create a cube, the database dimensions are copied to the cube along with all the information they contain. When the database dimensions are copied to the cube, the cube gets them as they are at that moment. If you change the dimensions later, those changes won’t appear in the cube until you make any change to the cube.

Most of the time, that’s all you need. Sometimes, however, there’s more information in the database dimension properties than the cube needs. For example, the Time dimension often contains attributes for days and hours or for the days of the week. But, the current cube doesn’t need those attributes. If we leave all these attributes in the dimension, they are visible to the user and will just make his work more difficult. In addition, supporting those attributes requires considerable system resources. When you define the cube, you can use the following parameters to exclude the attributes the cube doesn’t need (or to reduce the resources spent on them):

AttributeID references the ID of the attribute in the database dimension. You can’t rename this attribute in the cube. Cube attributes don’t have their own IDs or names in the cube; they inherit all the properties of the cube dimension attribute, including translations. However, you can define annotations on the cube attribute.
AttributeHierarchyEnable enables you to turn off the attribute hierarchy in the cube, even if it’s turned on in the dimension. When this property is turned off, the attribute is unavailable to the client application, just as if the attribute hierarchy were turned off in the database dimension. You can use this parameter to ensure that additional resources won’t be spent to support this attribute in the cube. However, if the attribute is used in other hierarchies in the cube, you can’t use this parameter to turn off the attribute hierarchy.
AttributeHierarchyVisible enables you to make the hierarchy attribute invisible to the user. In contrast to the use of the AttributeHierarchyEnable parameter, with this parameter the hierarchy is still available to any user who knows how to access it. And the cube will require the same resources to support it as it would for a visible hierarchy.
AttributeHierarchyOptimizedState enables you to turn off some of the resources spent supporting an attribute on the cube (such as turn off building aggregations and indexes for this attribute). From the standpoint of the cube model, nothing changes; but access to data through this attribute will take more time. We recommend that you turn off optimization for any attribute hierarchy that you’ve defined as invisible. If you turn off the attribute hierarchy rather than making it invisible and not optimized, the server spends fewer resources to support it.

Changing the parameters of attributes can make a big impact on the structure of the model and on the performance of the system. However, use those settings carefully because they change behavior from cube to cube.

Dimension attributes that don’t participate in the data definition of the measure group (for a definition of measure groups and granularity, see Chapter 7 ) will be automatically turned off by the system. This means that the system doesn’t have to spend resources to support them. For example, if the Time dimension has Day and Hours attributes, but the data in the cube is stored only by date, the system won’t use the Hours attribute, and so it won’t spend resources to support it. Therefore, you don’t need to set the AttributeHierarchyEnable property yourself—the system will do this for you automatically.

Cube Dimension Hierarchies

A lot of what we said earlier about dimension attributes, particularly about attribute hierarchies, applies to all the other hierarchies of a cube dimension. The multidimensional model enables you to control the user’s access to cube hierarchies and attribute hierarchies. In addition, you can similarly set parameters to limit the system resources spent on supporting cube hierarchies.

The following list describes the parameters of the cube hierarchy. (They’re similar to those of attribute hierarchies.)

HierarchyID is a reference to the ID of the dimension hierarchy. You can’t rename hierarchies; they inherit properties from the database dimension, including translation. However, you can define annotations.
Enabled enables you to turn on or off the dimension hierarchy in the cube. However, if it’s turned off in the database, you can’t turn it on in the cube. In addition, if one of the attributes in the cube hierarchy has the attribute hierarchy turned off, you can’t turn this hierarchy on in the cube.
Visible enables you to make the hierarchy invisible to users. However, users can access the hierarchy if they know how to call it.
OptimizedState enables you to redefine optimization as it’s defined for different attributes, thereby turning off some of the resources used to support the cube hierarchy. Nonetheless, if some other hierarchy optimizes this attribute, this attribute will be optimized.

Role-Playing Dimensions

In the multidimensional model, the same database dimension can play multiple roles. For example, the typical use of a Time dimension for a travel application is to store data for the time of arrival and departure. The Time dimension can be used twice. In one case, it’s used to define departure time, and in another it’s used to define arrival time. You can just rename the two Time dimensions to Departure Time and Arrival Time. We show you how to do just that in Listing 6.3.

Listing 6.3. A Cube with Two `Time` Dimensions


<Cube xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <ID>Flight</ID>
  <Name>Flight</Name>
  <Annotations />
  <Language>1033</Language>
  <Translations />
  <Dimensions>
    <Dimension>
      <ID>Departure Time</ID>
      <Name>Departure Time</Name>
      <DimensionID>Time By Day</DimensionID>
    </Dimension>
    <Dimension>
      <ID>Arrival Time</ID>
      <Name>Arrival Time</Name>
      <DimensionID>Time By Day</DimensionID>
    </Dimension>
  </Dimensions>
  <MeasureGroups>
  </MeasureGroups>
</Cube>

You have to give these two versions of the Time dimension different IDs and names (Departure Time and Arrival Time) in the cube so that each is a completely independent dimension, with all the information it contains. Because of this independence, you can change any attribute in one of the dimensions and it won’t change in the other.

Note

If you use a role-playing dimension in your cube, your hierarchy names are not unique across the cube. Therefore, if you reference an element of the dimension, you have to qualify the hierarchy name by the name of the dimension.

Hierarchy-based applications can’t work with role-playing dimensions.

It’s efficient to use role-playing dimensions because all the information for both of the dimensions is stored in the database only once. In addition, because each is a completely independent dimension in the cube, with independent management, you can optimize attributes in one differently than you do in the other. For example, you might optimize the Arrival Time dimension on the level of hours, and the Departure Time dimension on the level of days.

The Dimension Cube

Database dimensions are very important in the conceptual model. But the application model (MDX) recognizes only cube dimensions. In a client application, you can’t access information in a database dimension if the dimension isn’t included in a cube. Even if it’s included in a cube, your application will have access to the data in the database dimension according to permissions specified in the cube.

To make it possible for a client application to gain access to all the information in the database, Analysis Services automatically creates a dimension cube when you create a database dimension. That cube contains only one dimension and no measures. You can use the dimension in the dimension cube to access information that’s contained in the database dimension. Just use the same application that you use to gain access to information inside any of the other cubes.

The only difference is that the name of the dimension cube is the name of the dimension preceded by a dollar sign. For example, $Customers would be the name of the dimension cube for the Customer dimension. Security for the dimension cube is exactly the same as specified for the database dimension. The current version of Analysis Services allows only the database administrator access to the dimension cube.

Perspectives

Analysis Services introduces perspectives to simplify data models that can be used in applications. A cube can contain hundreds of hierarchies and tens of different measures, which are available for analysis by hierarchies. Analysis Services perspectives enable you to define a subset of the objects of the application’s model into an independent model. The perspective acts like a window into one area of the cube, one that the application can handle.

A perspective is defined by a name (that you specify) and the objects (that you select) it inherits from a cube. Any object that you don’t select are hidden. The objects that are available in the perspective are the following:

Dimensions (hierarchies and attributes)
Measure groups (measures)
KPIs
Calculated members
Actions

Perspectives don’t slow down the system and don’t require additional resources. All their use does is create an additional layer of metadata, which specifies the part of the cube that is visible to the user. No real data is stored in the perspective; the data is retrieved from the original cube.

You use perspectives to control the scope of a cube exposed to users. To the user, a perspective looks like any other cube available for the application. With perspectives, you can reduce clutter and make it possible for different types of users to see different parts of the cube.

You can’t use perspectives as a security tool to limit access to cube information. Even though the data the user sees is limited by the perspective, if he knows how, he can access any information in the cube.

A perspective is a major object and has all the properties of major objects: ID, Name, Description, Translations, and Annotations. A perspective doesn’t have its own Language or Collation properties. It uses these properties of the cube.

To define a perspective, you specify five collections with names similar to the collections in a cube:

Dimensions specifies the list of dimensions that are visible in the perspective.
MeasureGroups specifies the list of measure groups that are visible in the perspective.
Calculations specifies the list of calculations that are visible in the perspective.
KPIs specifies the list of key performance indicators available through the perspective.
Actions specifies the list of actions available through the perspective.

None of the elements of a collection is derived from the cube. You have to define all the collections when you create a perspective. If you don’t define a collection for any type of element (say hierarchy), that element won’t be visible in the perspective. If you don’t specify an attribute in the perspective, that attribute isn’t visible through the perspective.

Listing 6.4 shows a definition of a perspective for Warehouse and Sales cube, with only one dimension and one measure visible.

Listing 6.4. A Perspective for a Cube

<Perspectives>
     <Perspective>
          <ID>Perspective</ID>
          <Name>Product Sales</Name>
          <Dimensions>
               <Dimension>
                    <CubeDimensionID>Product</CubeDimensionID>
                    <Attributes>
                         <Attribute>
                              <AttributeID>Product</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>Brand Name</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>SKU</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>SRP</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>Product Subcategory</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>Product Category</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>Product Department</AttributeID>
                         </Attribute>
                         <Attribute>
                              <AttributeID>Product Family</AttributeID>
                         </Attribute>
                    </Attributes>
                    <Hierarchies>
                         <Hierarchy>
                              <HierarchyID>Hierarchy</HierarchyID>
                         </Hierarchy>
                    </Hierarchies>
               /Dimension>
          </Dimensions>
          <MeasureGroups>
               <MeasureGroup>
                    <MeasureGroupID>Sales Fact 1997</MeasureGroupID>
                    <Measures>
                         <Measure>
                              <MeasureID>Unit Sales</MeasureID>
                         </Measure>
                    </Measures>
               </MeasureGroup>
          </MeasureGroups>
     </Perspective>
</Perspectives>