Chapter 5. Dimensions in the Conceptual Model

2012-8-16 14:57| 发布者: demo| 查看: 604| 评论: 0

摘要: Chapter 5. Dimensions in the Conceptual Model In This Chapter Dimension AttributesDimension HierarchiesAttribute Hierarchies You define multidimensional space by specifying a set of dimensions. There ...

Attribute Member Keys

Each member of an attribute is identified by its key; each member key is unique within the attribute. The keys of all the members of an attribute have the same type. As a member is loaded into the collection that makes up the attribute, it receives a unique number, called Data ID (mostly used internally).

Simple and Composite Keys

In contrast to the “domain” of the relational model, where the scalar is always a simple value, the members of an attribute in the multidimensional model can have either simple keys or composite keys:

A simple key is defined by a single value of any data type allowed in the multidimensional database.
A composite key is defined by a combination of values of various data types. A composite key, like any key, must be unique for each attribute member.

You need a composite key when you can’t rely on the uniqueness of the key to identify a specific member. For example, if you have an attribute (collection) of world cities, to uniquely identify any one city you need to include in the key the country and perhaps even province or state. (There could be a number of cities with the same name located in one country—the United States is notorious for that.) Thus, the composite key.

Now, if you also had an attribute of countries, the value of a country would be present in both attributes (cities and countries). Composite keys, then, lead to duplication of data and consume more resources.

The Unknown Member

In a relational database, you have a value with its own set of rules—NULL—that takes the place of an unspecified value. In our implementation of the multidimensional database, the concept of the Unknown Member serves a similar purpose: to define a dimension member with an unspecified key. Every attribute has an unknown member, even though it might be inaccessible. Even an empty attribute has one member, the unknown member.

An unknown member has no specified key. It’s possible that some other member of the attribute would have a key with the value NULL. Such an attribute would have both an unknown member and a member with a NULL key value. We talk more about unknown members in Chapter 7 , “Measures and Multidimensional Analysis.”

Data Specification

To define keys and other data in the multidimensional model, you can use the DDL object DataItem. DataItem defines the primary properties of multidimensional data such as key, name, and so on. Table 5.2 describes the properties of DataItem.

Table 5.2. Properties of `DataItem`
Name (DDL)	Description
`DataType`	Defines the data type of the item. Data types in multidimensional databases are a subset of the data types supported by relational databases. From version to version of Analysis Services, we’ve extended the set of data in the multidimensional model to such an extent that it is pretty close to the full set of relational data types. Nonetheless, some data types are still supported by the relational model, but not by the multidimensional model.
`MimeType`	Defines the logical type of binary data: a picture, sound track, video track, the output of an application such as an editor, and so forth.
`Data Size`	Defines size for text or binary data, in characters and bytes, respectively. If you don’t specify a size, by default it will equal 255 characters for text data and 255 bytes for binary data.

`Data Format`	Defines the rules of transformation of data from numeric format into text, if such a transformation is required. Analysis Services uses the format types used in the `Format` function of Visual Basic.
`Collation`	Defines the rules for comparing strings of text data. Collation determines whether two text values are equal or different from each other, and how they should be ordered. For example, `Collation` could specify whether to ignore case in determining if two strings of text are the same or different.
`NullProcessing`	Defines the rules for processing `NULL` data. `NullProcessing` can be set to one of five values:
	If you set it to `Preserv`e, Analysis Services preserves the `NULL` value. `Preserve` takes additional resources to store and process `NULL` data. (We look at the question of the resources required for `Preserve` in more detail when we discuss the physical data model in Chapter 20 , “The Physical Data Model.”)
	If you set it to `ZeroOrBlank`, Analysis Services converts the `NULL` value to 0 if the data type is not a string, and to a blank if the data type is a string.
	If you set it to `Unknown Member`, the `NULL` value is associated with an unknown member.
	If you set it to `Error`, the `NULL` value is not allowed and the server will show an error message.
	If you set it to `Automatic`, the server will choose the best value, depending on the context.
`Trimming`	Defines the rules for deleting trailing spaces at the beginning and end of text. You can use `Trimming` to avoid the repetition of two strings of text that differ only by leading or trailing spaces.
`InvalidXmlCharacters`	Defines the rules of processing invalid XML characters. This property is useful if you think your users will receive data in XML format. In those cases, you can use `InvalidXmlCharacters`, with one of three possible values: `Preserve`, which doesn’t change the character; `Remove`, which removes the characters from the text; or `Replace`, which replaces each invalid character with a question mark.

Listing 5.4 shows the DDL definition of a composite key, which uses DataItems.

Listing 5.4. DDL Definition of a Composite Key

<KeyColumns>
     <KeyColumn>
          <DataType>WChar</DataType>
          <DataSize>50</DataSize>
          <Source>
               <TableID>dbo_customer</TableID>
               <ColumnID>city</ColumnID>
          </Source>
     </KeyColumn>
     <KeyColumn>
          <DataType>WChar</DataType>
          <DataSize>50</DataSize>
          <Source>
               <TableID>dbo_customer</TableID>
               <ColumnID>state_province</ColumnID>
          </Source>
     </KeyColumn>
</KeyColumns>

Attribute Member Names

An attribute member’s identifier is its name. The attribute member name is used pretty much where you want to reference a specific attribute member. The name of the attribute member can be either unique inside of the attribute or not. However, if the name is unique within an attribute, Analysis Services can support it more efficiently. You can use a property such as the MemberNameUnique. Otherwise, the property will come from the key. The MemberNameUnique property is uniqueness of the attribute members’ names, Analysis Services uses the names of members to generate the member unique name; otherwise, it uses member keys.

We use the DataItem object to specify the attribute member name. Unlike the data type for the key, the data type for the member name can only be text or a format that can be converted to text.

When you define a member name, avoid using spaces and special characters; they complicate the use of member names. It’s a good idea to avoid long names, too, because they take more resources to store and retrieve and can therefore decrease system performance. Storing the attribute member names can require gigabytes of disk space. Loading the names into memory can be the main cause of a slowdown.

Collation is important part of the member name specification. The collation of a name can change whether the name is unique or not. For example, suppose you have some name that uses capital letters whereas another, similar name uses lowercase letters (for instance, DeForge and Deforge). Depending on the Collation property value, the system might treat these similar names as either different or the same, resulting in confusion and unpredictable results for the user.

In addition, Collation affects the order the members will be sorted in. Sorting is an important aspect of how attribute members are used in the system. Defining the right ordering scheme is not always a simple task. For example, months in the year are usually not sorted by name; it’s typical to sort months by the numbers that indicate their place in the year.

In a typical sort order, you can use the key to define the order of the attribute members. If the key is defined by the text names of the months, the typical sort order would begin with August, which wouldn’t make sense in most cases. To solve this problem, you create another attribute, related to the original one that contains the numbers that indicate the order the months appear in the year. Then, you order by the key value of that related attribute.

The locale for the attribute member name defines the language of the member. When a certain locale is specified, the attribute member name itself is in the language of that locale. To specify that the name of the attribute member should be translated when it appears in other locales, use the Translations property.

Everything we’ve said about the member attribute name is also true for the Translations property, with one exception. Uniqueness of a translation is not defined in the model because the translation is not used to identify the member, but only for displaying the name of the member. When a user works on a client computer with a specific language, the Language property enables the server to use the right translation to create a Member Caption (used in the application model) in the language of the user.

Relationships Between Attributes

The Relationship between attributes in a dimension defines the possible associations that one attribute can have with another. Relationship affects all the functions of Analysis Services. It defines the properties of association that the current attribute has with other attributes, including whether an attribute can be accessed through the current attribute.

If one attribute can be accessed through another, that other attribute is treated as a member property of the current attribute. For example, Age and Marital Status are additional properties for our Customer attribute. Table 5.3 describes the properties of Relationship that can be used in the conceptual model.

Table 5.3. Properties of `Relationship`
Property	Description
`RelationshipType`	Defines the rules for modifying the key value of a member of related attribute. This property can be set to one of the two values:
	`Rigid`: The key value of the related attribute and current attribute are bound together and can’t change without a full reprocessing of the dimension. In our example, `Gender` can be defined as a dependent attribute with a rigid relationship, because it won’t change in the dimension.
	`Flexible`: The key of a dependent attribute, and therefore the whole member of the dependent attribute, can be changed at any time. In our example, the `Marital Status` property is a dependent attribute with a flexible relationship because it will periodically change (unfortunately).
`Cardinality`	Defines the nature of the relationship of the key of related attributes when those members are used as member properties.
	`One-to-One`: There is one (and only one) member of the current attribute that relates to each member of the related attribute. For example, if we were associating the names of months with the numbers that represent their order in the year, we would have a one-to-one relationship.
	`One-to-Many`: One of the members of the related attribute can be used as a property of various members from the current attribute. For example, member `Married` from attribute `Marital Status`, can be applied to many members of `Customer` attribute. One-to-many cardinality is much more frequently used than one-to-one.
`Optionality`	Defines the relationship of sets of members in the related attribute and in the current attribute. You can’t set this property through the Dimension Editor; use DDL.
	`Mandatory`: For each member of the related attribute, there is at least one member of the current attribute that references that member. For example, each state has to have at least one city.
	`Optional`: For some of the members of the related attribute, there might not be any member of the current attribute that references that attribute. For example, there are cities that don’t have any stores.
`Name` and `Translations`	When the related attribute is used as a member property of the current attribute, usually the name of the property of the current attribute is the same as the name of the related attribute. For example, when the `Gender` attribute is used as a property of the `Customer` attribute, we say that the `Customer` attribute has a property, `Gender`. However, if we want the property of the `Customer` attribute to be known as `Sex` rather than `Gender`, we define the name in the relationship as “Sex.”
`Visibility`	Determines whether the related attribute is accessible to the user as a member property for the current attribute.
	`False:` The related attribute can’t be used as a member property of the current attribute.
	`True`: The user can access the related attribute as a member property of the current attribute.

In Figure 5.3 you can see a diagram of the Customer dimension that shows the relationships of attributes and the properties of those relationships:

In terms of Type, you can see a Rigid relationship, indicated by a solid line, between Customer ID and Gender.
A Flexible type is indicated by a dotted line. You can see a Flexible relationship between Customer ID and Marital Status.
In terms of Cardinality, One-to-Many is indicated by a triangle in the middle of the relationship line. You have a one-to-many relationship for most attribute relationships.
A One-to-One cardinality is indicated by a diamond in the middle of the line. There is only one example of a one-to-one relationship, that between the Customer ID and Address attributes. (That’s true in our FoodMart 2008 database, although in reality you could have two customers who live at the same address.)
A Mandatory relationship is indicated by a V sign and the end of a line. You can see a mandatory relationship between the City and State Province attributes.
An Optional relationship (most of the relationships in Figure 5.3 ) is indicated by an absence of symbols.

Figure 5.3. The Customer dimension has different attributes with different types of relationships.