PK
,Aoa, mimetypeapplication/epub+zipPK ,A iTunesMetadata.plist=
A cube always returns summary data to a query as needed. While the cube may store data at the day level, for example, it can return a result at the quarter or year level without requiring a calculation in the query. This chapter explains how to optimize the unique aggregation subsystem of Oracle OLAP to provide the best performance for both data maintenance and querying.
This chapter contains the following topics:
Aggregation is the process of consolidating multiple values into a single value. For example, data can be collected on a daily basis and aggregated into a value for the week, the weekly data can be aggregated into a value for the month, and so on. Aggregation allows patterns in the data to emerge, and these patterns are the basis for analysis and decision making. When you define a data model with hierarchical dimensions, you are providing the framework in which aggregate data can be calculated.
Aggregation is frequently called summarization, and aggregate data is called summary data. While the most frequently used aggregation operator is Sum, there are many other operators, such as Average, First, Last, Minimum, and Maximum. Oracle OLAP also supports weighted and hierarchical methods. Following are some simple diagrams showing how the basic types of operators work. For descriptions of all the operators, refer to "Aggregation Operators" .
Figure 9-1 shows a simple hierarchy with four children and one parent value. Three of the children have values, while the fourth is empty. This empty cell has a null or NA
value. The Sum operator calculates a value of (2 + 4 + 6)=12 for the parent value.
Figure 9-1 Summary Aggregation in a Simple Hierarchy
The Average operator calculates the average of all real data, producing an aggregate value of ((2 + 4 + 6)/3)=4, as shown in Figure 9-2.
Figure 9-2 Average Aggregation in a Simple Hierarchy
The hierarchical operators include null values in the count of cells. In Figure 9-3, the Hierarchical Average operator produces an aggregate value of ((2 + 4 + 6 +NA)/4)=3.
Figure 9-3 Hierarchical Average Aggregation in a Simple Hierarchy
The weighted operators use the values in another measure to generate weighted values before performing the aggregation. Figure 9-4 shows how the simple sum of 12 in Figure 9-1 changes to 20 by using weights ((3*2) + (2*4) + (NA*6) +(4*NA)).
Figure 9-4 Weighted Sum Aggregation in a Simple Hierarchy
Analytic workspaces provide an extensive list of aggregation methods, including weighted, hierarchical, and weighted hierarchical methods.
The following are descriptions of the basic aggregation operators:
Average: Adds non-null data values, then divides the sum by the number of data values.
First Non-NA Data Value: Returns the first real data value.
Last Non-NA Data Value: Returns the last real data value.
Maximum: Returns the largest data value among the children of each parent.
Minimum: Returns the smallest non-null data value among the children of each parent.
Nonadditive: Does not aggregate the data.
Sum: Adds data values.
These operators require a measure providing the weight or scale values in the same cube. In a weight measure, an NA (null) is calculated like a 1
. In a scale measure, an NA is calculated like a 0
.
The weighted operators use outer joins, as described in "When Does Aggregation Order Matter?".
These are the scaled and weighted aggregation operators:
Scaled Sum: Adds the value of a weight object to each data value, then adds the data values.
Weighted Average: Multiplies each data value by a weight factor, adds the data values, and then divides that result by the sum of the weight factors.
Weighted First: Multiplies the first non-null data value by its corresponding weight value.
Weighted Last: Multiplies the last non-null data value by its corresponding weight value.
Weighted Sum: Multiplies each data value by a weight factor, then adds the data values.
The following are descriptions of the hierarchical operators. They include all cells identified by the hierarchy in the calculations, whether or not the cells contain data.
Hierarchical Average and the Hierarchical Weighted operators use outer joins.
Hierarchical Average: Adds data values, then divides the sum by the number of the children in the dimension hierarchy. Unlike Average, which counts only non-null children, hierarchical average counts all of the children of a parent, regardless of whether each child does or does not have a value.
Hierarchical First Member: Returns the first data value in the hierarchy, even when that value is null.
Hierarchical Last Member: Returns the last data value in the hierarchy, even when that value is null.
Hierarchical Weighted Average: Multiplies non-null child data values by their corresponding weight values, then divides the result by the sum of the weight values. Unlike Weighted Average, Hierarchical Weighted Average includes weight values in the denominator sum even when the corresponding child values are null.
Hierarchical Weighted First: Multiplies the first data value in the hierarchy by its corresponding weight value, even when that value is null.
Hierarchical Weighted Last: Multiplies the last data value in the hierarchy by its corresponding weight value, even when that value is null.
The OLAP engine aggregates a cube across one dimension at a time. When the aggregation operators are the same for all dimensions, the order in which they are aggregated may or may not make a difference in the calculated aggregate values, depending on the operator.
You should specify the order of aggregation when a cube uses multiple aggregation methods. The only exceptions are that you can combine Sum and Weighted Sum, or Average and Weighted Average, when the weight measure is only aggregated over the same dimension. For example, a weight measure used to calculate weighted averages across Customer is itself only aggregated across Customer.
The weight operators are incompressible for the specified dimension and all preceding dimensions. For a compressed cube, you should list the weighted operators as early as possible to minimize the number of outer joins. For example, suppose that a cube uses Weighted Sum across Customer, and Sum across all other dimensions. Performance is best if Customer is aggregated first.
The following information provides guidelines for when you must specify the order of the dimensions as part of defining the aggregation rules for a cube.
When these operators are used for all dimension of a cube, the order does not affect the results:
Maximum
Minimum
Sum
Hierarchical First Member
Hierarchical Last Member
Hierarchical Average
Even though you can use the Sum and Maximum operators alone without ordering the dimensions, you cannot use them together without specifying the order. The following figures show how they calculate different results depending on the order of aggregation. Figure 9-5 shows a cube with two dimensions. Sum is calculated first across one dimension of the cube, then Maximum is calculated down the other dimension.
Figure 9-5 Sum Method Followed by Maximum Method
Figure 9-6 shows the same cube, except Maximum is calculated first down one dimension of the cube, then Sum is calculated across the other dimension. The maximum value of the sums in Figure 9-5 is 15, while the sum of the maximum values in Figure 9-6 is 19.
Figure 9-6 Max Method Followed by Sum Method
This example describes changes to the default aggregation of the Units cube in the GLOBAL
analytic workspace. These changes take effect in the next data refresh.
Analytic Workspace Manager initially sets all dimensions to use the Sum operator and aggregates all levels of all dimensions. To change these default settings, use the Rules subtab of the Aggregation tab.
Figure 9-7 shows the operators for the Units Cube. Time is now set to Last Non-NA Data Value, and it is aggregated after the other dimensions. For operators like First and Last, the order in which the dimensions are aggregated can change the results.
Another change is that only the Shipments hierarchy of the Customer dimension is aggregated during data maintenance. Because the Segment hierarchy is seldom queried, the Global DBA chose not to calculate these aggregate values in order to save maintenance time and storage space. However, response time is slower for queries that request Segment aggregations.
Figure 9-7 Selecting the Aggregation Operators
Analytic Workspace Manager initially chooses cost-based aggregation with 35% precomputed values for the bottom partitions and 0% for the top partition. An unpartitioned cube is also set to 35%. This setting means that 35% of the aggregate values is calculated and stored during data maintenance, and 65% is calculated in response to a query. These settings optimize data maintenance.
Increasing the materialization of the bottom partitions improves querying of both the bottom and the top partitions. Increasing the materialization of the top partition improves querying of the most aggregate data and any other hierarchies of the partitioned dimension.
Figure 9-8 shows the settings for the Units Cube. In this case, the Global DBA chose to keep the top partition at 0%, and to increase the bottom partitions from 35 to 50%. This change increases maintenance costs in time and storage space, but improves run-time performance of all partitions.
Figure 9-8 Setting Cost-Based Presummarization
This preface identifies the major enhancements to the OLAP option of Oracle Database.
Several changes in this release make the powerful analytics of Oracle OLAP more accessible.
OLAP DML Support
Analytic Workspace Manager has improved support of the OLAP DML, which makes the powerful analytics of Oracle OLAP, such as forecasts, allocations, and models, more accessible. You can define custom calculated measures using the OLAP DML. You can also create programs directly in Analytic Workspace Manager instead of OLAP Worksheet. You can invoke these programs in a calculated measure, a cube script, or a maintenance script. The OLAP DML object definitions and programs are saved in the XML when creating a template of the analytic workspace.
Several new OLAP DML statements are dimensionally aware, generating all of the physical objects used to support cubes, cube dimensions, and other dimensional database objects.
Features in Analytic Workspace Manager
Several changes in Analytic Workspace Manager support ease-of-use and improved user control.
Calculated measure generator enables you to quickly specify all of the standard, predefined calculated measures for a cube. See "Creating Calculated Measures".
Automatic data type detection ensures the correct setting while simplifying the process of defining attributes and measures.
You can set hierarchy data constraints to the desired level of consistency.
Maintenance scripts provide you with greater control over the maintenance process for an analytic workspace. See "Creating and Executing Maintenance Scripts".
Detailed maintenance reports enable you to track the progress of a maintenance job, detect problems quickly, and correct any issues. See "Loading Data Into Dimensions" and "Loading Data Into Cubes".
You can delete scheduled maintenance jobs, and stop or delete running jobs.
Enhanced mapping support enables you to map to constants, expressions, and aggregate data.
You can identify a different base level for cube aggregation, above the level at which the data is loaded.
Object security enhancements automatically extend the permissions set on an object to its relational views and the analytic workspace.
Materialized views can contain a count of the number of children contributing to an aggregate value in a measure, as needed by some applications.
In Release 2, the OLAP Option to Oracle Database 11g provides numerous ease-of-use features.
Upgrade Support
You can easily upgrade an analytic workspace from CWM or standard form (AWXML) metadata in Oracle OLAP 10g. You can then incorporate all the features of Oracle OLAP 11g as described in this Guide.
Improved Manageability in Analytic Workspace Manager
Several new features in Analytic Workspace Manager simplify the creation and modification of dimensional objects. You can now rename all objects, save partial mappings, drop or modify dimensions, levels, hierarchies and attributes, and use WHERE
conditions in mappings. You can preview the contents of XML templates, fix duplicate names, and modify existing objects by merging them with the template definitions.
SQL Aggregation Management
SQL Aggregation Management is a group of PL/SQL subprograms in DBMS_CUBE
that supports the rapid deployment of cube materialized views from existing relational materialized views. A single cube materialized view can replace many of the relational materialized views of summaries on a fact table, providing uniform response time to all summary data through query rewrite.
In the process of creating the cube materialized views, DBMS_CUBE
also creates a fully functional analytic workspace including a cube and the cube dimensions. Thus, SQL Aggregation Management can also be used to create the initial metadata for a multidimensional data store enhanced with calculated measures and queried directly by analytic applications.
See Also: Oracle Database PL/SQL Packages and Types Reference |
Build Logs
OLAP now provides several logs: a build log, an operations log, a rejected records log, and a dimension compilation log. You can view the logs either in Analytic Workspace Manager or using the PL/SQL DBMS_CUBE_LOG
package.
These logs enable you to track the progress of long running processes, then use the results to profile performance characteristics. They provide information to help you diagnose and remedy problems that may occur during development and maintenance of a cube: Hierarchies that are improperly structured in the relational source tables, records that fail to load, or data refreshes that take too long to complete. They also help diagnose performance problems in querying cubes.
See Also: Oracle Database PL/SQL Packages and Types Reference |
The OLAP Option to Oracle Database 11g continues the development trends of Oracle9i and Oracle Database 10g, especially in deepening integration with the database and enhancing SQL access to cubes, security, and metadata. The power of OLAP is easily accessible to SQL applications. Oracle Database 11g also introduces the cube as a summary management solution for relational OLAP (ROLAP) implementations.
OLAP Metadata Integration
All metadata for cubes and dimensions is stored in the Oracle database and revealed in the data dictionary views, so that you can query the entire business model in SQL. Use of the data dictionary to store the metadata officially codifies the dimensional model in the database, provides significant improvements for metadata queries, and supports other new features such as SQL object security for cubes and dimensions.
Automatic Maintenance of Cube and Dimension Views
Oracle Database 11g automatically creates and maintains relational views for every cube, dimension, and hierarchy in the database. If you modify a dimensional object, such as adding a calculated measure to a cube, the view is immediately re-created to reflect the change. Oracle Database defines these views using the CUBE_TABLE
function, which enables the SQL Optimizer enhancements.
Cube Scripts
A cube script is an ordered list of commands that prepare a cube for querying, such as Clear Data, Load Data, Aggregate, Execute PL/SQL, and Execute OLAP DML. For many applications, cube scripts eliminate the use of procedural programs for processing cubes.
Cost-Based Aggregation
Fast updates and uniform querying performance are two hallmarks of the OLAP option. Cost-based aggregation enhances performance in both areas by executing a fine-grained pre-aggregation strategy and storing sparse data sets very efficiently.
Calculation Expression Syntax
OLAP calculation expressions extend the syntax of the SQL analytic functions. This syntax is familiar to SQL developers and DBAs, so that it is easier for them to adopt than proprietary OLAP languages and APIs.
This syntax is used to define calculations that are embedded in the cube, such as dynamically calculated facts or measures.
Cube Materialized Views
Cube materialized views are cubes that have been enhanced to use the automatic refresh and query rewrite features of Oracle Database.
Cube materialized views bring the fast update and fast query capabilities of the OLAP option to applications that query detail relational tables. Summary data is generated and stored in a cube, and query rewrite automatically redirects queries to the cube materialized views. Applications experience excellent query performance.
Object and Data Security
Oracle Database 11g introduces both object security and data security to OLAP cubes and dimensions. Both types of security are granted to database users and roles.
Object security controls access to analytic workspaces, cubes, and dimensions using standard SQL GRANT
and REVOKE
syntax.
Data security controls access to the data in a cube or a dimension. You can grant SELECT
, INSERT
, UPDATE
, and DELETE
privileges to dimension members (keys) either globally or for a particular cube to control access to the data in a cube.