PK
7Aoa, mimetypeapplication/epub+zipPK 7A iTunesMetadata.plistQ
Table 4-4 lists the views that contain information specific to partitioned tables and indexes:
Table 4-4 Views With Information Specific to Partitioned Tables and Indexes
View | Description |
---|---|
|
|
|
Display partition-level partitioning information, partition storage parameters, and partition statistics generated by the |
|
Display subpartition-level partitioning information, subpartition storage parameters, and subpartition statistics generated by the |
|
Display the partitioning key columns for partitioned tables. |
|
Display the subpartitioning key columns for composite-partitioned tables (and local indexes on composite-partitioned tables). |
|
Display column statistics and histogram information for the partitions of tables. |
|
Display column statistics and histogram information for subpartitions of tables. |
|
Display the histogram data (end-points for each histogram) for histograms on table partitions. |
|
Display the histogram data (end-points for each histogram) for histograms on table subpartitions. |
|
Display partitioning information for partitioned indexes. |
|
Display the following for index partitions: partition-level partitioning information, storage parameters for the partition, statistics collected by the |
|
Display the following information for index subpartitions: partition-level partitioning information, storage parameters for the partition, statistics collected by the |
|
Display information about existing subpartition templates. |
See Also:
|
This chapter describes new features in Oracle Database to support very large databases (VLDB).
Note: This functionality is available starting with Oracle Database 11g Release 2 (11.2.0.2). |
These are the new features in Oracle Database 11g Release 2 (11.2.0.2) to support very large databases:
Enhancements for managing segments
Segment creation on demand for partitioned tables
This feature enables the creation of partitioned tables with deferred segment creation. With this feature, on-disk segments are not created for a subpartition and its dependent objects until the first row is inserted.
For information about partitioning, refer to "Using Partitioning with Segments".
Enhanced TRUNCATE
functionality
If a partition or subpartition has a segment, the truncate feature drops the segment if the DROP
ALL
STORAGE
clause is specified.
For information about partitioning, refer to "Using Partitioning with Segments".
Maintenance package for segment creation on demand
New procedures are added to the PL/SQL DBMS_SPACE_ADMIN
package to enable you to maintain segment creation on demand.
For information about partitioning, refer to "Using Partitioning with Segments".
Managing Parallel Statement Queuing
By default, parallel statements are dequeued from the parallel statement queue in a simple first in, first out (FIFO) order. This feature enables you to use resource manager to manage the parallel statement queue by configuring a resource plan that controls the order in which parallel statements are dequeued. For example, you can ensure that parallel statements associated with a high-priority workload or consumer group are dequeued ahead of parallel statements from low-priority consumer groups. Alternatively, you could implement a fair-share policy that dequeues parallel statements based on the resource allocations configured for each consumer group.
For information about parallel statement queuing, refer to "Parallel Statement Queuing". For information about managing parallel statement queuing, refer to "Managing Parallel Statement Queuing with Resource Manager".
See Also:
|
VLDB and Partitioning Guide
11g Release 2 (11.2)
E25523-01
September 2011
Oracle Database VLDB and Partitioning Guide, 11g Release 2 (11.2)
E25523-01
Copyright © 2008, 2011, Oracle and/or its affiliates. All rights reserved.
Contributors: Hermann Baer, Eric Belden, Jean-Pierre Dijcks, Steve Fogel, Lilian Hobbs, Paul Lane, Sue K. Lee, Diana Lorentz, Valarie Moore, Tony Morales, Mark Van de Wiel
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:
U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle Parkway, Redwood City, CA 94065.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.
Modern enterprises frequently run mission-critical databases containing upwards of several hundred gigabytes, and often several terabytes of data. These enterprises are challenged by the support and maintenance requirements of very large databases (VLDB), and must devise methods to meet those challenges. This chapter contains an overview of VLDB topics, with emphasis on partitioning as a key component of the VLDB strategy.
This chapter contains the following sections:
Partitioning addresses key issues in supporting very large tables and indexes by decomposing them into smaller and more manageable pieces called partitions, which are entirely transparent to an application. SQL queries and Data Manipulation Language (DML) statements do not need to be modified to access partitioned tables. However, after partitions are defined, Data Definition Language (DDL) statements can access and manipulate individual partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects.
Each partition of a table or index must have the same logical attributes, such as column names, data types, and constraints, but each partition can have separate physical attributes, such as compression enabled or disabled, physical storage settings, and tablespaces.
Partitioning is useful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.
Partitioning offers these advantages:
It enables data management operations such as data loads, index creation and rebuilding, and backup and recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.
It improves query performance. Often the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.
It significantly reduces the impact of scheduled downtime for maintenance operations.
Partition independence for partition maintenance operations lets you perform concurrent maintenance operations on different partitions of the same table or index. You can also run concurrent SELECT
and DML operations against partitions that are unaffected by maintenance operations.
It increases the availability of mission-critical databases if critical tables and indexes are divided into partitions to reduce the maintenance windows, recovery times, and impact of failures.
Parallel execution provides specific advantages to optimize resource utilization, and minimize execution time. Parallel execution against partitioned objects is key for scalability in a clustered environment. Parallel execution is supported for queries and for DML and DDL.
Partitioning enables faster data access within an Oracle database. Whether a database has 10 GB or 10 TB of data, partitioning can improve data access by orders of magnitude. Partitioning can be implemented without requiring any modifications to your applications. For example, you could convert a nonpartitioned table to a partitioned table without needing to modify any of the SELECT
statements or DML statements that access that table. You do not need to rewrite your application code to take advantage of partitioning.
A very large database has no minimum absolute size. Although a VLDB is a database like smaller databases, there are specific challenges in managing a VLDB. These challenges are related to the sheer size and the cost-effectiveness of performing operations against a system of that size.
Several trends have been responsible for the steady growth in database size:
For a long time, systems have been developed in isolation. Companies have started to see the benefits of combining these systems to enable cross-departmental analysis while reducing system maintenance costs. Consolidation of databases and applications is a key factor in the ongoing growth of database size.
Many companies face regulations for storing data for a minimum amount of time. The regulations generally result in more data being stored for longer periods of time.
Companies grow by expanding sales and operations or through mergers and acquisitions, causing the amount of generated and processed data to increase. At the same time, the user population that relies on the database for daily activities increases.
Partitioning is a critical feature for managing very large databases. Growth is the basic challenge that partitioning addresses for very large databases, and partitioning enables a divide and conquer technique for managing the tables and indexes in the database, especially as those tables and indexes grow. Partitioning is the feature that allows a database to scale for very large data sets while maintaining consistent performance, without unduly increasing administrative or hardware resources. Chapter 3, "Partitioning for Availability, Manageability, and Performance" provides availability, manageability, and performance considerations for partitioning implementations.
Chapter 9, "Backing Up and Recovering VLDBs" addresses the challenges surrounding backup and recovery for a VLDB.
Storage is a key component of a very large database. Chapter 10, "Storage Management for VLDBs" focuses on best practices for storage in a VLDB.
Information Lifecycle Management (ILM) is a set of processes and policies for managing data throughout its useful life. One important component of an ILM strategy is determining the most appropriate and cost-effective medium for storing data at any point during its lifetime: newer data used in day-to-day operations is stored on the fastest, most highly-available storage tier, while older data which is accessed infrequently may be stored on a less expensive and less efficient storage tier. Older data may also be updated less frequently so it makes sense to compress and store the data as read-only.
Oracle Database provides the ideal environment for implementing your ILM solution. Oracle supports multiple storage tiers, and because all of the data remains in the Oracle database, multiple storage tiers are completely transparent to the application and the data continues to be completely secure. Partitioning provides the fundamental technology that enables data in tables to be stored in different partitions.
Although multiple storage tiers and sophisticated ILM policies are most often found in enterprise-level systems, most companies and most databases need some degree of information lifecycle management. The most basic of ILM operations, archiving older data and purging or removing that data from the database, can be orders of magnitude faster when using partitioning.
For more information about ILM, see Chapter 5, "Using Partitioning for Information Lifecycle Management".
The benefits of partitioning are not just for very large databases; every database, even small databases, can benefit from partitioning. While partitioning is a necessity for large databases, partitioning is obviously beneficial for the smaller database as well. Even a database whose size is measured in megabytes can gain the same type of performance and manageability benefits from partitioning as the largest multi-terabyte systems.
For more information about how partitioning can provide benefits in a data warehouse environment, see Chapter 6, "Using Partitioning in a Data Warehouse Environment".
For more information about how partitioning can provide benefits in an OLTP environment, see Chapter 7, "Using Partitioning in an Online Transaction Processing Environment".
This section contains some ideas for improving performance in a parallel execution environment and includes the following topics:
Oracle Database cannot return results to a user process in parallel. If a query returns a large number of rows, execution of the query might indeed be faster. However, the user process can receive the rows only serially. To optimize parallel execution performance for queries that retrieve large result sets, use PARALLEL
CREATE
TABLE
... AS
SELECT
or direct-path INSERT
to store the result set in the database. At a later time, users can view the result set serially.
Performing the SELECT
in parallel does not influence the CREATE
statement. If the CREATE
statement is executed in parallel, however, the optimizer tries to make the SELECT
run in parallel also.
When combined with the NOLOGGING
option, the parallel version of CREATE
TABLE
... AS
SELECT
provides a very efficient intermediate table facility, for example:
CREATE TABLE summary PARALLEL NOLOGGING AS SELECT dim_1, dim_2 ..., SUM (meas_1) FROM facts GROUP BY dim_1, dim_2;
These tables can also be incrementally loaded with parallel INSERT
. You can take advantage of intermediate tables using the following techniques:
Common subqueries can be computed once and referenced many times. This can allow some queries against star schemas (in particular, queries without selective WHERE
-clause predicates) to be better parallelized. Note that star queries with selective WHERE
-clause predicates using the star-transformation technique can be effectively parallelized automatically without any modification to the SQL.
Decompose complex queries into simpler steps to provide application-level checkpoint or restart. For example, a complex multitable join on a one terabyte database could run for dozens of hours. A failure during this query would mean starting over from the beginning. Using CREATE
TABLE
... AS
SELECT
or PARALLEL
INSERT
AS
SELECT
, you can rewrite the query as a sequence of simpler queries that run for a few hours each. If a system failure occurs, the query can be restarted from the last completed step.
Implement manual parallel delete operations efficiently by creating a new table that omits the unwanted rows from the original table, and then dropping the original table. Alternatively, you can use the convenient parallel delete feature, which directly deletes rows from the original table.
Create summary tables for efficient multidimensional drill-down analysis. For example, a summary table might store the sum of revenue grouped by month, brand, region, and salesman.
Reorganize tables, eliminating chained rows, compressing free space, and so on, by copying the old table to a new table. This is much faster than export/import and easier than reloading.
Be sure to use the DBMS_STATS
package to gather optimizer statistics on newly created tables. To avoid I/O bottlenecks, specify a tablespace that is striped across at least as many physical disks as CPUs. To avoid fragmentation in allocating space, the number of files in a tablespace should be a multiple of the number of CPUs. See Oracle Database Data Warehousing Guide, for more information about bottlenecks.
Use the EXPLAIN
PLAN
statement to see the execution plans for parallel queries. The EXPLAIN
PLAN
output shows optimizer information in the COST
, BYTES
, and CARDINALITY
columns. You can also use the utlxplp.sql
script to present the EXPLAIN
PLAN
output with all relevant parallel information.
There are several ways to optimize the parallel execution of join statements. You can alter system configuration, adjust parameters as discussed earlier in this chapter, or use hints, such as the DISTRIBUTION
hint.
The key points when using EXPLAIN
PLAN
are to:
Verify optimizer selectivity estimates. If the optimizer thinks that only one row is produced from a query, it tends to favor using a nested loop. This could be an indication that the tables are not analyzed or that the optimizer has made an incorrect estimate about the correlation of multiple predicates on the same table. Extended statistics or a hint may be required to provide the optimizer with the correct selectivity or to force the optimizer to use another join method.
Use hash join on low cardinality join keys. If a join key has few distinct values, then a hash join may not be optimal. If the number of distinct values is less than the degree of parallelism (DOP), then some parallel query servers may be unable to work on the particular query.
Consider data skew. If a join key involves excessive data skew, a hash join may require some parallel query servers to work more than others. Consider using a hint to cause a BROADCAST
distribution method if the optimizer did not choose it. Note that the optimizer considers the BROADCAST
distribution method only if the OPTIMIZER_FEATURES_ENABLE
is set to 9.0.2 or higher. See "V$PQ_TQSTAT" for more information.
The following example illustrates how the optimizer intends to execute a parallel query:
explain plan for SELECT /*+ PARALLEL */ cust_first_name, cust_last_name FROM customers c, sales s WHERE c.cust_id = s.cust_id; ---------------------------------------------------------- | Id | Operation | Name | ---------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | | 3 | NESTED LOOPS | | | 4 | PX BLOCK ITERATOR | | | 5 | TABLE ACCESS FULL | CUSTOMERS | | 6 | PARTITION RANGE ALL | | | 7 | BITMAP CONVERSION TO ROWIDS| | | 8 | BITMAP INDEX SINGLE VALUE | SALES_CUST_BIX | ---------------------------------------------------------- Note ----- - Computed Degree of Parallelism is 2 - Degree of Parallelism of 2 is derived from scan of object SH.CUSTOMERS
When you want to refresh your data warehouse database using parallel insert, update, or delete operations on a data warehouse, there are additional issues to consider when designing the physical database. These considerations do not affect parallel execution operations. These issues are:
If a parallel restriction is violated, the operation is simply performed serially. If a direct-path INSERT
restriction is violated, then the APPEND
hint is ignored and a conventional insert operation is performed. No error message is returned.
For tables that do not have the parallel DML itl
invariant property (tables created before Oracle Database release 9.2 or tables that were created with the COMPATIBLE
initialization parameter set to less than 9.2
), the degree of parallelism (DOP) equals the number of partitions or subpartitions. That means that, if the table is not partitioned, the query runs serially. To see what tables do not have this property, issue the following statement:
SELECT u.name, o.name FROM obj$ o, tab$ t, user$ u WHERE o.obj# = t.obj# AND o.owner# = u.user# AND bitand(t.property,536870912) != 536870912;
For information about the interested transaction list (ITL), also called the transaction table, refer to Oracle Database Concepts.
If you have global indexes, a global index segment and global index blocks are shared by server processes of the same parallel DML statement. Even if the operations are not performed against the same row, the server processes can share the same index blocks. Each server transaction needs one transaction entry in the index block header before it can make changes to a block. Therefore, in the CREATE
INDEX
or ALTER
INDEX
statements, you should set INITRANS
, the initial number of transactions allocated within each data block, to a large value, such as the maximum DOP against this index.
There is a limitation on the available number of transaction free lists for segments in dictionary-managed tablespaces. After a segment has been created, the number of process and transaction free lists is fixed and cannot be altered. If you specify a large number of process free lists in the segment header, you might find that this limits the number of transaction free lists that are available. You can abate this limitation the next time you re-create the segment header by decreasing the number of process free lists; this leaves more room for transaction free lists in the segment header.
For UPDATE
and DELETE
operations, each server process can require its own transaction free list. The parallel DML DOP is thus effectively limited by the smallest number of transaction free lists available on the table and on any of the global indexes the DML statement must maintain. For example, if the table has 25 transaction free lists and the table has two global indexes, one with 50 transaction free lists and one with 30 transaction free lists, the DOP is limited to 25. If the table had 40 transaction free lists, the DOP would have been limited to 30.
The FREELISTS
parameter of the STORAGE
clause is used to set the number of process free lists. By default, no process free lists are created.
The default number of transaction free lists depends on the block size. For example, if the number of process free lists is not set explicitly, a 4 KB block has about 80 transaction free lists by default. The minimum number of transaction free lists is 25.
Parallel DDL and parallel DML operations can generate a large number of redo logs. A single ARCH
process to archive these redo logs might not be able to keep up. To avoid this problem, you can spawn multiple archiver processes manually or by using a job queue.
Parallel DML operations use a large number of data, index, and undo blocks in the buffer cache during a short interval. For example, suppose you see a high number of free_buffer_waits
after querying the V$SYSTEM_EVENT
view, as in the following syntax:
SELECT TOTAL_WAITS FROM V$SYSTEM_EVENT WHERE EVENT = 'FREE BUFFER WAITS';
In this case, you should consider increasing the DBWn processes. If there are no waits for free buffers, the query does not return any rows.
The [NO]LOGGING
clause applies to tables, partitions, tablespaces, and indexes. Virtually no log is generated for certain operations (such as direct-path INSERT
) if the NOLOGGING
clause is used. The NOLOGGING
attribute is not specified at the INSERT
statement level but is instead specified when using the ALTER
or CREATE
statement for a table, partition, index, or tablespace.
When a table or index has NOLOGGING
set, neither parallel nor serial direct-path INSERT
operations generate redo logs. Processes running with the NOLOGGING
option set run faster because no redo is generated. However, after a NOLOGGING
operation against a table, partition, or index, if a media failure occurs before a backup is performed, then all tables, partitions, and indexes that have been modified might be corrupted.
Direct-path INSERT
operations (except for dictionary updates) never generate redo logs if the NOLOGGING
clause is used. The NOLOGGING
attribute does not affect undo, only redo. To be precise, NOLOGGING
allows the direct-path INSERT
operation to generate a negligible amount of redo (range-invalidation redo, as opposed to full image redo).
For backward compatibility, [UN]RECOVERABLE
is still supported as an alternate keyword with the CREATE
TABLE
statement. This alternate keyword might not be supported, however, in future releases.
At the tablespace level, the logging clause specifies the default logging attribute for all tables, indexes, and partitions created in the tablespace. When an existing tablespace logging attribute is changed by the ALTER
TABLESPACE
statement, then all tables, indexes, and partitions created after the ALTER
statement have the new logging attribute; existing ones do not change their logging attributes. The tablespace-level logging attribute can be overridden by the specifications at the table, index, or partition level.
The default logging attribute is LOGGING
. However, if you have put the database in NOARCHIVELOG
mode, by issuing ALTER
DATABASE
NOARCHIVELOG
, then all operations that can be done without logging do not generate logs, regardless of the specified logging attribute.
Multiple processes can work simultaneously to create an index. By dividing the work necessary to create an index among multiple server processes, Oracle Database can create the index more quickly than if a single server process created the index serially.
Parallel index creation works in much the same way as a table scan with an ORDER
BY
clause. The table is randomly sampled and a set of index keys is found that equally divides the index into the same number of pieces as the DOP. A first set of query processes scans the table, extracts key-rowid pairs, and sends each pair to a process in a second set of query processes based on a key. Each process in the second set sorts the keys and builds an index in the usual fashion. After all index pieces are built, the parallel execution coordinator simply concatenates the pieces (which are ordered) to form the final index.
Parallel local index creation uses a single server set. Each server process in the set is assigned a table partition to scan and for which to build an index partition. Because half as many server processes are used for a given DOP, parallel local index creation can be run with a higher DOP. However, the DOP is restricted to be less than or equal to the number of index partitions you want to create. To avoid this limitation, you can use the DBMS_PCLXUTIL
package.
You can optionally specify that no redo and undo logging should occur during index creation. This can significantly improve performance but temporarily renders the index unrecoverable. Recoverability is restored after the new index is backed up. If your application can tolerate a window where recovery of the index requires it to be re-created, then you should consider using the NOLOGGING
clause.
The PARALLEL
clause in the CREATE
INDEX
statement is the only way in which you can specify the DOP for creating the index. If the DOP is not specified in the parallel clause of the CREATE
INDEX
statement, then the number of CPUs is used as the DOP. If there is no PARALLEL
clause, index creation is done serially.
When creating an index in parallel, the STORAGE
clause refers to the storage of each of the subindexes created by the query server processes. Therefore, an index created with an INITIAL
value of 5 MB and a DOP of 12 consumes at least 60 MB of storage during index creation because each process starts with an extent of 5 MB. When the query coordinator process combines the sorted subindexes, some extents might be trimmed, and the resulting index might be smaller than the requested 60 MB.
When you add or enable a UNIQUE
or PRIMARY
KEY
constraint on a table, you cannot automatically create the required index in parallel. Instead, manually create an index on the desired columns, using the CREATE
INDEX
statement and an appropriate PARALLEL
clause, and then add or enable the constraint. Oracle Database then uses the existing index when enabling or adding the constraint.
Multiple constraints on the same table can be enabled concurrently and in parallel if all the constraints are in the ENABLE
NOVALIDATE
state. In the following example, the ALTER
TABLE
... ENABLE
CONSTRAINT
statement performs the table scan that checks the constraint in parallel:
CREATE TABLE a (a1 NUMBER CONSTRAINT ach CHECK (a1 > 0) ENABLE NOVALIDATE) PARALLEL; INSERT INTO a values (1); COMMIT; ALTER TABLE a ENABLE CONSTRAINT ach;
This section provides an overview of parallel DML functionality. The topics covered include:
See Also:
|
The functionality available using an INSERT
statement can be summarized as shown in Table 8-5:
Table 8-5 Summary of INSERT Features
Insert Type | Parallel | Serial | NOLOGGING |
---|---|---|---|
Conventional |
No See text in this section for information about using the |
Yes |
No |
Direct-path
( |
Yes, but requires
and one of the following:
Or the following
|
Yes, but requires:
|
Yes, but requires:
|
If parallel DML is enabled and there is a PARALLEL
hint or PARALLEL
attribute set for the table in the data dictionary, then insert operations are parallel and appended, unless a restriction applies. If either the PARALLEL
hint or PARALLEL
attribute is missing, the insert operation is performed serially. Note that automatic DOP only parallelizes the DML part of a SQL statement if and only if parallel DML is enabled or forced.
If parallel DML is enabled, then you can use the NOAPPEND
hint to perform a parallel conventional insert operation. For example, you can use /*+
noappend
parallel
*/
with the SQL INSERT
statement to perform a parallel conventional insert.
SQL> INSERT /*+ NOAPPEND PARALLEL */ INTO sales_hist SELECT * FROM sales;
The advantage of the parallel conventional insert operation is the ability to perform online operations with none of the restrictions of direct-path INSERT
. The disadvantage of the parallel conventional insert operation is that this process may be slower than direct-path INSERT
.
The append mode is the default during a parallel insert operation: data is always inserted into a new block, which is allocated to the table. Therefore, the APPEND
hint is optional. You should use append mode to increase the speed of INSERT
operations, but not when space utilization must be optimized. You can use NOAPPEND
to override append mode.
The APPEND
hint applies to both serial and parallel insert operation: even serial insertions are faster if you use this hint. The APPEND
hint, however, does require more space and locking overhead.
You can use NOLOGGING
with APPEND
to make the process even faster. NOLOGGING
means that no redo log is generated for the operation. NOLOGGING
is never the default; use it when you want to optimize performance. It should not typically be used when recovery is needed for the table or partition. If recovery is needed, be sure to perform a backup immediately after the operation. Use the ALTER
TABLE
[NO]LOGGING
statement to set the appropriate value.
When the table or partition has the PARALLEL
attribute in the data dictionary, that attribute setting is used to determine parallelism of INSERT
, UPDATE
, and DELETE
statements and queries. An explicit PARALLEL
hint for a table in a statement overrides the effect of the PARALLEL
attribute in the data dictionary.
You can use the NO_PARALLEL
hint to override a PARALLEL
attribute for the table in the data dictionary. In general, hints take precedence over attributes.
DML operations are considered for parallelization only if the session is in a PARALLEL
DML
enabled mode. (Use ALTER
SESSION
ENABLE
PARALLEL
DML
to enter this mode.) The mode does not affect parallelization of queries or of the query portions of a DML statement.
In the INSERT
... SELECT
statement, you can specify a PARALLEL
hint after the INSERT
keyword, in addition to the hint after the SELECT
keyword. The PARALLEL
hint after the INSERT
keyword applies to the INSERT
operation only, and the PARALLEL
hint after the SELECT
keyword applies to the SELECT
operation only. Thus, parallelism of the INSERT
and SELECT
operations are independent of each other. If one operation cannot be performed in parallel, it has no effect on whether the other operation can be performed in parallel.
The ability to parallelize insert operations causes a change in existing behavior if the user has explicitly enabled the session for parallel DML and if the table in question has a PARALLEL
attribute set in the data dictionary entry. In that case, existing INSERT
... SELECT
statements that have the select operation parallelized can also have their insert operation parallelized.
If you query multiple tables, you can specify multiple SELECT
PARALLEL
hints and multiple PARALLEL
attributes.
Example 8-10 shows the addition of the new employees who were hired after the acquisition of ACME
.
The PARALLEL
hint (placed immediately after the UPDATE
or DELETE
keyword) applies not only to the underlying scan operation, but also to the UPDATE
or DELETE
operation. Alternatively, you can specify UPDATE
or DELETE
parallelism in the PARALLEL
clause specified in the definition of the table to be modified.
If you have explicitly enabled parallel DML for the session or transaction, UPDATE
or DELETE
statements that have their query operation parallelized can also have their UPDATE
or DELETE
operation parallelized. Any subqueries or updatable views in the statement can have their own separate PARALLEL
hints or clauses, but these parallel directives do not affect the decision to parallelize the update or delete. If these operations cannot be performed in parallel, it has no effect on whether the UPDATE
or DELETE
portion can be performed in parallel.
Example 8-11 shows the update operation to give a 10 percent salary raise to all clerks in Dallas.
Example 8-11 Parallelizing UPDATE and DELETE
UPDATE /*+ PARALLEL(employees) */ employees SET SAL=SAL * 1.1 WHERE JOB='CLERK' AND DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE LOCATION='DALLAS');
The PARALLEL
hint is applied to the UPDATE
operation and to the scan.
Example 8-12 shows the removal of all products in the grocery category because the grocery business line was recently spun off into a separate company.
Parallel DML combined with the updatable join views facility provides an efficient solution for refreshing the tables of a data warehouse system. To refresh tables is to update them with the differential data generated from the OLTP production system.
In the following example, assume a refresh of a table named customer
that has columns c_key
, c_name
, and c_addr
. The differential data contains either new rows or rows that have been updated since the last refresh of the data warehouse. In this example, the updated data is shipped from the production system to the data warehouse system by means of ASCII files. These files must be loaded into a temporary table, named diff_customer
, before starting the refresh process. You can use SQL*Loader with both the parallel and direct options to efficiently perform this task. You can use the APPEND
hint when loading in parallel as well.
After diff_customer
is loaded, the refresh process can be started. It can be performed in two phases or by merging in parallel, as demonstrated in the following:
The following statement is a straightforward SQL implementation of the update using subqueries:
UPDATE customers SET(c_name, c_addr) = (SELECT c_name, c_addr FROM diff_customer WHERE diff_customer.c_key = customer.c_key) WHERE c_key IN(SELECT c_key FROM diff_customer);
Unfortunately, the two subqueries in this statement affect performance.
An alternative is to rewrite this query using updatable join views. To rewrite the query, you must first add a primary key constraint to the diff_customer
table to ensure that the modified columns map to a key-preserved table:
CREATE UNIQUE INDEX diff_pkey_ind ON diff_customer(c_key) PARALLEL NOLOGGING; ALTER TABLE diff_customer ADD PRIMARY KEY (c_key);
You can then update the customers
table with the following SQL statement:
UPDATE /*+ PARALLEL(cust_joinview) */ (SELECT /*+ PARALLEL(customers) PARALLEL(diff_customer) */ CUSTOMER.c_name AS c_name CUSTOMER.c_addr AS c_addr, diff_customer.c_name AS c_newname, diff_customer.c_addr AS c_newaddr FROM diff_customer WHERE customers.c_key = diff_customer.c_key) cust_joinview SET c_name = c_newname, c_addr = c_newaddr;
The underlying scans feeding the join view cust_joinview
are done in parallel. You can then parallelize the update to further improve performance, but only if the customers
table is partitioned.
The last phase of the refresh process consists of inserting the new rows from the diff_customer
temporary table to the customers
table. Unlike the update case, you cannot avoid having a subquery in the INSERT
statement:
INSERT /*+PARALLEL(customers)*/ INTO customers SELECT * FROM diff_customer s);
However, you can guarantee that the subquery is transformed into an anti-hash join by using the HASH_AJ
hint. Doing so enables you to use parallel INSERT
to execute the preceding statement efficiently. Parallel INSERT
is applicable even if the table is not partitioned.
You can combine update and insert operations into one statement, commonly known as a merge. The following statement achieves the same result as all of the statements in "Updating the Table in Parallel" and "Inserting the New Rows into the Table in Parallel":
MERGE INTO customers USING diff_customer ON (diff_customer.c_key = customer.c_key) WHEN MATCHED THEN UPDATE SET (c_name, c_addr) = (SELECT c_name, c_addr FROM diff_customer WHERE diff_customer.c_key = customers.c_key) WHEN NOT MATCHED THEN INSERT VALUES (diff_customer.c_key,diff_customer.c_data);
This section describes how to perform partition and subpartition maintenance operations for both tables and indexes.
This section contains the following topics:
See Also:
|
Note: The following sections discuss maintenance operations on partitioned tables. Where the usability of indexes or index partitions affected by the maintenance operation is discussed, consider the following:
|
Table 4-1 lists partition maintenance operations that can be performed on partitioned tables and composite partitioned tables, and Table 4-2 lists subpartition maintenance operations that can be performed on composite partitioned tables. For each type of partitioning and subpartitioning, the specific clause of the ALTER TABLE
statement that is used to perform that maintenance operation is listed.
Table 4-1 ALTER TABLE Maintenance Operations for Table Partitions
Maintenance Operation | RangeComposite Range-* | IntervalComposite Interval-* | Hash | ListComposite List-* | Reference |
---|---|---|---|---|---|
|
|
|
|
|
N/AFoot 1 |
|
N/A |
N/A |
|
N/A |
N/AFootref 1 |
|
|
|
N/A |
|
N/AFootref 1 |
|
|
|
|
|
|
|
|
|
N/A |
|
N/AFootref 1 |
|
|
|
|
|
|
Modifying Real Attributes of Partitions |
|
|
|
|
|
Modifying List Partitions: Adding Values |
N/A |
N/A |
N/A |
|
N/A |
Modifying List Partitions: Dropping Values |
N/A |
N/A |
N/A |
|
N/A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
N/A |
|
N/AFootref 1 |
|
|
|
|
|
|
Footnote 1 These operations cannot be performed on reference-partitioned tables. If performed on a parent table, then these operations cascade to all descendant tables.
Table 4-2 ALTER TABLE Maintenance Operations for Table Subpartitions
Maintenance Operation | Composite *-Range | Composite *-Hash | Composite *-List |
---|---|---|---|
|
|
|
|
|
N/A |
|
N/A |
|
|
N/A |
|
|
|
N/A |
|
|
|
N/A |
|
|
|
|
|
Modifying Real Attributes of Partitions |
|
|
|
Modifying List Partitions: Adding Values |
N/A |
N/A |
|
Modifying List Partitions: Dropping Values |
N/A |
N/A |
|
Modifying a Subpartition Template |
|
|
|
|
|
|
|
|
|
|
|
|
|
N/A |
|
|
|
|
|
Note: The first time you use table compression to introduce a compressed partition into a partitioned table that has bitmap indexes and that currently contains only uncompressed partitions, you must do the following:
These actions are independent of whether any partitions contain data and of the operation that introduces the compressed partition. This does not apply to partitioned tables with B-tree indexes or to partitioned index-organized tables. |
Table 4-3 lists maintenance operations that can be performed on index partitions, and indicates on which type of index (global or local) they can be performed. The ALTER INDEX
clause used for the maintenance operation is shown.
Global indexes do not reflect the structure of the underlying table. If partitioned, they can be partitioned by range or hash. Partitioned global indexes share some, but not all, of the partition maintenance operations that can be performed on partitioned tables.
Because local indexes reflect the underlying structure of the table, partitioning is maintained automatically when table partitions and subpartitions are affected by maintenance activity. Therefore, partition maintenance on local indexes is less necessary and there are fewer options.
Table 4-3 ALTER INDEX Maintenance Operations for Index Partitions
Maintenance Operation | Type of Index | Type of Index Partitioning | ||
---|---|---|---|---|
Range | Hash and List | Composite | ||
|
Global |
|
|
- |
Local |
N/A |
N/A |
N/A | |
|
Global |
|
- |
- |
Local |
N/A |
N/A |
N/A | |
Modifying Default Attributes of Index Partitions |
Global |
|
- |
- |
Local |
|
|
| |
Modifying Real Attributes of Index Partitions |
Global |
|
- |
- |
Local |
|
|
| |
|