Loading Tables

This section describes techniques for loading data into tables. In contains the following topics:

Methods for Loading Tables
Improving INSERT Performance with Direct-Path INSERT
Using Conventional Inserts to Load Tables
Avoiding Bulk INSERT Failures with DML Error Logging

Note:

Beginning with release 11.2.0.2 of Oracle Database, the default size of the first extent of any new segment for a partitioned table is 8 MB instead of 64 KB. This helps improve performance of inserts and queries on partitioned tables. Although partitioned tables will start with a larger initial size, once sufficient data is inserted, the space consumption will be the same as in previous releases. You can override this default by setting the INITIAL size in the storage clause for the table. This new default only applies to table partitions and LOB partitions.

Methods for Loading Tables

There are several means of inserting or initially loading data into your tables. Most commonly used are the following:

Method	Description
SQL*Loader	This Oracle utility program loads data from external files into tables of an Oracle Database. For information about SQL*Loader, see Oracle Database Utilities.
`CREATE TABLE` ... `AS SELECT` statement (CTAS)	Using this SQL statement you can create a table and populate it with data selected from another existing table, including an external table.
`INSERT` statement	The `INSERT` statement enables you to add rows to a table, either by specifying the column values or by specifying a subquery that selects data from another existing table, including an external table. One form of the `INSERT` statement enables direct-path `INSERT`, which can improve performance, and is useful for bulk loading. See "Improving INSERT Performance with Direct-Path INSERT". If you are inserting a lot of data and want to avoid statement termination and rollback if an error is encountered, you can insert with DML error logging. See "Avoiding Bulk INSERT Failures with DML Error Logging".
`MERGE` statement	The `MERGE` statement enables you to insert rows into or update rows of a table, by selecting rows from another existing table. If a row in the new data corresponds to an item that already exists in the table, then an `UPDATE` is performed, else an `INSERT` is performed.

See Oracle Database SQL Language Reference for details on the CREATE TABLE ... AS SELECT, INSERT, and MERGE statements.

Note:

Only a few details and examples of inserting data into tables are included in this book. Oracle documentation specific to data warehousing and application development provide more extensive information about inserting and manipulating data in tables. See:

See Also:

"Managing External Tables"

Improving INSERT Performance with Direct-Path INSERT

When loading large amounts of data, you can improve load performance by using direct-path INSERT.

This section contains:

About Direct-Path INSERT
How Direct-Path INSERT Works
Loading Data with Direct-Path INSERT
Specifying the Logging Mode for Direct-Path INSERT
Additional Considerations for Direct-Path INSERT

About Direct-Path INSERT

Oracle Database inserts data into a table in one of two ways:

During conventional INSERT operations, the database reuses free space in the table, interleaving newly inserted data with existing data. During such operations, the database also maintains referential integrity constraints.
During direct-path INSERT operations, the database appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the table is not reused, and referential integrity constraints are ignored. Direct-path INSERT can perform significantly better than conventional insert.

The database can insert data either in serial mode, where one process executes the statement, or in parallel mode, where multiple processes work together simultaneously to run a single SQL statement. The latter is referred to as parallel execution.

The following are benefits of direct-path INSERT:

During direct-path INSERT, you can disable the logging of redo and undo entries to reduce load time. Conventional insert operations, in contrast, must always log such entries, because those operations reuse free space and maintain referential integrity.
Direct-path INSERT operations ensure atomicity of the transaction, even when run in parallel mode. Atomicity cannot be guaranteed during parallel direct-path loads (using SQL*Loader).

When performing parallel direct-path loads, one notable difference between SQL*Loader and INSERT statements is the following: If errors occur during parallel direct-path loads with SQL*Loader, the load completes, but some indexes could be marked UNUSABLE at the end of the load. Parallel direct-path INSERT, in contrast, rolls back the statement if errors occur during index update.

Note:

A conventional INSERT operation checks for violations of NOT NULL constraints during the insert. Therefore, if a NOT NULL constraint is violated for a conventional INSERT operation, then the error is returned during the insert. A direct-path INSERT operation checks for violations of NOT NULL constraints before the insert. Therefore, if a NOT NULL constraint is violated for a direct-path INSERT operation, then the error is returned before the insert.

How Direct-Path INSERT Works

You can use direct-path INSERT on both partitioned and nonpartitioned tables.

Serial Direct-Path INSERT into Partitioned or Nonpartitioned Tables

The single process inserts data beyond the current high water mark of the table segment or of each partition segment. (The high-water mark is the level at which blocks have never been formatted to receive data.) When a COMMIT runs, the high-water mark is updated to the new value, making the data visible to users.

Parallel Direct-Path INSERT into Partitioned Tables

This situation is analogous to serial direct-path INSERT. Each parallel execution server is assigned one or more partitions, with no more than one process working on a single partition. Each parallel execution server inserts data beyond the current high-water mark of its assigned partition segment(s). When a COMMIT runs, the high-water mark of each partition segment is updated to its new value, making the data visible to users.

Parallel Direct-Path INSERT into Nonpartitioned Tables

Each parallel execution server allocates a new temporary segment and inserts data into that temporary segment. When a COMMIT runs, the parallel execution coordinator merges the new temporary segments into the primary table segment, where it is visible to users.

Loading Data with Direct-Path INSERT

You can load data with direct-path INSERT by using direct-path INSERT SQL statements, inserting data in parallel mode, or by using the Oracle SQL*Loader utility in direct-path mode. A direct-path INSERT can be done in either serial or parallel mode.

Serial Mode Inserts with SQL Statements

You can activate direct-path INSERT in serial mode with SQL in the following ways:

If you are performing an INSERT with a subquery, specify the APPEND hint in each INSERT statement, either immediately after the INSERT keyword, or immediately after the SELECT keyword in the subquery of the INSERT statement.
If you are performing an INSERT with the VALUES clause, specify the APPEND_VALUES hint in each INSERT statement immediately after the INSERT keyword. Direct-path INSERT with the VALUES clause is best used when there are hundreds of thousands or millions of rows to load. The typical usage scenario is for array inserts using OCI. Another usage scenario might be inserts in a FORALL loop in PL/SQL.

If you specify the APPEND hint (as opposed to the APPEND_VALUES hint) in an INSERT statement with a VALUES clause, the APPEND hint is ignored and a conventional insert is performed.

The following is an example of using the APPEND hint to perform a direct-path INSERT:

INSERT /*+ APPEND */ INTO sales_hist SELECT * FROM sales WHERE cust_id=8890;

The following PL/SQL code fragment is an example of using the APPEND_VALUES hint:

FORALL i IN 1..numrecords
  INSERT /*+ APPEND_VALUES */ INTO orderdata 
  VALUES(ordernum(i), custid(i), orderdate(i),shipmode(i), paymentid(i));
COMMIT;

Parallel Mode Inserts with SQL Statements

When you are inserting in parallel mode, direct-path INSERT is the default. However, you can insert in parallel mode using conventional INSERT by using the NOAPPEND PARALLEL hint.

To run in parallel DML mode, the following requirements must be met:

You must have Oracle Enterprise Edition installed.
You must enable parallel DML in your session. To do this, submit the following statement:
```
ALTER SESSION { ENABLE | FORCE } PARALLEL DML;
```
You must meet at least one of the following requirements:
- Specify the parallel attribute for the target table, either at create time or subsequently
- Specify the PARALLEL hint for each insert operation
- Set the database initialization parameter PARALLEL_DEGREE_POLICY to AUTO

To disable direct-path INSERT, specify the NOAPPEND hint in each INSERT statement. Doing so overrides parallel DML mode.

Note:

You cannot query or modify data inserted using direct-path INSERT immediately after the insert is complete. If you attempt to do so, an ORA-12838 error is generated. You must first issue a COMMIT statement before attempting to read or modify the newly-inserted data.

See Also:

"Using Conventional Inserts to Load Tables"
Oracle Database Performance Tuning Guide for more information on using hints
Oracle Database SQL Language Reference for more information on the subquery syntax of INSERT statements and for additional restrictions on using direct-path INSERT

Specifying the Logging Mode for Direct-Path INSERT

Direct-path INSERT lets you choose whether to log redo and undo information during the insert operation.

You can specify logging mode for a table, partition, index, or LOB storage at create time (in a CREATE statement) or subsequently (in an ALTER statement).
If you do not specify either LOGGING or NOLOGGING at these times:
- The logging attribute of a partition defaults to the logging attribute of its table.
- The logging attribute of a table or index defaults to the logging attribute of the tablespace in which it resides.
- The logging attribute of LOB storage defaults to LOGGING if you specify CACHE for LOB storage. If you do not specify CACHE, then the logging attributes defaults to that of the tablespace in which the LOB values resides.
You set the logging attribute of a tablespace in a CREATE TABLESPACE or ALTER TABLESPACE statements.

Note:
If the database or tablespace is in FORCE LOGGING mode, then direct path INSERT always logs, regardless of the logging setting.

Direct-Path INSERT with Logging

In this mode, Oracle Database performs full redo logging for instance and media recovery. If the database is in ARCHIVELOG mode, then you can archive redo logs to tape. If the database is in NOARCHIVELOG mode, then you can recover instance crashes but not disk failures.

Direct-Path INSERT without Logging

In this mode, Oracle Database inserts data without redo or undo logging. Instead, the database logs a small number of block range invalidation redo records and periodically updates the control file with information about the most recent direct write.

Direct-path INSERT without logging improves performance. However, if you subsequently must perform media recovery, the invalidation redo records mark a range of blocks as logically corrupt, because no redo data was logged for them. Therefore, it is important that you back up the data after such an insert operation.

Beginning with release 11.2.0.2 of Oracle Database, you can significantly improve the performance of unrecoverable direct path inserts by disabling the periodic update of the control files. You do so by setting the initialization parameter DB_UNRECOVERABLE_SCN_TRACKING to FALSE. However, if you perform an unrecoverable direct path insert with these control file updates disabled, you will no longer be able to accurately query the database to determine if any datafiles are currently unrecoverable.

See Also:

Oracle Database Backup and Recovery User's Guide for more information about unrecoverable datafiles
The section "Determining If a Backup Is Required After Unrecoverable Operations" in Oracle Data Guard Concepts and Administration

Additional Considerations for Direct-Path INSERT

The following are some additional considerations when using direct-path INSERT.

Compressed Tables

If a table is created with the basic compression, then you must use direct-path INSERT to compress table data as it is loaded. If a table is created with OLTP, warehouse, or online archival compression, then best compression ratios are achieved with direct-path INSERT.

See "Consider Using Table Compression" for more information.

Index Maintenance with Direct-Path INSERT

Oracle Database performs index maintenance at the end of direct-path INSERT operations on tables (partitioned or nonpartitioned) that have indexes. This index maintenance is performed by the parallel execution servers for parallel direct-path INSERT or by the single process for serial direct-path INSERT. You can avoid the performance impact of index maintenance by making the index unusable before the INSERT operation and then rebuilding it afterward.

See Also:

"Making an Index Unusable"

Space Considerations with Direct-Path INSERT

Direct-path INSERT requires more space than conventional-path INSERT.

All serial direct-path INSERT operations, as well as parallel direct-path INSERT into partitioned tables, insert data above the high-water mark of the affected segment. This requires some additional space.

Parallel direct-path INSERT into nonpartitioned tables requires even more space, because it creates a temporary segment for each degree of parallelism. If the nonpartitioned table is not in a locally managed tablespace in automatic segment-space management mode, you can modify the values of the NEXT and PCTINCREASE storage parameter and MINIMUM EXTENT tablespace parameter to provide sufficient (but not excess) storage for the temporary segments. Choose values for these parameters so that:

The size of each extent is not too small (no less than 1 MB). This setting affects the total number of extents in the object.
The size of each extent is not so large that the parallel INSERT results in wasted space on segments that are larger than necessary.

After the direct-path INSERT operation is complete, you can reset these parameters to settings more appropriate for serial operations.

Locking Considerations with Direct-Path INSERT

During direct-path INSERT, the database obtains exclusive locks on the table (or on all partitions of a partitioned table). As a result, users cannot perform any concurrent insert, update, or delete operations on the table, and concurrent index creation and build operations are not permitted. Concurrent queries, however, are supported, but the query will return only the information before the insert operation.

Using Conventional Inserts to Load Tables

During conventional INSERT operations, the database reuses free space in the table, interleaving newly inserted data with existing data. During such operations, the database also maintains referential integrity constraints. Unlike direct-path INSERT operations, conventional INSERT operations do not require an exclusive lock on the table.

Several other restrictions apply to direct-path INSERT operations that do not apply to conventional INSERT operations. See Oracle Database SQL Language Reference for information about these restrictions.

You can perform a conventional INSERT operation in serial mode or in parallel mode using the NOAPPEND hint.

The following is an example of using the NOAPPEND hint to perform a conventional INSERT in serial mode:

INSERT /*+ NOAPPEND */ INTO sales_hist SELECT * FROM sales WHERE cust_id=8890;

The following is an example of using the NOAPPEND hint to perform a conventional INSERT in parallel mode:

INSERT /*+ NOAPPEND PARALLEL */ INTO sales_hist
   SELECT * FROM sales;

To run in parallel DML mode, the following requirements must be met:

You must have Oracle Enterprise Edition installed.
You must enable parallel DML in your session. To do this, submit the following statement:
```
ALTER SESSION { ENABLE | FORCE } PARALLEL DML;
```
You must meet at least one of the following requirements:
- Specify the parallel attribute for the target table, either at create time or subsequently
- Specify the PARALLEL hint for each insert operation
- Set the database initialization parameter PARALLEL_DEGREE_POLICY to AUTO

Avoiding Bulk INSERT Failures wit h DML Error Logging

When you load a table using an INSERT statement with subquery, if an error occurs, the statement is terminated and rolled back in its entirety. This can be wasteful of time and system resources. For such INSERT statements, you can avoid this situation by using the DML error logging feature.

To use DML error logging, you add a statement clause that specifies the name of an error logging table into which the database records errors encountered during DML operations. When you add this error logging clause to the INSERT statement, certain types of errors no longer terminate and roll back the statement. Instead, each error is logged and the statement continues. You then take corrective action on the erroneous rows at a later time.

DML error logging works with INSERT, UPDATE, MERGE, and DELETE statements. This section focuses on INSERT statements.

To insert data with DML error logging:

Create an error logging table. (Optional)

You can create the table manually or use the DBMS_ERRLOG package to automatically create it for you. See "Creating an Error Logging Table" for details.
Execute an INSERT statement and include an error logging clause. This clause:
- Optionally references the error logging table that you created. If you do not provide an error logging table name, the database logs to an error logging table with a default name. The default error logging table name is ERR$_ followed by the first 25 characters of the name of the table that is being inserted into.
- Optionally includes a tag (a numeric or string literal in parentheses) that gets added to the error log to help identify the statement that caused the errors. If the tag is omitted, a NULL value is used.
- Optionally includes a REJECT LIMIT subclause.
  
  This subclause indicates the maximum number of errors that can be encountered before the INSERT statement terminates and rolls back. You can also specify UNLIMITED. The default reject limit is zero, which means that upon encountering the first error, the error is logged and the statement rolls back. For parallel DML operations, the reject limit is applied to each parallel server.
Note:
If the statement exceeds the reject limit and rolls back, the error logging table retains the log entries recorded so far.

See Oracle Database SQL Language Reference for error logging clause syntax information.
Query the error logging table and take corrective action for the rows that generated errors.

See "Error Logging Table Format", later in this section, for details on the error logging table structure.

Example The following statement inserts rows into the DW_EMPL table and logs errors to the ERR_EMPL table. The tag 'daily_load' is copied to each log entry. The statement terminates and rolls back if the number of errors exceeds 25.

INSERT INTO dw_empl
  SELECT employee_id, first_name, last_name, hire_date, salary, department_id 
  FROM employees
  WHERE hire_date > sysdate - 7
  LOG ERRORS INTO err_empl ('daily_load') REJECT LIMIT 25

For more examples, see Oracle Database SQL Language Reference and Oracle Database Data Warehousing Guide.

Error Logging Table Format

The error logging table consists of two parts:

A mandatory set of columns that describe the error. For example, one column contains the Oracle error number.

Table 20-3 lists these error description columns.
An optional set of columns that contain data from the row that caused the error. The column names match the column names from the table being inserted into (the "DML table").

The number of columns in this part of the error logging table can be zero, one, or more, up to the number of columns in the DML table. If a column exists in the error logging table that has the same name as a column in the DML table, the corresponding data from the offending row being inserted is written to this error logging table column. If a DML table column does not have a corresponding column in the error logging table, the column is not logged. If the error logging table contains a column with a name that does not match a DML table column, the column is ignored.

Because type conversion errors are one type of error that might occur, the data types of the optional columns in the error logging table must be types that can capture any value without data loss or conversion errors. (If the optional log columns were of the same types as the DML table columns, capturing the problematic data into the log could suffer the same data conversion problem that caused the error.) The database makes a best effort to log a meaningful value for data that causes conversion errors. If a value cannot be derived, NULL is logged for the column. An error on insertion into the error logging table causes the statement to terminate.

Table 20-4 lists the recommended error logging table column data types to use for each data type from the DML table. These recommended data types are used when you create the error logging table automatically with the DBMS_ERRLOG package.

Table 20-3 Mandatory Error Description Columns

Column Name	Data Type	Description
`ORA_ERR_NUMBER$`	`NUMBER`	Oracle error number
`ORA_ERR_MESG$`	`VARCHAR2(2000)`	Oracle error message text
`ORA_ERR_ROWID$`	`ROWID`	Rowid of the row in error (for update and delete)
`ORA_ERR_OPTYP$`	`VARCHAR2(2)`	Type of operation: insert (`I`), update (`U`), delete (`D`) Note: Errors from the update clause and insert clause of a `MERGE` operation are distinguished by the `U` and `I` values.
`ORA_ERR_TAG$`	`VARCHAR2(2000)`	Value of the tag supplied by the user in the error logging clause

Table 20-4 Error Logging Table Column Data Types

DML Table Column Type	Error Logging Table Column Type	Notes
`NUMBER`	`VARCHAR2(4000)`	Able to log conversion errors
`CHAR/VARCHAR2(n)`	`VARCHAR2(4000)`	Logs any value without information loss
`NCHAR/NVARCHAR2(n)`	`NVARCHAR2(4000)`	Logs any value without information loss
`DATE/TIMESTAMP`	`VARCHAR2(4000)`	Logs any value without information loss. Converts to character format with the default date/time format mask
`RAW`	`RAW(2000)`	Logs any value without information loss
`ROWID`	`UROWID`	Logs any rowid type
`LONG/LOB`		Not supported
User-defined types		Not supported

Creating an Error Logging Table

You can create an error logging table manually, or you can use a PL/SQL package to automatically create one for you.

Creating an Error Logging Table Automatically

You use the DBMS_ERRLOG package to automatically create an error logging table. The CREATE_ERROR_LOG procedure creates an error logging table with all of the mandatory error description columns plus all of the columns from the named DML table, and performs the data type mappings shown in Table 20-4.

The following statement creates the error logging table used in the previous example.

EXECUTE DBMS_ERRLOG.CREATE_ERROR_LOG('DW_EMPL', 'ERR_EMPL');

See Oracle Database PL/SQL Packages and Types Reference for details on DBMS_ERRLOG.

Creating an Error Logging Table Manually

You use standard DDL to manually create the error logging table. See "Error Logging Table Format" for table structure requirements. You must include all mandatory error description columns. They can be in any order, but must be the first columns in the table.

Error Logging Restrictions and Caveats

Oracle Database logs the following errors during DML operations:

Column values that are too large
Constraint violations (NOT NULL, unique, referential, and check constraints)
Errors raised during trigger execution
Errors resulting from type conversion between a column in a subquery and the corresponding column of the table
Partition mapping errors
Certain MERGE operation errors (ORA-30926: Unable to get a stable set of rows for MERGE operation.)

Some errors are not logged, and cause the DML operation to terminate and roll back. For a list of these errors and for other DML logging restrictions, see the discussion of the error_logging_clause in the INSERT section of Oracle Database SQL Language Reference.

Space Considerations

Ensure that you consider space requirements before using DML error logging. You require available space not only for the table being inserted into, but also for the error logging table.

Security

The user who issues the INSERT statement with DML error logging must have INSERT privileges on the error logging table.

See Also:

Oracle Database SQL Language Reference and Oracle Database Data Warehousing Guide for DML error logging examples.

Loading Tables

Methods for Loading Tables

Improving INSERT Performance with Direct-Path INSERT

About Direct-Path INSERT

How Direct-Path INSERT Works

Serial Direct-Path INSERT into Partitioned or Nonpartitioned Tables

Parallel Direct-Path INSERT into Partitioned Tables

Parallel Direct-Path INSERT into Nonpartitioned Tables

Loading Data with Direct-Path INSERT

Serial Mode Inserts with SQL Statements

Parallel Mode Inserts with SQL Statements

Specifying the Logging Mode for Direct-Path INSERT

Direct-Path INSERT with Logging

Direct-Path INSERT without Logging

Additional Considerations for Direct-Path INSERT

Compressed Tables

Index Maintenance with Direct-Path INSERT

Space Considerations with Direct-Path INSERT

Locking Considerations with Direct-Path INSERT

Using Conventional Inserts to Load Tables

Avoiding Bulk INSERT Failures with DML Error Logging

Error Logging Table Format

Creating an Error Logging Table

Creating an Error Logging Table Automatically

Creating an Error Logging Table Manually

Error Logging Restrictions and Caveats

Space Considerations

Security

Avoiding Bulk INSERT Failures wit h DML Error Logging