Oracle® Database Administrator's Guide 11g Release 2 (11.2) Part Number E25494-02 |
|
|
PDF · Mobi · ePub |
This section contains:
Oracle Database allows you read-only access to data in external tables. External tables are defined as tables that do not reside in the database, and can be in any format for which an access driver is provided. By providing the database with metadata describing an external table, the database is able to expose the data in the external table as if it were data residing in a regular database table. The external data can be queried directly and in parallel using SQL.
You can, for example, select, join, or sort external table data. You can also create views and synonyms for external tables. However, no DML operations (UPDATE
, INSERT
, or DELETE
) are possible, and no indexes can be created, on external tables.
External tables provide a framework to unload the result of an arbitrary SELECT
statement into a platform-independent Oracle-proprietary format that can be used by Oracle Data Pump. External tables provide a valuable means for performing basic extraction, transformation, and loading (ETL) tasks that are common for data warehousing.
The means of defining the metadata for external tables is through the CREATE TABLE...ORGANIZATION EXTERNAL
statement. This external table definition can be thought of as a view that allows running any SQL query against external data without requiring that the external data first be loaded into the database. An access driver is the actual mechanism used to read the external data in the table. When you use external tables to unload data, the metadata is automatically created based on the data types in the SELECT
statement.
Oracle Database provides two access drivers for external tables. The default access driver is ORACLE_LOADER
, which allows the reading of data from external files using the Oracle loader technology. The ORACLE_LOADER
access driver provides data mapping capabilities which are a subset of the control file syntax of SQL*Loader utility. The second access driver, ORACLE_DATAPUMP
, lets you unload data—that is, read data from the database and insert it into an external table, represented by one or more external files—and then reload it into an Oracle Database.
The following are restrictions on external tables:
The ANALYZE
statement is not supported for gathering statistics for external tables. Use the DBMS_STATS
package instead.
Virtual columns are not supported
See Also:
Oracle Database Utilities for information about access drivers
Oracle Database Data Warehousing Guide for information about using external tables for ETL in a data warehousing environment
Oracle Database Performance Tuning Guide for information about using the DBMS_STATS
package
You create external tables using the CREATE
TABLE
statement with an ORGANIZATION
EXTERNAL
clause. This statement creates only metadata in the data dictionary.
Note:
External tables cannot have virtual columns.The following example creates an external table and then uploads the data to a database table. Alternatively, you can unload data through the external table framework by specifying the AS
subquery
clause of the CREATE TABLE
statement. External table data pump unload can use only the ORACLE_DATAPUMP
access driver.
EXAMPLE: Creating an External Table and Loading Data
In this example, the data for the external table resides in the two text files empxt1.dat
and empxt2.dat
.
The file empxt1.dat
contains the following sample data:
360,Jane,Janus,ST_CLERK,121,17-MAY-2001,3000,0,50,jjanus 361,Mark,Jasper,SA_REP,145,17-MAY-2001,8000,.1,80,mjasper 362,Brenda,Starr,AD_ASST,200,17-MAY-2001,5500,0,10,bstarr 363,Alex,Alda,AC_MGR,145,17-MAY-2001,9000,.15,80,aalda
The file empxt2.dat
contains the following sample data:
401,Jesse,Cromwell,HR_REP,203,17-MAY-2001,7000,0,40,jcromwel 402,Abby,Applegate,IT_PROG,103,17-MAY-2001,9000,.2,60,aapplega 403,Carol,Cousins,AD_VP,100,17-MAY-2001,27000,.3,90,ccousins 404,John,Richardson,AC_ACCOUNT,205,17-MAY-2001,5000,0,110,jrichard
The following SQL statements create an external table named admin_ext_employees
in the hr
schema and load data from the external table into the hr.employees
table.
CONNECT / AS SYSDBA; -- Set up directories and grant access to hr CREATE OR REPLACE DIRECTORY admin_dat_dir AS '/flatfiles/data'; CREATE OR REPLACE DIRECTORY admin_log_dir AS '/flatfiles/log'; CREATE OR REPLACE DIRECTORY admin_bad_dir AS '/flatfiles/bad'; GRANT READ ON DIRECTORY admin_dat_dir TO hr; GRANT WRITE ON DIRECTORY admin_log_dir TO hr; GRANT WRITE ON DIRECTORY admin_bad_dir TO hr; -- hr connects. Provide the user password (hr) when prompted. CONNECT hr -- create the external table CREATE TABLE admin_ext_employees (employee_id NUMBER(4), first_name VARCHAR2(20), last_name VARCHAR2(25), job_id VARCHAR2(10), manager_id NUMBER(4), hire_date DATE, salary NUMBER(8,2), commission_pct NUMBER(2,2), department_id NUMBER(4), email VARCHAR2(25) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY admin_dat_dir ACCESS PARAMETERS ( records delimited by newline badfile admin_bad_dir:'empxt%a_%p.bad' logfile admin_log_dir:'empxt%a_%p.log' fields terminated by ',' missing field values are null ( employee_id, first_name, last_name, job_id, manager_id, hire_date char date_format date mask "dd-mon-yyyy", salary, commission_pct, department_id, email ) ) LOCATION ('empxt1.dat', 'empxt2.dat') ) PARALLEL REJECT LIMIT UNLIMITED; -- enable parallel for loading (good if lots of data to load) ALTER SESSION ENABLE PARALLEL DML; -- load the data in hr employees table INSERT INTO employees (employee_id, first_name, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id, email) SELECT * FROM admin_ext_employees;
The following paragraphs contain descriptive information about this example.
The first few statements in this example create the directory objects for the operating system directories that contain the data sources, and for the bad record and log files specified in the access parameters. You must also grant READ
or WRITE
directory object privileges, as appropriate.
Note:
When creating a directory object or BFILEs, ensure that the following conditions are met:The operating system file must not be a symbolic or hard link.
The operating system directory path named in the Oracle Database directory object must be an existing OS directory path.
The operating system directory path named in the directory object should not contain any symbolic links in its components.
The TYPE
specification indicates the access driver of the external table. The access driver is the API that interprets the external data for the database. If you omit the TYPE
specification, ORACLE_LOADER
is the default access driver. You must specify the ORACLE_DATAPUMP
access driver if you specify the AS
subquery
clause to unload data from one Oracle Database and reload it into the same or a different Oracle Database.
The access parameters, specified in the ACCESS PARAMETERS
clause, are opaque to the database. These access parameters are defined by the access driver, and are provided to the access driver by the database when the external table is accessed. See Oracle Database Utilities for a description of the ORACLE_LOADER
access parameters.
The PARALLEL
clause enables parallel query on the data sources. The granule of parallelism is by default a data source, but parallel access within a data source is implemented whenever possible. For example, if PARALLEL=3
were specified, then multiple parallel execution servers could be working on a data source. But, parallel access within a data source is provided by the access driver only if all of the following conditions are met:
The media allows random positioning within a data source
It is possible to find a record boundary from a random position
The datafiles are large enough to make it worthwhile to break up into multiple chunks
Note:
Specifying aPARALLEL
clause is of value only when dealing with large amounts of data. Otherwise, it is not advisable to specify a PARALLEL
clause, and doing so can be detrimental.The REJECT
LIMIT
clause specifies that there is no limit on the number of errors that can occur during a query of the external data. For parallel access, this limit applies to each parallel execution server independently. For example, if REJECT
LIMIT
is specified, each parallel query process is allowed 10 rejections. Hence, the only precisely enforced values for REJECT
LIMIT
on parallel query are 0 and UNLIMITED
.
In this example, the INSERT
INTO
TABLE
statement generates a dataflow from the external data source to the Oracle Database SQL engine where data is processed. As data is parsed by the access driver from the external table sources and provided to the external table interface, the external data is converted from its external representation to its Oracle Database internal data type.
See Also:
Oracle Database SQL Language Reference provides details of the syntax of theCREATE TABLE
statement for creating external tables and specifies restrictions on the use of clausesYou can use any of the ALTER TABLE
clauses shown in Table 20-5 to change the characteristics of an external table. No other clauses are permitted.
Table 20-5 ALTER TABLE Clauses for External Tables
ALTER TABLE Clause | Description | Example |
---|---|---|
|
Changes the reject limit |
ALTER TABLE admin_ext_employees REJECT LIMIT 100; |
|
Determines how the access driver validates rows in subsequent queries:
|
ALTER TABLE admin_ext_employees PROJECT COLUMN REFERENCED; ALTER TABLE admin_ext_employees PROJECT COLUMN ALL; |
|
Changes the default directory specification |
ALTER TABLE admin_ext_employees DEFAULT DIRECTORY admin_dat2_dir; |
Caution:
There are security implications to consider when using thePREPROCESSOR
clause. See Oracle Database Security Guide for more information.External tables can be preprocessed by user-supplied preprocessor programs. By using a preprocessing program, users can use data from a file that is not in a format supported by the driver. For example, a user may want to access data stored in a compressed format. Specifying a decompression program for the ORACLE_LOADER
access driver allows the data to be decompressed as the access driver processes the data.
To use the preprocessing feature, you must specify the PREPROCESSOR
clause in the access parameters of the ORACLE_LOADER
access driver. The preprocessor must be a directory object, and the user accessing the external table must have EXECUTE
privileges for the directory object. The following example includes the PREPROCESSOR
clause and specifies the directory and preprocessor program.
CREATE TABLE sales_transactions_ext
(PROD_ID NUMBER,
CUST_ID NUMBER,
TIME_ID DATE,
CHANNEL_ID CHAR,
PROMO_ID NUMBER,
QUANTITY_SOLD NUMBER,
AMOUNT_SOLD NUMBER(10,2),
UNIT_COST NUMBER(10,2),
UNIT_PRICE NUMBER(10,2))
ORGANIZATION external
(TYPE oracle_loader
DEFAULT DIRECTORY data_file_dir
ACCESS PARAMETERS
(RECORDS DELIMITED BY NEWLINE
CHARACTERSET US7ASCII
PREPROCESSOR exec_file_dir:'zcat'
BADFILE log_file_dir:'sh_sales.bad_xt'
LOGFILE log_file_dir:'sh_sales.log_xt'
FIELDS TERMINATED BY "|" LDRTRIM
( PROD_ID,
CUST_ID,
TIME_ID,
CHANNEL_ID,
PROMO_ID,
QUANTITY_SOLD,
AMOUNT_SOLD,
UNIT_COST,
UNIT_PRICE))
location ('sh_sales.dat.gz')
)REJECT LIMIT UNLIMITED;
The PREPROCESSOR
clause is not available for databases that use Oracle Database Vault.
See Also:
Oracle Database Utilities provides information more information about the PREPROCESSOR
clause
Oracle Database Security Guide for more information about the security implications of the PREPROCESSOR
clause
For an external table, the DROP
TABLE
statement removes only the table metadata in the database. It has no affect on the actual data, which resides outside of the database.
System and object privileges for external tables are a subset of those for regular table. Only the following system privileges are applicable to external tables:
CREATE
ANY
TABLE
ALTER
ANY
TABLE
DROP
ANY
TABLE
SELECT
ANY
TABLE
Only the following object privileges are applicable to external tables:
ALTER
SELECT
However, object privileges associated with a directory are:
READ
WRITE
For external tables, READ
privileges are required on directory objects that contain data sources, while WRITE
privileges are required for directory objects containing bad, log, or discard files.