Oracle® R Enterprise User's Guide Release 1.3 for Linux and Windows Part Number E36761-04 |
|
|
PDF · Mobi · ePub |
Oracle R Enterprise Transparency Layer performs these functions:
Traps all R commands and scripts prior to execution and looks for opportunities to ship them to the Oracle Database for execution in the database.
Enables transparent grandparent SQL generation for R expressions that use mapped data types.
Converts R commands and scripts to SQL equivalents to leverage the Oracle Database as a high-performance compute engine, taking advantage of query optimization, tables indexes, deferred evaluation, and parallel execution.
The Oracle R Enterprise transparency layer allows R users to use R syntax to work directly with database-resident objects without having to pull data from Oracle into R's memory on the user's desktop. It thus enables R users to work with data larger than desktop memory allows.
R language constructs and syntax are supported for objects mapped to Oracle Database objects.
This chapter summarizes the functionality provided by the Transparency Layer. These topics are discussed:
The following R data types have been overloaded so that they are mapped to database objects and hence enabled for in-database execution:
Character, Integer, Numeric, and Logical vectors
Factors
Data Frame
Matrix is overloaded in two situations:
Linear algebra cross-products
Creating input matrices for advanced analytics
class(object)
reports the data type of such mapped objects. For example, if the table NARROW contains the column AGE and AGE is numeric,
R> class(NARROW$AGE) [1] "ore.numeric" attr(,"package") [1] "OREbase"
This section describes how Oracle database supports Date and Time Data Types and illustrates how to use these data types in Oracle R Enterprise.
Oracle Database supports these data and time data types:
The DATE
data type stores date and time information. For each DATE
value, Oracle stores the following fields: YEAR
, MONTH
, DAY
, HOUR
, MINUTE
, and SECOND.
The valid date range is January 1, 4712 BC, to December 31, 9999 AD.
The TIMESTAMP
data type is an extension of the DATE
data type. It stores the year, month, and day of the DATE
data type, plus hour, minute, and second values. Supports an optional fractional_seconds_precision
, the number of digits in the fractional part of the SECOND
field in DATE
. You can specify 0 to 9 digits; the default is 6 digits.
There are two extensions of TIMESTAMP
:
TIMESTAMP WITH TIME ZONE
is TIMESTAMP
as well as time zone displacement value TIMEZONE_HOUR
and TIMEZONE_MINUTE
.
TIMESTAMP WITH LOCAL TIME ZONE
is TIMESTAMP WITH TIME ZONE
with data normalized to the database time zone when it is stored in the database. When the data is retrieved, users see the data in the session time zone.
INTERVAL YEAR TO MONTH
stores a period of time using the YEAR
and MONTH
fields. This data type is useful for representing the difference between two data time values when only the year and month values are significant.
INTERVAL DAY TO SECOND
stores a period of time in terms of days, hours, minutes, and seconds. This data type is useful for representing the precise difference between two date time values.
INTERVAL YEAR TO MONTH
stores a period of time in years and months, where optional year_precision, which is the number of digits in the YEAR
date time field. Accepted values are 0 to 9.
INTERVAL DAY TO SECOND
stores a period of time in days, hours, minutes, and seconds. Supports an optional day_precision
, the maximum number of digits in the DAY
date time field (value is 0 to 9 with a default of 2.) Also supports optional fractional_seconds_precision
, the number of digits in the fractional part of the SECOND
field. (value 0 to 9 with a default of 6).
For detailed information about Oracle Data Types, see “Data Types” in Oracle Database SQL Language Reference.
You can perform all expected operations on dates.
Oracle R Enterprise provides these classes to support date and time calculations:
ore.date
(Oracle DATE
)
ore.datetime
(TIMESTAMP
, TIMESTAMP WITH TIME ZONE
, TIMESTAMP WITH LOCAL TIME ZONE
)
ore.difftime
(INTERVAL DAY TO SECOND
)
Note that ore.datetime
objects do not support a time zone setting, instead they use the system time zone Sys.timezone()
if it is available or GMT if Sys.timezone()
is not available.
Oracle R Enterprise supports data pre-processing functionality extensively so all data preparation and analysis can take place directly in the database.
You are not restricted to using this list of functions. If a specific function that you need is not supported by Oracle R Enterprise, you can pull data from the database into the R engine memory using ore.pull()
to create an in-memory R object first, and use any R function.
The following operators and functions are supported. See R documentation for syntax and semantics of these operators and functions. Syntax and semantics for these items are unchanged when used on a corresponding database-mapped data type (also known as an Oracle R Enterprise data type).
Mathematical transformations: abs, sign, sqrt, ceiling, floor, trunc, cummax, cummin, cumprod, cumsum, log, loglo, log10, log2, log1p, acos, acosh, asin, asinh, atan, atanh, exp, expm1, cos, cosh, sin, sinh, tan, atan2, tanh, gamma, lgamma, digamma, trigamma, factorial, lfactorial, round, signif, pmin, pmax, zapsmall, rank, diff, besselI, besselJ, besselK, besselY
Basic statistics: mean, summary, min, max, sum, any, all, median, range, IQR, fivenum, mad, quantile, sd, var, table, tabulate, rowSums, colSums, rowMeans, colMeans, cor, cov
Arithmetic operators: +, -, *, /, ^, %%, %/%
Comparison operators: ==, >, <, !=, <=, >=
Logical operators: &, |, xor
Set operations: unique, %in%, subset
String operations: tolower, toupper, casefold, toString, chartr, sub, gsub, substr, substring, paste, nchar, grepl
Combine Data Frame: cbind, rbind, merge
Combine vectors: append
Vector creation: ifelse
Subset selection: [, [[, $, head, tail, window, subset, Filter, na.omit, na.exclude, complete.cases
Subset replacement: [<-, [[<-, $<-
Data reshaping: split, unlist
Data processing: eval, with, within, transform
Apply variants: tapply, aggregate, by
Special value checks: is.na, is.finite, is.infinite, is.nan
Metadata functions: nrow, NROW, ncol, NCOL, nlevels, names, names<-, row, col, dimnames, dimnames<-, dim, length, row.names, row.names<-, rownames, rownames<-, colnames, levels, reorder
Graphics::arrows, boxplot, cdplot, co.intervals, coplot, hist, identify, lines, matlines, matplot, matpoints, pairs, plot, points, polygon, polypath, rug, segments, smoothScatter, sunflowerplot, symbols, text, xspline, xy.coords
Conversion functions: as.logical, as.integer, as.numeric, as.character, as.vector, as.factor, as.data.frame
Type check functions: is.logical, is.integer, is.numeric, is.character, is.vector, is.factor, is.data.frame
Character manipulation: nchar, tolower, toupper, casefold, chartr, sub, gsub, substr.
Other ore.frame functions: data.frame, max.col, scale
Hypothesis testing: binom.test, chisq.test, ks.test, prop.test, t.test, var.test, wilcox.test
Various Distributions: Density, cumulative distribution, and quantile functions for standard distributions
ore.matrix function: show, is.matrix. as.matrix%*% (matrix multiplication), t, crossprod (matrix cross-product), tcrossprod (matrix cross-product A times transpose of B), solve (invert), backsolve, forwardsolve, all appropriate mathematical functions (abs, sign, etc.), summary (max, min, all, etc.) mean
The Oracle R Enterprise sample programs described in Oracle R Enterprise Examples include several examples using each category of these functions with Oracle R Enterprise data types.