Go to main content
1/66
Contents
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Conventions
Part I Introductions
1
Introduction to Oracle Data Mining
About Oracle Data Mining
Data Mining in the Database Kernel
Oracle Data Mining with R Extensibility
Data Mining in Oracle Exadata
About Partitioned Model
Interfaces to Oracle Data Mining
PL/SQL API
DBMS_DATA_MINING with R and Supported Subprograms
SQL Functions
Oracle Data Miner
Predictive Analytics
Overview of Database Analytics
2
Oracle Data Mining Basics
Mining Functions
Supervised Data Mining
Supervised Learning: Testing
Supervised Learning: Scoring
Unsupervised Data Mining
Unsupervised Learning: Scoring
Algorithms
Oracle Data Mining Supervised Algorithms
Oracle Data Mining Unsupervised Algorithms
Data Preparation
Oracle Data Mining Simplifies Data Preparation
Case Data
Nested Data
Text Data
In-Database Scoring
Parallel Execution and Ease of Administration
SQL Functions for Model Apply and Dynamic Scoring
Part II Mining Functions
3
Regression
About Regression
How Does Regression Work?
Linear Regression
Multivariate Linear Regression
Regression Coefficients
Nonlinear Regression
Multivariate Nonlinear Regression
Confidence Bounds
Testing a Regression Model
Regression Statistics
Root Mean Squared Error
Mean Absolute Error
Regression Algorithms
4
Classification
About Classification
Testing a Classification Model
Confusion Matrix
Lift
Lift Statistics
Receiver Operating Characteristic (ROC)
The ROC Curve
Area Under the Curve
ROC and Model Bias
ROC Statistics
Biasing a Classification Model
Costs
Costs Versus Accuracy
Positive and Negative Classes
Assigning Costs and Benefits
Priors and Class Weights
Classification Algorithms
5
Anomaly Detection
About Anomaly Detection
One-Class Classification
Anomaly Detection for Single-Class Data
Anomaly Detection for Finding Outliers
Anomaly Detection Algorithm
6
Clustering
About Clustering
How are Clusters Computed?
Scoring New Data
Hierarchical Clustering
Rules
Support and Confidence
Evaluating a Clustering Model
Clustering Algorithms
7
Association
About Association
Association Rules
Market-Basket Analysis
Association Rules and eCommerce
Transactional Data
Association Algorithm
8
Feature Selection and Extraction
Finding the Best Attributes
About Feature Selection and Attribute Importance
Attribute Importance and Scoring
About Feature Extraction
Feature Extraction and Scoring
Algorithms for Attribute Importance and Feature Extraction
Part III Algorithms
9
Apriori
About Apriori
Association Rules and Frequent Itemsets
Antecedent and Consequent
Confidence
Data Preparation for Apriori
Native Transactional Data and Star Schemas
Items and Collections
Sparse Data
Calculating Association Rules
Itemsets
Frequent Itemsets
Example: Calculating Rules from Frequent Itemsets
Aggregates
Reverse Confidence
Minimum Support Count
Transaction Count
Including and Excluding Rules
Excluding Rules
Example: Calculating Aggregates
Performance Impact for Aggregates
Evaluating Association Rules
Support
Confidence
Lift
10
Decision Tree
About Decision Tree
Decision Tree Rules
Confidence and Support
Advantages of Decision Trees
XML for Decision Tree Models
Growing a Decision Tree
Splitting
Cost Matrix
Preventing Over-Fitting
Tuning the Decision Tree Algorithm
Data Preparation for Decision Tree
11
Expectation Maximization
About Expectation Maximization
Expectation Step and Maximization Step
Probability Density Estimation
Algorithm Enhancements
Scalability
High Dimensionality
Number of Components
Parameter Initialization
From Components to Clusters
Configuring the Algorithm
Data Preparation for Expectation Maximization
12
Explicit Semantic Analysis
About Explicit Semantic Analysis
Scoring with ESA
Scoring Large ESA Models
ESA for Text Mining
Data Preparation for ESA
13
Generalized Linear Models
About Generalized Linear Models
GLM in Oracle Data Mining
Interpretability and Transparency
Wide Data
Confidence Bounds
Ridge Regression
Configuring Ridge Regression
Ridge and Confidence Bounds
Ridge and Data Preparation
Scalable Feature Selection
Feature Selection
Configuring Feature Selection
Feature Selection and Ridge Regression
Feature Generation
Configuring Feature Generation
Tuning and Diagnostics for GLM
Build Settings
Diagnostics
Coefficient Statistics
Global Model Statistics
Row Diagnostics
Data Preparation for GLM
Data Preparation for Linear Regression
Data Preparation for Logistic Regression
Missing Values
Linear Regression
Coefficient Statistics for Linear Regression
Global Model Statistics for Linear Regression
Row Diagnostics for Linear Regression
Logistic Regression
Reference Class
Class Weights
Coefficient Statistics for Logistic Regression
Global Model Statistics for Logistic Regression
Row Diagnostics for Logistic Regression
14
k
-Means
About
k
-Means
Oracle Data Mining Enhanced
k
-Means
Centroid
k
-Means Algorithm Configuration
Data Preparation for
k
-Means
15
Minimum Description Length
About MDL
Compression and Entropy
Values of a Random Variable: Statistical Distribution
Values of a Random Variable: Significant Predictors
Total Entropy
Model Size
Model Selection
The MDL Metric
Data Preparation for MDL
16
Naive Bayes
About Naive Bayes
Advantages of Naive Bayes
Tuning a Naive Bayes Model
Data Preparation for Naive Bayes
17
Non-Negative Matrix Factorization
About NMF
Matrix Factorization
Scoring with NMF
Text Mining with NMF
Tuning the NMF Algorithm
Data Preparation for NMF
18
O-Cluster
About O-Cluster
Partitioning Strategy
Partitioning Numerical Attributes
Partitioning Categorical Attributes
Active Sampling
Process Flow
Scoring
Tuning the O-Cluster Algorithm
Data Preparation for O-Cluster
User-Specified Data Preparation for O-Cluster
19
Singular Value Decomposition
About Singular Value Decomposition
Matrix Manipulation
Low Rank Decomposition
Scalability
Configuring the Algorithm
Model Size
Performance
PCA scoring
Data Preparation for SVD
20
Support Vector Machines
About Support Vector Machines
Advantages of SVM
Advantages of SVM in Oracle Data Mining
Usability
Scalability
Kernel-Based Learning
Tuning an SVM Model
Data Preparation for SVM
Normalization
SVM and Automatic Data Preparation
SVM Classification
Class Weights
One-Class SVM
SVM Regression
Part IV Using the Data Mining API
21
Data Mining With SQL
Highlights of the Data Mining API
Example: Targeting Likely Candidates for a Sales Promotion
Example: Analyzing Preferred Customers
Example: Segmenting Customer Data
Example : Building an ESA Model with a Wiki Dataset
22
About the Data Mining API
About Mining Models
Data Mining Data Dictionary Views
ALL_MINING_MODELS
ALL_MINING_MODEL_ATTRIBUTES
ALL_MINING_MODEL_PARTITIONS
ALL_MINING_MODEL_SETTINGS
ALL_MINING_MODEL_VIEWS
ALL_MINING_MODEL_XFORMS
Data Mining PL/SQL Packages
DBMS_DATA_MINING
DBMS_DATA_MINING_TRANSFORM
Transformation Methods in DBMS_DATA_MINING_TRANSFORM
DBMS_PREDICTIVE_ANALYTICS
Data Mining SQL Scoring Functions
23
Preparing the Data
Data Requirements
Column Data Types
Data Sets for Classification and Regression
Scoring Requirements
About Attributes
Data Attributes and Model Attributes
Target Attribute
Numericals, Categoricals, and Unstructured Text
Model Signature
Scoping of Model Attribute Name
Model Details
Using Nested Data
Nested Object Types
Example: Transforming Transactional Data for Mining
Using Market Basket Data
Example: Creating a Nested Column for Market Basket Analysis
Using Retail Analysis Data
Handling Missing Values
Examples: Missing Values or Sparse Data?
Sparsity in a Sales Table
Missing Values in a Table of Customer Data
Missing Value Treatment in Oracle Data Mining
Changing the Missing Value Treatment
24
Transforming the Data
About Transformations
Preparing the Case Table
Creating Nested Columns
Converting Column Data Types
Text Transformation
About Business and Domain-Sensitive Transformations
Understanding Automatic Data Preparation
Binning
Normalization
Outlier Treatment
How ADP Transforms the Data
Embedding Transformations in a Model
Specifying Transformation Instructions for an Attribute
Expression Records
Attribute Specifications
Building a Transformation List
SET_TRANSFORM
The STACK Interface
GET_MODEL_TRANSFORMATIONS and GET_TRANSFORM_LIST
Transformation Lists and Automatic Data Preparation
Oracle Data Mining Transformation Routines
Binning Routines
Normalization Routines
Routines for Outlier Treatment
Understanding Reverse Transformations
25
Creating a Model
Before Creating a Model
The CREATE_MODEL Procedure
Choosing the Mining Function
Choosing the Algorithm
Supplying Transformations
Creating a Transformation List
Transformation List and Automatic Data Preparation
About Partitioned Model
Partitioned Model Build Process
DDL in Partitioned model
Drop Model or Drop Partition
Add Partition
Partitioned Model scoring
Specifying Model Settings
Specifying Costs
Specifying Prior Probabilities
Specifying Class Weights
Model Settings in the Data Dictionary
Specifying Mining Model Settings for R Model
ALGO_EXTENSIBLE_LANG
RALG_BUILD_FUNCTION
RALG_BUILD_PARAMETER
RALG_DETAILS_FUNCTION
RALG_DETAILS_FORMAT
RALG_SCORE_FUNCTION
RALG_WEIGHT_FUNCTION
Registered R Scripts
R Model Demonstration Scripts
Model Detail Views
Model Detail Views for Association Rules
Model Detail View for Frequent Itemsets
Model Detail View for Transactional Itemsets
Model Detail View for Transactional Rule
Model Detail Views for Classification Algorithms
Model Detail Views for Decision Tree
Model Detail Views for Generalized Linear Model
Model Detail Views for Naive Bayes
Model Detail View for Support Vector Machine
Model Detail Views for Clustering Algorithms
Model Detail Views for Expectation Maximization
Model Detail Views for
k
-Means
Model Detail Views for O-Cluster
Model Detail Views for Explicit Semantic Analysis
Model Detail Views for Non-Negative Matrix Factorization
Model Detail Views for Singular Value Decomposition
Model Detail View for Minimum Description Length
Model Detail View for Binning
Model Detail Views for Global Information
Model Detail View for Normalization and Missing Value Handling
26
Scoring and Deployment
About Scoring and Deployment
Using the Data Mining SQL Functions
Choosing the Predictors
Single-Record Scoring
Prediction Details
Cluster Details
Feature Details
Prediction Details
GROUPING Hint
Real-Time Scoring
Dynamic Scoring
Cost-Sensitive Decision Making
DBMS_DATA_MINING.Apply
27
Mining Unstructured Text
About Unstructured Text
About Text Mining and Oracle Text
Data Preparation for Text Features
Creating a Model that Includes Text Mining
Creating a Text Policy
Configuring a Text Attribute
28
Administrative Tasks for Oracle Data Mining
Installing and Configuring a Database for Data Mining
About Installation
Enabling or Disabling a Database Option
Database Tuning Considerations for Data Mining
Upgrading or Downgrading Oracle Data Mining
Pre-Upgrade Steps
Dropping Models Created in Java
Dropping Mining Activities Created in Oracle Data Miner Classic
Upgrading Oracle Data Mining
Using Database Upgrade Assistant to Upgrade Oracle Data Mining
Upgrading from Release 10
g
Upgrading from Release 11
g
Using Export/Import to Upgrade Data Mining Models
Export/Import Release 10
g
Data Mining Models
Export/Import Release 11
g
Data Mining Models
Post Upgrade Steps
Downgrading Oracle Data Mining
Exporting and Importing Mining Models
About Oracle Data Pump
Options for Exporting and Importing Mining Models
Directory Objects for EXPORT_MODEL and IMPORT_MODEL
Using EXPORT_MODEL and IMPORT_MODEL
Importing From PMML
Controlling Access to Mining Models and Data
Creating a Data Mining User
Granting Privileges for Data Mining
System Privileges for Data Mining
Object Privileges for Mining Models
Auditing and Adding Comments to Mining Models
Adding a Comment to a Mining Model
Auditing Mining Models
29
The Data Mining Sample Programs
About the Data Mining Sample Programs
Installing the Data Mining Sample Programs
The Data Mining Sample Data
Part V Oracle Data Mining API Reference
30
PL/SQL Packages
DBMS_DATA_MINING
Using DBMS_DATA_MINING
DBMS_DATA_MINING Overview
DBMS_DATA_MINING Security Model
DBMS_DATA_MINING — Mining Functions
DBMS_DATA_MINING Datatypes
DBMS_DATA_MINING — Model Settings
DBMS_DATA_MINING — Algorithm Names
DBMS_DATA_MINING — Automatic Data Preparation
DBMS_DATA_MINING — Mining Function Settings
DBMS_DATA_MINING — Global Settings
DBMS_DATA_MINING — Algorithm Settings: ALGO_EXTENSIBLE_LANG
DBMS_DATA_MINING — Algorithm Settings: Decision Tree
DBMS_DATA_MINING — Algorithm Settings: Expectation Maximization
DBMS_DATA_MINING — Algorithm Settings: Explicit Semantic Analysis
DBMS_DATA_MINING — Algorithm Settings: Generalized Linear Models
DBMS_DATA_MINING — Algorithm Settings:
k
-Means
DBMS_DATA_MINING — Algorithm Settings: Naive Bayes
DBMS_DATA_MINING — Algorithm Settings: Non-Negative Matrix Factorization
DBMS_DATA_MINING — Algorithm Settings: O-Cluster
DBMS_DATA_MINING — Algorithm Constants and Settings: Singular Value Decomposition
DBMS_DATA_MINING — Algorithm Settings: Support Vector Machine
Summary of DBMS_DATA_MINING Subprograms
ADD_COST_MATRIX Procedure
ADD_PARTITION Procedure
ALTER_REVERSE_EXPRESSION Procedure
APPLY Procedure
COMPUTE_CONFUSION_MATRIX Procedure
COMPUTE_CONFUSION_MATRIX_PART Procedure
COMPUTE_LIFT Procedure
COMPUTE_LIFT_PART Procedure
COMPUTE_ROC Procedure
COMPUTE_ROC_PART Procedure
CREATE_MODEL Procedure
CREATE_MODEL2 Procedure
DROP_PARTITION Procedure
DROP_MODEL Procedure
EXPORT_MODEL Procedure
GET_ASSOCIATION_RULES Function
GET_FREQUENT_ITEMSETS Function
GET_MODEL_COST_MATRIX Function
GET_MODEL_DETAILS_AI Function
GET_MODEL_DETAILS_EM Function
GET_MODEL_DETAILS_EM_COMP Function
GET_MODEL_DETAILS_EM_PROJ Function
GET_MODEL_DETAILS_GLM Function
GET_MODEL_DETAILS_GLOBAL Function
GET_MODEL_DETAILS_KM Function
GET_MODEL_DETAILS_NB Function
GET_MODEL_DETAILS_NMF Function
GET_MODEL_DETAILS_OC Function
GET_MODEL_SETTINGS Function
GET_MODEL_SIGNATURE Function
GET_MODEL_DETAILS_SVD Function
GET_MODEL_DETAILS_SVM Function
GET_MODEL_DETAILS_XML Function
GET_MODEL_TRANSFORMATIONS Function
GET_TRANSFORM_LIST Procedure
IMPORT_MODEL Procedure
RANK_APPLY Procedure
REMOVE_COST_MATRIX Procedure
RENAME_MODEL Procedure
DBMS_DATA_MINING_TRANSFORM
Using DBMS_DATA_MINING_TRANSFORM
DBMS_DATA_MINING_TRANSFORM Overview
DBMS_DATA_MINING_TRANSFORM Security Model
DBMS_DATA_MINING_TRANSFORM Datatypes
DBMS_DATA_MINING_TRANSFORM Constants
DBMS_DATA_MINING_TRANSFORM Operational Notes
DBMS_DATA_MINING_TRANSFORM — About Transformation Lists
DBMS_DATA_MINING_TRANSFORM — About Stacking and Stack Procedures
DBMS_DATA_MINING_TRANSFORM — Nested Data Transformations
Summary of DBMS_DATA_MINING_TRANSFORM Subprograms
CREATE_BIN_CAT Procedure
CREATE_BIN_NUM Procedure
CREATE_CLIP Procedure
CREATE_COL_REM Procedure
CREATE_MISS_CAT Procedure
CREATE_MISS_NUM Procedure
CREATE_NORM_LIN Procedure
DESCRIBE_STACK Procedure
GET_EXPRESSION Function
INSERT_AUTOBIN_NUM_EQWIDTH Procedure
INSERT_BIN_CAT_FREQ Procedure
INSERT_BIN_NUM_EQWIDTH Procedure
INSERT_BIN_NUM_QTILE Procedure
INSERT_BIN_SUPER Procedure
INSERT_CLIP_TRIM_TAIL Procedure
INSERT_CLIP_WINSOR_TAIL Procedure
INSERT_MISS_CAT_MODE Procedure
INSERT_MISS_NUM_MEAN Procedure
INSERT_NORM_LIN_MINMAX Procedure
INSERT_NORM_LIN_SCALE Procedure
INSERT_NORM_LIN_ZSCORE Procedure
SET_EXPRESSION Procedure
SET_TRANSFORM Procedure
STACK_BIN_CAT Procedure
STACK_BIN_NUM Procedure
STACK_CLIP Procedure
STACK_COL_REM Procedure
STACK_MISS_CAT Procedure
STACK_MISS_NUM Procedure
STACK_NORM_LIN Procedure
XFORM_BIN_CAT Procedure
XFORM_BIN_NUM Procedure
XFORM_CLIP Procedure
XFORM_COL_REM Procedure
XFORM_EXPR_NUM Procedure
XFORM_EXPR_STR Procedure
XFORM_MISS_CAT Procedure
XFORM_MISS_NUM Procedure
XFORM_NORM_LIN Procedure
XFORM_STACK Procedure
DBMS_PREDICTIVE_ANALYTICS
Using DBMS_PREDICTIVE_ANALYTICS
DBMS_PREDICTIVE_ANALYTICS Overview
DBMS_PREDICTIVE_ANALYTICS Security Model
Summary of DBMS_PREDICTIVE_ANALYTICS Subprograms
EXPLAIN Procedure
PREDICT Procedure
PROFILE Procedure
31
Data Dictionary Views
ALL_MINING_MODELS
ALL_MINING_MODEL_ATTRIBUTES
ALL_MINING_MODEL_PARTITIONS
ALL_MINING_MODEL_SETTINGS
ALL_MINING_MODEL_VIEWS
ALL_MINING_MODEL_XFORMS
32
SQL Scoring Functions
CLUSTER_DETAILS
CLUSTER_DISTANCE
CLUSTER_ID
CLUSTER_PROBABILITY
CLUSTER_SET
FEATURE_COMPARE
FEATURE_DETAILS
FEATURE_ID
FEATURE_SET
FEATURE_VALUE
ORA_DM_PARTITION_NAME
PREDICTION
PREDICTION_BOUNDS
PREDICTION_COST
PREDICTION_DETAILS
PREDICTION_PROBABILITY
PREDICTION_SET
Scripting on this page enhances content navigation, but does not change the content in any way.