Go to main content
1/25
Contents
List of Examples
List of Figures
List of Tables
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documents
Conventions
Changes in This Release for Oracle Text Application Developer's Guide
Changes in Oracle Text 12
c
Release 2 (12.2.0.1)
1
Understanding Oracle Text Application Development
1.1
Introduction to Oracle Text
1.2
Document Collection Applications
1.2.1
About Document Collection Applications
1.2.2
Flowchart of Text Query Application
1.3
Catalog Information Applications
1.3.1
About Catalog Information Applications
1.3.2
Flowchart for Catalog Query Application
1.4
Document Classification Applications
1.5
XML Search Applications
1.5.1
The CONTAINS Operator with XML Search Applications
1.5.2
Combining Oracle Text Features with Oracle XML DB (XML Search Index)
1.5.2.1
Using the xml_enable Method for an XML Search Index
1.5.2.2
Using the Text-on-XML Method
1.5.2.3
Indexing JSON Data
2
Getting Started with Oracle Text
2.1
Overview of Getting Started with Oracle Text
2.2
Creating an Oracle Text User
2.3
Query Application Quick Tour
2.3.1
Creating the Text Table
2.3.2
Using SQL*Loader to Load the Table
2.3.2.1
Step 1 Create the CONTEXT index
2.3.2.2
Step 2 Querying Your Table with CONTAINS
2.3.2.3
Step 3 Present the Document
2.3.2.4
Step 4 Synchronize the Index After Data Manipulation
2.4
Catalog Application Quick Tour
2.4.1
Creating the Table
2.4.2
Using SQL*Loader to Load the Table
2.4.2.1
Step 1 Determine your Queries
2.4.2.2
Step 2 Create the Sub-Index to Order by Price
2.4.2.3
Step 3 Create the CTXCAT Index
2.4.2.4
Step 4 Querying Your Table with CATSEARCH
2.4.2.5
Step 5 Update Your Table
2.5
Classification Application Quick Tour
2.5.1
About Classification of a Document
2.5.2
Steps for Creating a Classification Application
3
Indexing with Oracle Text
3.1
About Oracle Text Indexes
3.1.1
Types of Oracle Text Indexes
3.1.2
Structure of the Oracle Text CONTEXT Index
3.1.3
The Oracle Text Indexing Process
3.1.3.1
Datastore Object
3.1.3.2
Filter Object
3.1.3.3
Sectioner Object
3.1.3.4
Lexer Object
3.1.3.5
Indexing Engine
3.1.4
About Updates to Indexed Columns
3.1.5
Partitioned Tables and Indexes
3.1.6
Creating an Index Online
3.1.7
Parallel Indexing
3.1.8
Indexing and Views
3.2
Considerations for Oracle Text Indexing
3.2.1
Location of Text
3.2.1.1
Supported Column Types
3.2.1.2
Storing Text in the Text Table
3.2.1.2.1
CONTEXT Data Storage
3.2.1.2.2
CTXCAT Data Storage
3.2.1.3
Storing File Path Names
3.2.1.4
Storing URLs
3.2.1.5
Storing Associated Document Information
3.2.1.6
Format and Character Set Columns
3.2.1.7
Supported Document Formats
3.2.1.8
Summary of DATASTORE Types
3.2.2
Document Formats and Filtering
3.2.2.1
No Filtering for HTML
3.2.2.2
Filtering Mixed-Format Columns
3.2.2.3
Custom Filtering
3.2.3
Bypassing Rows for Indexing
3.2.4
Document Character Set
3.2.4.1
Character Set Detection
3.2.4.2
Mixed Character Set Columns
3.2.5
Document Language
3.2.5.1
Language Features Outside BASIC_LEXER
3.2.5.2
Indexing Multi-language Columns
3.2.6
Indexing Special Characters
3.2.6.1
Printjoin Characters
3.2.6.2
Skipjoin Characters
3.2.6.3
Other Characters
3.2.7
Case-Sensitive Indexing and Querying
3.2.8
Document Services Procedures Performance and Forward Index
3.2.9
Language-Specific Features
3.2.9.1
Indexing Themes
3.2.9.2
Base-Letter Conversion for Characters with Diacritical Marks
3.2.9.3
Alternate Spelling
3.2.9.4
Composite Words
3.2.9.5
Korean, Japanese, and Chinese Indexing
3.2.10
About Entity Extraction and CTX_ENTITY
3.2.10.1
Basic Example of Using Entity Extraction
3.2.10.2
Example of Creating a New Entity Type Using a User-defined Rule
3.2.11
Fuzzy Matching and Stemming
3.2.11.1
Values For Language Attribute for index_stems of BASIC_LEXER
3.2.11.2
Values For Language Attribute for index_stems of AUTO_LEXER
3.2.12
Better Wildcard Query Performance
3.2.13
Document Section Searching
3.2.14
Stopwords and Stopthemes
3.2.14.1
Language Detection and Stoplists
3.2.14.2
Multi-Language Stoplists
3.2.15
Index Performance
3.2.16
Query Performance and Storage of Large Object (LOB) Columns
3.2.17
Mixed Query Performance
3.3
Creating Oracle Text Indexes
3.3.1
Summary of Procedure for Creating a Text Index
3.3.2
Creating Preferences
3.3.2.1
Datastore Examples
3.3.2.1.1
Specifying DIRECT_DATASTORE
3.3.2.1.2
Specifying MULTI_COLUMN_DATASTORE
3.3.2.1.3
Specifying URL Data Storage
3.3.2.1.4
Specifying File Data Storage
3.3.2.2
NULL_FILTER Example: Indexing HTML Documents
3.3.2.3
PROCEDURE_FILTER Example
3.3.2.4
BASIC_LEXER Example: Setting Printjoin Characters
3.3.2.5
MULTI_LEXER Example: Indexing a Multi-Language Table
3.3.2.6
BASIC_WORDLIST Example: Enabling Substring and Prefix Indexing
3.3.3
Creating Section Groups for Section Searching
3.3.4
Using Stopwords and Stoplists
3.3.4.1
Multi-Language Stoplists
3.3.4.2
Stopthemes and Stopclasses
3.3.4.3
PL/SQL Procedures for Managing Stoplists
3.3.5
Creating a CONTEXT Index
3.3.5.1
CONTEXT Index and DML
3.3.5.2
Default CONTEXT Index Example
3.3.5.3
Incrementally Creating an Index with ALTER INDEX and CREATE INDEX
3.3.5.4
Incrementally Creating a CONTEXT Index with POPULATE_PENDING
3.3.5.5
Custom CONTEXT Index Example: Indexing HTML Documents
3.3.5.6
CONTEXT Index Example: Query Processing with FILTER BY and ORDER BY
3.3.5.7
DATASTORE Triggers in Release 12
c
3.3.6
Creating a CTXCAT Index
3.3.6.1
CTXCAT Index and DML
3.3.6.2
About CTXCAT Sub-Indexes and Their Costs
3.3.6.3
Creating CTXCAT Sub-indexes
3.3.6.4
Creating CTXCAT Index
3.3.7
Creating a CTXRULE Index
3.3.7.1
Step One: Create a Table of Queries
3.3.7.2
Step Two: Create the CTXRULE Index
3.3.7.3
Step Three: Classify a Document
3.3.8
Create Search Index for JSON
3.4
Maintaining Oracle Text Indexes
3.4.1
Viewing Index Errors
3.4.2
Dropping an Index
3.4.3
Resuming Failed Index
3.4.4
Re-creating an Index
3.4.4.1
Re-creating a Global Index
3.4.4.2
Re-creating a Local Partitioned Index
3.4.4.3
Re-creating a Global Index with Time Limit for Synch
3.4.4.4
Re-creating a Global Index with Scheduled Swap
3.4.4.5
Re-creating a Local Index with All-at-Once Swap
3.4.4.6
Scheduling Local Index Re-creation with All-at-Once Swap
3.4.4.7
Re-creating a Local Index with Per-Partition Swap
3.4.5
Rebuilding an Index
3.4.6
Dropping a Preference
3.5
Managing DML Operations for a CONTEXT Index
3.5.1
Viewing Pending DML
3.5.2
Synchronizing the Index
3.5.2.1
Synchronizing the Index With SYNC_INDEX
3.5.2.2
Maxtime Parameter for SYNC_INDEX
3.5.2.3
Locking Parameter for SYNC_INDEX
3.5.3
Optimizing the Index
3.5.3.1
CONTEXT Index Structure
3.5.3.2
Index Fragmentation
3.5.3.3
Document Invalidation and Garbage Collection
3.5.3.4
Single Token Optimization
3.5.3.5
Viewing Index Fragmentation and Garbage Data
3.5.3.6
Example: Optimizing the Index
4
Querying with Oracle Text
4.1
Overview of Queries
4.1.1
Querying with CONTAINS
4.1.1.1
CONTAINS SQL Example
4.1.1.2
CONTAINS PL/SQL Example
4.1.1.3
Structured Query with CONTAINS
4.1.2
Querying with CATSEARCH
4.1.2.1
CATSEARCH SQL Query
4.1.2.2
CATSEARCH Example
4.1.3
Querying with MATCHES
4.1.3.1
MATCHES SQL Query
4.1.3.2
MATCHES PL/SQL Example
4.1.4
Word and Phrase Queries
4.1.4.1
CONTAINS Phrase Queries
4.1.4.2
CATSEARCH Phrase Queries
4.1.5
Querying Stopwords
4.1.6
ABOUT Queries and Themes
4.1.7
Query Expressions
4.1.7.1
CONTAINS Operators
4.1.7.2
CATSEARCH Operator
4.1.7.3
MATCHES Operator
4.1.8
Case-Sensitive Searching
4.1.8.1
Word Queries
4.1.8.2
ABOUT Queries
4.1.9
Query Feedback
4.1.10
Query Explain Plan
4.1.11
Using a Thesaurus in Queries
4.1.12
About Document Section Searching
4.1.13
Using Query Templates
4.1.14
Query Rewrite
4.1.15
Query Relaxation
4.1.16
Query Language
4.1.17
Ordering By SDATA Sections
4.1.18
Alternative and User-defined Scoring
4.1.19
Alternative Grammar
4.1.20
Query Analysis
4.1.21
Other Query Features
4.2
The CONTEXT Grammar
4.2.1
ABOUT Query
4.2.2
Logical Operators
4.2.3
Section Searching and HTML and XML
4.2.4
Proximity Queries with NEAR, NEAR_ACCUM, and NEAR2 Operators
4.2.5
Fuzzy, Stem, Soundex, Wildcard and Thesaurus Expansion Operators
4.2.6
Using CTXCAT Grammar
4.2.7
Stored Query Expressions
4.2.7.1
Defining a Stored Query Expression
4.2.7.2
SQE Example
4.2.8
Calling PL/SQL Functions in CONTAINS
4.2.9
Optimizing for Response Time
4.2.10
Counting Hits
4.2.10.1
SQL Count Hits Example
4.2.10.2
Counting Hits with a Structured Predicate
4.2.10.3
PL/SQL Count Hits Example
4.2.11
Using DEFINESCORE and DEFINEMERGE for User-defined Scoring
4.3
The CTXCAT Grammar
5
Presenting Documents in Oracle Text
5.1
Highlighting Query Terms
5.1.1
Text highlighting
5.1.2
Theme Highlighting
5.1.3
CTX_DOC Highlighting Procedures
5.1.3.1
Markup Procedure
5.1.3.2
Highlight Procedure
5.1.3.3
Concordance
5.2
Obtaining Part-of-Speech Information for a Document
5.3
Obtaining Lists of Themes, Gists, and Theme Summaries
5.3.1
Lists of Themes
5.3.2
Gist and Theme Summary
5.3.2.1
In-Memory Gist
5.3.2.2
Result Table Gists
5.3.2.3
Theme Summary
5.4
Document Presentation and Highlighting
5.4.1
Highlighting Example
5.4.2
Document List of Themes Example
5.4.3
Gist Example
6
Classifying Documents in Oracle Text
6.1
Overview of Document Classification
6.2
Classification Applications
6.3
Classification Solutions
6.4
Rule-Based Classification
6.4.1
Rule-based Classification Example
6.4.1.1
Step 1 Create schema
6.4.1.2
Step 2 Load Documents with SQLLDR
6.4.1.3
Step 3 Create Categories
6.4.1.4
Step 4 Create the CTXRULE index
6.4.1.5
Step 5 Classify Documents
6.4.2
CTXRULE Parameters and Limitations
6.5
Supervised Classification
6.5.1
Decision Tree Supervised Classification
6.5.2
Decision Tree Supervised Classification Example
6.5.2.1
Create the Category Rules
6.5.2.2
Index Rules to Categorize New Documents
6.5.3
SVM-Based Supervised Classification
6.5.4
SVM-Based Supervised Classification Example
6.6
Unsupervised Classification (Clustering)
6.7
Unsupervised Classification (Clustering) Example
7
Tuning Oracle Text
7.1
Optimizing Queries with Statistics
7.1.1
Collecting Statistics
7.1.2
Query Optimization with Statistics Example
7.1.3
Re-Collecting Statistics
7.1.4
Deleting Statistics
7.2
Optimizing Queries for Response Time
7.2.1
Other Factors that Influence Query Response Time
7.2.2
Improved Response Time with FIRST_ROWS(n) Hint for ORDER BY Queries
7.2.3
Improved Response Time Using the DOMAIN_INDEX_SORT Hint
7.2.4
Improved Response Time using Local Partitioned CONTEXT Index
7.2.4.1
Range Search on Partition Key Column
7.2.4.2
ORDER BY Partition Key Column
7.2.5
Improved Response Time with Local Partitioned Index for Order by Score
7.2.6
Improved Response Time with Query Filter Cache
7.2.7
Improved Response Time using BIG_IO Option of CONTEXT Index
7.2.8
Improved Response Time using SEPARATE_OFFSETS Option of CONTEXT Index
7.2.9
Improved Response Time Using STAGE_ITAB, STAGE_ITAB_MAX_ROWS, STAGE_ITAB_PARALLEL Options of CONTEXT Index
7.3
Optimizing Queries for Throughput
7.3.1
CHOOSE and ALL ROWS Modes
7.3.2
FIRST_ROWS(n) Mode
7.4
Composite Domain Index (CDI) in Oracle Text
7.5
Performance Tuning with CDI
7.6
Solving Index and Query Bottlenecks Using Tracing
7.7
Using Parallel Queries
7.7.1
Parallel Queries on a Local Context Index
7.7.2
Parallelizing Queries Across Oracle RAC Nodes
7.8
Tuning Queries with Blocking Operations
7.9
Frequently Asked Questions About Query Performance
7.9.1
What is Query Performance?
7.9.2
What is the fastest type of text query?
7.9.3
Should I collect statistics on my tables?
7.9.4
How does the size of my data affect queries?
7.9.5
How does the format of my data affect queries?
7.9.6
What is a
functional
versus an
indexed
lookup?
7.9.7
What tables are involved in queries?
7.9.8
How is $R table contention reduced?
7.9.9
Does sorting the results slow a text-only query?
7.9.10
How do I make an ORDER BY score query faster?
7.9.11
Which memory settings affect querying?
7.9.12
Does out-of-line LOB storage of wide base table columns improve performance?
7.9.13
How can I make a CONTAINS query on more than one column faster?
7.9.14
Is it OK to have many expansions in a query?
7.9.15
How can local partition indexes help?
7.9.16
Should I query in parallel?
7.9.17
Should I index themes?
7.9.18
When should I use a CTXCAT index?
7.9.19
When is a CTXCAT index NOT suitable?
7.9.20
What optimizer hints are available, and what do they do?
7.10
Frequently Asked Questions About Indexing Performance
7.10.1
How long should indexing take?
7.10.2
Which index memory settings should I use?
7.10.3
How much disk overhead will indexing require?
7.10.4
How does the format of my data affect indexing?
7.10.5
Can parallel indexing improve performance?
7.10.6
How can I improve index performance for creating local partitioned index?
7.10.7
How can I tell how much indexing has completed?
7.11
Frequently Asked Questions About Updating the Index
7.11.1
How often should I index new or updated records?
7.11.2
How can I tell when my indexes are getting fragmented?
7.11.3
Does memory allocation affect index synchronization?
8
Searching Document Sections in Oracle Text
8.1
About Oracle Text Document Section Searching
8.1.1
Enabling Oracle Text Section Searching
8.1.1.1
Create a Section Group
8.1.1.2
Define Your Sections
8.1.1.3
Index Your Documents
8.1.1.4
Section Searching with the WITHIN Operator
8.1.1.5
Path Searching with INPATH and HASPATH Operators
8.1.1.6
Marking an SDATA Section to be Searchable
8.1.2
Oracle Text Section Types
8.1.2.1
Zone Section
8.1.2.2
Field Section
8.1.2.2.1
Visible and Invisible Field Sections
8.1.2.2.2
Nested Field Sections
8.1.2.2.3
Repeated Field Sections
8.1.2.3
Stop Section
8.1.2.4
MDATA Section
8.1.2.5
NDATA Section
8.1.2.6
SDATA Section
8.1.2.7
Attribute Section
8.1.2.8
Special Sections
8.1.3
Oracle Text Section Attributes
8.2
HTML Section Searching with Oracle Text
8.2.1
Creating HTML Sections
8.2.2
Searching HTML Meta Tags
8.3
XML Section Searching with Oracle Text
8.3.1
Automatic Sectioning
8.3.2
Attribute Searching
8.3.2.1
Creating Attribute Sections
8.3.2.2
Searching Attributes with the INPATH Operator
8.3.3
Creating Document Type Sensitive Sections
8.3.4
Path Section Searching
8.3.4.1
Creating an Index with PATH_SECTION_GROUP
8.3.4.2
Top-Level Tag Searching
8.3.4.3
Any-Level Tag Searching
8.3.4.4
Direct Parentage Searching
8.3.4.5
Tag Value Testing
8.3.4.6
Attribute Searching
8.3.4.7
Attribute Value Testing
8.3.4.8
Path Testing
8.3.4.9
Section Equality Testing with HASPATH
9
Using Oracle Text Name Search
9.1
Overview of Name Search
9.2
Examples of Using Name Search
10
Working With a Thesaurus in Oracle Text
10.1
Overview of Oracle Text Thesaurus Features
10.1.1
Oracle Text Thesaurus Creation and Maintenance
10.1.1.1
CTX_THES Package
10.1.1.2
Thesaurus Operators
10.1.1.3
ctxload Utility
10.1.2
Using a Case-sensitive Thesaurus
10.1.3
Using a Case-insensitive Thesaurus
10.1.4
Default Thesaurus
10.1.5
Supplied Thesaurus
10.1.5.1
Supplied Thesaurus Structure and Content
10.1.5.2
Supplied Thesaurus Location
10.2
Defining Terms in a Thesaurus
10.2.1
Defining Synonyms
10.2.2
Defining Hierarchical Relations
10.3
Using a Thesaurus in a Query Application
10.3.1
Loading a Custom Thesaurus and Issuing Thesaurus-based Queries
10.3.2
Augmenting Knowledge Base with Custom Thesaurus
10.3.2.1
Advantage
10.3.2.2
Limitations
10.3.2.3
Linking New Terms to Existing Terms
10.3.2.3.1
Example: Linking New Terms to Existing Terms
10.3.2.4
Loading a Thesaurus with ctxload
10.3.2.5
Loading a Thesaurus with PL/SQL procedure CTX_THES.IMPORT_THESAURUS
10.3.2.6
Compiling a Loaded Thesaurus
10.4
About the Supplied Knowledge Base
10.4.1
Adding a Language-Specific Knowledge Base
10.4.2
Limitations for Adding Knowledge Bases
11
Using XML Query Result Set Interface
11.1
Overview of the XML Query Result Set Interface
11.2
Using the XML Query Result Set Interface
11.3
Creating XML-Only Applications with Oracle Text
11.4
Example of a Result Set Descriptor
11.5
Identifying Collocates Using Oracle Text
12
Performing Sentiment Analysis Using Oracle Text
12.1
Overview of Sentiment Analysis
12.1.1
About Sentiment Analysis
12.1.2
About Sentiment Classifiers
12.1.3
About Performing Sentiment Analysis with Oracle Text
12.1.4
Interfaces for Performing Sentiment Analysis
12.2
Creating a Sentiment Classifier Preference
12.3
Training Sentiment Classifiers
12.4
Performing Sentiment Analysis Using the CTX_DOC Package
12.5
Performing Sentiment Analysis Using Result Set Interface
13
Administering Oracle Text
13.1
Oracle Text Users and Roles
13.1.1
CTXSYS User
13.1.2
CTXAPP Role
13.1.3
Granting Roles and Privileges to Users
13.2
DML Queue
13.3
The CTX_OUTPUT Package
13.4
The CTX_REPORT Package
13.5
Text Manager in Oracle Enterprise Manager
13.5.1
Using Text Manager
13.5.2
Viewing General Information for a Text Index
13.5.3
Checking Text Index Health
13.6
Servers and Indexing
13.7
Database Feature Usage Tracking in Oracle Enterprise Manager
13.8
Oracle Text on Oracle Real Application Clusters
14
Migrating Oracle Text Applications
14.1
Oracle Text and Rolling Upgrade with Logical Standby
14.1.1
CTX_DDL PL/SQL Procedures
14.1.2
CTX_OUTPUT PL/SQL Procedures
14.1.3
CTX_DOC PL/SQL Procedures
14.2
Identifying and Copying Oracle Text Files To a New Oracle Home
A
CONTEXT Query Application
A.1
Web Query Application Overview
A.2
The PL/SQL Server Pages (PSP) Web Application
A.2.1
PSP Web Application Prerequisites
A.2.2
Building the PSP Web Application
A.2.3
PSP Web Application Sample Code
A.2.3.1
loader.ctl
A.2.3.2
loader.dat
A.2.3.3
HTML Files for loader.dat Example
A.2.3.4
search_htmlservices.sql
A.2.3.5
search_html.psp
A.3
The Java Server Pages (JSP) Web Application
A.3.1
JSP Web Application Prerequisites
A.3.2
JSP Web Application Sample Code
B
CATSEARCH Query Application
B.1
CATSEARCH Web Query Application Overview
B.2
The JSP Web Application
B.2.1
Building the JSP Web Application
B.2.2
JSP Web Application Sample Code
B.2.2.1
loader.ctl
B.2.2.2
loader.dat
B.2.2.3
catalogSearch.jsp
Glossary
Index
Scripting on this page enhances content navigation, but does not change the content in any way.