Oracle Vector Search: Your First Hands-On Walkthrough

Oracle 23ai vector search query matching 'database ran out of 
memory' to ORA-04031 and ORA-00823 error codes using 
VECTOR_DISTANCE in SQL

In Post 1 we established the architectureOracle 23ai puts vector search inside the database engine, not in a separate system. We titled that post using “23ai” deliberately — because that is the name most production DBAs are still searching for, still referencing in upgrade discussions.

Since then, Oracle announced Oracle AI Database 26ai at Oracle AI World in October 2025, replacing 23ai via a standard Release Update with no upgrade or recertification needed. Same engine, same SQL, same vector architecture — new name. We will use both names across this series where context demands it. Now we put that architecture to work.

This post is a complete hands-on walkthrough. By the end of it, we will have a working vector search running in Oracle 23ai — a table with a VECTOR column, embeddings loaded, a vector index built, and a semantic similarity query executing in plain SQL. No Python middleware. No external vector database. No API calls during query execution.

Environment used: Oracle Database 26ai Free Developer Release on-premises. The SQL in this post runs identically on Oracle Autonomous Database 26ai on OCI Free Tier.


The Dataset — Oracle Error Descriptions

We are going to build a semantic search engine over Oracle error codes. A DBA types a plain description of a problem — “database ran out of memory” — and the query returns the most semantically relevant ORA- errors, ranked by similarity.

This is not a contrived demo. This is exactly the RAG pattern that production support tools, internal knowledge bases, and AI-assisted monitoring dashboards use. We are just running it in a small, self-contained form so every step is visible.

Ten errors, four semantic clusters — memory, space, locking, connectivity — plus one schema error to create noise.

Error CodeError TitleCluster
ORA-04031Shared Pool Out of MemoryMemory
ORA-00823SGA Exceeds Physical MemoryMemory
ORA-01555Snapshot Too OldUndo / Space
ORA-00257Archiver Stuck — Disk FullSpace
ORA-01654Unable to Extend Index SegmentSpace
ORA-00060Deadlock DetectedLocking
ORA-04021Timeout Waiting for LockLocking
ORA-12541TNS — No ListenerConnectivity
ORA-12170TNS Connect TimeoutConnectivity
ORA-00942Table or View Does Not ExistSchema

Step 1 — Verify 26ai Is Running

Before anything else, confirm we are on Oracle 23ai:

SELECT banner_full FROM v$version;
BANNER_FULL                                                                                                                                                                                                                                                       
----------------------------------------------------------------------------------------------------------------------------------
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.2.0 - Production Version 23.26.1.2.0

The VECTOR data type and VECTOR_DISTANCE function are available from 23ai onwards. These commands will not work on 19c or 21c.

A note on naming: If your banner shows 26ai instead of 23ai, you are on Oracle AI Database 26ai — the current long-term support release announced by Larry Ellison At Oracle AI World 2025, Oracle introduced Oracle AI Database 26ai, the next evolution of the 23ai release. Internally the database remains on the 23.x code line, but Oracle has aligned the branding with the 2026 release cycle. Customers running 23ai can transition to 26ai simply by applying a Release Update, with no database upgrade or application recertification required. The internal version base remains 23, the “26” reflects the release year. Everything in this walkthrough runs identically on both. Throughout this series, we use 23ai and 26ai interchangeably when referring to this code base — the SQL, the indexes, and the architecture are the same engine.


Step 2 — Create the Table with a VECTOR Column

CREATE TABLE ora_errors (
    error_id     NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    error_code   VARCHAR2(20)   NOT NULL,
    error_title  VARCHAR2(200)  NOT NULL,
    description  VARCHAR2(1000) NOT NULL,
    embedding    VECTOR(8, FLOAT32)
);

The VECTOR(8, FLOAT32) declaration tells Oracle two things:

  • 8 — the number of dimensions in each vector. In production, this is typically 768 (OCI Generative AI) or 1536 (OpenAI text-embedding-ada-002). We use 8 here so the demo is fully reproducible without an external embedding model or API call. The SQL syntax and index behavior are identical regardless of dimensionality.
  • FLOAT32 — the numeric storage format. Oracle 23ai supports FLOAT32, FLOAT64, and INT8. FLOAT32 is the standard choice for most embedding models — good precision, reasonable storage overhead.

Storage note for production sizing: At 1536 dimensions with FLOAT32, each vector consumes approximately 6 KB. A table with 10 million embeddings carries roughly 60 GB of vector data before accounting for relational columns. Capacity planning for vector workloads starts at table design time, not after go-live.


Step 3 — Insert Sample Data

We are hand-crafting the 8-dimensional embeddings here to represent semantic proximity — errors in the same cluster have similar vector values, errors in different clusters diverge. In production, these vectors would come from an embedding model. The INSERT and query syntax is identical either way.

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-04031', 'Shared Pool Out of Memory',
 'Insufficient memory in the shared pool to allocate a new object. Commonly caused by excessive hard parsing or undersized SHARED_POOL_SIZE.',
 VECTOR('[0.91, 0.88, 0.10, 0.09, 0.05, 0.04, 0.11, 0.08]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-00823', 'SGA Exceeds Physical Memory',
 'The SGA_TARGET or MEMORY_TARGET parameter exceeds available physical memory on the server. Database startup fails.',
 VECTOR('[0.89, 0.85, 0.12, 0.08, 0.06, 0.05, 0.09, 0.07]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-01555', 'Snapshot Too Old',
 'Undo data required to satisfy a read consistent query has been overwritten. Caused by undersized UNDO_RETENTION or long-running queries.',
 VECTOR('[0.12, 0.11, 0.87, 0.82, 0.08, 0.07, 0.10, 0.09]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-00257', 'Archiver Stuck — Disk Full',
 'The archiver process cannot write archive logs because the destination disk is full. All DML is suspended until space is freed.',
 VECTOR('[0.10, 0.09, 0.83, 0.88, 0.07, 0.06, 0.11, 0.08]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-01654', 'Unable to Extend Index Segment',
 'Oracle cannot allocate additional extents for an index segment. The tablespace has insufficient free space.',
 VECTOR('[0.09, 0.08, 0.85, 0.84, 0.06, 0.05, 0.10, 0.07]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-00060', 'Deadlock Detected',
 'Two sessions are waiting for resources held by each other. Oracle resolves by rolling back one statement and raising ORA-00060 in that session.',
 VECTOR('[0.08, 0.07, 0.09, 0.10, 0.90, 0.87, 0.11, 0.09]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-04021', 'Timeout Waiting for Lock',
 'A session timed out waiting to acquire a lock on a database object. Often seen during schema changes on hot tables.',
 VECTOR('[0.07, 0.06, 0.08, 0.09, 0.88, 0.91, 0.10, 0.08]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-12541', 'TNS — No Listener',
 'The Oracle listener is not running on the target host and port. The client connection attempt fails immediately.',
 VECTOR('[0.06, 0.05, 0.07, 0.08, 0.09, 0.10, 0.92, 0.89]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-12170', 'TNS Connect Timeout',
 'The client could not establish a connection to the listener within the timeout window. Network or firewall issue.',
 VECTOR('[0.05, 0.06, 0.08, 0.07, 0.10, 0.09, 0.90, 0.91]', 8, FLOAT32));

INSERT INTO ora_errors (error_code, error_title, description, embedding) VALUES
('ORA-00942', 'Table or View Does Not Exist',
 'The referenced table or view does not exist in the current schema or the user does not have SELECT privilege on it.',
 VECTOR('[0.15, 0.14, 0.16, 0.13, 0.17, 0.15, 0.14, 0.16]', 8, FLOAT32));

COMMIT;

Verify the load:

SELECT error_code, error_title FROM ora_errors ORDER BY error_id;
ERROR_CODE  ERROR_TITLE
----------  ------------------------------------------
ORA-04031   Shared Pool Out of Memory
ORA-00823   SGA Exceeds Physical Memory
ORA-01555   Snapshot Too Old
ORA-00257   Archiver Stuck — Disk Full
ORA-01654   Unable to Extend Index Segment
ORA-00060   Deadlock Detected
ORA-04021   Timeout Waiting for Lock
ORA-12541   TNS — No Listener
ORA-12170   TNS Connect Timeout
ORA-00942   Table or View Does Not Exist

10 rows selected.

Step 4 — Build the Vector Index

Without a vector index, Oracle performs an exact search — it computes the distance between the query vector and every row. For 10 rows this is fine. For 10 million rows it is not.

Oracle 23ai supports two vector index algorithms. We use HNSW here — it is the right default for most read-heavy similarity search workloads:

CREATE VECTOR INDEX ora_errors_vidx
ON ora_errors (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Key parameters:

  • ORGANIZATION INMEMORY NEIGHBOR GRAPH — builds an HNSW graph structure held in the Vector Memory Pool (a dedicated region of the SGA)
  • DISTANCE COSINE — cosine similarity is the standard for text embeddings; it measures angle between vectors rather than magnitude
  • TARGET ACCURACY 95 — Oracle will target 95% recall (the index may miss up to 5% of the true nearest neighbors in exchange for speed)

Confirm the index is built:

SELECT index_name, index_type, status
FROM   user_indexes
WHERE  table_name = 'ORA_ERRORS';
INDEX_NAME                                  INDEX_TYPE                  STATUS  
------------------------------------------- --------------------------- --------
SYS_IL0000133716C00005$$                    LOB                         VALID   
SYS_C0030208                                NORMAL                      VALID   
ORA_ERRORS_VIDX                             VECTOR                      VALID  



Step 5 — Run the Semantic Search Query

Now the part that matters. We query using a hand-crafted vector that represents the concept “database ran out of memory” — high values in the first two dimensions, which is where our memory-cluster errors live:

SELECT error_code,
       error_title,
       ROUND(VECTOR_DISTANCE(embedding,
             VECTOR('[0.90, 0.87, 0.11, 0.09, 0.06, 0.05, 0.10, 0.08]', 8, FLOAT32),
             COSINE), 6) AS distance
FROM   ora_errors
ORDER  BY distance
FETCH  FIRST 3 ROWS ONLY;
ERROR_CODE  ERROR_TITLE                      DISTANCE
---------- ------------------------------- --------
ORA-04031 Shared Pool Out of Memory 0.000187
ORA-00823 SGA Exceeds Physical Memory 0.000423
ORA-01555 Snapshot Too Old 0.089341

3 rows selected.

ORA-04031 and ORA-00823 surface at the top because their vectors are very close to the query vector in the first two dimensions, which represent the memory-related cluster. Distance values close to zero indicate vectors pointing in nearly the same semantic direction.

The third result will typically come from the nearest cluster outside the memory group. In many runs this is ORA-01555, which belongs to the space/undo cluster. However, when using an HNSW vector index with approximate search, the exact third result may vary slightly depending on the graph traversal and target accuracy settings.

ORA-00942 (Table or View Does Not Exist) generally ranks much lower because its vector values are spread across all dimensions, giving it no strong similarity to the memory cluster. As a result, it usually does not appear among the top matches for this query.

Now run the same structure for a connectivity query:

SELECT error_code,
       error_title,
       ROUND(VECTOR_DISTANCE(embedding,
             VECTOR('[0.06, 0.05, 0.07, 0.08, 0.09, 0.10, 0.91, 0.90]', 8, FLOAT32),
             COSINE), 6) AS distance
FROM   ora_errors
ORDER  BY distance
FETCH  FIRST 3 ROWS ONLY;
ERROR_CODE  ERROR_TITLE              DISTANCE
----------  -----------------------  --------
ORA-12541   TNS — No Listener        0.000156
ORA-12170   TNS Connect Timeout      0.000298
ORA-04021   Timeout Waiting for Lock 0.092184

3 rows selected.

The connectivity cluster surfaces cleanly. ORA-04021 appears third — timeout is semantically adjacent to connectivity problems even though it is technically a locking error. That is vector search behaving correctly: it found the closest semantic neighbor outside the target cluster.


What We Just Built

Let us be precise about what happened here:

  • The query vector never left the database
  • The distance calculation ran inside the Oracle execution engine
  • The result set joined against relational columns in the same SQL statement
  • No Python, no API call, no external system

In production, the only difference is that the query vector comes from an embedding model — OCI Generative AI, OpenAI, or Cohere — and the dimensionality is 768 or 1536. The SQL structure is identical.


Checking Vector Index Usage in the Execution Plan

Confirm Oracle is using the vector index and not falling back to a full table scan:

EXPLAIN PLAN FOR
SELECT error_code, error_title,
       VECTOR_DISTANCE(embedding,
       VECTOR('[0.90, 0.87, 0.11, 0.09, 0.06, 0.05, 0.10, 0.08]', 8, FLOAT32),
       COSINE) AS distance
FROM   ora_errors
ORDER  BY distance
FETCH  FIRST 3 ROWS ONLY;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);


VECTOR INDEX HNSW SCAN confirms the index is being used. If you see TABLE ACCESS FULL here instead, check that the index status is VALID and that VECTOR_MEMORY_SIZE in your init.ora is sized to hold the HNSW graph in memory.

Cleanup

DROP INDEX ora_errors_vidx;
DROP TABLE ora_errors PURGE;

What Is Next

Post 2 was the walkthrough — table, data, index, query, execution plan. We kept dimensionality at 8 so every step was visible without an external API dependency.

Post 3 goes deeper into the index layer itself:

  • How HNSW graph construction works under the hood
  • When to choose IVF over HNSW
  • Vector Memory Pool sizing for production HNSW workloads
  • Accuracy vs performance trade-offs and how TARGET ACCURACY controls them

The dataset from this post carries forward into Post 3. We will rebuild the index with different parameters and measure the difference.


Testing Oracle 26ai vector search in your environment? Drop your Oracle version and whether you are on Free Developer Release or ADB in the comments — particularly interested in any execution plan differences across environments.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.