We have written blogs which helps DBA doing their day to day works. This blog will explain how to Speed up Inserts while loading data from huge files.
Problem Statement:
The problem statement this blog is going to answers with respect to Speed up Inserts are:
- How to speed up loading millions of rows from flat files (csv) to database
- How to load a table from another huge table.
- How to improve the speed of Insert statements
Approach:
If we want to Speed up Inserts in our environment, we have to understand how many ways we can insert records.
For creating record in database using insert there are two methods. One is conventional and the other is direct path. If we look at the performance aspects of both the approach, latter is much faster than the earlier one.
Conventional Vs Direct Path:
- Direct path does not write on partially written blocks, so no need to find them. This removes overhead of finding blocks suitable for writing. It always writes above the high water mark. This also removes overhead of partially written block into memory for new record creation in those blocks.
- In oppose to conventional path, where load calls oracle to lock and unlock table and indexes for each set (array) of records it processed, direct path calls oracle to lock table and indexes at the beginning and unlock at end. This reduces the overhead of locking and unlocking during load.
- Instead of using oracle’s buffer cache, direct path read performs their own write I/O. This not only minimizes the contentions with other users but also improve performances of inserts.
- Direct Path always bypass UNDO.
- Indexes on tables for which direct path insert happens, it builds mini indexes separately and merge them into actual index in bulk. This operation is much –much faster.
Direct Path Insert:
The Syntax:
A direct path insert statement can be written as:
- CREATE TABLE .. AS SELECT
- INSERT /*+ APPEND */ INTO … SELECT ..
- INSERT /*+ APPEND_VALUES */ INTO .. VALUES ..
The hint provided in the statement is the way to tell oracle to make your insert statement parallel and direct path. These hints are based on the type of statement used.
The Hint:
APPEND hint should be used with the statement of type “INSERT INTO .. SELECT” and APPEND_VALUES should be used with the statement of type “INSERT INTO .. VALUES ..”.
The Example:
Let’s understand this with the help of an example:
Prepare for Performance Measure:
- Create a simple table DPATHINSERT [with only id as number and name as varchar2(100)] and Gather its stats. Before starting validate the HWM allocated for this table and physical direct path write stats.
- Let’s see how many blocks below the HWM allocated to the table
- Let’s see how many physical direct write operations we have currently performed:
Conventional Insert:
- Insert records in the table with conventional insert statement and validate HWM and Direct Path Write. There are no direct path write found after operation for conventional insert.
Direct Path Insert:
- Insert records in the table with Direct path insert statement and validate HWM and Direct Path Write. You will find one direct path write and block count to 9 from previous block count 5.
Performance Comparison:
There are substantial improvement in insert performance using direct path as compared to conventional path insert. We have compared numbers between these two type of inserts for same set of records.
Tried to insert 3 Million (3000000) records in the same table multiple times and found different times taken for each insert. Maximum time it took was 60 seconds while minimum it took was 21 seconds.
While with Direct path insert, the time was nearly constant and found to be 9 seconds which was a substantial improvement over 21 Seconds of conventional insert time.
This is how we can Speed up Inserts in our database.