Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Tuesday, September 27, 2011

Application Tuning using Explain

The EXPLAIN facility can be used quickly and easily to
determine how the data is accessed (known as the access path) for
any given SQL statement, namely queries.  The ability to modify
this access path can yield an incredible performance benefit.  We
will discuss the various kinds of access paths and performance
gains and loss by providing scenarios.
Using the EXPLAIN facility requires that you build the
EXPLAIN_PLAN table in the Oracle account which owns the
application tables, views and indexes.  Your Oracle distribution
media contains the SQL script for doing this.
ORACLE_HOME/rdbms/admin/xplainpl.sql
 
Here is a brief description of the columns we will use in our
discussion.  You may find a full description of each column in
the file named EXPLAIN.DOC on your distribution media.

STATEMENT_ID:  Identifies the name of a specific execution plan
               for a given SQL statement.  It is set to NULL if
               the SET STATEMENT_ID is not used in the EXPLAIN
               PLAN ... syntax.

OPERATION:     The name of the operation being performed in this
               step of the plan.  Various operations performed
               might include Filters, Index, Table, Merge Joins
               and Table, to name a few.

OPTION:        An option that will modify the operation.  For
               example, on a table access, the operation may be
               TABLE ACCESS but the option may say BY ROWID or
               FULL.

OBJECT_OWNER:  The name of the schema or Oracle account that owns
               the database object.

OBJECT_NAME:   The name of the database object.

OBJECT_TYPE:   The object type i.e., TABLE, VIEW, INDEX, etc.

ID:            A number assigned to this step of the execution
               plan tree.

PARENT_ID:     The number assigned to the previous operation that
               received information from this operation.  Using
               the CONNECT BY with ID and PARENT_ID, a tree walk
               through the execution plan can be queried.

     Once this PLAN table has been built, the user is ready to
use EXPLAIN for their application. 
 
For example:

     EXPLAIN PLAN
     SET STATEMENT_ID = 'QUERY1'
     FOR
     SELECT *
     FROM EMP
     WHERE DEPTNO = 10;
 
Note that if the INTO clause is omitted in the EXPLAIN syntax,
EXPLAIN will try to use a table called PLAN_TABLE.  By querying
the plan table, we can see what the execution plan looks like:
 
will return:

OPERATION     OPTIONS  OBJECT_NAME  OBJECT_TYPE ID PARENT_ID
------------- -------- ------------ ----------- -- ---------
TABLE ACCESS  FULL     EMP          TABLE        1

1 row selected.
 
This indicates that a full table scan will occur for this
particular query.  Because no indexes have yet to be created on
the table EMP, all queries against EMP will result in full table
scans.  But suppose a non-unique index on the DEPTNO column of
EMP is created:
 
CREATE INDEX EMP_IDX ON EMP (DEPTNO); 
 
Now, if we re-explain the query:

     EXPLAIN PLAN
     SET STATEMENT_ID = 'QUERY2'
     FOR
     SELECT *
     FROM EMP
     WHERE DEPTNO = 10;

and examine the plan table:

     SELECT OPERATION, OPTIONS, OBJECT_NAME, OBJECT_TYPE, ID,
     PARENT_ID
     FROM PLAN_TABLE
     WHERE STATEMENT_ID = 'QUERY2'
     ORDER BY ID;

will return:

OPERATION     OPTIONS  OBJECT_NAME  OBJECT_TYPE ID PARENT_ID
------------- -------- ------------ ----------- -- ---------
TABLE ACCESS  BY ROWID EMP          TABLE        1
INDEX         RANGE-
              SCAN     EMP_IDX      NON-UNIQUE   2         1

2 rows selected.         
 
Here, we see that the index EMP_IDX is used to retrieve all rows
whose DEPTNO is 10 then access the table BY ROWID.  The index
stores each row's ROWID of the table.  Once a row is found in
the index, a fetch BY ROWID for the remainder of the row is done.
On large tables, this access path may be preferred over the
previous one as this access path will result in fewer disk I/O
operations.  If however, the index has "low selectivity", a full
table scan may be more efficient.
Consider this query and its execution plan:

     EXPLAIN PLAN
     SET STATEMENT_ID = 'QUERY3'
     FOR
     SELECT DEPTNO
     FROM EMP
     WHERE DEPTNO = 10; 

The plan for the execution path is:

OPERATION     OPTIONS  OBJECT_NAME  OBJECT_TYPE ID PARENT_ID
------------- -------- ------------ ----------- -- ---------
INDEX         RANGE-
              SCAN     EMP_IDX      NON-UNIQUE   1

The above execution plan shows that no data was needed from the
table, the query was resolved in the index only.
 
he optimizer will build an execution plan tree once it has
evaluated the query and determined the access path for the
query.  This execution tree (or plan tree) can be shown visually
from the plan table itself with the following query used in
SQL*Plus:

COLUMN plan FORMAT a70
SELECT lpad(' ', 3*level)||operation||'('||options||')
     '||object_name||
        ' ' ||object_type
     FROM plan_table
     CONNECT BY PRIOR id = parent_id and statement_id='&stmt_id'
     START WITH id = 1 and statement_id = '&stmt_id';

For example, consider this query:

     SELECT ENAME
     FROM EMP
     WHERE DEPTNO = 10
     ORDER BY ENAME;

Using the above alternative query to examine the plan table:

   SORT(ORDER BY)
      TABLE ACCESS(BY ROWID) EMP
         INDEX(RANGE SCAN) EMP_IDX NON-UNIQUE
 
The tree is read from the most indented part in, or in this case,
from the bottom up.  This plan tree illustrates that an index
range scan was performed on the EMP_IDX index and the ENAME data
fetched from the table by rowid.  These rows were then sorted
using the ORDER BY option.  This last part of the plan tree will
take the longest time to complete if this were a large table.
Suppose we explained this query:

     SELECT DEPTNO,ENANE
     FROM EMP
     WHERE DEPTNO BETWEEN 10 AND 30
     ORDER BY DEPTNO;

The plan tree looks like:
   TABLE ACCESS(BY ROWID) EMP
      INDEX(RANGE SCAN) EMP_IDX NON-UNIQUE
Note that the SORT(ORDER BY) is not in this plan tree yet the
query has an ORDER BY DEPTNO.  Why?  The sort will not be
necessary for 2 reasons:  1) the DEPTNO column is indexed and
hence, already sorted and 2) the DEPTNO is defined as NOT NULL
(i.e., DEPTNO   NOT NULL NUMBER).  Note that these must be in
place before this feature can be utilized.

Consider the following common join query:

     SELECT *
     FROM EMP,DEPT
     WHERE EMP.DEPTNO = DEPT.DEPTNO
     AND SAL > 5000;

The plan tree is:

   NESTED LOOPS()
      TABLE ACCESS(FULL) DEPT
      TABLE ACCESS(BY ROWID) EMP
         INDEX(RANGE SCAN) EMP_IDX NON-UNIQUE
 
NESTED LOOPS means that a sequential scan is done on one
table (DEPT) and each DEPTNO retrieved is looked up in the index
EMP_IDX on EMP.  This query has what is called a  driving table.
In this case, the driving table is DEPT.  In these type of joins,
the driving table is determined by which table is listed last.
This is because both tables have equal rank access paths (both
have non-unique indexes on DEPTNO) and since all rows must be
searched in at least one table, it is more efficient to perform a
full table scan on one table and look up the join column in the
index of the other to meet the join predicate.  In such cases,
ensure that the driving table with the fewest number of rows is
the last one listed in the FROM clause.  Note that the order of
tables in the FROM clause currently influences how the ORACLE
RDBMS chooses the access path for this type of join.

     The driving table in the above query can be overridden by
using a join-query where one of the WHERE predicates returns a
single row.  Suppose a unique index is constructed on the EMPNO
column on the EMP table:

     SELECT *
     FROM EMP,DEPT
     WHERE EMP.DEPTNO = DEPT.DEPTNO
     AND EMPNO = 7735;

The plan tree is:

   NESTED LOOPS()
      TABLE ACCESS(BY ROWID) EMP
         INDEX(UNIQUE SCAN) EMP_UIDX UNIQUE
      TABLE ACCESS(BY ROWID) DEPT
         INDEX(RANGE SCAN) DEPT_IDX NON-UNIQUE
This execution plan illustrates that the NESTED LOOPS will still
be done but only once because an INDEX UNIQUE SCAN operation is
performed on EMP (the unique index EMP_UIDX) which is the driving
table.  This value is looked up via an INDEX RANGE SCAN on the
DEPT_IDX and if the row is found, the table is access BY ROWID.

     Other operations and their modifying options may also show
up in an EXPLAIN output.  CONNECT BY for example, is an operation
used when the query contains a CONNECT BY PRIOR ... clause
causing the RDBMS to perform a tree walk through the table based
on the connect by condition.  The query which gives us the
execution plan tree is one such example.  Use the EXPLAIN on the
query and see what you get.
 
Still, other operations such as FILTERS, PROJECT, UNION may
also be encountered.  For instance, the following query will
cause a FILTER operation to be performed:

     SELECT DEPTNO,AVG(SAL),SUM(COM)
     FROM EMP
     GROUP BY DEPTNO
     HAVING SUM(COM) > 500;

Will have the following execution plan:

   FILTER()
      SORT(GROUP BY)
         TABLE ACCESS(FULL) EMP

Once the table has been sorted, the filter operation will collect
the rows having COM greater than 500.

     PROJECT is not as frequently encountered.  It indicates that
a subset of columns were selected of all columns when MINUS and
UNIONS are used.

     UNION and MINUS will be seen in the execution plan tree if
the user explicitly uses them in their SQL statements.
 
IEWS

     Virtual tables or VIEWS may also be explained.  It is of
good practice to do so before applications are to make use of
them.  This is done by just taking the query that the view is
based on and use EXPLAIN on the query.  The access path should be
transparent to the query utilizing the view.  Consider this view
for example:

     CREATE VIEW EMP_VIEW AS
     SELECT *
     FROM EMP
     WHERE DEPTNO > 20;

The execution plan tree is:

   TABLE ACCESS(BY ROWID) EMP
      INDEX(RANGE SCAN) EMP_IDX NON-UNIQUE

if you were to just select from the view.  If we were to join
this view with DEPT sa in:

     SELECT ENAME,LOC
     FROM EMP_VIEW,DEPT;

would yield this execution plan:

   NESTED LOOPS()
      TABLE ACCESS(FULL) DEPT
      TABLE ACCESS(BY ROWID) EMP
         INDEX(RANGE SCAN) EMP_IDX NON-UNIQUE

which is identical to an earlier example we gave.  If the
operation in a given step of the execution plan is "VIEW", this
indicates that this is a separate operation.  This step would be
performed to complete its parent operation.  You may find the
VIEW operation when views are defined based on queries joining
two or more tables.  There may be other situations where the VIEW
operation is performed.

     Ensure that if you are going to utilize views in your
application, that they are written for maximum performance yield.
ORACLE Support often finds many customer applications where views
are used with no regards to the performance they can cost if not
written properly.
 
WHAT CAN AND CAN'T BE EXPLAINED

     Just about any SQL statement can be explained but this tool
is not very useful for INSERT, UPDATE and DELETE statements
without subqueries.  INSERT statements without a subquery will
not create an execution plan tree.  UPDATE and DELETE will create
an execution plan as they must fetch rows even without a WHERE
clause or subqueries.
     In addition, EXPLAIN will reveal the use of other object
types such as SEQUENCE number generators if used in your SQL
statements.

     The only real restriction on EXPLAIN is that the user cannot
explain tables, views, indexes or other objects which are owned
by another user.  The user must own all of the objects to be
explained.  If there are objects for which the user has SELECT
privileges on but does not own the objects, EXPLAIN will return
an error indicating that objects owned by another user cannot be
explained.
 
         

No comments: