Sunday, 20 May 2012

Manually add a host target to Grid Control / OEM

Recently i went through the situation where is incedently drop host target from agent while moving my agent
to another grid.

So, i got following solution and it works for  me:

Solution: Create a temporary file /tmp/hosttarget with one line:

<Target TYPE="host" NAME="hostname.abc.com"/>

Where hostname.abc.com is the host name. Then add the target with:

emctl config agent addtarget -f /tmp/hosttarget

Similary we can do this for other targets as well.

Saturday, 5 May 2012

Oracle Join Nested/Hash/Sort Merge- Performance Tuning

Normally we have three types of  Oracle join:

1. Nested - Loop Join
2. Hash Join
3. Sort Merge Join

I will discuss about characteristics of these three joins today:

1. Nested Loop Joins

Nested loop joins are useful when small subsets of data are being joined and if the join condition is an efficient way of accessing the second table.
It is very important to ensure that the inner table is driven from (dependent on) the outer table. If the inner table's access path is independent of the outer table, then the same rows are retrieved for every iteration of the outer loop, degrading performance considerably. In such cases, hash joins joining the two independent row sources perform better.
See Also:

A nested loop join involves the following steps:
  1. The optimizer determines the driving table and designates it as the outer table.
  2. The other table is designated as the inner table.
  3. For every row in the outer table, Oracle accesses all the rows in the inner table. The outer loop is for every row in the outer table and the inner loop is for every row in the inner table. The outer loop appears before the inner loop in the execution plan, as follows:
    NESTED LOOPS 
      outer_loop 
      inner_loop 
    

1.1 When the Optimizer Uses Nested Loop Joins

The optimizer uses nested loop joins when joining small number of rows, with a good driving condition between the two tables. You drive from the outer loop to the inner loop, so the order of tables in the execution plan is important.
The outer loop is the driving row source. It produces a set of rows for driving the join condition. The row source can be a table accessed using an index scan or a full table scan. Also, the rows can be produced from any other operation. For example, the output from a nested loop join can be used as a row source for another nested loop join.
The inner loop is iterated for every row returned from the outer loop, ideally by an index scan. If the access path for the inner loop is not dependent on the outer loop, then you can end up with a Cartesian product; for every iteration of the outer loop, the inner loop produces the same set of rows. Therefore, you should use other join methods when two independent row sources are joined together.

1.2 Nested Loop Join Hints

If the optimizer is choosing to use some other join method, you can use the USE_NL(table1 table2) hint, where table1 and table2 are the aliases of the tables being joined.


2 Hash Joins

Hash joins are used for joining large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows.
This method is best used when the smaller table fits in available memory. The cost is then limited to a single read pass over the data for the two tables.

2.1 When the Optimizer Uses Hash Joins

The optimizer uses a hash join to join two tables if they are joined using an equijoin and if either of the following conditions are true:
  • A large amount of data needs to be joined.
  • A large fraction of a small table needs to be joined.

2.2 Hash Join Hints

Apply the USE_HASH hint to instruct the optimizer to use a hash join when joining two tables together.

3. Sort Merge Joins

Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:
  • The row sources are sorted already.
  • A sort operation does not have to be done.
However, if a sort merge join involves choosing a slower access method (an index scan as opposed to a full table scan), then the benefit of using a sort merge might be lost.
Sort merge joins are useful when the join condition between two tables is an inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins perform better than nested loop joins for large data sets. You cannot use hash joins unless there is an equality condition.
In a merge join, there is no concept of a driving table. The join consists of two steps:
  1. Sort join operation: Both the inputs are sorted on the join key.
  2. Merge join operation: The sorted lists are merged together.
If the input is already sorted by the join column, then a sort join operation is not performed for that row source. However, a sort merge join always creates a positionable sort buffer for the right side of the join so that it can seek back to the last match in the case where duplicate join key values come out of the left side of the join.

3.1 When the Optimizer Uses Sort Merge Joins

The optimizer can choose a sort merge join over a hash join for joining large amounts of data if any of the following conditions are true:
  • The join condition between two tables is not an equi-join.
  • Because of sorts already required by other operations, the optimizer finds it is cheaper to use a sort merge than a hash join.

3.2 Sort Merge Join Hints

To instruct the optimizer to use a sort merge join, apply the USE_MERGE hint. You might also need to give hints to force an access path.
There are situations where it is better to override the optimizer with the USE_MERGE hint. For example, the optimizer can choose a full scan on a table and avoid a sort operation in a query. However, there is an increased cost because a large table is accessed through an index and single block reads, as opposed to faster access through a full table scan.


Note : if we consider on the broader look Nested Loop join is performed on small tables with index on driven (inner) column will add edge to it, on the other hand Hash Join is used on Large tables with no indexes and use pga for preparing hash table. Sort Merge join is used in case of medium sized tables.

Above note is referenced from Oracle Internals and manuals.


Example:

SQL> conn hr/*****
Connected.
SQL> create table e as select * from emp;
Table created.
SQL> create table d as select * from dept;
Table created.
SQL> create index e_deptno on e(deptno);
Index created.
Gather D stats as it is
SQL> exec dbms_stats.gather_table_stats('hr','D')
PL/SQL procedure successfully completed.

Set artificial stats for E:
SQL> exec dbms_stats.set_table_stats(ownname => 'hr', tabname => 'E', numrows => 100, numblks => 100, avgrlen => 124);
PL/SQL procedure successfully completed.

Set artificial stats for E_DEPTNO index
SQL> exec dbms_stats.set_index_stats(ownname => 'hr', indname => 'E_DEPTNO', numrows => 100, numlblks => 10);
PL/SQL procedure successfully completed.

Check out the plan:
A) With less number of rows(100 in E), you will see Nested loop getting used.

SQL> select e.ename,d.dname from e, d where e.deptno=d.deptno;
Execution Plan
----------------------------------------------------------
Plan hash value: 3204653704
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2200 | 6 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| E | 25 | 225 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 100 | 2200 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | D | 4 | 52 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | E_DEPTNO | 33 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

B) Let us set some more artificial stats to see which plans is getting used:

SQL> exec dbms_stats.set_table_stats(ownname => 'hr', tabname => 'E', numrows => 1000000, numblks => 10000, avgrlen => 124);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.set_index_stats(ownname => 'hr', indname => 'E_DEPTNO', numrows => 1000000, numlblks => 1000);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.set_table_stats(ownname => 'hr', tabname => 'D', numrows => 1000000,numblks => 10000 , avgrlen => 124);
PL/SQL procedure successfully completed.

Now we have 1000000 number of rows in E and D table both and index on E(DEPTNO) reflects the same.
Plans changes !!
SQL> select e.ename,d.dname from e, d where e.deptno=d.deptno;
Execution Plan
----------------------------------------------------------
Plan hash value: 51064926
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250G| 5122G| | 3968K(100)| 13:13:45 |
|* 1 | HASH JOIN | | 250G| 5122G| 20M| 3968K(100)| 13:13:45 |
| 2 | TABLE ACCESS FULL| E | 1000K| 8789K| | 2246 (3)| 00:00:27 |
| 3 | TABLE ACCESS FULL| D | 1000K| 12M| | 2227 (2)| 00:00:27 |
-----------------------------------------------------------------------------------

C) Now to test MERGE JOIN, we set moderate number of rows and do some ordering business.
SQL> exec dbms_stats.set_table_stats(ownname => 'hr', tabname => 'E', numrows => 10000, numblks => 1000, avgrlen => 124);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.set_index_stats(ownname => 'hr', indname => 'E_DEPTNO', numrows => 10000, numlblks => 100);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.set_table_stats(ownname => 'hr', tabname => 'D', numrows => 1000, numblks => 100, avgrlen => 124);
PL/SQL procedure successfully completed.
SQL> select e.ename,d.dname from e, d where e.deptno=d.deptno order by e.deptno;
Execution Plan
----------------------------------------------------------
Plan hash value: 915894881
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500K| 52M| 167 (26)| 00:00:02 |
| 1 | MERGE JOIN | | 2500K| 52M| 167 (26)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| E | 10000 | 90000 | 102 (1)| 00:00:02 |
| 3 | INDEX FULL SCAN | E_DEPTNO | 10000 | | 100 (0)| 00:00:02 |
|* 4 | SORT JOIN | | 1000 | 13000 | 25 (4)| 00:00:01 |
| 5 | TABLE ACCESS FULL | D | 1000 | 13000 | 24 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

 

Friday, 4 May 2012

Invisible Indexes - 11g New Feature

Invisible Indexes in Oracle Database 11g Release 1 New Feature

Oracle 11g allows indexes to be marked as invisible. Invisible indexes are maintained like any other index, but they are ignored by the optimizer unless the OPTIMIZER_USE_INVISIBLE_INDEXES parameter is set to TRUE at the instance or session level. Indexes can be created as invisible by using the INVISIBLE keyword, and their visibility can be toggled using the ALTER INDEX command.

 1. Create a table t1 with 2 columns n1 and n2

Hint : Create table t1(n1 number,n2 number);

SQL> show parameter visible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_use_invisible_indexes      boolean     FALSE
SQL> Create table t1(n1 number,n2 number);

Table created.



2. Populate Records

    Begin
          For i in 1..1000 loop
           Insert into t1 values(i,i);
          end loop;
   end;
/

SQL>   Begin
          For i in 1..1000 loop
           Insert into t1 values(i,i);
          end loop;
   end;
/  2    3    4    5    6

PL/SQL procedure successfully completed.

SQL>



3.Create a Invisible index on column n1

Hint :-

SQL> create index t1_n1 on t1(n1) invisible;

create index t1_n1 on t1(n1) invisible;


4
SQL> explain plan for select count(*) from t1 where n1=:b1;



SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    10 |   130 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("N1"=TO_NUMBER(:B1))

Note
-----
   - dynamic sampling used for this statement (level=2)

18 rows selected.
If you see above it's not using index as we have give it to be in invisible mode.

5 alter index t1_n1 visible;


6 Define a bind variable b1

sql> variable b1 number

sql>begin
        :b1:=5;
     end;


7 SQL> explain plan for select count(*) from t1 where n1=:b1;



8 SQL> select * from table(dbms_xplan.display);
---------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |      1|     5|      1   (0)| 00:00:01 |
|   1|   SORT AGGREGATE    |       |      1|     5|             |          |
|*  2|     INDEX RANGE SCAN| T1_N1 |      1|     5|      1   (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
   2 - access("N1"=TO_NUMBER(:B1))



9 alter index t1_n1 visible;


10 SQL> explain plan for select count(*) from t1 where n1=:b1;


SQL> explain plan for select count(*) from t1 where n1=:b1;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 73337487

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |    13 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |       |     1 |    13 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_N1 |    10 |   130 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"=TO_NUMBER(:B1))

Note
-----
   - dynamic sampling used for this statement (level=2)

18 rows selected.

SQL>

Benefits:Invisible indexes can be useful for processes with specific indexing needs, where the presence of the indexes may adversely affect other functional areas. They are also useful for testing the impact of dropping an index.

View:The current visibility status of an index is indicated by the VISIBILITY column of the [DBA|ALL|USER]_INDEXES views.

Bottleneck: Invisble index are just not visible to select statement, rather it will operate normally in case of DML's.


Thursday, 3 May 2012

Row chaining and Row Migration - Perfomance Tuning

Proof of Concept: There are two circumstances when this can occur, the data for a row in a table may be too large to fit into a single data block. This can be caused by either row chaining or row migration.

Row Chaining: Occurs when the row is too large to fit into one data block when it is first inserted. In this case, Oracle stores the data for the row in a chain of data blocks (one or more) reserved for that segment. Row chaining most often occurs with large rows, such as rows that contain a column of datatype LONG, LONG RAW, LOB, etc. Row chaining in these cases is unavoidable.

Row Migration: Occurs when a row that originally fitted into one data block is updated so that the overall row length increases, and the block’s free space is already completely filled. In this case, Oracle migrates the data for the entire row to a new data block, assuming the entire row can fit in a new block. Oracle preserves the original row piece of a migrated row to point to the new block containing the migrated row: the rowid of a migrated row does not change.
When a row is chained or migrated, performance associated with this row decreases because Oracle must scan more than one data block to retrieve the information for that row.

o INSERT and UPDATE statements that cause migration and chaining perform poorly, because they perform additional processing.

o SELECTs that use an index to select migrated or chained rows must perform additional I/Os.

Detection: Migrated and chained rows in a table or cluster can be identified by using the ANALYZE command with the LIST CHAINED ROWS option. This command collects information about each migrated or chained row and places this information into a specified output table. To create the table that holds the chained rows,

1. Execute script UTLCHAIN.SQL. i.e @?/rdbms/admin/utlchain.sql --> to crated the table chained_Rows which will contain the data after analyze.

2.ANALYZE TABLE scott.emp LIST CHAINED ROWS; --> Analyze table

3.SELECT * FROM chained_rows; --> it will populate data in chained_rows regarding row chaining and migrations

4.You can also detect migrated and chained rows by checking the ‘table fetch continued row’ statistic in the v$sysstat view.
SQL> SELECT name, value FROM v$sysstat WHERE name = ‘table fetch continued row’;
NAME VALUE
—————————————————————- ———
table fetch continued row 308

Although migration and chaining are two different things, internally they are represented by Oracle as one. When detecting migration and chaining of rows you should analyze carrefully what you are dealing with.

Resolving:
o In most cases chaining is unavoidable, especially when this involves tables with large columns such as LONGS, LOBs, etc. When you have a lot of chained rows in different tables and the average row length of these tables is not that large, then you might consider rebuilding the database with a larger blocksize.

e.g.: You have a database with a 2K block size. Different tables have multiple large varchar columns with an average row length of more than 2K. Then this means that you will have a lot of chained rows because you block size is too small. Rebuilding the database with a larger block size can give you a significant performance benefit.
o Migration is caused by PCTFREE being set too low, there is not enough room in avoid migration, all tables that are updated should have their PCTFREE set so that there is enough space within the block for updates.
You need to increase PCTFREE to avoid migrated rows. If you leave more free space available in the block for updates, then the row will have more room to grow.
SQL Script to eliminate row migration :
— Get the name of the table with migrated rows:
ACCEPT table_name PROMPT ‘Enter the name of the table with migrated rows: ‘

— Clean up from last execution
set echo off
DROP TABLE migrated_rows;
DROP TABLE chained_rows;
— Create the CHAINED_ROWS table
@…/rdbms/admin/utlchain.sql
set echo on
spool fix_mig
— List the chained and migrated rows
ANALYZE TABLE &table_name LIST CHAINED ROWS;

— Copy the chained/migrated rows to another table
create table migrated_rows as
SELECT orig.*
FROM &table_name orig, chained_rows cr
WHERE orig.rowid = cr.head_rowid
AND cr.table_name = upper(‘&table_name’);

— Delete the chained/migrated rows from the original table
DELETE FROM &table_name WHERE rowid IN (SELECT head_rowid FROM chained_rows);

— Copy the chained/migrated rows back into the original table
INSERT INTO &table_name SELECT * FROM migrated_rows;

spool off

Tips

1. Analyze the table and check the chained count for that particular table
8671 Chain Count

analyze table tbl_tmp_transaction_details compute statistics

select table_name,chain_cnt,pct_free,pct_used from dba_tables where table_name=’TBL_TMP_TRANSACTION_DETAILS’

2. Increase the pctfree size to 30

alter table tbl_tmp_transaction_details pctfree 30

3. Regenerate Report (When rows get updated only we will have Chained rows)

tbl_report_generation_status

begin dbms_job.run(190); end;

4. Analyze the table and check the chained count for that particular table
0 Chain Count

analyze table tbl_tmp_transaction_details compute statistics

select table_name,chain_cnt,pct_free,pct_used from dba_tables where table_name=’TBL_TMP_TRANSACTION_DETAILS’

Note:
If we want to do the procedure to delete the chained rows from original table and insert the same again, then we need chained_rows table
To create chained rows we need to run the utlchain.sql from $ORACLE_HOME/rdbms

Find out the chained rows.

analyze table tbl_tmp_transaction_details list chained count;

The above command will move the chained rows to chained_row table
Based on the rowid in chained_row table we can move those record to temp table and delete those chained rows from original table then insert the same again into original table.

select * from tbl_tmp_transaction_details where rowid=’AAAG8DAAGAAAGOKABD’:


Example:

SQL> Create table frag_tab(code number,x1 char(2000),x2 char(2000),
                            x3 char(2000),x4 char(2000));  2 

Table created.

SQL> Insert into frag_tab(code) values(1);
       Insert into frag_tab(code) values(2);
       Insert into frag_tab(code) values(3);
       commit;
1 row created.

SQL>
1 row created.

SQL>
1 row created.

SQL>

Commit complete.

SQL>
SQL>
SQL>
SQL> update frag_tab set x1='x1',x2='x2',x3='x3',x4='x4' where code=2;
   update frag_tab set x1='x1',x2='x2',x3='x3',x4='x4' where code=1;
   update frag_tab set x1='x1',x2='x2',x3='x3',x4='x4' where code=3;
commit;

1 row updated.

SQL>
1 row updated.

SQL>
1 row updated.

SQL>
Commit complete.

SQL>
SQL>
SQL>
SQL>
SQL> @?/rdbms/admin/utlchain.sql

Table created.

SQL> Analyze table frag_tab list chained rows;

Table analyzed.

SQL> select * from chained_rows;

OWNER_NAME                     TABLE_NAME
------------------------------ ------------------------------
CLUSTER_NAME                   PARTITION_NAME
------------------------------ ------------------------------
SUBPARTITION_NAME              HEAD_ROWID         ANALYZE_T
------------------------------ ------------------ ---------
SYS                            FRAG_TAB

N/A                            AAASOtAABAAAU9ZAAA 03-MAY-12

SYS                            FRAG_TAB

N/A                            AAASOtAABAAAU9ZAAB 03-MAY-12

OWNER_NAME                     TABLE_NAME
------------------------------ ------------------------------
CLUSTER_NAME                   PARTITION_NAME
------------------------------ ------------------------------
SUBPARTITION_NAME              HEAD_ROWID         ANALYZE_T
------------------------------ ------------------ ---------

SYS                            FRAG_TAB

N/A                            AAASOtAABAAAU9ZAAC 03-MAY-12





SQL> Create table duptab as select * from frag_tab where 1=0;

Table created.

SQL>  Insert into duptab  select * from frag_tab
       where rowid in(select head_rowid from chained_rows);  2 

3 rows created.

SQL> delete frag_tab where rowid in(select head_rowid from chained_rows);

3 rows deleted.

SQL> Insert into frag_tab
                   as select * from duptab  2 
  3  /
                   as select * from duptab
                   *
ERROR at line 2:
ORA-00926: missing VALUES keyword


SQL> Insert into frag_tab
                    select * from duptab  2 
  3  /

3 rows created.

SQL>  commit;

Commit complete.

SQL> truncate table chained_rows;

Table truncated.

SQL> Create tablespace bigtbs
         datafile '/home/oracle/bigtbs.dbf' size 10m
         blocksize 16k;  2    3 
Create tablespace bigtbs
*
ERROR at line 1:
ORA-29339: tablespace block size 16384 does not match configured block sizes


SQL> ALter system set db_16k_cache_size=10m;

System altered.

SQL> Create tablespace bigtbs
         datafile '/home/oracle/bigtbs.dbf' size 10m
         blocksize 16k;  2    3 
Create tablespace bigtbs
*
ERROR at line 1:
ORA-01119: error in creating database file '/home/oracle/bigtbs.dbf'
ORA-27040: file create error, unable to create file
Linux Error: 2: No such file or directory


SQL> l
  1  Create tablespace bigtbs
  2           datafile '/home/oracle/bigtbs.dbf' size 10m
  3*          blocksize 16k
SQL> Create tablespace bigtbs
         datafile '/tmp/bigtbs.dbf' size 10m
         blocksize 16k;
  2    3 
Tablespace created.

SQL> Alter table frag_tab move tablespace bigtbs;

Table altered.

SQL> Analyze table frag_tab list chained rows;

Table analyzed.

SQL> select * from chained_rows
  2  /

no rows selected

SQL>

Wednesday, 2 May 2012

Increemental Statistics Gathering Feature -11g

Increemental Statistics Gathering Feature -11g

Expensive global statistics collection


In data warehouse environment it is very common to do a bulk load directly into one or more empty partitions. This will make the partition statistics stale and may also make the global statistics stale. Re-gathering statistics for the effected partitions and for the entire table can be very time consuming. Traditionally, statistics collection is done in a two-pass approach:
  • In the first pass we will scan the table to gather the global statistics
  • In the second pass we will scan the partitions that have been changed to gather their partition level statistics.
The full scan of the table for global statistics collection can be very expensive depending on the size of the table. Note that the scan of the entire table is done even if we change a small subset of partitions.


In Oracle Database 11g, we avoid scanning the whole table when computing global statistics by deriving the global statistics from the partition statistics. Some of the statistics can be derived easily and accurately from partition statistics. For example, number of rows at global level is the sum of number of rows of partitions. Even global histogram can be derived from partition histograms. But the number of distinct values (NDV) of a column cannot be derived from partition level NDVs. So, Oracle maintains another structure called a synopsis for each column at the partition level. A synopsis can be considered as sample of distinct values. The NDV can be accurately derived from synopses. We can also merge multiple synopses into one. The global NDV is derived from the synopsis generated by merging all of the partition level synopses. To summarize

  1. Gather statistics and create synopses for the changed partitions only
  2. Oracle automatically merges partition level synopses into a global synopsis
  3. The global statistics are automatically derived from the partition level statistics and global synopses


Incremental maintenance feature is disabled by default. It can be enabled by changing the INCREMENTAL table preference to true. It can also be enabled for a particular schema or at the database level. If you are interested in more details of the incremental maintenance feature

Assume we have table called SALES that is range partitioned by day on the SALES_DATE column. At the end of every day data is loaded into latest partition and partition statistics are gathered. Global statistics are only gathered at the end of every month because gathering them is very time and resource intensive. Use the following steps in order to maintain global statistics after every load.
1 -Turn on incremental feature for the table. 

EXEC DBMS_STATS.SET_TABLE_PREFS('SH','SALES','INCREMENTAL','TRUE');

2 -At the end of every load gather table statistics using GATHER_TABLE_STATS command. You don't need to specify the partition name. Also, do not specify the granularity parameter. The command will collect statistics for partitions where data has change or statistics are missing and update the global statistics based on the partition level statistics and synopsis.
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

Note: that the incremental maintenance feature was introduced in Oracle Database 11g Release 1. However, we also provide a solution in Oracle Database10g Release 2 (10.2.0.4) that simulates the same behavior. The 10g solution is a new value, 'APPROX_GLOBAL AND PARTITION' for the GRANULARITY parameter of the GATHER_TABLE_STATS procedures. It behaves the same as the incremental maintenance feature except that we don't update the NDV for non-partitioning columns and number of distinct keys of the index at the global level. For partitioned column we update the NDV as the sum of NDV at the partition levels. Also we set the NDV of columns of unique indexes as the number of rows of the table. In general, non-partitioning column NDV at the global level becomes stale less often. It may be possible to collect global statistics less frequently then the default (when table changes 10%) since approx_global option maintains most of the global statistics accurately.

Let's take a look at an example to see how you would effectively use the Oracle Database 10g approach.
After the data load is complete, gather statistics using DBMS_STATS.GATHER_TABLE_STATS for the last partition (say SALES_11FEB2009), specify granularity => 'APPROX_GLOBAL AND PARTITION'. It will collect statistics for the specified partition and derive global statistics from partition statistics (except for NDV as described before).

EXEC DBMS_STATS.GATHER_TABLE_STATS ('SH', 'SALES', 'SALES_11FEB2009', GRANULARITY => 'APPROX_GLOBAL AND PARTITION');
It is necessary to install the one off patch for bug 8719831 if you are using the above features in 10.2.0.4 (patch 8877245) or in 11.1.0.7 (patch 8877251)


Let’s take the ORDERS2 table, which is partitioned by month on order_date.  We will begin by enabling incremental statistics for the table and gathering statistics on the table.


After the statistics gather the last_analyzed date for the table and all of the partitions now show 13-Mar-12.

And we now have the following column statistics for the ORDERS2 table.

We can also confirm that we really did use incremental statistics by querying the dictionary table sys.HIST_HEAD$, which should have an entry for each column in the ORDERS2 table.

So, now that we have established a good baseline, let’s move on to the DML. Information is loaded into the latest partition of the ORDERS2 table once a month. Existing orders maybe also be update to reflect changes in their status. Let’s assume the following transactions take place on the ORDERS2 table this month.

After these transactions have occurred we need to re-gather statistic since the partition ORDERS_MAR_2012 now has rows in it and the number of distinct values and the maximum value for the STATUS column have also changed.

Now if we look at the last_analyzed date for the table and the partitions, we will see that the global statistics and the statistics on the partitions where rows have changed due to the update (ORDERS_FEB_2012) and the data load (ORDERS_MAR_2012) have been updated.

The column statistics also reflect the changes with the number of distinct values in the status column increase to reflect the update.



Note : Above info is referenced from Oracle blogs and Oracle Manuals.