Oracle Database – Gathering Statistics

Oracle Database – Gathering Statistics

Statistics are very important in databases as it allows the Oracle Optimizer (CBO) to determine the most efficient way to parse an SQL statement. Stale statistics will lead to poor database performance because the SQL execution will not be optimal. Oracle knows this is important and have the statistics gathered for you automatically via Automatic Optimizer Statistics Collection.

The types of Optimizer Statistics are:

Table statistics

  1. Number of rows
  2. Number of blocks
  3. Average row length

Column statistics

  1. Number of distinct values (NDV) in column
  2. Number of nulls in column
  3. Data distribution (histogram)
  4. Extended statistics

Index statistics

  1. Number of leaf blocks
  2. Levels
  3. Clustering factor

System statistics

  1. I/O performance and utilization
  2. CPU performance and utilization

So how should we gather statistics in Oracle manually?

  • ANALYZE
  • DBMS_STATS

There are 2 ways to do it. Prior to Oracle 10gR1, ANALYZE was the recommended way to do it. It is a command that can be easily executed.

SQL> analyze table schema_name.table_name [estimate|compute] statistics;

The COMPUTE clause will allow Oracle to perform a full table scan and insert the statistics into the data dictionary for the CBO to use. However it is very resource intensive, especially for a Data Warehouse, it might take a long time.

The ESTIMATE clause will allow Oracle to update its statistics use a sample data from the tables. It might not be as accurate, however it utilise a lot less time/resources than COMPUTE clause.

However, if you are running Oracle 10G and onwards, you should not be using ANALYZE. It is there for backward compatibility and may be removed in future releases. You should be using DBMS_STATS instead.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS ('schema_name','table_name');

There are a few advantages:

  • DBMS_STATS is a stored procedure which can be manipulated easily
  • DBMS_STATS can be execute in parallel
  • Support for external and partitioned tables
  • Collects system statistics as well
  • The Optimizer, which depends upon statistics, will eventually use only statistics that have been collected by DBMS_STATS

Hope this helps

Cheers,
Wei Shan

Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: