Tuesday, 30 December 2025

Why Oracle Performs Full Table Scan for DISTINCT Queries

This question was recently asked by a colleague:

"I have an index on a column, but when I run a DISTINCT query on that column, Oracle still performs a full table scan. Why?”
At first glance, this feels counter-intuitive. If an index exists, Oracle should use it, right?
The answer is simple, but understanding why Oracle behaves this way is important for every DBA.
In this post, we’ll walk through a small test case and see exactly when and why Oracle uses (or ignores) an index for a DISTINCT query.

Test Setup

Let’s create a sample table with some data and an index on the column we’ll use in the DISTINCT query.
 

SQL>  create table some_data as select trunc(dbms_random.value(1,100)) val, a.* from dba_objects a;
Table created.
SQL> @desc some_data
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VAL                                                NUMBER
 OWNER                                              VARCHAR2(128)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(128)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(23)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
 NAMESPACE                                          NUMBER
 EDITION_NAME                                       VARCHAR2(128)
 SHARING                                            VARCHAR2(18)
 EDITIONABLE                                        VARCHAR2(1)
 ORACLE_MAINTAINED                                  VARCHAR2(1)
 APPLICATION                                        VARCHAR2(1)
 DEFAULT_COLLATION                                  VARCHAR2(100)
 DUPLICATED                                         VARCHAR2(1)
 SHARDED                                            VARCHAR2(1)
 CREATED_APPID                                      NUMBER
 CREATED_VSNID                                      NUMBER
 MODIFIED_APPID                                     NUMBER
 MODIFIED_VSNID                                     NUMBER

SQL> create index some_data_val_indx on some_data(val);
Index created.


SQL>
SQL>
SQL>  exec dbms_stats.gather_table_stats(null, 'SOME_DATA', cascade=>true);
PL/SQL procedure successfully completed.

SQL> set autot trace
SQL>  select distinct val from some_data;
99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 443606636

-------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |    99 |   297 |   632   (1)| 00:00:01 |
|   1 |  HASH UNIQUE       |           |    99 |   297 |   632   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SOME_DATA |   116K|   341K|   628   (1)| 00:00:01 |
-------------------------------------------------------------
Statistics
----------------------------------------------------------
         36  recursive calls
          0  db block gets
       2340  consistent gets
          0  physical reads
          0  redo size
       2207  bytes sent via SQL*Net to client
        463  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
         99  rows processed




Even though an index exists on VAL, Oracle chooses a full table scan.

What If We Force the Index Using a Hint?

SQL> select /*+ index(some_data) */ distinct val from some_data;
99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 443606636
-------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |    99 |   297 |   632   (1)| 00:00:01 |
|   1 |  HASH UNIQUE       |           |    99 |   297 |   632   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SOME_DATA |   116K|   341K|   628   (1)| 00:00:01 |
-------------------------------------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
-------------------------------------------------------------

   2 -  SEL$1 / SOME_DATA@SEL$1
         U -  index(some_data)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2312  consistent gets
          0  physical reads
          0  redo size
       2207  bytes sent via SQL*Net to client
        486  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         99  rows processed


Result
• Same execution plan
• Index hint is ignored
• Full table scan is still chosen

Oracle clearly doesn’t want to use the index here.

Why Is Oracle Ignoring the Index?

The reason is straightforward:
A B-Tree index does NOT store NULL values.
Since VAL is nullable, Oracle cannot guarantee that all distinct values can be derived by scanning only the index. To be 100% correct, it must scan the table to account for possible NULLs.
That’s why:
• Full table scan is chosen
• Index hints are ignored

What If the Column Is NOT NULL?

Let’s change the column definition.

SQL>  alter table some_data modify(val not null);
Table altered.



SQL>  select distinct val from some_data;

99 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1329759908
--------------------------------------------------------------------------------------------
| Id  | Operation             | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------
|   0 | SELECT STATEMENT      |                    |    99 |   297 |    69   (8)| 00:00:01 |
|   1 |  HASH UNIQUE          |                    |    99 |   297 |    69   (8)| 00:00:01 |
|   2 |   INDEX FAST FULL SCAN| SOME_DATA_VAL_INDX |   116K|   341K|    65   (2)| 00:00:01 |
-------------------------------------------------------------
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        236  consistent gets
          0  physical reads
          0  redo size
       2207  bytes sent via SQL*Net to client
        463  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         99  rows processed


Oracle now uses an INDEX FAST FULL SCAN, resulting in:
• Lower cost
• Fewer consistent gets
• Multiblock reads
• No table access

This is a significant improvement.

What If the Column Cannot Be Changed to NOT NULL?

In real systems, modifying column definitions is not always possible.
In that case, we can help Oracle by explicitly excluding NULLs.

SQL> alter table some_data modify (val null);
Table altered.

SQL> select /*+ index_ffs(some_data) */ distinct val from some_data where val is not null;

99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1329759908
-------------------------------------------------------------
| Id  | Operation             | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                    |    99 |   297 |    69   (8)| 00:00:01 |
|   1 |  HASH UNIQUE          |                    |    99 |   297 |    69   (8)| 00:00:01 |
|*  2 |   INDEX FAST FULL SCAN| SOME_DATA_VAL_INDX |   116K|   341K|    65   (2)| 00:00:01 |
-------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("VAL" IS NOT NULL)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        236  consistent gets
          0  physical reads
          0  redo size
       2207  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         99  rows processed



Oracle now safely uses the index because:
• NULL values are explicitly filtered
• Index contains all required rows

Key Takeaways
1. B-Tree indexes do not store NULL values
2. DISTINCT on a nullable column often results in a full table scan
3. Oracle may ignore index hints if correctness cannot be guaranteed
4. DISTINCT on a NOT NULL column can use an index efficiently
5. Adding WHERE column IS NOT NULL can enable index usage
6. INDEX FAST FULL SCAN is especially efficient when the index covers all required columns



I hope this article helped you. Your suggestions/feedback are most welcome.

Keep learning... Have a great day!!!


Thank you,
Amit Pawar
Email: amitpawar.dba@gmail.com
WhatsApp No: +91-8454841011




No comments:

Post a Comment