jump to navigation

Smart scan on Exadata and direct path reads. 10 July 2011

Posted by David Alejo Marcos in Uncategorized.
trackback

Couple of days ago I was called to investigate a performance problem on one of our developement databases. People complained of slowness without much indication of what was slow.

Production, DR and UAT are running on a full rack Exadata machine while development runs on a single server.

The problem:

Database is slow. This is what I was given to start with. After monitoring the system I did notice several sessions where performing direct path reads, db file scattered read and some other events.

I am aware of some people having experienced problems with Oracle doing direct path reads instead of db scattered reads so I thought we might be having the same problem.

The solution:

To cut a long story short, direct path reads was not our problem, at least not this time. I managed to find a long running session consuming many of the database resources. The session was running for 7 days, details below:

SQL Text
------------------------------
UPDATE ....
Global Information
------------------------------
 Status                                 :  EXECUTING
 Duration                               :  613427s
Global Stats
======================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Gets  | Reqs | Bytes | Reqs  | Bytes |
======================================================================================
|  617320 |   64631 |   552688 |        0.82 |   619M |  52M |   6TB |   47M |   2TB |
======================================================================================
SQL Plan Monitoring Details (Plan Hash Value=2108859552)
============================================================================================================================================================================================
| Id    |            Operation             |   Name    |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write | Mem | Temp | Activity | Activity Detail |
|       |                                  |           | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes |     |      |   (%)    |   (# samples)   |
============================================================================================================================================================================================
|     0 | UPDATE STATEMENT                 |           |         |       |           |        |     1 |          |       |       |       |       |     |      |          |                 |
|     1 |   UPDATE                         | TABLE1    |         |       |           |        |     1 |          |       |       |       |       |     |      |          |                 |
|     2 |    FILTER                        |           |         |       |    613362 |    +63 |     1 |        0 |       |       |       |       |     |      |          |                 |
|     3 |     NESTED LOOPS                 |           |         |       |    613362 |    +63 |     1 |     6905 |       |       |       |       |     |      |          |                 |
|     4 |      NESTED LOOPS                |           |       9 | 27146 |    613362 |    +63 |     1 |     108K |       |       |       |       |     |      |          |                 |
|     5 |       SORT UNIQUE                |           |     839 | 24624 |    613405 |    +20 |     1 |     6905 |    38 |   7MB |  1243 | 257MB | 10M | 333M |          |                 |
|     6 |        TABLE ACCESS FULL         | TABLE1    |     839 | 24624 |        44 |    +20 |     1 |       5M |  1029 | 499MB |       |       |     |      |          |                 |
|  -> 7 |       INDEX RANGE SCAN           | INDEX01   |       6 |     2 |    613364 |    +63 |  6905 |     108K |   792 |   6MB |       |       |     |      |          |                 |
|  -> 8 |      TABLE ACCESS BY INDEX ROWID | TABLE1    |       1 |     6 |    613364 |    +63 |  108K |     6905 | 48839 | 382MB |       |       |     |      |          |                 |
|     9 |     FILTER                       |           |         |       |    613334 |    +91 |  6905 |     6904 |       |       |       |       |     |      |          |                 |
|    10 |      HASH GROUP BY               |           |     798 | 24625 |    252603 |    +65 |  6905 |       4G |   24M |   1TB |   47M | 488GB | 14M |   8M |          |                 |
| -> 11 |       TABLE ACCESS FULL          | TABLE1    |     839 | 24624 |    613364 |    +63 |  6905 |       4G |   29M | 416GB |       |       |     |      |          |                 |
============================================================================================================================================================================================

impressive, right?

The query was terminated and it is being reviewed on development but, why didn’t we have problems on prod?

It is my believe the reason is called smart scan on the Exadata storage :D.

Lets have a look how the query looks on Prod:

1.- is it enabled?

PROD> show parameter cell_offload_processing
NAME                     TYPE     VALUE
-----------------------  -------- ------------------------------
cell_offload_processing  boolean  TRUE

2.- Check the explain plan:

SQL>  SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(sql_id =>'4xxxxxxxj',format =>'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | E-Rows |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT               |           |        |       |       |          |
|   1 |  UPDATE                        | TABLE1    |        |       |       |          |
|*  2 |   HASH JOIN ANTI               |           |      9 |  3957K|  1689K| 4524K (0)|
|   3 |    NESTED LOOPS                |           |        |       |       |          |
|   4 |     NESTED LOOPS               |           |      9 |       |       |          |
|   5 |      SORT UNIQUE               |           |    839 |  2037K|   607K| 1810K (0)|
|*  6 |       TABLE ACCESS STORAGE FULL| TABLE1    |    839 |       |       |          |
|*  7 |      INDEX RANGE SCAN          | INDEX01   |      6 |       |       |          |
|*  8 |     TABLE ACCESS BY INDEX ROWID| TABLE1    |      1 |       |       |          |
|   9 |    VIEW                        | VW_NSO_1  |    839 |       |       |          |
|  10 |     SORT GROUP BY              |           |    839 |  2250K|   629K| 1999K (0)|
|* 11 |      TABLE ACCESS STORAGE FULL | TABLE1    |    839 |       |       |          |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("COL_ID"=TO_NUMBER("COL_ID"))
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
   6 - storage(("COL01"=:B2 AND NVL("COL02",'N')='N' AND "COL04"=:B3
              AND "COL03"=TO_NUMBER(:B1)))
       filter(("COL01"=:B2 AND NVL("COL02",'N')='N' AND "COL04"=:B3
              AND "COL03"=TO_NUMBER(:B1)))
   7 - access("COL05"="COL05" AND "COL06"="COL06" AND
              "COL07"="COL07")
   8 - filter((NVL("COL02",'N')='N' AND "COL08"="COL08"))
  11 - storage(("COL01"=:B2 AND NVL("COL02",'N')='N' AND "COL04"=:B3
              AND "COL03"=TO_NUMBER(:B1)))
       filter(("COL01"=:B2 AND NVL("COL02",'N')='N' AND "COL04"=:B3
              AND "COL03"=TO_NUMBER(:B1)))

On the plan_table_output, we see storage, indicating smart scan was being used, being the runtime 73 seconds:

PROD> select sql_id, executions, elapsed_time/1000000 from gv$sql
  2  where sql_id = '4xxxxxxxj';
SQL_ID        EXECUTIONS ELAPSED_TIME/1000000
------------- ---------- --------------------
'4xxxxxxxj'          1            73.166979

I must say, I am quite impressed with Exadata but, would it make it more difficult to spot bad queries?

As always, comments are welcome.

Comments

1. Greg Rahn - 28 July 2011

The reason your plan took so long in dev is the plan is different than prod. Also it’s a very bad plan and likely the table stats are out of date. Look at line 4 – the CBO estimates 9 rows but it has already run that NLJ 108,000 times – that is pain by definition. This is because line 5 thinks there are 839 rows but there are really 5 million!

David Alejo Marcos - 1 September 2011

Thank you Greg,

the point of the post is to see the difference between Exadata and a Non-Exadata environment as the database is similar in size an code being executed is the same.

I am lucky to have 4 Exadata environments, but DEV does not run on Exadata, making tuning harder.

David Alejo-Marcos.

2. Joaquin - 2 August 2011

Hi David,

“On the plan_table_output, we see storage, indicating smart scan was being used”

I think just seeing “storage” on the plan does not meen smart scan is used.
Check these url:
http://uhesse.wordpress.com/2011/01/19/exadata-part-i-smart-scan/
http://kerryosborne.oracle-guy.com/2011/03/how-to-tune-an-exadata/

Joaquin Gonzalez

David Alejo Marcos - 1 September 2011

Hi Joaquin,

it is true that only by having storage on the explain plan does not mean Oracle Exadata is performing smart scan.

On my post I did mention it did after verifying on v$sql and other system views.

You might also have a look to “snapper”, a really cool tool by Tanel Poder which helps to diagnose wait events (even smart scans) on real time.

Thank you,

David Alejo-Marcos.


Sorry comments are closed for this entry