Friday, July 14, 2017

Why index is so fast ?


        is my recommendation for sql tuning. 

Index is a storing unit which can store column values . It is ordered,  low height and root-leaf structure. It retrives the column values orderly with rowid and put them together into oracle blocks called index block. Index structure is a physical structure. Index will take effect in columns in where statement eg : ( select * from person where name ='peter').



1. Index tree's height is low : because index blocks won't be full in reality , layer is often few (3 layers are 3 system IO firing), In big data query, if query 1 or 2 records, it will be fast. however it won't just querying 1 or 2 records , so system IO takes more query cost (slowly). If order by , group by and sum() all these functions are involved, it will be very slow.Thus distibuted file system like hadoop comes into the picture.

Query the first 100 records : use "as content from dual connect by level <= 100";


Select index_name, blevel,, leaf_blocks,num_rows, distinct_keys,clustering_factor from user_ind_statistics where table_name in ('T1','T2'); //check the table's index height,
blevel is the index layer number/index height,0 is less than 1 layer。


2.Index stores column values : index is consist of column value and rowid.

Index won't store null value:  if indexed columns have null value, even though it has index, but it will still go full table scan. So , in case of this occasion, try to use : 

select count(*) from t where object_id is not null;

or before table creation do :

alter table modify object_id not null;

Tuning sum() avg() etc : columns in all these functions must have indexes added so as to reduce logic read.

3.Index is ordered itself: 

optimise order by :"order by"'s cost is very high so try to add index in columns with order by .  because index is ordered itself , so it won't re-order which saves the cost and memory but cost CPU a lot.

optimise max() : add index in the column with max function. because index is ordered and store column values with tree structure, so if index is added, it will go to the leaf and find the last child directly which can greatly reduce the logic read.

optimise distinct() : distinct() doesn't go through index but by default go by hash unique algorithsm causing lots of logic read and PGA memory, so my way to query is : 


Select /* index(t) */ distinct object_id from it;  //add index to distinct columns

Don't optimise union


select object id from t1 union select object_id from t2;

Look at the above query using union, it will auto-filter the duplicated records , so It definately produces sorting,. Thus union + index is useless ,however, you can add index to union all.

composite index:  if multiple columns should be selected, you can create a composite index, if there exist where statement with "= , < or >", you must put the equaled columns before select.

if composite index is introduced(id, name), you don't need to create index seperately like (name). try to use "in" rather than > or < :


Select /* + index_desc(a,idx_t) */ * from t a order by owner asc ,object_type desc;

index in partition tables : your query will be very very fast !!

query min() and max() : you can't write select max(num), min(num) from ..., that will be very slow. because indexes in this occasion can't go to leaves to find the max and min at the same time, it can't look at 2 directions at the same time. thus optimised query will be :

Select max ,min from (select max(num_id) max from t) a,(select min(num_id) min from t) b;

Index scan methods 

In OCP 11G exam, when i studied it , there are 5 ways to scan the index,  I am not sure in 12C , these 5 still exist or have their name changed. So i just share some of my experience in 11G :

index range scan : range scan, it will refer to other indexes to cross-validate whether find the index specified.

index unique scan : unique scan, create unique index...; it's slightly faster than the range scan, because it won't refer to other indexes, as it is unique as defined.

table access by index rowid : no nedd to create index, this is the fastest way, directly locate the row!

index full scan & fast full scan : fast full scan is slightly faster, if order by is added , it will go index full scan , read one block once, ordered. without order by, it will go fast full scan , read multiple blocks once, not ordered.

index skip scan : if data is small , and limited in one column , it will jump, shuffle and fetch. eg : there is one column called "status", first 100 rows are "view" last 100 rows are "unread". in this situation, it will go skip scan.

Hope my share can help you improve your application . leave msgs if anything unclear
 

No comments:

Post a Comment

Add Loading Spinner for web request.

when web page is busily loading. normally we need to add a spinner for the user to kill their waiting impatience. Here, 2 steps we need to d...