Oracle Live

13/04/2016

Histogramas II v.12c

Filed under: Database,Estadisticas,Laboratorios,New 12c — mogukiller @ 5:52 pm

Con Oracle 12c se introducen 2 tipos nuevos de histogramas. TOP Frequency y Hybrid Histogram. En esta entrada se hacen ejemplos prácticos de cuando Oracle opta por unos o por otros. Además de como calcula la cardinalidad en cada uno de los casos.


-- Teoria
-- Queries importantes
-- FREQUENCY HISTOGRAM
-- TOP FREQUENCY HISTOGRAM
-- HYBRID HISTOGRAM
----------------------------------------
-- Teoria
----------------------------------------
Referencias:
White Paper: Understanding Optimizer Statistics with Oracle 12c


----------------------------------------
-- Queries importantes
----------------------------------------
@col_usage

SET ECHO off
COL OWNER FOR A10
COL OBJECT_NAME FOR A30
COL COLUMN_NAME FOR A20
SELECT B.OWNER,B.OBJECT_NAME,C.COLUMN_NAME, A.EQUALITY_PREDS, A.EQUIJOIN_PREDS, A.NONEQUIJOIN_PREDS,A.RANGE_PREDS,
A.LIKE_PREDS, A.NULL_PREDS FROM COL_USAGE$ A, DBA_OBJECTS B, DBA_TAB_COLUMNS C WHERE A.OBJ#=B.OBJECT_ID
AND A.INTCOL# = C.COLUMN_ID
AND B.OWNER = C.OWNER
AND B.OBJECT_NAME = C.TABLE_NAME
AND B.OWNER NOT IN ('SYS','SYSTEM')
AND B.OBJECT_NAME LIKE '%&table_name%'
ORDER BY 1,2;

@hist_type

COL TABLE_NAME FOR A30
SET ECHO off
SELECT OWNER, TABLE_NAME, COLUMN_NAME, NUM_DISTINCT,NUM_NULLS,HISTOGRAM,NUM_BUCKETS,SAMPLE_SIZE, LAST_ANALYZED FROM DBA_TAB_COL_STATISTICS
WHERE TABLE_NAME LIKE '%&table_name%' ORDER BY 1,2,3;

@hist_distribution

COL ENDPOINT_ACTUAL_VALUE FOR A30
SET ECHO off
SELECT OWNER, TABLE_NAME, COLUMN_NAME, ENDPOINT_NUMBER, ENDPOINT_ACTUAL_VALUE FROM DBA_HISTOGRAMS WHERE OWNER NOT IN ('SYSTEM','SYS')
AND TABLE_NAME LIKE '%&table_name%' ORDER BY 1,2,3,4;

@hist_distribution_2

COL ENDPOINT_ACTUAL_VALUE FOR A30
SET ECHO off
SELECT COLUMN_NAME, ENDPOINT_NUMBER, ENDPOINT_ACTUAL_VALUE,ENDPOINT_REPEAT_COUNT FROM DBA_HISTOGRAMS WHERE OWNER NOT IN ('SYSTEM','SYS')
AND TABLE_NAME LIKE '%&table_name%' ORDER BY 1,2,3,4;

-- FREQUENCY HISTOGRAM
-- Creamos una tabla con 10K con menos de 254 valores diferentes.

CREATE TABLE TAB_FREC AS
SELECT level AS id,
TRUNC(DBMS_RANDOM.value(1,254)) AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 10000;
COMMIT;

-- Hacemos diferentes tipos de queries sobre la columna record_type

SELECT COUNT(*) FROM TAB_FREC WHERE record_type = 1;
SELECT COUNT(*) FROM TAB_FREC WHERE record_type > 10;

-- Hacemos el flush para que se refleje en la tabla
EXEC dbms_stats.flush_database_monitoring_info();

-- Revisamos la col_usage$
@col_usage
/*
OWNER      OBJECT_NAME                    COLUMN_NAME          EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS
---------- ------------------------------ -------------------- -------------- -------------- ----------------- ----------- ---------- ----------
MOGU       TAB_FREC                       RECORD_TYPE                       1              0                 0           1          0          0
*/

-- Pasamos estadisticas.
EXEC DBMS_STATS.gather_table_stats('MOGU', 'TAB_FREC');

-- Miramos el tipo de histograma que ha creado
@hist_type
/*
OWNER      TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM       NUM_BUCKETS SAMPLE_SIZE LAST_ANAL
---------- ------------------------------ -------------------- ------------ ---------- --------------- ----------- ----------- ---------
MOGU       TAB_FREC                       DESCRIPTION                 10000          0 NONE                      1       10000 02-FEB-16
MOGU       TAB_FREC                       ID                          10000          0 NONE                      1       10000 02-FEB-16
MOGU       TAB_FREC                       RECORD_TYPE                   253          0 FREQUENCY               253       10000 02-FEB-16 <<<<
*/

@hist_distribution
/*
OWNER      TABLE_NAME                     COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALUE
---------- ------------------------------ -------------------- --------------- ------------------------------
MOGU       TAB_FREC                       DESCRIPTION                        0
MOGU       TAB_FREC                       DESCRIPTION                        1
MOGU       TAB_FREC                       ID                                 0
MOGU       TAB_FREC                       ID                                 1
MOGU       TAB_FREC                       RECORD_TYPE                       37 1
MOGU       TAB_FREC                       RECORD_TYPE                       75 2
MOGU       TAB_FREC                       RECORD_TYPE                      113 3
MOGU       TAB_FREC                       RECORD_TYPE                      151 4
MOGU       TAB_FREC                       RECORD_TYPE                      188 5
MOGU       TAB_FREC                       RECORD_TYPE                      226 6
MOGU       TAB_FREC                       RECORD_TYPE                      262 7
.
.
.
MOGU       TAB_FREC                       RECORD_TYPE                     9954 252
MOGU       TAB_FREC                       RECORD_TYPE                    10000 253
*/

-- Coprobamos el plan de ejecucion de una query que vaya por ese predicado
SET AUTOTRACE ON
SELECT COUNT(*) FROM TAB_FREC WHERE RECORD_TYPE = 70;
/*
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |     4 |    14   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |          |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB_FREC |    34 |   136 |    14   (0)| 00:00:01 |
-------------------------------------------------------------------------------
*/
----------------------------------
-- TOP FREQUENCY HISTOGRAM
----------------------------------

-- Creamos una tabla con el 99.9% entre 1 y 254 y el 0,1% mayor que 254.

-- El 99,9% son valores comprendidos entre 1 y 254

CREATE TABLE TAB_TOPFREC AS
SELECT level AS id,
TRUNC(DBMS_RANDOM.value(1,254)) AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 100000;

-- El 0,1% valores mayores de 254

INSERT INTO TAB_TOPFREC
SELECT level AS id,
TRUNC(DBMS_RANDOM.value(255,1000)) AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 100;

commit;

'nota: hacemos una query para que rellene COL_USAGE$'
SELECT COUNT(*) FROM TAB_TOPFREC WHERE record_type = 1;
SELECT COUNT(*) FROM TAB_TOPFREC WHERE record_type > 10;

EXEC dbms_stats.flush_database_monitoring_info();
/*
OWNER      OBJECT_NAME                    COLUMN_NAME          EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS
---------- ------------------------------ -------------------- -------------- -------------- ----------------- ----------- ---------- ----------
MOGU       TAB_TOPFREC                    RECORD_TYPE                       1              0                 0           1          0          0
*/

-- Pasamos estadisticas
EXEC DBMS_STATS.gather_table_stats('MOGU', 'TAB_TOPFREC');

-- Vemos que ha generado histogramas de tipo TOP_FREQUENCY
@hist_type
/*
OWNER      TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM       NUM_BUCKETS SAMPLE_SIZE LAST_ANAL
---------- ------------------------------ -------------------- ------------ ---------- --------------- ----------- ----------- ---------
MOGU       TAB_TOPFREC                    DESCRIPTION                100100          0 NONE                      1      100100 13-APR-16
MOGU       TAB_TOPFREC                    ID                         100100          0 NONE                      1      100100 13-APR-16
MOGU       TAB_TOPFREC                    RECORD_TYPE                   347          0 TOP-FREQUENCY           254      100100 13-APR-16
*/

@hist_distribution_2
/*
OWNER      TABLE_NAME                     COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALUE          ENDPOINT_VALUE
---------- ------------------------------ -------------------- --------------- ------------------------------ --------------
MOGU       TAB_TOPFREC                    RECORD_TYPE                      380 1                                           1
MOGU       TAB_TOPFREC                    RECORD_TYPE                      799 2                                           2
MOGU       TAB_TOPFREC                    RECORD_TYPE                     1196 3                                           3
MOGU       TAB_TOPFREC                    RECORD_TYPE                     1592 4                                           4
MOGU       TAB_TOPFREC                    RECORD_TYPE                     1961 5                                           5
MOGU       TAB_TOPFREC                    RECORD_TYPE                     2301 6                                           6
MOGU       TAB_TOPFREC                    RECORD_TYPE                     2681 7                                           7
MOGU       TAB_TOPFREC                    RECORD_TYPE                     3095 8                                           8
MOGU       TAB_TOPFREC                    RECORD_TYPE                     3455 9                                           9
.
.
.
MOGU       TAB_TOPFREC                    RECORD_TYPE                    99186 251                                       251
MOGU       TAB_TOPFREC                    RECORD_TYPE                    99560 252                                       252
MOGU       TAB_TOPFREC                    RECORD_TYPE                   100000 253                                       253
MOGU       TAB_TOPFREC                    RECORD_TYPE                   100001 998                                       998
*/
'nota: Comprobamos que ha creado 254 buckets y en el ultimo ha metido todos los valores superiores a 254'

-- Comprobamos como se calcula la cardinalidad.
SELECT COUNT(*) FROM TAB_TOPFREC WHERE RECORD_TYPE=6;
/*
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |     1 |     4 |   141   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |             |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB_TOPFREC |   340 |  1360 |   141   (1)| 00:00:01 |    <<< Lo calcula perfectamente ya que es el endpoint de un bucket.
----------------------------------------------------------------------------------
*/
SELECT COUNT(*) FROM TAB_TOPFREC WHERE RECORD_TYPE=405;
/*
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |     1 |     4 |   141   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |             |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB_TOPFREC |     1 |     4 |   141   (1)| 00:00:01 |
----------------------------------------------------------------------------------
*/
'nota: a los valores por encima de 254 les da a tods la misma cardinalidad 1'

-------------------------------
-- HYBRID HISTOGRAM
-------------------------------

'nota: el tamaño de los bucket en HYBRID pueden cambiar ligeramente su tamaño para albergar a todos los valores'

-- Creamos una tabla con mas de 254 vamores diferentes

CREATE TABLE TAB_HYBRID AS
SELECT level AS id,
TRUNC(DBMS_RANDOM.value(1,600)) AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 10000;
COMMIT;

-- Pasamos estadisticas.
EXEC DBMS_STATS.gather_table_stats('MOGU', 'TAB_HYBRID');

@col_usage
/*
no rows selected
*/
'nota: tampoco ha creado histogramas'
/*
OWNER                          TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS HISTOGRAM       NUM_BUCKETS SAMPLE_SIZE LAST_ANAL
------------------------------ ------------------------------ ------------------------------ ------------ ---------- --------------- ----------- ----------- ---------
MOGU                           TAB_HYBRID                     DESCRIPTION                           10000          0 NONE                      1       10000 02-FEB-16
MOGU                           TAB_HYBRID                     ID                                    10000          0 NONE                      1       10000 02-FEB-16
MOGU                           TAB_HYBRID                     RECORD_TYPE                             599          0 NONE                      1       10000 02-FEB-16
*/

'nota: hacemos una query para que rellene COL_USAGE$'
SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE =200;
SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE <200;
SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE <>1000;
SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE IS NULL;

EXEC dbms_stats.flush_database_monitoring_info();
@col_usage
/*
OWNER      OBJECT_NAME                    COLUMN_NAME          EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS
---------- ------------------------------ -------------------- -------------- -------------- ----------------- ----------- ---------- ----------
MOGU       TAB_HYBRID                     RECORD_TYPE                       1              0                 0           1          0          0
*/

EXEC DBMS_STATS.gather_table_stats('MOGU', 'TAB_HYBRID');

@hist_type
/*
OWNER      TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM       NUM_BUCKETS SAMPLE_SIZE LAST_ANAL
---------- ------------------------------ -------------------- ------------ ---------- --------------- ----------- ----------- ---------
MOGU       TAB_HYBRID                     DESCRIPTION                 10000          0 NONE                      1       10000 02-FEB-16
MOGU       TAB_HYBRID                     ID                          10000          0 NONE                      1       10000 02-FEB-16
MOGU       TAB_HYBRID                     RECORD_TYPE                   599          0 NONE                      1       10000 02-FEB-16
*/

-- No me ha creado histogramas, vuelvo a cargar otros 10000 con 1000 valores diferentes
INSERT INTO TAB_HYBRID
SELECT level AS id,
TRUNC(DBMS_RANDOM.value(1,1000)) AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 10000;
COMMIT;
'nota: Insertamos un valor muy popular'

INSERT INTO TAB_HYBRID
SELECT level AS id,
100 AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 10000;
COMMIT;

INSERT INTO TAB_HYBRID
SELECT level AS id,
101 AS record_type,
'Description for ' || level AS description
FROM   dual
CONNECT BY level <= 100;
COMMIT;

SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE =100;
SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE =101;

EXEC DBMS_STATS.gather_table_stats('MOGU', 'TAB_HYBRID');

@hist_type
/*
OWNER      TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM       NUM_BUCKETS SAMPLE_SIZE LAST_ANAL
---------- ------------------------------ -------------------- ------------ ---------- --------------- ----------- ----------- ---------
MOGU       TAB_HYBRID                     DESCRIPTION                 10000          0 NONE                      1       30000 02-FEB-16
MOGU       TAB_HYBRID                     ID                          10000          0 NONE                      1       30000 02-FEB-16
MOGU       TAB_HYBRID                     RECORD_TYPE                   998          0 HYBRID                  254        5429 02-FEB-16
*/
@hist_distribution_2
/*
COLUMN_NAME                    ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALUE          ENDPOINT_REPEAT_COUNT
------------------------------ --------------- ------------------------------ ---------------------
RECORD_TYPE                               2920 98                                                 3
RECORD_TYPE                               4358 100                                             1436    <<<< Los considera valores populares.
RECORD_TYPE                               4371 101                                               13 <<<<
RECORD_TYPE                               4377 103                                                2
RECORD_TYPE                               4384 105                                                5
*/

SELECT RECORD_TYPE,COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE IN (99,100,101,102,103,104,105) GROUP BY RECORD_TYPE ORDER BY 1;
/*
RECORD_TYPE   COUNT(*)
----------- ----------
99         16
100      10016
101        122
102         17
103         20
104         15
105         20
*/

SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE=100;
/*
COUNT(*)
----------
10016

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |     4 |    69   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |            |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB_HYBRID | 10586 | 42344 |    69   (0)| 00:00:01 |    <<< Cardinalidad muy aproximada
---------------------------------------------------------------------------------
*/

SELECT COUNT(*) FROM TAB_HYBRID WHERE RECORD_TYPE=101;
/*
COUNT(*)
----------
122
---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |     4 |    69   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |            |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB_HYBRID |    96 |   384 |    69   (0)| 00:00:01 |    <<< Buena resolucion para los valores populares
---------------------------------------------------------------------------------
*/

1 comentario »

  1. […] Top Frequency Histogram: Se optará por este histograma cuando tengamos más de 254 valores diferentes, pero mas del 99% de los valores tengan menos del 254 valores diferentes. Por lo que se hara un Frequency Histogram donde en los 253 buckets incluiran los valores mas comunes y en el ultimo bucket iria ese 1% del conjunto de valores con pocos registros. Para mas informacion consultar el laboratorio de histogramas aqui. […]

    Pingback por CAP 5.- Performance Tuning | Oracle Live — 13/04/2016 @ 7:43 pm


RSS feed for comments on this post. TrackBack URI

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s

Blog de WordPress.com.

A %d blogueros les gusta esto: