Performance Tuning Guide

DB2 UDB v9.1

Performance Tuning Guide

Author: Shelton Reese

Date: Tuesday, 22 January 2019

Version: 1.0

Email:

Website:

Table of Contents

1. Testing methodology

2. Database performance tuning

2.1 Runstats

Updating catalog statistics

2.2 Database monitoring

Monitoring and tuning database configuration parameters

Buffer pool size

Log buffer size (LOGBUFSZ)

Application heap size (APPLHEAPSZ)

Sort heap size (SORTHEAP) and sort heap threshold (SHEAPTHRES)

Number of agents (MAXAGENTS, NUM_POOLAGENTS and NUM_INITAGENTS)

Locks (LOCKLIST, MAXLOCKS and LOCKTIMEOUT)

Maximum number of active applications (MAXAPPLS)

Number of asynchronous page cleaners (NUM_IOCLEANERS)

Number of I/O servers (NUM_IOSERVERS)

Number of commits to group (MINCOMMIT)

2.3 Event monitoring

2.4 Access plans

2.4.1 Selects

2.4.2 Inserts

2.4.2 Deletes

2.4.3 Updates

2.4.4 Commits

2.4.5 Unions

2.4.6 Joins

2.4.7 Indexing

2.5 Memory

2.5.1 Bufferpool

2.5.2 Sortheap

2.5.3 Private memory

2.6 CPU

2.7 Disk

2.8 Logging

2.9 Locking

2.10 Application

2.11 Sorting

3 Server performance monitoring

3.1 Unix performance monitoring

AIX Disk Layout Considerations

3.2 Windows performance tuning

4. Important database manager (dbm) parameters

4.1 Monitoring parameters

4.2 Authority parameters

4.3 Stored procedure parameters

4.4 Parallel parameters

4.5 Shared memory parameters

4.7 Communication parameters

5. Important database (db) parameters

5.1 Lock parameters

5.2 Parallel parameters

5.3 Shared memory parameters

5.4 Private memory parameters

5.5 Bufferpool parameters

5.6 Log parameters

6. DB2 v9.1 Performance enhancements

6.1 Multidimensional clustering

6.2 Prefetching enhancements

6.3 Page cleaner I/O improvements

6.4 Catalog and authorization caching on databases with multiple partitions

6.5 Threading of Java UDFs and stored procedures

6.6 Connection Concentrator

6.7 Materialized query tables

Conclusion

1. Testing methodology

1) Create a benchmark environment

2) Run test and capture results

3) Investigate

4) Make small changes

5) Retest

To resolve the performance issues as part of the effort to resolve the performance problems, I installed and executed seven pieces of software on the AIX servers, as follows.

db2_status.ksh –d <dbname> --> this script waits for the sampling time and then gathers all the DB2 snapshot information into the IBM snapshot tables. The script provides the following data for review:
The hit ratio for the data and index pages of every bufferpool.
The hit ratio for the catalog cache.
The hit ratio for the package cache.
The number of sorts serviced and number of sorts rejected.
The average elapsed time for each sort.
The update command to update the configuration to proper settings
db2_aix_check.ksh This script reads the AIX configuration and reports on any problem that might be found with possible suggestions for fixes it also dumps all of raw db2 data for all instances (run as root) .

Some AIX housekeeping:

export LIBPATH=db2instance_home_directory/sqllib/lib:/usr/lib:/lib

export LANG=en_US

2. Database performance tuning

2.1 Runstats

Updating catalog statistics

Background

The RUNSTATS utility updates statistics in the system catalog tables to help with the query optimization process. Without these statistics, the database manager could make a decision that would adversely affect the performance of an SQL statement. The RUNSTATS utility allows you to collect statistics on the data contained in the tables, indexes, or both tables and indexes. Use the RUNSTATS utility to collect statistics based on both the table and the index data to provide accurate information to the access plan selection process in the following situations:

When a table has been loaded with data, and the appropriate indexes have been created.
When a table has been reorganized with the REORG utility.
When there have been extensive updates, deletions, and insertions that affect a table and its indexes. ("Extensive" in this case may mean that 10 to 20 percent of the table and index data has been affected.)
Before binding application programs whose performance is critical.
When you want to compare new statistics with previous statistics. Running statistics on a periodic basis enables you to discover performance problems at an early stage.
When the prefetch quantity is changed.
When you have used the REDISTRIBUTE NODEGROUP utility.

When optimizing SQL queries, the decisions made by the SQL compiler are heavily influenced by the optimizer's model of the database contents. This data model is used by the optimizer to estimate the costs of alternative access paths that can be used to resolve a particular query. A key element in the data model is the set of statistics gathered about the data contained in the database and stored in the system catalog tables. This includes statistics for tables, nicknames, indexes, columns, and user-defined functions (UDFs). A change in the data statistics can result in a change in the choice of access plan selected as the most efficient method of accessing the desired data.

Examples of the statistics available which help define the data model to the optimizer include:

The number of pages in a table and the number of pages that are not empty.
The degree to which rows have been moved from their original page to other (overflow) pages.
The number of rows in a table.
Statistics about individual columns such as the number of distinct values in a column.
The degree of clustering of an index; that is, the extent to which the physical sequence of rows in a table follows an index.
Statistics about the index such as the number of index levels and the number of leaf pages in each index.
The number of occurrences of frequently used column values.
The distribution of column values across the range of values present in the column.
Cost estimates for user-defined functions (UDFs) in this case you don’t have them.

RUNSTATS can help you determine how performance is related to changes in your database. The statistics show the data distribution within a table. When used routinely, RUNSTATS provides data about tables and indexes over a period of time, thereby allowing performance trends to be identified for your data model as it evolves over time. Rebind applications that use static SQL after using RUNSTATS so that the query optimizer can choose the best access plan given the new statistics. However, for applications using dynamic SQL (e.g. most vendor applications) rebinding is not necessary since the statement will be optimized based on the statistics at run time. When statistical information about tables is not accurate, it may cause performance problems. In a worst-case scenario, a particular SQL statement may cause DB2 to use a table scan instead of an index scan. The statistics will be updated in the db2service program included with the weekend maintenance scripts.

How to update the statistics

Statistics for objects are updated in the system catalog tables only when explicitly requested. There are several ways to update some or all of the statistics:

Using the RUNSTATS (run statistics) utility.
Using LOAD, with statistics collection options specified.
Coding SQL UPDATE statements that operate against a set of predefined catalog views.
Using the "reorgchk update statistics" command.

When you do not exactly know all the table names, or there are too many, the easiest way to do RUNSTATS is to use the "db2 reorgchk update statistics" command. The exact script looks like this:

#!/bin/ksh

for i in `db2 list db directory|grep -B 4 -i Indirect|grep name|grep Database| awk '{print $4}'`

echo 'DBname = ' $i

DBname=$i

db2 connect to $i

db2 -x "select rtrim(TABSCHEMA) || '.' || TABNAME from syscat.tables where stats_time < (current timestamp - 7 days) and type='T'" > runstats.temp

if [ "$?" -eq 1 ];then

echo "No tables for database $DBname"

continue

while read line_by_line

db2 "reorgchk update statistics on table ${A}"

db2 commit

done < runstats.temp

rm –f runstats.temp

db2rbind $DBname -l bind.log

done

echo Runstats ended at `date`

The script above does not require table names. This one command performs RUNSTATS on all tables.

Note: If you know the name of the table and to avoid having large numbers of tables that may take a long time to complete, it's preferable to do RUNSTATS on each table one at a time. The command looks like the following:
db2 -v runstats on table TAB_NAME with distribution and detailed indexes all

Checking to see if RUNSTATS has been run

First thing to check is that runstats has been done on the database.

If not then you should perform a runstats when the users aren't using the system, as ittakes out locks on the tables.

One quick way to see whether RUNSTATS has been performed on your database is to query some system catalog tables. For example, as shown in the script above, you can run this command:

db2 -v "select tbname, nleaf, nlevels, stats_time from sysibm.sysindexes"

If RUNSTATS has not yet been run, you will see "-1" for the nleaf and nlevels columns, and a "-" for the stats_time column. These columns contain real numbers if RUNSTATS has been run, and the stats_time column will contain the timestamp when RUNSTATS ran. If you think the time shown in stats_time is too old, it's time to do runstats again. Based on the analysis for runstats attached to this document, I recommend a full runstats be done using the script provided above.

The DB2 commands such as runstats have a C API which will allow programs to access them. If you are writing your program in Java, the 2 methods are:-

1. Write a JNI encapsulation interface, so you can call the C API from Java

2. Create a Java Process object and associate an operating system command with it. You can read any output using a BufferedStreamReader.

2.2 Database monitoring

Update all the monitoring switches first :-

"db2 update dbm cfg using dft_mon_bufpool on dft_mon_lock on dft_mon_uow on dft_mon_stmt on dft_mon_table on dft_mon_sort on"

To get a database snapshot: - “db2 get snapshot for db on <database>”

Since V9.1, you can now select snapshot statistics using SQL table functions. E.g. “select total_sort_time from table(snapshot_database(‘<dbname>’,-1)) dbsnap

Monitoring and tuning database configuration parameters

The following database configuration tuning steps were used in the environment to achieve reasonably good performance and at the same time enable us to avoid obvious pitfalls. Among the configuration parameters, database manager configuration parameters require a restart of the database manager, and most database configuration parameters require the application to reconnect to the database in order to have the changes take effect. The configuration parameters I recommend changing are described here:

Buffer pool size
Log buffer size
Application heap size
Sort heap size and sort heap threshold
Number of agents
Locks
Maximum number of active applications
Number of asynchronous page cleaners
Number of I/O servers
Number of commits to group

Buffer pool size

Background

A buffer pool is an area of storage in memory into which database pages (containing table rows or index entries) are temporarily read and changed. The purpose of the buffer pool is to improve database system performance. Data can be accessed much faster from memory than from a disk. Therefore, the fewer times the database manager needs to read from or write to a disk, the better the performance. The configuration of one or more buffer pools is the single most important tuning area, since it is here that most of the data manipulation takes place for applications connected to the database (excluding large objects and long field data).

By default, applications use the buffer pool called IBMDEFAULTBP, which is created when the database is created. The DB2 database configuration parameter BUFFPAGE controls the size of a buffer pool when the value of NPAGES is -1 for that buffer pool in the SYSCAT.BUFFERPOOLS catalog table. Otherwise the BUFFPAGE parameter is ignored, and the buffer pool is created with the number of pages specified by the NPAGES parameter.

Recommendations

For applications that only use one buffer pool , change NPAGES to -1 so that BUFFPAGE controls the size of the buffer pool. This makes it easier to update and report the buffer pool size along with other DB2 database configuration parameters.

After making sure that you can use the BUFFPAGE parameter in the database configuration to control the buffer pool size, set it to a proper value. Setting it to a reasonably large value is a safe thing based on the size of your database and the nature of your application. Usually, the default value of this parameter is very small and may not be satisfactory. Consider the following:

As a starting point, and if you have enough memory on your machine, set BUFFPAGE to 40,000 pages (160 MB), or 10% of the total memory on your machine.
For databases, set aside as much as memory as possible for the buffer pool while keeping the system stable. As a starting point, try 1.6 GB and then experiment with more.

How to change the parameter

Run the following script to:

Verify the catalog value
Enable the use of the database configuration parameter BUFFPAGE
Update the value of BUFFPAGE for all databases.

db2 -v connect to DB_NAME

db2 -v select * from syscat.bufferpools

db2 -v alter bufferpool IBMDEFAULTBP size -1

db2 -v connect reset

db2 -v update db cfg for dbname using BUFFPAGE bigger_value

db2 -v terminate

Investigative steps

To determine whether the BUFFPAGE parameter is in use for buffer pool size of database, run:

db2 -v connect to DB_NAME

db2 -v SELECT * from SYSCAT.BUFFERPOOLS

db2 -v connect reset

db2 -v terminate

Examine the results. If each buffer pool has an NPAGES value of -1, then the buffer pool size is being controlled through the BUFFPAGE parameter in the database configuration.

To determine whether the database buffer pool size is big enough, collect snapshots for the database and/or buffer pool while running the application. A script similar to the following will give you the needed information:

db2 -v update monitor switches using bufferpool on

db2 -v get monitor switches

db2 -v reset monitor all

-- run the Java application --

./db2_status.ksh –d <DB_NAME>

db2 -v terminate

Make sure that you issue the "db2 -v get snapshot" before you lose your database connection. When the last application disconnects from the database, the database terminates and all snapshot statistics will be lost. To ensure there is always a connection that keeps the database up, use one of the following methods:

Maintain one separate connection in the window where you are collecting snapshots.
Use the DB2 ACTIVATE DATABASE command.

In the snapshot output, either from the database snapshot or buffer pool snapshot, look for the following "logical reads" and "physical reads" so that you can calculate the buffer pool hit ratio, which can help you tune your buffer pools:

MONITORING THE BUFFER POOLS

Buffer Pool Hit Rate - the higher, the better

======

Buffer Pool Name / Buffer Pool Hit Rate (higher is better) / Buffer pool data logical reads / Buffer pool data physical reads

The buffer pool hit ratio indicates the percentage of time that the database manager did not need to load a page from disk in order to service a page request; that is, the page was already in the buffer pool. The greater the buffer pool hit ratio, the lower the frequency of disk I/O. The 32K buffer pool will need to be updated after a full table reorganization is done. To calculate the buffer pool hit ratio, the formula is as follows:

(1 - ((buffer pool data physical reads + buffer pool index physical reads) /

(buffer pool data logical reads + pool index logical reads))) * 100%

This calculation takes into account all of the pages (index and data) that are cached by the buffer pool. Ideally this ratio should be over 95%, and as close to 100% as possible. To increase the buffer pool hit ratio, try the following:

Increase the buffer pool size.
Consider allocating multiple buffer pools, possibly one for each frequently-accessed large table with its own table space, and one for a group of small tables, as is done here and the next step is to adjust the sizes of buffer pools to see which combination provides the best performance.

Avoid over-allocating memory to buffer pools if the memory allocated cannot help performance. The buffer pool sizes should be determined based on snapshot information taken from the test environment.

Log buffer size (LOGBUFSZ)

Background

LOGBUFSZ is a database configuration parameter. It is the parameter for the log buffer. It allows you to specify the amount of database shared memory to use as a buffer for log records before writing these records to disk. The log records are written to disk when one of the following events occurs:

A transaction commits.
The log buffer is full.
As a result of some other internal database manager event.

Buffering the log records results in more efficient log file I/O, because the log records are written to disk less frequently and more log records are written each time. Increase the size of this buffer area if there is considerable read activity on a dedicated log disk, or if there is high disk utilization. When increasing the value of this parameter, consider the DBHEAP parameter, too, because the log buffer area uses space controlled by the DBHEAP parameter.

How to change the parameter

The default value for this parameter, is 8 (4KB pages), and that is not big enough for a large database. The optimal value for LOGBUFSZ is 128, or 256 4KB pages. For example, you can use the command below to change it:

db2 -v update database cfg for DB_NAME using LOGBUFSZ <value>

db2 -v terminate

Investigative steps

Use the database snapshot to determine whether the LOGBUFSZ parameter is optimal or not by looking at the lines shown in the following example from the production server:

Log pages read = 484961

Log pages written = 21963890

Maximum total log space used (Bytes) = 8,024,082,185 (8 GB)

In general, the ratio between "log pages read" and "log pages written" should be as small as possible. An ideal value would be zero log pages read while seeing a good number of log pages written. When there are too many log pages read, it means a bigger LOGBUFSZ is needed. The current LOGBUFSZ is 8.