9.1.3 Determine the I/O for Persistent Storage

9.2 Using the iostat Command Tool

The iostat command tool reports the following three types of statistics.

This section discusses the iostat command to identify I/O-subsystem and CPU bottlenecks using CPU statistics and disk statistics. The following is a sample iostat output report.

# iostat

tty:      tin         tout   avg-cpu:  % user    % sys     % idle    % iowait
          0.1          9.0               7.7      6.8       85.0       0.5

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk1           0.1       0.7       0.0      39191    242113
hdisk0           0.8      11.1       0.8    3926601    822579
cd0              0.0       0.0       0.0        780         0

The following sections will discuss some of the frequently referenced fields from the preceding example.

9.2.1 CPU Statistics

The following fields of CPU statistics in the iostat report determine the CPU usage and the I/O status.

% user
The % user column shows the percentage of CPU resource spent in user mode. A UNIX process can execute in user or system mode. When in user mode, a process executes within its own code and does not require kernel resources.
% sys
The % sys column shows the percentage of CPU resource spent in system mode. This includes CPU resource consumed by kernel processes and others that need access to kernel resources. For example, the reading or writing of a file requires kernel resources to open the file, seek a specific location, and read or write data. A UNIX process accesses kernel resources by issuing system calls.
% idle
The % idle column shows the percentage of CPU time spent idle, or waiting, without pending local disk I/O. If there are no processes on the run queue, the system dispatches a special kernel process called wait. On most AIX systems, the wait process ID (PID) is 516.
% iowait
The % iowait column shows the percentage of time the CPU was idle with pending local disk I/O. The iowait state is different from the idle state in that at least one process is waiting for local disk I/O requests to complete. Unless the process is using asynchronous I/O, an I/O request to disk causes the calling process to block (or sleep) until the request is completed. Once a process's I/O request completes, it is placed on the run queue.

9.2.1.1 Data Analysis

The following conclusions can be drawn from the iostat reports.

To understand the I/O subsystem thoroughly, you need to examine the disk statistics of iostat report in the following section.

9.2.2 Disk Statistics

The disk statistics portion of the iostat output determines the I/O usage. This information is useful in determining whether a physical disk is the bottleneck for performance. The system maintains a history of disk activity by default. The history is disabled if you see the following message:

Disk history since boot not available.

Disk I/O history should be enabled since the CPU resource used in maintaining it is insignificant. History-keeping can be disabled or enabled by executing smitty chgsys, the SMIT fast path command that will display the screen as shown in Figure 104. Change the option Continuously maintain DISK I/O history to TRUE to enable the history keeping option..



Figure 104: Enabling the Disk I/O History

The following fields of the iostat report determine the physical disk I/O.

Disks
The Disks column shows the names of the physical volumes. They are either hdisk or cd followed by a number. (hdisk0 and cd0 refer to the first physical disk drive and the first CD disk drive, respectively.)
% tm_act
The % tm_act column shows the percentage of time the volume was active. This is the primary indicator of a bottleneck.
Kbps
Kbps shows the amount of data read from and written to the drive in KBs per second. This is the sum of Kb_read plus Kb_wrtn, divided by the number of seconds in the reporting interval.
tps
tps reports the number of transfers per second. A transfer is an I/O request at the device-driver level.
Kb_read
Kb_read reports the total data (in KB) read from the physical volume during the measured interval.
Kb_wrtn
Kb_wrtn shows the amount of data (in KB) written to the physical volume during the measured interval.

9.2.2.1 Data Analysis

There is no unacceptable value for any of the fields in the preceding section because statistics are too closely related to application characteristics, system configuration, and types of physical disk drives and adapters. Therefore, when evaluating data, you must look for patterns and relationships. The most common relationship is between disk utilization and data transfer rate.

To draw any valid conclusions from this data, you must understand the application's disk data access patterns (sequential, random, or a combination) and the type of physical disk drives and adapters on the system. For example, if an application reads and writes sequentially, you should expect a high disk-transfer rate when you have a high disk-busy rate. (Note: Kb_read and Kb_wrtn can confirm an understanding of an application's read and write behavior, but they provide no information on the data access patterns).

Generally you do not need to be concerned about a high disk-busy rate as long as the disk-transfer rate is also high. However, if you get a high disk-busy rate and a low data-transfer rate, you may have a fragmented logical volume, file system, or individual file.

What is a high data-transfer rate? That depends on the disk drive and the effective data-transfer rate for that drive. You should expect numbers between the effective sequential and effective random disk-transfer rates. The effective transfer rates for a few of the common SCSI-1 and SCSI-2 disk drives are shown in Table 15:


Table 15: Effective Transfer Rates (KB/sec)

You can also use the data captured by the iostat command to analyze the requirement of an additional SCSI adapter (if the one installed in your system is the I/O bottleneck) by tracking transfer rates, and finding the maximum data transfer rate for each disk.

The disk usage percentage (% tm_act) is directly proportional to resource contention and inversely proportional to performance. As disk use increases, performance decreases and response time increases. In general, when a disk's use exceeds 70 percent, processes are waiting longer than necessary for I/O to complete because most UNIX processes block (or sleep) while waiting for their I/O requests to complete.

To overcome I/O bottlenecks, you can perform the following actions:

9.3 Paging Space Management