Intercepting file I/O

This SystemTap script allows to intercept all the disk I/O operations in the system, including both high-level system calls and low-level block I/O (BIO) operations.

For each operation, the script reports its beginnig (call enter) and end (call return). Script intercepts
I/O operations at two layers: system calls and low-level block I/O (BIO) calls.

For system calls, such as sys_open, sys_read and sys_stat the script reports timestamp, PID, TID, the name of the calling process and the name of the call. The script separately reports call begins and returns. This allows easily determining bounds of the opeartion and measuring how much time each operation took.

Tracking low-level BIO is harder. The script intercepts generic_make_request() function, which is an entry point for all the BIO operations, and also all possible completion calls (functions that are called when the operation is complete): end_bio_bh_io_sync, mpage_end_io_read, mpage_end_io_write. The BIO call is initiated int the context of the calling process and then blocks until a completion method of the corresponding bio structure is called. Unfortunately, this method is called in the context of the kernel and does not directly refer to the process and original call. Thus even if we can easily track beginnings of low-level BIO calls, it is not easy to determine when the call has ended.

One way to match completion calls to initial calls is tracking the address of the corresponding bio structure. Normally this address does not change for the low-level operations. Thus we can sort operations by their timestamp and then find a next completion call with the same address as the initial generic_make_request() call. This approach works in the vast majority of cases.

We can learn many interesting things by looking at the information retrieved. One important observation is that each BIO call corresponds to a single disk access. Thus we can reliably determine what percentage of I/O calls are gettng data from cache by simply looking how many system calls initiated BIO calls. Second, we can see which system calls result in disk operations and in which circumstances. Third, we can determine how much time each operation has taken to complete.

Finally, you can run the script in the autopilot mode and then retrieve the log. By parsing the log you can get a complete picture of what is going on in the system - which processes are initiating I/O, how they access disk etc. You can take a look at the SystemTap site for more information and for more script examples.