Vmstat Monitoring Tool
Monitoring Linux with Vmstat
Monitoring with vmstat - The Virtual Memory Statistic Tool
Vmstat is an excellent tool to use to monitor your systems virtual memory and paging. Vmstat allows you to easily monitor paging in and paging out. When running vmstat it is better to use a delay. This is the amount of time in seconds between updates. For example, issuing the command: vmstat 5. This will give you a five second delay. It is also possible to set a count limit for the number of times that you wish vmstat to execute before quitting. For example: vmstat 5 10 will run vmstat 10 times with an interval of 5 seconds between each run.
The following is output from a vmstat command running on an Ubuntu desktop system:
The output from vmstat is organised roughly into six sections: "procs", "memory", "swap", "io", "system" and "cpu".
The first section covers "processes". The "r" refers to the number of processes waiting for run time. The "b" refers to the number of process in uninterruptable sleep.
The next section covers "memory". "swpd" refers to the amount of virtual memory used.
"Free" is the amount of idle memory.
"buff" is the amount of memory that is used as buffers.
"cache" is the amount of memory that is used as cache.
The next section covers "swap". "si" refers to the amount of memory swapped in from disk. "so" is the amount of memory that is swapped out to disk.
The next section covers "IO". "bi" is the blocks received from a block device (usually disk). "bo" is the number of blocks sent to a block device.
The next section covers "system". "in" is the number of interrupts including the clock.
"cs" is the number of context switching per second.
The final section covers "CPU", the numbers refer to the percentage of total cpu time. "us" is the time spent running non kernel code (user time including nice time). "sy" is the time spent running kernel code (system time). "id" the time spent idle. "wa" time spent waiting for "IO".
What does this information mean?
To understand the output from the vmstat command, we need to understand some of the key areas that are being reported.
CPU - CPU Utilisation
CPU Utilisation is defined as the percentage of usage of a cpu. How the system utilises the CPU is a very important metric. Generally CPU is broken down into the following categories:
This is the percentage of time a CPU spends executing threads in the userspace or "userland" as it is sometimes referred to.
The percentage of time a system spends executing kernel threads and interrupts.
Waiting on "IO" refers to the percentage of time a cpu spends idle, due to all process threads are blocked waiting for IO requests to complete.
Each CPU maintains a runqueue of threads. Ideally these should be executed by the scheduler in a consistent fashion. Process threads are either in a blocked state (waiting on IO) or they are in a runnable state. If the CPU sub system becomes heavily loaded, then you may end up in a situation where the scheduler can not keep up with demand. As a direct result of this, the runnable processes start to increase thus the runqueue becomes high. As a general rule of thumb, the number of processes waiting to run should not be constantly greater than the number of CPUs. However, the final judge of how the system is running would be the system administrator. It is not uncommon to see a runqueue high for short periods of time.
Generally CPU load refers to the runqueue averaged out over a period of time. Quite often Load is referred to as it gives an overall performance metric of your system.
Context switching is process the kernel goes through when handling processes. The kernel will suspend one process and store its current state (context) in memory. Next the kernel retrieves another process state from memory and restores this to the cpu register, where it is then processed from the point in time where it was interrupted. Context switching is quite normal on a Linux system, however, if the context switching is too high, this can then impact on performance.
Whenever looking at monitoring information stats, it is always important to have a baseline of how your system normally behaves. This will then allow you to compare its current state to that of a healthy known state.
For a full list of the other parameters that can be passed to the vmstat command, issue the man vmstat command.