The following USE Method example checklists are automatically generated from the individual pages for: Linux, Solaris, Mac OS X, and FreeBSD. These analyze the performance of the physical host. You can customize this table using the checkboxes on the right.
There are some additional USE Method example checklists not included in this table: the SmartOS checklist, which is for use within an OS virtualized guest, and the Unix 7th Edition checklist for historical interest.
For general purpose operating system differences, see the Rosetta Stone for Unix, which was the inspiration for this page.
Resource | Metric | Linux | Solaris | FreeBSD | Mac OS X |
---|---|---|---|---|---|
CPU | errors | perf (LPE) if processor specific error events (CPC) are available; eg, AMD64's "04Ah Single-bit ECC Errors Recorded by Scrubber" [4] | fmadm faulty; cpustat (CPC) for whatever error counters are supported (eg, thermal throttling) | dmesg; /var/log/messages; pmcstat for PMC and whatever error counters are supported (eg, thermal throttling) | dmesg; /var/log/system.log; Instruments → Counters, for PMC and whatever error counters are supported (eg, thermal throttling) |
CPU | saturation | system-wide: vmstat 1, "r" > CPU count [2]; sar -q, "runq-sz" > CPU count; dstat -p, "run" > CPU count; per-process: /proc/PID/schedstat 2nd field (sched_info.run_delay); perf sched latency (shows "Average" and "Maximum" delay per-schedule); dynamic tracing, eg, SystemTap schedtimes.stp "queued(us)" [3] | system-wide: uptime, load averages; vmstat 1, "r"; DTrace dispqlen.d (DTT) for a better "vmstat r"; per-process: prstat -mLc 1, "LAT" | system-wide: uptime, "load averages" > CPU count; vmstat 1, "procs:r" > CPU count; per-cpu: DTrace to profile CPU run queue lengths [1]; per-process: DTrace of scheduler events [2] | system-wide: uptime, "load averages" > CPU count; latency, "SCHEDULER" and "INTERRUPTS"; per-cpu: dispqlen.d (DTT), non-zero "value"; runocc.d (DTT), non-zero "%runocc"; per-process: Instruments → Thread States, "On run queue"; DTrace [2] |
CPU | utilization | system-wide: vmstat 1, "us" + "sy" + "st"; sar -u, sum fields except "%idle" and "%iowait"; dstat -c, sum fields except "idl" and "wai"; per-cpu: mpstat -P ALL 1, sum fields except "%idle" and "%iowait"; sar -P ALL, same as mpstat; per-process: top, "%CPU"; htop, "CPU%"; ps -o pcpu; pidstat 1, "%CPU"; per-kernel-thread: top/htop ("K" to toggle), where VIRT == 0 (heuristic). [1] | per-cpu: mpstat 1, "usr" + "sys"; system-wide: vmstat 1, "us" + "sy"; per-process: prstat -c 1 ("CPU" == recent), prstat -mLc 1 ("USR" + "SYS"); per-kernel-thread: lockstat -Ii rate, DTrace profile stack() | system-wide: vmstat 1, "us" + "sy"; per-cpu: vmstat -P; per-process: top, "WCPU" for weighted and recent usage; per-kernel-process: top -S, "WCPU" | system-wide: iostat 1, "us" + "sy"; per-cpu: DTrace [1]; Activity Monitor → CPU Usage or Floating CPU Window; per-process: top -o cpu, "%CPU"; Activity Monitor → Activity Monitor, "%CPU"; per-kernel-thread: DTrace profile stack() |
CPU interconnect | errors | LPE (CPC) for whatever is available | cpustat (CPC) for whatever is available | pmcstat and relevant PMCs for whatever is available | Instruments → Counters, and relevent PMCs for whatever is available |
CPU interconnect | saturation | LPE (CPC) for stall cycles | cpustat (CPC) for stall cycles | pmcstat and relevant PMCs for CPU interconnect stall cycles | Instruments → Counters, and relevent PMCs for stall cycles |
CPU interconnect | utilization | LPE (CPC) for CPU interconnect ports, tput / max | cpustat (CPC) for CPU interconnect ports, tput / max (eg, see the amd64htcpu script) | pmcstat (PMC) for CPU interconnect ports, tput / max | for multi-processor systems, try Instruments → Counters, and relevent PMCs for CPU interconnect port I/O, and measure throughput / max |
GPU | errors | - | - | - | DTrace [7] |
GPU | saturation | - | - | - | DTrace [7]; Instruments → OpenGL Driver, "Client GLWait Time" (maybe) |
GPU | utilization | - | - | - | directly: DTrace [7]; atMonitor, "gpu"; indirect: Temperature Monitor; atMonitor, "gput" |
I/O interconnect | errors | LPE (CPC) for whatever is available | cpustat (CPC) for whatever is available | pmcstat and relevant PMCs for whatever is available | Instruments → Counters, and relevent PMCs for whatever is available |
I/O interconnect | saturation | LPE (CPC) for stall cycles | cpustat (CPC) for stall cycles | pmcstat and relevant PMCs for I/O bus stall cycles | Instruments → Counters, and relevent PMCs for stall cycles |
I/O interconnect | utilization | LPE (CPC) for tput / max if available; inference via known tput from iostat/ip/... | busstat (SPARC only); cpustat for tput / max if available; inference via known tput from iostat/nicstat/... | pmcstat and relevant PMCs for tput / max if available; inference via known tput from iostat/netstat/... | Instruments → Counters, and relevent PMCs for tput / max if available; inference via known tput from iostat/... |
Memory capacity | errors | dmesg for physical failures; dynamic tracing, eg, SystemTap uprobes for failed malloc()s | fmadm faulty and prtdiag for physical failures; fmstat -s -m cpumem-retire (ECC events); DTrace failed malloc()s | physical: dmesg?; /var/log/messages?; virtual: DTrace failed malloc()s | System Information → Hardware → Memory, "Status" for physical failures; DTrace failed malloc()s |
Memory capacity | saturation | system-wide: vmstat 1, "si"/"so" (swapping); sar -B, "pgscank" + "pgscand" (scanning); sar -W; per-process: 10th field (min_flt) from /proc/PID/stat for minor-fault rate, or dynamic tracing [5]; OOM killer: dmesg | grep killed | system-wide: vmstat 1, "sr" (bad now), "w" (was very bad); vmstat -p 1, "api" (anon page ins == pain), "apo"; per-process: prstat -mLc 1, "DFL"; DTrace anonpgpid.d (DTT), vminfo:::anonpgin on execname | system-wide: vmstat 1, "sr" for scan rate, "w" for swapped threads (was saturated, may not be now); swapinfo, "Capacity" also for evidence of swapping/paging; per-process: DTrace [3] | system-wide: vm_stat 1, "pageout"; per-process: anonpgpid.d (DTT), DTrace vminfo:::anonpgin [3] (frequent anonpgin == pain); Instruments → Memory Monitor, high rate of "Page Ins" and "Page Outs"; sysctl vm.memory_pressure [4] |
Memory capacity | utilization | system-wide: free -m, "Mem:" (main memory), "Swap:" (virtual memory); vmstat 1, "free" (main memory), "swap" (virtual memory); sar -r, "%memused"; dstat -m, "free"; slabtop -s c for kmem slab usage; per-process: top/htop, "RES" (resident main memory), "VIRT" (virtual memory), "Mem" for system-wide summary | system-wide: vmstat 1, "free" (main memory), "swap" (virtual memory); per-process: prstat -c, "RSS" (main memory), "SIZE" (virtual memory) | system-wide: vmstat 1, "fre" is main memory free; top, "Mem:"; per-process: top -o res, "RES" is resident main memory size, "SIZE" is virtual memory size; ps -auxw, "RSS" is resident set size (Kbytes), "VSZ" is virtual memory size (Kbytes) | system-wide: vm_stat 1, main memory free = "free" + "inactive", in units of pages; Activity Monitor → Activity Monitor → System Memory, "Free" for main memory; per-process: top -o rsize, "RSIZE" is resident main memory size, "VSIZE" is virtual memory size; ps -alx, "RSS" is resident set size, "SZ" is virtual memory size; ps aux similar (legacy format) |
Memory interconnect | errors | LPE (CPC) for whatever is available | cpustat (CPC) for whatever is available | pmcstat and relevant PMCs for whatever is available | Instruments → Counters, and relevent PMCs for whatever is available |
Memory interconnect | saturation | LPE (CPC) for stall cycles | cpustat (CPC) for stall cycles | pmcstat and relevant PMCs for memory stall cycles | Instruments → Counters, and relevent PMCs for stall cycles |
Memory interconnect | utilization | LPE (CPC) for memory busses, tput / max; or CPI greater than, say, 5; CPC may also have local vs remote counters | cpustat (CPC) for memory busses, tput / max; or CPI greater than, say, 5; CPC may also have local vs remote counters | pmcstat and relevant PMCs for memory bus throughput / max, or, measure CPI and treat, say, 5+ as high utilization | Instruments → Counters, and relevent PMCs for memory bus throughput / max, or, measure CPI and treat, say, 5+ as high utilization; Shark had "Processor bandwidth analysis" as a feature, which either was or included memory bus throughput, but I never used it |
Network Interfaces | errors | ifconfig, "errors", "dropped"; netstat -i, "RX-ERR"/"TX-ERR"; ip -s link, "errors"; sar -n EDEV, "rxerr/s" "txerr/s"; /proc/net/dev, "errs", "drop"; extra counters may be under /sys/class/net/...; dynamic tracing of driver function returns 76] | netstat -i, error counters; dladm show-phys; kstat for extended errors, look in the interface and "link" statistics (there are often custom counters for the card); DTrace for driver internals | system-wide: netstat -s | egrep 'bad|checksum', for various metrics; per-interface: netstat -i, "Ierrs", "Oerrs" (eg, late collisions), "Colls" [5] | system-wide: netstat -s | grep bad, for various metrics; per-interface: netstat -i, "Ierrs", "Oerrs" (eg, late collisions), "Colls" [5] |
Network Interfaces | saturation | ifconfig, "overruns", "dropped"; netstat -s, "segments retransmited"; sar -n EDEV, *drop and *fifo metrics; /proc/net/dev, RX/TX "drop"; nicstat "Sat" [6]; dynamic tracing for other TCP/IP stack queueing [7] | nicstat; kstat for whatever custom statistics are available (eg, "nocanputs", "defer", "norcvbuf", "noxmtbuf"); netstat -s, retransmits | system-wide: netstat -s, for saturation related metrics, eg netstat -s | egrep 'retrans|drop|out-of-order|memory problems|overflow'; per-interface: DTrace | system-wide: netstat -s, for saturation related metrics, eg netstat -s | egrep 'retrans|overflow|full|out of space|no bufs'; per-interface: DTrace |
Network Interfaces | utilization | sar -n DEV 1, "rxKB/s"/max "txKB/s"/max; ip -s link, RX/TX tput / max bandwidth; /proc/net/dev, "bytes" RX/TX tput/max; nicstat "%Util" [6] | nicstat (latest version here); kstat; dladm show-link -s -i 1 interface | system-wide: netstat -i 1, assume one very busy interface and use input/output "bytes" / known max (note: includes localhost traffic); per-interface: netstat -I interface 1, input/output "bytes" / known max | system-wide: netstat -i 1, assume one very busy interface and use input/output "bytes" / known max (note: includes localhost traffic); per-interface: netstat -I interface 1, input/output "bytes" / known max; Activity Monitor → Activity Monitor → Network, "Data received/sec" "Data sent/sec" / known max (note: includes localhost traffic); atMonitor, interface percent |
Network controller | errors | see network interface errors, ... | kstat for whatever is there / DTrace | see network interface errors | see network interface errors |
Network controller | saturation | see network interface saturation, ... | see network interface saturation | see network interface saturation | see network interface saturation |
Network controller | utilization | infer from ip -s link (or /proc/net/dev) and known controller max tput for its interfaces | infer from nicstat and known controller max tput | system-wide: netstat -i 1, assume one busy controller and examine input/output "bytes" / known max (note: includes localhost traffic) | system-wide: netstat -i 1, assume one busy controller and examine input/output "bytes" / known max (note: includes localhost traffic) |
Storage capacity | errors | - | DTrace; /var/adm/messages file system full messages | DTrace; /var/log/messages file system full messages | DTrace; /var/log/system.log file system full messages |
Storage capacity | file systems: errors | strace for ENOSPC; dynamic tracing for ENOSPC; /var/log/messages errs, depending on FS | - | - | - |
Storage capacity | saturation | not sure this one makes sense - once it's full, ENOSPC | not sure this one makes sense - once its full, ENOSPC | not sure this one makes sense - once its full, ENOSPC | not sure this one makes sense - once its full, ENOSPC |
Storage capacity | utilization | swap: swapon -s; free; /proc/meminfo "SwapFree"/"SwapTotal"; file systems: "df -h" | swap: swap -s; file systems: df -h; plus other commands depending on FS type | file systems: df -h, "Capacity"; swap: swapinfo, "Capacity"; pstat -T, also shows swap space; | file systems: df -h; swap: sysctl vm.swapusage, for swap file usage; Activity Monitor → Activity Monitor → System Memory, "Swap used" |
Storage controller | errors | see storage device errors, ... | DTrace the driver, eg, mptevents.d (DTB); /var/adm/messages | DTrace the driver | DTrace the driver |
Storage controller | saturation | see storage device saturation, ... | look for kernel queueing: sd (iostat "wait" again), ZFS zio pipeline | check utilization and DTrace and look for kernel queueing | DTrace and look for kernel queueing |
Storage controller | utilization | iostat -xz 1, sum devices and compare to known IOPS/tput limits per-card | iostat -Cxnz 1, compare to known IOPS/tput limits per-card | iostat -xz 1, sum IOPS & tput metrics for devices on the same controller, and compare to known limits [5] | iostat 1, compare to known IOPS/tput limits per-card |
Storage device I/O | errors | /sys/devices/.../ioerr_cnt; smartctl; dynamic/static tracing of I/O subsystem response codes [8] | iostat -En; DTrace I/O subsystem, eg, ideerr.d (DTB), satareasons.d (DTB), scsireasons.d (DTB), sdretry.d (DTB) | DTrace io:::done probe when /args[0]->b_error == 0/ | DTrace io:::done probe when /args[0]->b_error == 0/ |
Storage device I/O | saturation | iostat -xnz 1, "avgqu-sz" > 1, or high "await"; sar -d same; LPE block probes for queue length/latency; dynamic/static tracing of I/O subsystem (incl. LPE block probes) | iostat -xnz 1, "wait"; DTrace iopending (DTT), sdqueue.d (DTB) | system-wide: iostat -xz 1, "qlen"; DTrace for queue duration or length [4] | system-wide: iopending (DTT) |
Storage device I/O | utilization | system-wide: iostat -xz 1, "%util"; sar -d, "%util"; per-process: iotop; pidstat -d; /proc/PID/sched "se.statistics.iowait_sum" | system-wide: iostat -xnz 1, "%b"; per-process: DTrace iotop | system-wide: iostat -xz 1, "%b"; per-process: DTrace io provider, eg, iosnoop or iotop (DTT, needs porting) | system-wide: iostat 1, "KB/t" and "tps" are rough usage stats [6]; DTrace could be used to calculate a percent busy, using io provider probes; atMonitor, "disk0" is percent busy; per-process: iosnoop (DTT), shows usage; iotop (DTT), has -P for percent I/O |
Resource | Metric | Linux | Solaris | FreeBSD | Mac OS X |
Resource | Metric | Linux | Solaris | FreeBSD | Mac OS X |
---|---|---|---|---|---|
File descriptors | errors | strace errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...). | truss or DTrace (better) to look for errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...). | truss, dtruss, or custom DTrace to look for errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...) | dtruss or custom DTrace to look for errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...) |
File descriptors | saturation | does this make sense? I don't think there is any queueing or blocking, other than on memory allocation. | does this make sense? I don't think there is any queueing or blocking, other than on memory allocation. | I don't think this one makes sense, as if it can't allocate or expand the array, it errors; see fdalloc() | I don't think this one makes sense, as if it can't allocate or expand the array, it errors; see fdalloc() |
File descriptors | utilization | system-wide: sar -v, "file-nr" vs /proc/sys/fs/file-max; dstat --fs, "files"; or just /proc/sys/fs/file-nr; per-process: ls /proc/PID/fd | wc -l vs ulimit -n | system-wide (no limit other than RAM); per-process: pfiles vs ulimit or prctl -t basic -n process.max-file-descriptor PID; a quicker check than pfiles is ls /proc/PID/fd | wc -l | system-wide: pstat -T, "files"; sysctl kern.openfiles / sysctl kern.maxfiles; per-process: can figure out using fstat -p PID and ulimit -n | system-wide: sysctl kern.num_files / sysctl kern.maxfiles; per-process: can figure out using lsof and ulimit -n |
Kernel mutex | errors | dynamic tracing (eg, recusive mutex enter); other errors can cause kernel lockup/panic, debug with kdump/crash | lockstat -E, eg recusive mutex enter (other errors can cause kernel lockup/panic, debug with mdb -k) | lockstat -E (errors); DTrace and fbt provider for return probes and error status | DTrace and fbt provider for return probes and error status |
Kernel mutex | saturation | With CONFIG_LOCK_STATS=y, /proc/lock_stat "waittime-total" / "contentions" (also see "waittime-min", "waittime-max"); dynamic tracing of lock functions or instructions (maybe); spinning shows up with profiling (perf record -a -g -F 997 ..., oprofile, dynamic tracing) | lockstat -C (contention); DTrace lockstat provider; spinning shows up with dtrace -n 'profile-997 { @[stack()] = count(); }' | lockstat -C (contention); DTrace lockstat provider [6]; spinning shows up with dtrace -n 'profile-997 { @[stack()] = count(); }' | DTrace and lockstat provider for contention times [8] |
Kernel mutex | utilization | With CONFIG_LOCK_STATS=y, /proc/lock_stat "holdtime-totat" / "acquisitions" (also see "holdtime-min", "holdtime-max") [8]; dynamic tracing of lock functions or instructions (maybe) | lockstat -H (held time); DTrace lockstat provider | lockstat -H (held time); DTrace lockstat provider | DTrace and lockstat provider for held times |
Process capacity | errors | - | "can't fork()" messages | "can't fork()" messages | "can't fork()" messages |
Process capacity | saturation | - | not sure this makes sense; you might get queueing on pidlinklock in pid_allocate(), as it scans for available slots once the table gets full | not sure this makes sense | not sure this makes sense |
Process capacity | utilization | - | sar -v, "proc-sz"; kstat, "unix:0:var:v_proc" for max, "unix:0:system_misc:nproc" for current; DTrace (`nproc vs `max_nprocs) | current/max using: ps -a | wc -l / sysctl kern.maxproc; top, "Processes:" also shows current | current/max using: ps -e | wc -l / sysctl kern.maxproc; top, "Processes:" also shows current |
Task capacity | errors | "can't fork()" errors; user-level threads: pthread_create() failures with EAGAIN, EINVAL, ...; kernel: dynamic tracing of kernel_thread() ENOMEM | - | - | - |
Task capacity | saturation | threads blocking on memory allocation; at this point the page scanner should be running (sar -B "pgscan*"), else examine using dynamic tracing | - | - | - |
Task capacity | utilization | top/htop, "Tasks" (current); sysctl kernel.threads-max, /proc/sys/kernel/threads-max (max) | - | - | - |
Thread capacity | errors | - | user-level: pthread_create() failures with EAGAIN, EINVAL, ...; kernel: thread_create() blocks for memory but won't fail. | - | - |
Thread capacity | saturation | - | threads blocking on memory allocation; at this point the page scanner should be running (vmstat "sr"), else examine using DTrace/mdb. | - | - |
Thread capacity | utilization | - | user-level: kstat, "unix:0:lwp_cache:buf_inuse" for current, prctl -n zone.max-lwps -i zone ZONE for max; kernel: mdb -k or DTrace, "nthread" for current, limited by memory | - | - |
User mutex | errors | valgrind --tool=drd various errors; dynamic tracing of pthread_mutex_lock() for EAGAIN, EINVAL, EPERM, EDEADLK, ENOMEM, EOWNERDEAD, ... | DTrace plockstat and pid providers, for EAGAIN, EINVAL, EPERM, EDEADLK, ENOMEM, EOWNERDEAD, ... see pthread_mutex_lock(3C) | DTrace pid provider for EINVAL, EDEADLK, ... see pthread_mutex_lock(3C) etc. | DTrace plockstat and pid providers, for EDEADLK, EINVAL, ... see pthread_mutex_lock(3C) |
User mutex | saturation | valgrind --tool=drd to infer contention from held time; dynamic tracing of synchronization functions for wait time; profiling (oprofile, PEL, ...) user stacks for spins | plockstat -C (contention); prstat -mLc 1, "LCK"; DTrace plockstat provider | DTrace pid provider for contention; eg, pthread_mutex_*lock() entry to return times | plockstat -C (contention); DTrace plockstat provider |
User mutex | utilization | valgrind --tool=drd --exclusive-threshold=... (held time); dynamic tracing of lock to unlock function time | plockstat -H (held time); DTrace plockstat provider | DTrace pid provider for hold times; eg, pthread_mutex_*lock() return to pthread_mutex_unlock() entry | plockstat -H (held time); DTrace plockstat provider |
Resource | Metric | Linux | Solaris | FreeBSD | Mac OS X |
See the individual checklist pages listed at the top. This is a summary of their content.