BPF Performance Tools (book)
this book colored red.
This is the official site for the book BPF Performance Tools: Linux System and Application Observability, published by Addison Wesley (2019). This book can help you get the most out of your systems and applications, helping you improve performance, reduce costs, and solve software issues. Here I'll describe the book, link to related content, and list errata and updates.
The book is available on Amazon.com (paperback, kindle), InformIT (paperback, PDF, etc), and Safari (here and here). The paper book was released in December 2019 but sold out immediately. ISBN-13: 9780136554820. (If you purchase through the Amazon or InformIT links, the book's technical editor earns a commission.)
The Amazon Kindle preview shows the first 100 pages out of this 880 page book.
There is also a companion book, Systems Performance: 2nd Edition (2020), that provides balanced coverage of performance analysis and methodologies using all tool types.
On this page: BPF, Screenshots, OSes, Audience, Tools, TOC, Related, Errata, Updates.
What is BPF?
Berkeley Packet Filter (BPF) is an in-kernel execution engine that processes a virtual instruction set, and has been extended recently (aka eBPF) for providing a safe way to extend kernel functionality. In some ways, eBPF does to the kernel what JavaScript does to websites: it allows all sorts of new applications to be created. BPF is now used for software defined networking, observability (this book), security enforcement, and more. The main front-ends for BPF performance tools are BCC and bpftrace. BPF itself is also becoming a technology name, and no longer an abbreviation.
Screenshots
As an example new tool from the book, readahead.bt provides a new view of file system read ahead performance: the age of read-ahead pages when they are finally referenced, and unused read-ahead pages while tracing:
# readahead.bt Attaching 5 probes... ^C Readahead unused pages: 128 Readahead used page age (ms): @age_ms: [1] 2455 |@@@@@@@@@@@@@@@ | [2, 4) 8424 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4, 8) 4417 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 7680 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [16, 32) 4352 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32, 64) 0 | | [64, 128) 0 | | [128, 256) 384 |@@ |
The book covers many of the existing tools as well, for example, tcplife for efficiently logging TCP session details:
# tcplife PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS 4169 java 100.1.111.231 40158 100.2.116.192 6001 7 33 3590.91 4169 java 100.1.111.231 56940 100.5.177.31 6101 0 0 2.48 4169 java 100.1.111.231 6001 100.2.176.45 49482 0 0 17.94 4169 java 100.1.111.231 18926 100.5.102.250 6101 0 0 0.90 4169 java 100.1.111.231 44530 100.2.31.140 6001 0 0 2.64 4169 java 100.1.111.231 44406 100.2.8.109 6001 11 28 3982.11 34781 sshd 100.1.111.231 22 100.2.17.121 41566 5 7 2317.30 [...]
Apart from kernel resources, applications are also analyzed. The following book tool counts Java JNI usage by stack trace:
# bpftrace --unsafe jnistacks.bt Tracing jni_NewObject* calls... Ctrl-C to end. ^C Running /usr/local/bin/jmaps to create Java symbol files in /tmp... Fetching maps for all java processes... Mapping PID 25522 (user bgregg): wc(1): 8350 26012 518729 /tmp/perf-25522.map [...] @[ jni_NewObject+0 Lsun/awt/X11GraphicsConfig;::pGetBounds+171 Ljava/awt/MouseInfo;::getPointerInfo+2048 Lnet/sf/freecol/client/gui/plaf/FreeColButtonUI;::paint+1648 Ljavax/swing/plaf/metal/MetalButtonUI;::update+232 Ljavax/swing/JComponent;::paintComponent+672 Ljavax/swing/JComponent;::paint+2208 [...] Ljavax/swing/RepaintManager;::prePaintDirtyRegions+1556 Ljavax/swing/RepaintManager$ProcessingRunnable;::run+572 Ljava/awt/EventQueue$4;::run+1100 call_stub+138 JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Th... ]: 232
This book doesn't just show the tools, it also explains caveats and gotchas. In this case jnistacks.bt is a simple tool, but getting it to work in production can mean fixing stack traces and symbols. These real-world gotchas are explained with recommended fixes and workarounds.
The book explains these and over 150 other BPF tools, as well as summarizing over 30 traditional performance analysis tools (top, vmstat, iostat, perf, Ftrace, etc) so that you can use the right tool for the job.
Operating Systems
Extended BPF is a built-in Linux kernel technology, added in parts since 3.18. At least Linux 4.9 is necessary to utilize the tools in this book. All Linux distributions can use the BPF tools (Ubuntu, CentOS, Fedora, Red Hat, etc): although the status of BCC and bpftrace varies for each distribution. Some have packages, others still require a build from source. See the install instructions for BCC and bpftrace.
Other operating systems including BSD (where BPF originated) are not covered in this book. As extended BPF is being ported elsewhere, a future edition of this book may cover more than Linux.
Audience
This book is primarily for engineers, developers, and support staff in enterprise and cloud environments. No programming is required, unless you want to, as you can use this book as either:
- A reference of ready-to-run performance analysis and debugging tools.
- A guide for learning how to develop new tools.
This book is also useful for students as a way to learn system internals in an interactive way: you can run and develop tools to examine the workings of the system.
Tools
Over 150 BPF tools are covered in the book, for performance analysis, troubleshooting, and other uses (e.g., security forensics). These tools provide observability for CPUs, memory, disks, file systems, networking, languages, applications, containers, hypervisors, security, and the Linux kernel. To explain how to analyze different languages, three types of execution are studied: compiled, JIT-compiled, and interpreted, using C, Java, and the bash shell as examples. The same approaches can be applied to other languages, and a summary for Node.js, C++, and Golang are included.
To cover all these targets, many new tools needed to be developed for this book. The diagram on the top right shows these new tools colored red. The source to these is included in the book, and can also be found here:
The /originals directory contains an as-is snapshot of the published tools, and /updated contains those tools plus updated versions.
Table of Contents
Achievement unlocked, finished chapter 1 of BPF performance tools, found and disabled several services that were spamming the system with several open/close/exec loops
— bsingharora (@bsingharora) November 26, 2019
Found a misconfiguration in nginx with gzip_static being enabled via `opensnoop`, the new BPF Performance Tools book by @brendangregg is great reading so far. We saw a 10% latency drop immediately 😮
— Kyle Scott Mcgill (@kylescottmcgill) November 28, 2019
-
Preface
Part I: Technologies
-
1. Intro
2. Technology Background
3. Performance Analysis
4. bcc
5. bpftrace
-
6. CPUs
7. Memory
8. File Systems
9. Disk I/O
10. Networking
11. Security
12. Languages
13. Applications
14. Kernel
15. Containers
16. Hypervisors
-
17. Other BPF Tools
18. Tips and Tricks
Apx.B. bpftrace Cheat Sheet
Apx C. bcc Tool Development
Apx D. C BPF
Apx E. BPF Instructions
Glossary
Bibliography
PDF Download eBook EPUB
The Safari online book store features early drafts of books for feedback, called "rough cuts." I'd never published one before, but did this time to see if it helped. It did not. This happened:
- I received next to no feedback from the rough cut.
- A badly-formatted EPUB version immediately appeared on pirate sites, months before the book was finished.
This pirate version is missing bug fixes and content I later added. It is really frustrating as I've worked hard to give readers the best possible experience, but some of you may be studying this draft instead, thinking that it's the final book. There is also (obviously) no way for the publisher to ask the pirates to update their version. Please only read the finished book, preferably "second printing" or later (as the second printing should include the errata fixes, listed below). One tell-tale sign: the cover of the final book includes the text "Foreword by Alexei Starovoitov...," and the early draft versions did not.
Related Content
- bpftrace: The BPF front-end used for code examples in the book.
- BCC: The BPF front-end used for complex tools in the book.
- Linux eBPF Tracing Tools: My page about BPF tracing tools for performance analysis.
- BPF Performance Tools (blog post): My blog post to launch this book.
Errata
1st Printing
- pxxvii, Preface: the footnote 1 text is somehow from chapter 6 by mistake; it should be: "The exercises include some advanced and "unsolved" problems, for which I have yet to see a working solution. It is possible that some of these problems are impossible to solve without kernel or application changes."
- pxxxiv, Preface: the Kindle version has a conversion error where two early page numbers are inserted into the text, appearing as "xxxivtracepoints" and "xxxvmost of whom".
- 2.6, p45: "This figure also shows the Linux kernel versions ...": that was the old figure, but not this new one.
- 2.10.2, p60: "The location of the probe from the previous readelf(1) output was 0x6a2."; that previous output was deleted.
- 2.13, p64: "Linux 2.6.21" -> "Linux 2.6.31".
- 5.9.6, p154: "A a rate of 99" extra "a".
- 6.2.3, p192: "perf list" should be "perf script" (twice).
- 6.2.5, p196: "perf script to show the rate" should be "perf stat ..." (matching the screenshot).
- 9.1.3, p346: the Safari version misnumbers step 2a as another step 1.
- 9.3.2, p359: "biostoop(8)" -> "biosnoop(8)".
- 9.3.7, p370: "kprobe:blk_start_request,kprobe:blk_mq_start_request" -> "kprobe:blk_account_io_done", to trace the full I/O latency (and not just OS queued time).
- 12.5.1, p583: Javascript(Node.js): "v8 can run Java functions" Java should be JavaScript.
- ApxC, p749: "line 4 imports the BPF library" should be line 2.
- ApxC, p749: "predate his capability" typo his->this.
- ApxC, p767: "make $(getconf" -> "make -j $(getconf" (missing -j).
- ApxC, p767: "thesamples" -> "the samples".
1st & 2nd Printing
- ApxE, p786: Dest and Source Register are both 4-bit (not 8-bit).
Updates
These are updates to BPF and its front-ends, many of which were mentioned in the book as "planned" and have since been implemented:
- 5.5.1 p173: bpf_probe_read_kernel() and bpf_probe_read_user() have now been implemented and may show up in Linux 5.5.
- 5.15.2 p174: bpfrace added signal() (thanks Bas Smit).
- 5.15.2 p175: bpftrace added override_return() (thanks Bas Smit).
- bpftrace added strncmp() (thanks Jay Kamat, Bas Smit).
- 5.10.3, p155: bpftrace added if else support (thanks Daniel Xu).
- 5.10.4, p155: bpftrace added while() loops (thanks Bas Smit).
- bpftrace curtask is now a task_struct if type info is available (headers or BTF).
- Appendix C: covered the BCC Python interface, but that is now considered deprecated as we switch to BCC libbpf C.
Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!