Systems Performance 2nd Ed.



BPF Performance Tools book

Recent posts:
Blog index
About
RSS

AWS re:Invent 2017: How Netflix Tunes EC2

31 Dec 2017

My last talk for 2017 was at AWS re:Invent, on "How Netflix Tunes EC2 Instances for Performance," an updated version of my 2014 talk. There was so much demand for it this year that I had three overflow rooms streaming it, and people still couldn't get in. (I shouldn't let this go to my head, as there were 42,000 attendees at re:Invent looking for something to see!) Fortunately, it was videoed for those who missed it.

A video of the talk is on youtube:

Here are the slides or as a PDF:

/ permalink/zoom

I love this talk as I get to share more about what the Performance and Operating Systems team at Netflix does, rather than just my work. Our team looks after the BaseAMI, kernel tuning, OS performance tools and profilers, and self-service tools like Vector. We're not the only people doing performance and performance tuning at Netflix either: all the development teams do performance work. We help where we can.

My talk included a section on Linux kernel tunables, as follows. WARNING: These tunables were developed in late 2017, for Ubuntu Xenial instances on EC2.

CPU

schedtool –B PID

Virtual Memory

vm.swappiness = 0       # from 60

Huge Pages

# echo madvise > /sys/kernel/mm/transparent_hugepage/enabled

NUMA

kernel.numa_balancing = 0

File System

vm.dirty_ratio = 80                     # from 40
vm.dirty_background_ratio = 5           # from 10
vm.dirty_expire_centisecs = 12000       # from 3000
mount -o defaults,noatime,discard,nobarrier …

Storage I/O

/sys/block/*/queue/rq_affinity  2
/sys/block/*/queue/scheduler        noop
/sys/block/*/queue/nr_requests  256
/sys/block/*/queue/read_ahead_kb    256
mdadm –chunk=64 ...

Networking

net.core.somaxconn = 1000
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535
net.ipv4.tcp_abort_on_overflow = 1    # maybe

Hypervisor (Xen)

echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource

Not a lot has changed with these tunables since my 2014 talk.

What I was most excited for was the launch of a new EC2 hypervisor, which I referred to in the video as the "c5 hypervisor". Later that night the real name was released: the "Nitro" hypervisor, as well as the bare metal instance type. My last post on Introducing Nitro explained it and the hypervisor development journey.

Many other Netflix staff spoke at re:Invent (list here). Here are the talks from my immediate colleagues in building F, level 2 at Netflix:

Check them out. It's awesome to see my coworkers on the big stage doing great!



Click here for Disqus comments (ad supported).