-
Notifications
You must be signed in to change notification settings - Fork 140
[Z Blog] Tuning Linux for benchmarking
Let's assume we want to tune linux to handle a lot of incoming network connections.
You want to make sure you can open enough file handles.
Check system limit:
cat /proc/sys/fs/file-max
For this test we want 2 million open connections and it is likely that we will be running more than one process that is opening connections (simulators, load testers for smoke testing, ha_proxy, nginx, plus Java services).
If you do not have 2 million file handles that you can open, edit the sysctl conf file as follows.
$ sudo nano /etc/sysctl.conf
Now check to see if the soft and hard user limits for files are set.
# ulimit -Hn
# ulimit -Sn
Let's go ahead and check the various limits and see if we can open enough files from a user perpective.
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1031032
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 30000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1031032
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Most of the parameters look good for how we are going to use the OS except for the open files.
Add the following to limits.conf.
* hard nofile 1000000
* soft nofile 1000000
$ sudo nano /etc/security/limits.conf
Follow this guide for tuning network parameters.
Ideas adapted from: TCP Tuning and Linux TPCIP Tuning and TIME_WAIT and port reuse and Linux Networking Tuning.
Edit /etc/sysctl.conf
file.
Reload with sysctl -p
.
We are going to crank up the min buffers, max buffers. 16MB for 1GE, and 32M or 54M for 10GE
Next we will increase the NIC (network card) interface queue. If the RTT is more than 50 ms, a value of 5,000-10,000 is recommended.
To increase txqueuelen, do the following:
- TCP_FIN_TIMEOUT - elapsed time before TCP/IP can release a closed connection, and frees its resources.
- Smaller TCP_FIN_TIMEOUT means TCP/IP can release closed connection resources faster (like the port).
- TCP_KEEPALIVE_INTERVAL - wait time between isAlive interval probes
- TCP_KEEPALIVE_PROBES - number of probes before timing out
- TCP_TW_RECYCLE - Turn on fast recycling of TIME_WAIT sockets (closed waiting to be reused)
- TCP_TW_REUSE - Safer version of TCP_TW_RECYCLE, good for short connections (server behind a load balancer)
Setting 'sysctl net.ipv4.tcp_available_congestion_control', use Cubic which is the default use this or htcp.
Find more tuning ideas at ip-sysctl.txt, which is part of the Linux distribution.
Especially when you are running wrk on a client, use ephemeral ports available to your application. Default ephemeral port range is 32768 to 61000. This can be increased. Since the ports will be reused, we will have 47,535 of them.
net.ipv4.ip_local_port_range = 18000 65535
Connection stays in the TIME_WAIT state for twice msl. Default msl is 60 seconds, thus TIME_WAIT timeout value at 2 minutes. Since we are going to be benchmarking with wrk, we want to spare our ports.
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 1
The established connection timeout ESTABLISHED state, and a connection should get out of this state when a FIN packet goes through in either direction. nf_conntrack_tcp_timeout_established is 432000 seconds by default. We don't want to lose connections and ports that were never able to do the three way hang shake. We are benchmarking, losing enough ports and resources could be bad.
net.netfilter.nf_conntrack_tcp_timeout_established=60
Related to the above is the sysctl setting net.ipv4.tcp_slow_start_after_idle
.
net.ipv4.tcp_slow_start_after_idle=0
# /etc/sysctl.conf
# Increase system file descriptor limit
fs.file-max = 100000
# Discourage Linux from swapping idle processes to disk (default = 60)
vm.swappiness = 10
# Increase ephermeral IP ports
net.ipv4.ip_local_port_range = 10000 65000
# Increase Linux autotuning TCP buffer limits
# Set max to 16MB for 1GE and 32M (33554432) or 54M (56623104) for 10GE
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Make room for more TIME_WAIT sockets due to more clients,
# and allow them to be reused if we run out of sockets
# Also increase the max packet backlog
net.core.netdev_max_backlog = 50000
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
# Disable source routing and redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
# Log packets with impossible addresses for security
net.ipv4.conf.all.log_martians = 1
# allow all users to open 100000 files
# alternatively, replace * with an explicit username
* soft nofile 100000
* hard nofile 100000
http://www.nateware.com/linux-network-tuning-for-2013.html#.VTdagocUr8u
Sometimes you get server instance without knowing what you got. It happens.
more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping : 7
microcode : 0x70d
cpu MHz : 1999.832
cache size : 20480 KB
cpu cores : 8
So this is a 2012 processor. Must have been a sale at ebay. https://cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E5-2650+%40+2.00GHz&id=1218&cpuCount=2 http://ark.intel.com/products/64590/Intel-Xeon-Processor-E5-2650-20M-Cache-2_00-GHz-8_00-GTs-Intel-QPI
Ok. At least if I decide to do some benchmarking in EC2, I can sort of pick what size EC2 instance I am going to use. This box has 32 CPUs. CPU should not be an issue with this app, but if it is we can split it up.
I like to make sure that I have things setup properly. I have custom HTTP code both client and server, but I like to make sure that I have a decent OS setup before I waste too much time tweaking my code.
I do this with wrk and nginx.
I install Nginx on the box that is going to be the server. I install wrk on the client box.
$ wrk -c 20000 -d 10s http://10.5.99.62/index.html --timeout 1000s -t 12
Running 10s test @ http://10.5.99.62/index.html
12 threads and 20000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 85.74ms 330.97ms 6.40s 98.01%
Req/Sec 11.26k 4.50k 50.62k 77.00%
1345589 requests in 10.05s, 1.04GB read
Socket errors: connect 0, read 74, write 0, timeout 0
Requests/sec: 133925.85
Transfer/sec: 105.62MB
I changed the nginx worker pool to use 16 processors since we have 32 to use.
/etc/nginx$ cat nginx.conf
user www-data;
worker_processes 16;
pid /run/nginx.pid;
events {
worker_connections 768;
# multi_accept on;
}
Install Java on Ubuntu.
https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get
Install Vertx
$ cat server.js
var vertx = require('vertx');
vertx.createHttpServer().requestHandler(function(req) {
req.response.end("Hello World!");
}).listen(9090);
$ /opt/vertx/vert.x-2.1.5/bin/vertx run server.js -instances 16
Client
$ wrk -c 20000 -d 10s http://10.5.99.62:9090/ --timeout 1000s -t 20
Running 10s test @ http://10.5.99.62:9090/
20 threads and 20000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 104.42ms 617.70ms 6.06s 97.13%
Req/Sec 12.57k 11.13k 79.82k 89.77%
2507087 requests in 10.10s, 121.94MB read
Socket errors: connect 0, read 542, write 0, timeout 0
Requests/sec: 248,333.64
Transfer/sec: 12.08MB
QBit Website What is Microservices Architecture?
QBit Java Micorservices lib tutorials
The Java microservice lib. QBit is a reactive programming lib for building microservices - JSON, HTTP, WebSocket, and REST. QBit uses reactive programming to build elastic REST, and WebSockets based cloud friendly, web services. SOA evolved for mobile and cloud. ServiceDiscovery, Health, reactive StatService, events, Java idiomatic reactive programming for Microservices.
Reactive Programming, Java Microservices, Rick Hightower
Java Microservices Architecture
[Microservice Service Discovery with Consul] (http://www.mammatustech.com/Microservice-Service-Discovery-with-Consul)
Microservices Service Discovery Tutorial with Consul
[Reactive Microservices] (http://www.mammatustech.com/reactive-microservices)
[High Speed Microservices] (http://www.mammatustech.com/high-speed-microservices)
Reactive Microservices Tutorial, using the Reactor
QBit is mentioned in the Restlet blog
All code is written using JetBrains Idea - the best IDE ever!
Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting
Tutorials
- QBit tutorials
- Microservices Intro
- Microservice KPI Monitoring
- Microservice Batteries Included
- RESTful APIs
- QBit and Reakt Promises
- Resourceful REST
- Microservices Reactor
- Working with JSON maps and lists
__
Docs
Getting Started
- First REST Microservice
- REST Microservice Part 2
- ServiceQueue
- ServiceBundle
- ServiceEndpointServer
- REST with URI Params
- Simple Single Page App
Basics
- What is QBit?
- Detailed Overview of QBit
- High level overview
- Low-level HTTP and WebSocket
- Low level WebSocket
- HttpClient
- HTTP Request filter
- HTTP Proxy
- Queues and flushing
- Local Proxies
- ServiceQueue remote and local
- ManagedServiceBuilder, consul, StatsD, Swagger support
- Working with Service Pools
- Callback Builders
- Error Handling
- Health System
- Stats System
- Reactor callback coordination
- Early Service Examples
Concepts
REST
Callbacks and Reactor
Event Bus
Advanced
Integration
- Using QBit in Vert.x
- Reactor-Integrating with Cassandra
- Using QBit with Spring Boot
- SolrJ and service pools
- Swagger support
- MDC Support
- Reactive Streams
- Mesos, Docker, Heroku
- DNS SRV
QBit case studies
QBit 2 Roadmap
-- Related Projects
- QBit Reactive Microservices
- Reakt Reactive Java
- Reakt Guava Bridge
- QBit Extensions
- Reactive Microservices
Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting