Friday, October 8, 2010

TCP Tunnig Guide


Network / TCP / UDP Tuning


High Performance Networking Options:-

The options below are presented in the order that they should be checked and adjusted.
  1. Maximum TCP Buffer (Memory) space: All operating systems have some global mechanism to limit the amount of system memory that can be used by any one TCP connection.
  2. Socket Buffer Sizes: Most operating systems also support separate per connection send and receive buffer limits that can be adjusted by the user, application or other mechanism as long as they stay within the maximum memory limits above. These buffer sizes correspond to the SO_SNDBUF and SO_RCVBUF options of the BSD setsockopt() call.
  3. TCP Large Window Extensions (RFC1323): These enable optional TCP protocol features (window scale and time stamps) which are required to support large BDP paths
  4. TCP Selective Acknowledgments Option (SACK, RFC2018) allow a TCP receiver inform the sender exactly which data is missing and needs to be retransmitted.
  5. Path MTU The host system must use the largest possible MTU for the path. This may require enabling Path MTU Discovery.



This is a very basic step by step description of how to improve the performance networking (TCP & UDP) on Linux 2.4+ for high-bandwidth applications. These settings are especially important for GigE link
Assumptions
This howto assumes that the machine being tuned is involved in supporting high-bandwidth applications. Making these modifications on a machine that supports multiple users and/or multiple connections is not recommended - it may cause the machine to deny connections because of a lack of memory allocation.
The Steps
  1. Make sure that you have root privleges.
  2. Type: sysctl -p | grep mem
    This will display your current buffer settings. Save These! You may want to roll-back these changes
  3. Type: sysctl -w net.core.rmem_max=8388608
    This sets the max OS receive buffer size for all types of connections.
  4. Type: sysctl -w net.core.wmem_max=8388608
    This sets the max OS send buffer size for all types of connections.
  5. Type: sysctl -w net.core.rmem_default=65536
    This sets the default OS receive buffer size for all types of connections.
  6. Type: sysctl -w net.core.wmem_default=65536
    This sets the default OS send buffer size for all types of connections.
  7. Type: sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608'
    TCP Autotuning setting. "The tcp_mem variable defines how the TCP stack should behave when it comes to memory usage. ... The first value specified in the tcp_mem variable tells the kernel the low threshold. Below this point, the TCP stack do not bother at all about putting any pressure on the memory usage by different TCP sockets. ... The second value tells the kernel at which point to start pressuring memory usage down. ... The final value tells the kernel how many memory pages it may use maximally. If this value is reached, TCP streams and packets start getting dropped until we reach a lower memory usage again. This value includes all TCP sockets currently in use."
  8. Type: sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
    TCP Autotuning setting. "The first value tells the kernel the minimum receive buffer for each TCP connection, and this buffer is always allocated to a TCP socket, even under high pressure on the system. ... The second value specified tells the kernel the default receive buffer allocated for each TCP socket. This value overrides the /proc/sys/net/core/rmem_default value used by other protocols. ... The third and last value specified in this variable specifies the maximum receive buffer that can be allocated for a TCP socket."
  9. Type: sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608'
    TCP Autotuning setting. "This variable takes 3 different values which holds information on how much TCP sendbuffer memory space each TCP socket has to use. Every TCP socket has this much buffer space to use before the buffer is filled up. Each of the three values are used under different conditions. ... The first value in this variable tells the minimum TCP send buffer space available for a single TCP socket. ... The second value in the variable tells us the default buffer space allowed for a single TCP socket to use. ... The third value tells the kernel the maximum TCP send buffer space."
  10. Type:sysctl -w net.ipv4.route.flush=1
    This will enusre that immediatly subsequent connections use these values.
Quick Step
Cut and paste the following into a linux shell with root privleges:

sysctl -w net.core.rmem_max=8388608
sysctl -w net.core.wmem_max=8388608
sysctl -w net.core.rmem_default=65536
sysctl -w net.core.wmem_default=65536
sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608'
sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608'
sysctl -w net.ipv4.route.flush=1

Tune values:-
Set the max OS send buffer size (wmem) and receive buffer size (rmem) to 12 MB for queues on all protocols. In other words set the amount of memory that is allocated for each TCP socket when it is opened or created while transferring files:

WARNING! The default value of rmem_max and wmem_max is about 128 KB in most Linux distributions, which may be enough for a low-latency general purpose network environment or for apps such as DNS / Web server. However, if the latency is large, the default size might be too small. Please note that the following settings going to increase memory usage on your server.

# echo 'net.core.wmem_max=12582912' >> /etc/sysctl.conf
# echo 'net.core.rmem_max=12582912' >> /etc/sysctl.conf


You also need to set minimum size, initial size, and maximum size in bytes:

# echo 'net.ipv4.tcp_rmem= 10240 87380 12582912' >> /etc/sysctl.conf
# echo 'net.ipv4.tcp_wmem= 10240 87380 12582912' >> /etc/sysctl.conf

Turn on window scaling which can be an option to enlarge the transfer window:
# echo 'net.ipv4.tcp_window_scaling = 1' >> /etc/sysctl.conf

Enable timestamps as defined in RFC1323:
# echo 'net.ipv4.tcp_timestamps = 1' >> /etc/sysctl.conf

Enable select acknowledgments:
# echo 'net.ipv4.tcp_sack = 1' >> /etc/sysctl.conf
By default, TCP saves various connection metrics in the route cache when the connection closes, so that connections established in the near future can use these to set initial conditions. Usually, this increases overall performance, but may sometimes cause performance degradation. If set, TCP will not cache metrics on closing connections.# echo 'net.ipv4.tcp_no_metrics_save = 1' >> /etc/sysctl.conf

Set maximum number of packets, queued on the INPUT side, when the interface receives packets faster than kernel can process them.
# echo 'net.core.netdev_max_backlog = 5000' >> /etc/sysctl.conf

Now reload the changes:
# sysctl -p
Use tcpdump to view changes for eth0:
# tcpdump -ni eth0

MTU: -
The Maximum Transmission Unit(MTU) can be set/modified in real time on Redhat Enterprise Linux or can be set force the value at boot time.
The MTU in simple terms is the maximum size of a packet that can be sent on a Network Interface card. The default MTU size is 1500 bytes.
To dynamically change the MTU in real time while the server is in use,
#:- ip link set dev eth0 mtu 1350
where eth0 is the Ethernet interface and 1350 is the mtu size (1350 bytes)
However, this change is lost when the server or the network interface restarts the next time.To make the change permanent, edit the interface configuration file (for instance eth0)
/etc/sysconfig/network-scripts/ifcfg-eth0 and add the following line MTU=1350
nce done, simply restart the interface or reboot the server at the next available maintenance window


for the changes to take effect.

redhatlinux# ip link list
 
1: lo: mtu 16436 qdisc noqueue
 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 
2: eth0: mtu 1350 qdisc pfifo_fast qlen 1000
 
link/ether 00:01:11:12:13:14 brd ff:ff:ff:ff:ff:ff
 
3: eth1: mtu 1500 qdisc noop qlen 1000
 
link ether 00:40:f4:98:8e:43 brd ff:ff:ff:ff:ff:ff