Enabling High Performance Data Transfers
Specific notes for Historical Operating Systems
The information on this page covers three groups of systems: obsolete operating systems that are no longer being supported by the original vendor, and systems for which our information is too far out of date and we have no access to current systems for testing. For pedantic consistency reasons it also includes current systems that appear on the main performance tuning page. We will continue to keep archival information on historical systems as long as it appears to be useful to the community.
For operating system which are still being supported but do not appear on the main performance tuning page, please accept our apologies and drop us a note. We will be happy to update the main tuning page with fresh information.
For all other information (The TCP tuning tutorial, etc) please see the main performance tuning page.
- High Performance Networking Options
- Using Web Based Network Diagnostic Servers
- Support for Various Operating Systems
- Detailed Procedures for tuning and Raising network limits on historical systems
- under BSD/OS 2.1 & 3.0
- on CRI systems under Unicos 8.0
- on (Compaq) DEC Alpha systems under Digital Unix 3.2c
- under HP-UX 9.X, 10.x & 11.X
- on IBM RS/6000 systems under AIX 3.2 or 4.1
- on IBM MVS systems under the Interlink TCP Stack
- on Linux
- on Macintosh OS X
- under NetBSD
- under FTP Software (NetManage) OnNet 4.0 for Windows 95/98
- on SGI systems under IRIX 6.5
- on Sun Solaris Systems
- Miscellaneous information about Windows NT
- on Microsoft Windows 98 systems
- on Microsoft Windows 2000
|Operating System (Alphabetical) (Click for additional info)||RFC1191 Path MTU Discovery||RFC1323 Support||Default maximum socket buffer size||Default TCP socket buffer size||Default UDP socket buffer size||Applications (if any) which areuser tunable||RFC2018 SACK Support
|BSD/OS 2.0||No||Yes||256kB||8kB||9216 snd 41600 rcv||None||Hari Balakrishnan’s BSD/OS 2.1 implementation|
|BSD/OS 3.0||Yes||Yes||256kB||8kB||9216 snd 41600 rcv||None|
|CRI Unicos 8.0||Yes||Yes||FTP|
|(Compaq) Digital Unix 3.2||Yes Winscale, No Timestamps||128kB||32kB||None|
|(Compaq) Digital Unix 4.0||Yes||Yes Winscale, No Timestamps||128kB||32kB||9216 snd 41600 rcv||None||PSC Research version|
|HPUX 9.X||No||9.05 and 9.07 provide patches for RFC1323||1 MB (?)||8kB||9216||FTP (with patches)|
|IBM AIX 3.2 & 4.1||No||Yes||64kB||16kB||41600 Bytes recieve/9216 Bytes send||None|
|IBM MVS TCP stack by Interlink, v2.0 or greater||No||Yes||1MB|
|Linux 2.4 and 2.6||Yes||Yes||64kB||32kB (seenotes||32kB(?)||None||Yes|
|Mac OS X||Yes||Yes||256kB||32kB||42kB (receive)||ftp (for a terminal shell)||Yes!
as of 10.4.6
|NetBSD 1.1/1.2||No||Yes||256kB||16kB||None||PSC Research version|
|FTP Software (NetManage) OnNet Kernel 4.0 for Win95/98||Yes||Yes||963.75 MB||8K [146K for Satellite tuning]||8K send 48K recv||FTP server||Yes|
|SGI IRIX 6.5||Yes||Yes||Unlimitted||60kB||60kB||None||Yes, as of 6.5.7. It is on by default.|
|Sun Solaris 10||Yes||Yes||1MB TCP, 256kB UDP||48kB||8kB||Unknown||Yes|
|Microsoft Windows NT 3.5/4.0||Yes||No||64kB||max(~8kB, min(4*MSS, 64kB))||No|
|Microsoft Windows NT 5.0 Beta||Yes||Yes|
|Microsoft Win98||Yes||1GB(?!)||8kB||Yes (on by default)|
|Microsoft Windows 2000||Yes||1GB(?!)||8kB||Yes (on by default)|
|Operating System (Alphabetical) (Click for additional info)||Path MTU Discovery||RFC1323 Support||Default maximum socket buffer size||Default TCP socket buffer size||Default UDP socket buffer size||Applications (if any) which areuser tunable||SACK Support|
Procedure for raising network limits under BSD/OS 2.1 and 3.0 (BSDi)
MTU discovery is now supported in BSD/OS 3.0. RFC1323 is also supported, and the procedure for setting the relevant kernel variable uses the “
sysctl” interface described for FreeBSD. See sysctl(1) andsysctl(3) for more information.
Procedure for raising network limits under CRI systems under Unicos 8.0
System configuration parameters are tunable via the command “
/etc/netvar“. Running “
/etc/netvar” with no arguments shows all configurable variables:
% /etc/netvar Network configuration variables tcp send space is 32678 tcp recv space is 32678 tcp time to live is 60 tcp keepalive delay is 14400 udp send space is 65536 udp recv space is 68096 udp time to live is 60 ipforwarding is on ipsendredirects is on subnetsarelocal is on dynamic MTU discovery is on adminstrator mtu override is on maximum number of allocated sockets is 3750 maximum socket buffer space is 409600 operator message delay interval is 5 per-session sockbuf space limit is 0
The following variables can be set:
- dynamic MTU discovery: This is “off” by default and should be changed to “on”.
- maximum socket buffer space: This should be set to the desired maximum socket buffer size (in bytes).
- tcp send space, tcp recv space: These are the default buffer sizes used by applications. These should be changed with caution.
Once variables have been changed in by
/etc/netvar, they take effect immediately for new processes. Processes which are already running with open sockets are not modified.
Procedure for raising network limits on (Compaq) DEC Alpha systems under Digital Unix 3.2c
- By default, the maximum allowable socket buffer size on this operating system is 128kB.
- In order to raise this maximum, you must increase the kernel variable sb_max. In order to do this, run the following commands as root:
# dbx -k /vmunix (dbx) assign sb_max = (u_long) 524288 (dbx) patch sb_max = (u_long) 524288
In this example, sb_max is increased to 512kB. The first command changes the variable for the running system, and the second command patches the kernel so it will continue to use the new value, even after rebooting the system. Note, however, that reinstalling (overwriting) the kernel will undo this change.
- The Digital Unix manuals also recommend increasing mbclusters to at least 832.
- Standard applications do not have a mechanism for setting the socket buffer size to anything but the default. However, you can change the kernel default by modifying the kernel variables (tcp_sendspace, tcp_recvspace)
Procedure for raising network limits on (Compaq) DEC Alpha systems under Digital Unix 4.0
- Under version 4.0 of Digital Unix, many variables can now be tuned with the sysconfigcommand. Some (but not all!) of the relevant variables from sysconfig are shown here:
% /sbin/sysconfig -q inet inet: tcp_sendspace = 32768 tcp_recvspace = 32768 tcp_keepidle = 14400 tcp_keepintvl = 150 tcp_keepinit = 150 tcp_keepcnt = 8 tcp_ttl = 60 tcp_mssdflt = 536 tcp_rttdflt = 3 tcp_dont_winscale = 0 tcpnodelack = 0 tcptwreorder = 1 udp_sendspace = 9216 udp_recvspace = 41600 udpcksum = 1 udp_ttl = 30 pmtu_enabled = 1 pmtu_rt_check_intvl = 20 pmtu_decrease_intvl = 1200 pmtu_increase_intvl = 240 ... % /sbin/sysconfig -q socket socket: sominconn = 0 somaxconn = 1024 sb_max = 131072
To make a change (for example):
# /sbin/sysconfig -r inet tcp_sendspace 65536 # /sbin/sysconfig -r inet tcp_recvspace 65536
- Specific advice for tuning (Compaq) Digital UNIX systems (for both V4.0 releases and many of the V3.2x releases) may be found at http://www.unix.digital.com/internet/tuning.htm
This document contains information on other important parameters (not just the ones directly associated with the socket, IP, and TCP layers) and gives instructions on how to modify things. It also includes important patch information, and is updated every few months.
Procedure for raising network limits under FreeBSD
All system parameters can be read or set with ‘sysctl’. E.g.:
sysctl [parameter] sysctl -w [parameter]=[value]
You can raise the maximum socket buffer size by, for example:
sysctl -w kern.ipc.maxsockbuf=4000000
FreeBSD 7.0 implements automatic receive and send buffer tuning which are enabled by default. The default maximum value is 256KB which is likely too small. These should likely be increased, e.g. with follows:
You can also set the TCP and UDP default buffer sizes using the variables
net.inet.tcp.sendspace net.inet.tcp.recvspace net.inet.udp.recvspace
When using larger socket buffers, you probably need to make sure that the TCP window scaling option is enabled. (The default is not enabled!) Check ‘tcp_extensions=”YES”‘ in /etc/rc.conf and ensure it’s enabled via the sysctl variable:
FreeBSD’s TCP has a thing called “inflight limiting” turned on by default, which can be detrimental to TCP throughput in some situations. If you want “normal” TCP behavior you should
sysctl -w net.inet.tcp.inflight_enable=0
You may also want to confirm that SACK is enabled: (working since FreeBSD 5.3):
MTU discovery is on by default in FreeBSD. If you wish to disable MTU discovery, you can toggle it with the sysctl variable:
Contributors: Pekka Savola and David Malone.
Checked for FreeBSD 7.0, Sept 2008
Procedure for raising network limits under HPUX 9.X
HP-UX 9.X does not support Path MTU discovery.
There are patches for 9.05 and 9.07 that provide 1323 support. To enable it, one must poke the kernel variables tcp_dont_tsecho and tcp_dont_winscale to 0 with adb (the patch includes a script, but I don’t recall the patch number).
Without the 9.05/9.07 patch, the maximum socket buffer buffer size is somewhere around 58254 bytes. With the patch it is somewhere around 1MB (there is a small chance it is as much as 4MB).
The FTP provided with the up to date patches should offer an option to change the socket buffer size. The default socket buffer size for this could be 32KB or 56KB.
There is no support for SACK in 9.X.
Procedure for raising network limits under HPUX 10.X
HP-UX 10.00, 10.01, 10.10, 10.20, and 10.30 supports Path MTU discovery. It is on by default for TCP, and off by default for UDP. On/Off can be toggled with nettune.
Up through 10.20, RFC 1323 support is like the 9.05 patch, except the maximum socket buffer size is somewhere between 240 and 256KB. In other words, you need to do the same adb “pokes” as described above.
10.30 does not require adb “pokes” to enable RFC1323. 10.30 also replaces nettunewith ndd. The 10.X default TCP socket buffer size is 32768, the default UDP remains unchanged from 9.X. Both can be tweaked with nettune.
FTP should be as it is in patched 9.X.
There is no support for SACK in 10.X up through 10.20.
Procedure for raising network limits under HPUX 11
HP-UX 11supports PMTU discovery and enables it by default. This is controlled through the ndd setting ip_pmtu_strategy.
Note: Addition (extensive) information is available atftp://ftp.cup.hp.com/dist/networking/briefs/annotated_ndd.txt
RFC 1323 support is enabled automagically in HP-UX 11. If an application requests a window/socket buffer size greater than 64 KB, window scaling and timestamps will be used automatically.
The default TCP window size in HP-UX 11 remains 32768 bytes and can be altered though ndd and the settings:
tcp_recv_hiwater_def tcp_recv_hiwater_lfp tcp_recv_hiwater_lnp tcp_xmit_hiwater_def tcp_xmit_hiwater_lfp tcp_xmit_hiwater_lnp
FTP in HP-UX 11 uses the new sendfile() system call. This allows data to be sent directly from the filesystem buffer cache through the network without intervening data copies.
HP-UX 11 (patches) and 11i (patches or base depending on the revision) have commercial support for SACK (based on feedback from HP – Thanks!)
Here is some ndd -h parm output for a few of the settings mentioned above. For those not mentioned, use ndd -h on an HP-UX 11 system, or consult the online manuals at http://docs.hp.com/
# ndd -h ip_pmtu_strategy ip_pmtu_strategy: Set the Path MTU Discovery strategy: 0 disables Path MTU Discovery; 1 enables Strategy 1; 2 enables Strategy 2. Because of problems encountered with some firewalls, hosts, and low-end routers, IP provides for selection of either of two discovery strategies, or for completely disabling the algorithm. The tunable parameter ip_pmtu_strategy controls the selection. Strategy 1: All outbound datagrams have the "Don't Fragment" bit set. This should result in notification from any intervening gateway that needs to forward a datagram down a path that would require additional fragmentation. When the ICMP "Fragmentation Needed" message is received, IP updates its MTU for the remote host. If the responding gateway implements the recommendations for gateways in RFCM- 1191, then the next hop MTU will be included in the "Fragmentation Needed" message, and IP will use it. If the gateway does not provide next hop information, then IP will reduce the MTU to the next lower value taken from a table of "popular" media MTUs. Strategy 2: When a new routing table entry is created for a destination on a locally connected subnet, the "Don't Fragment" bit is never turned on. When a new routing table entry for a non-local destination is created, the "Don't Fragment" bit is not immediately turned on. Instead, o An ICMP "Echo Request" of full MTU size is generated and sent out with the "Don't Fragment" bit on. o The datagram that initiated creation of the routing table entry is sent out immediately, without the "Don't Fragment" bit. Traffic is not held up waiting for a response to the "Echo Request". o If no response to the "Echo Request" is received, the "Don't Fragment" bit is never turned on for that route; IP won't time-out or retry the ping. If an ICMP "Fragmentation Needed" message is received in response to the "Echo Request", the Path MTU is reduced accordingly, and a new "Echo Request" is sent out using the updated Path MTU. This step repeats as needed. o If a response to the "Echo Request" is received, the "Don't Fragment" bit is turned on for all further packets for the destination, and Path MTU discovery proceeds as for Strategy 1. Assuming that all routers properly implement Path MTU Discovery, Strategy 1 is generally better - there is no extra overhead for the ICMP "Echo Request" and response. Strategy 2 is available only because some routers, or firewalls, or end hosts have been observed simply to drop packets that have the DF bit on without issuing the "Fragmentation Needed" message. Strategy 2 is more conservative in that IP will never fail to communicate when using it. [0,2] Default: Strategy 2 # ndd -h tcp_recv_hiwater_def | more tcp_recv_hiwater_def: The maximum size for the receive window. [4096,-] Default: 32768 bytes # ndd -h tcp_xmit_hiwater_def tcp_xmit_hiwater_def: The amount of unsent data that triggers write-side flow control. [4096,-] Default: 32768 bytes
HP has detailed networking performance information online, including information about the “netperf” tool and a large database of system performance results obtained with netperf:
Procedure for raising network limits on IBM RS/6000 systems under AIX 3.2 or AIX 4.1
RFC1323 options and defaults are tunable via the “no” command.
See the “no” man page for options; additional information is available in the IBM manual AIX Versions 3.2 and 4.1 Performance Tuning Guide, which is available on AIX machines through the InfoExplorer hypertext interface.
Procedure for raising network limits on IBM MVS systems under the Interlink TCP stack
The default send and receive buffer sizes are specified at startup, through a configuration file. The range is from 4K to 1MByte. The syntax is as follows:
- TCP SCALE(4) – specifies to support window scaling of 4 bits. Range is 0 (suppress both window scaling and timestamps) to 14 bits.
If SCALE is not zero, and the user bufferspace is > 65535, negotiating window scaling and timestamps will be attempted.
If SCALE is not zero, and the remote user negotiates window scaling or timestamps, we will accept those requests.
- FTP IBUF(4 20480) – would specify a receive bufferspace of 81920 bytes, and thus eligible for window scaling and timestamps.
FTP and user programs can be configured to use Window Scaling and Timestamps. This is done through the use of SITE commands:
- QUOTE SITE IBUF(num size) – specifies the input bufferspace for file transfers. When the product is larger than 65535, negotiating window scaling and timestamps will be attempted (if SCALE is not zero).
Tuning TCP for Linux 2.4 and 2.6
NB: Recent versions of Linux (version 2.6.17 and later) have full autotuning with 4 MB maximum buffer sizes. Except in some rare cases, manual tuning is unlikely to substantially improve the performance of these kernels over most network paths, and is not generally recommended
Since autotuning and large default buffer sizes were released progressively over a succession of different kernel versions, it is best to inspect and only adjust the tuning as needed. When you upgrade kernels, you may want to consider removing any local tuning.
All system parameters can be read or set by accessing special files in the /proc file system. E.g.:
If the parameter tcp_moderate_rcvbuf is present and has value 1 then autotuning is in effect. With autotuning, the receiver buffer size (and TCP window size) is dynamically updated (autotuned) for each connection. (Sender side autotuning has been present and unconditionally enabled for many years now).
The per connection memory space defaults are set with two 3 element arrays:
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers /proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP snd buffers
These are arrays of three values: minimum, initial and maximum buffer size. They are used to set the bounds on autotuning and balance memory usage while under memory stress. Note that these are controls on the actual memory usage (not just TCP window size) and include memory used by the socket data structures as well as memory wasted by short packets in large buffers. The maximum values have to be larger than the BDP of the path by some suitable overhead.
With autotuning, the middle value just determines the initial buffer size. It is best to set it to some optimal value for typical small flows. With autotuning, excessively large initial buffer waste memory and can even hurt performance.
If autotuning is not present (Linux 2.4 before 2.4.27 or Linux 2.6 before 2.6.7), you may want to get a newer kernel. Alternately, you can adjust the default socket buffer size for all TCP connections by setting the middle tcp_rmem value to the calculated BDP. This is NOT recommended for kernels with autotuning. Since the sending side is autotuned, this is never recommended for tcp_wmem.
The maximum buffer size that applications can request (the maximum acceptable values for SO_SNDBUF and SO_RCVBUF arguments to the setsockopt() system call) can be limited with /proc variables:
/proc/sys/net/core/rmem_max - maximum receive window /proc/sys/net/core/wmem_max - maximum send window
The kernel sets the actual memory limit to twice the requested value (effectively doubling rmem_max and wmem_max) to provide for sufficient memory overhead. You do not need to adjust these unless your are planing to use some form of application tuning.
NB: Manually adjusting socket buffer sizes with setsockopt() disables autotuning. Application that are optimized for other operating systems may implicitly defeat Linux autotuning.
The following values (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reasonable for all paths with a 4MB BDP or smaller (you must be root):
echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf echo 108544 > /proc/sys/net/core/wmem_max echo 108544 > /proc/sys/net/core/rmem_max echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
Do not adjust tcp_mem unless you know exactly what you are doing. This array (in units of pages) determines how the system balances the total network buffer space against all other LOWMEM memory usage. The three elements are initialized at boot time to appropriate fractions of the available system memory.
You do not need to adjust rmem_default or wmem_default (at least not for TCP tuning). These are the default buffer sizes for non-TCP sockets (e.g. unix domain and UDP sockets).
All standard advanced TCP features are on by default. You can check them by:
cat /proc/sys/net/ipv4/tcp_timestamps cat /proc/sys/net/ipv4/tcp_window_scaling cat /proc/sys/net/ipv4/tcp_sack
Linux supports both /proc and sysctl (using alternate forms of the variable names – e.g. net.core.rmem_max) for inspecting and adjusting network tuning parameters. The following is a useful shortcut for inspecting all tcp parameters:
sysctl -a | fgrep tcp
For additional information on kernel variables, look at the documentation included with your kernel source, typically in some location such as /usr/src/linux-<version>/Documentation/networking/ip-sysctl.txt. There is a very good (but slightly out of date) tutorial on network sysctl’s at http://ipsysctl-tutorial.frozentux.net/ipsysctl-tutorial.html.
If you would like to have these changes to be preserved across reboots, you can add the tuning commands to your the file /etc/rc.d/rc.local .
Autotuning was prototyped under the Web100 project. Web100 also provides complete TCP instrumentation and some additional features to improve performance on paths with very large BDP.
Contributors: John Heffner and Matt MathisChecked for Linux 2.6.18, 12/5/2006
Mac OS X has a single sysctl parameter, kern.ipc.maxsockbuf, to set the maximum combined buffer size for both sides of a TCP (or other) socket. In general, it can be set to at least twice the BDP. E.g:
sysctl -w kern.ipc.maxsockbuf=8000000
The default send and receive buffer sizes can be set using the following sysctl variables:
sysctl -w net.inet.tcp.sendspace=4000000 sysctl -w net.inet.tcp.recvspace=4000000
If you would like these changes to be preserved across reboots you can edit /etc/sysctl.conf.
RFC1323 features are supported and on by default. SACK is present and enabled by defult in OS X version 10.4.6.
Although we have never tested it, there is a commercial product to tune TCP on Macintoshes. The URL is http://www.sustworks.com/products/prod_ottuner.html. I don’t endorse the product they are selling (since I’ve never tried it). However, it is available for a free trial, and they appear to do an excellent job of describing perf-tune issues for Macs.Tested for 10.3, MBM 5/15/05
Procedure for raising network limits under NetBSD
RFC1323 is on by default in NetBSD 1.1 and above. Under NetBSD 1.2, it can be verified to be on by typing:
The maximum socket buffer size can be modified by changing SB_MAX in/usr/src/sys/sys/socketvar.h.
The default socket buffer sizes can be modified by changing TCP_SENDSPACE and TCP_RECVSPACE in /usr/src/sys/netinet/tcp_usrreq.c.
It may also be necessary to increase the number of mbufs, NMBCLUSTERS in/usr/src/sys/arch/*/include/param.h.
Update: It is also possible to set these parameters in the kernel configuration file.
options SB_MAX=1048576 # maximum socket buffer size options TCP_SENDSPACE=65536 # default send socket buffer size options TCP_RECVSPACE=65536 # default recv socket buffer size options NMBCLUSTERS=1024 # maximum number of mbuf clusters
Procedure for raising network limits under FTP Software (NetManage) OnNet 4.0 for Win95/98
OnNet Kernel has a check box “Enable Satellite tuning” which was intended and tested for 2Mb Satellite link with 600ms delay. This sets tcp window to 146K.
Many default settings, all of the above and more, may be overriden with registry entries. We plan to make available tuning guidelines at “some future time”. Also default TCP window may be set with Statistics app which is installed with OnNet Kernel.
The product “readme” discusses changing TCP window size and Initial slow start threshold with the Windows registry.
Statistics also has interesting graphs of TCP/UDP/IP/ICMP traffic. Also IPtrace app is shipped with OnNet Kernel to view unicast / multicast / broadcast traffic (no unicast traffic for other hosts – it does not run in promiscuous mode).
Procedure for raising network limits under SGI systems under IRIX 6.5
Under this version, there are two locations where configuration is done. Although I list the BSD information first, SGI recommends using systune which is described below.
The BSD values are now stored in /var/sysgen/mtune/bsd.
For instance from the file:
* name default minimum maximum * * TCP window sizes/socket space reservation; limited to 1Gbyte by RFC 1323 * tcp_sendspace 61440 2048 1073741824 tcp_recvspace 61440 2048 1073741824
These variables are used similarly to earlier IRIX 5 and 6 versions.
There is now a systune command. This command allows you to configure other networking variables. systune keeps strack of the chances you make in a file calledstune so that you can see them all in one place. Also note that changes made usingsystune are permanent. Here is a sample of things which can be tuned usingsystune:
/usr/sbin/systune (which is like sysctl for BSD) is what you use for tuneable values. group: net_stp (statically changeable) stp_ttl = 60 (0x3c) stp_ipsupport = 0 (0x0) stp_oldapi = 0 (0x0) group: net_udp (dynamically changeable) soreceive_alt = 1 (0x1) arpreq_alias = 0 (0x0) udp_recvgrams = 2 (0x2) udp_sendspace = 61440 (0xf000) udp_ttl = 60 (0x3c) group: net_tcp (dynamically changeable) tcp_gofast = 0 (0x0) tcp_recvspace = 61440 (0xf000) tcp_sendspace = 61440 (0xf000) tcprexmtthresh = 3 (0x3) tcp_2msl = 60 (0x3c) tcp_mtudisc = 1 (0x1) tcp_maxpersistidle = 7200 (0x1c20) tcp_keepintvl = 75 (0x4b) tcp_keepidle = 7200 (0x1c20) tcp_ttl = 60 (0x3c) group: net_rsvp (statically changeable) ps_num_batch_pkts = 0 (0x0) ps_rsvp_bandwidth = 50 (0x32) ps_enabled = 1 (0x1) group: net_mbuf (statically changeable) mbretain = 20 (0x14) mbmaxpages = 16383 (0x3fff) group: net_ip (dynamically changeable) tcpiss_md5 = 0 (0x0) subnetsarelocal = 1 (0x1) allow_brdaddr_srcaddr = 0 (0x0) ipdirected_broadcast = 0 (0x0) ipsendredirects = 1 (0x1) ipforwarding = 1 (0x1) ipfilterd_inactive_behavior = 1 (0x1) icmp_dropredirects = 0 (0x0) group: network (statically changeable) netthread_float = 0 (0x0) group: inpcb (statically changeable) udp_hashtablesz = 2048 (0x800) tcp_hashtablesz = 8184 (0x1ff8)
Changes made using systune may or may not require a reboot. This can be easily determined by looking at the ‘group’ heading for each section of tunables. If the group heading says dynamic, changes can be made on the fly. Group headings labelled static require a reboot.
Finally, the tcp_sendspace and tcp_recvspace can be tuned on a per-interface basis using the rspace and sspace options to ifconfig.
SACK: As of 6.5.7, SACK is included in the IRIX operating system and is on by default.
All system TCP parameters are set with the ‘ndd’ tool (man 1 ndd). Parameter values can be read with:
ndd /dev/tcp [parameter]
ndd -set /dev/tcp [parameter] [value]
RFC1323 timestamps, window scaling and RFC2018 SACK should be enabled by default. You can double check that these are correct:
ndd /dev/tcp tcp_wscale_always #(should be 1) ndd /dev/tcp tcp_tstamp_if_wscale #(should be 1) ndd /dev/tcp tcp_sack_permitted #(should be 2)
Set the maximum (send or receive) TCP buffer size an application can request:
ndd -set /dev/tcp tcp_max_buf 4000000
Set the maximum congestion window:
ndd -set /dev/tcp tcp_cwnd_max 4000000
Set the default send and receive buffer sizes:
ndd -set /dev/tcp tcp_xmit_hiwat 4000000 ndd -set /dev/tcp tcp_recv_hiwat 4000000
Contributors: John Heffner (PSC), Nicolas Williams (Sun Microsystems, Inc)
Misc Info about Windows NT
Editor’s note: See Windows 98 above for a detailed description of how this all works. In NT land, the Registry Editor is called regedt32.
Any Registry Values listed appear in: HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTcpipParameters Receive Window maximum value = 64kB, since window scaling is not supported default value = min( max( 4 x MSS, 8kB rounded up to nearest multiple of MSS), 64kB) Registry Value: TcpWindowSize Path MTU Discovery Variables: EnablePMTUDiscovery (default = enabled) turn on/off path MTU discovery EnablePMTUBHDetect (default = disabled) turn on/off Black Hole detection Using Path MTU Discovery: EnablePMTUDiscovery REG_DWORD Range: 0 (false) or 1 (true) Default: 1 Determines whether TCP uses a fixed, default maximum transmission unit (MTU) or attempts to find the actual MTU. If the value of this entry is 0, TCP uses an MTU of 576 bytes for all connections to computers outside of the local subnet. If the value of this entry is 1, TCP attempts to discover the MTU (largest packet size) over the path to a remote host. Using Path MTU Discovery's "Blackhole Detection" algorithm: EnablePMTUBHDetect REG_DWORD Range: 0 (false) or 1 (true) Default: 0 If the value of this entry is 1, TCP tries to detect black hole routers while doing Path MTU Discovery. TCP will try to send segments without the Don't Fragment bit set if several retransmissions of a segment go unacknowledged. If the segment is acknowledged as a result, the MSS will be decreased and the Don't Fragment bit will be set in future packets on the connection.
I received the following additional notes about the Windows TCP implementation.
PMTU Discovery. If PMTU is turned on, NT 3.1 cannot cope with routers that have the BSD 4.2 bug (see RFC 1191, section 5). It loops resending the same packet. Only confirmed on NT 3.1.
Procedure for raising network limits under Microsoft Windows 98
New: Some folks at NLANR/MOAT in SDSC have written a tool to do guide you through some of this stuff. It can be found athttp://moat.nlanr.net/Software/TCPtune/.
Even newer: I’ve updated some sending window information that was inaccurate. See below.
Several folks have recently helped me to figure out how to accomplish the necessary tuning under Windows98, and the features do appear to exist and work. Thanks to everyone for the assistance! The new description below should be useful to even the complete Windows novice (such as me :-).
Windows98 includes implementation of RFC1323 and RFC2018. Both are on by default. (However, with a default buffer size of only about 8kB, window scaling doesn’t do much).
Windows stores the tuning parameters in the Windows Registry. In the registry are settings to toggle on/off Large Windows, Timestamps, and SACK. In addition, default socket buffer sizes can be specified in the registry.
In order to modify registry variables, do the following steps:
- Click on Start -> Run and then type in “regedit”. This will fire up the Registry Editor.
- In the Registry Editor, double click on the appropriate folders to walk the tree to the parameter you wish to modify. For the parameters below, this means clicking on HKEY_LOCAL_MACHINE -> System -> CurrentControlSet -> Services -> VxD -> MSTCP.
- Once there, you should see a list of parameters in the right half of your screen, and MSTCP should be highlighted in the left half. The parameters you wish to modify will probably not appear in the right half of your screen; this is OK.
- In the menu bar, Click on “Edit -> New -> String Value”. It is important to create the parameter with the correct type. All of the parameters listed below are strings.
- A box will appear with “New Value #1”; change the name to the name listed below, exactly as shown. Hit return.
- On the menu, click on “Edit -> Modify” (your new entry should still be selected). Then type in the value you wish to assign to the parameter.
- Exit the registry editor, and reboot windows. (The rebooting is important, *sigh*.)
- When your system comes back up, you should have access to the features you have just turned on. The only real way to verify this is through packet traces (or by noticing a significant performance improvement).
TCP/IP Stack Variables
Support for TCP Large Windows (TCPLW)
Win98 TCP/IP supports TCP large windows as documented in RFC 1323. TCP large windows can be used for networks that have large bandwidth delay products such as high-speed trans-continental connections or satellite links. Large windows support is controlled by a registry key value in:
The registry key Tcp1323Opts is a string value type. The values for Tcp1323Optare
|0||No Windowscaling and Timestamp Options|
|1||Window scaling but no Timestamp options|
|3||Window scaling and Time stamp options|
The default value for Tcp1323Opts is 3: Window Scaling and Time stamp options. Large window support is enabled if an application requests a Winsock socket to use buffer sizes greater than 64K. The current default value for TCP receive window size in Memphis TCP is 8196 bytes. In previous implementations the TCP window size was limited to 64K, this limit is raised to 2**30 through the use of TCP large window support.
Support for Selective Acknowledgements (SACK)
Win98 TCP supports Selective Acknowledgements as documented in RFC 2018. Selective acknowledgements allow TCP to recover from IP packet loss without resending packets that were already received by the receiver. Selective Acknowledgements is most useful when employed with TCP large windows. SACK support is controlled by a registry key value in:
The registry key SackOpts is a string value type. The values for SackOpts are
|0||No Sack options|
|1||Sack Option enabled|
Support for Fast Retransmission and Fast Recovery
Win98 TCP/IP supports Fast Retransmission and Fast Recovery of TCP connections that are encountering IP packet loss in the network. These mechanisms allow a TCP sender to quickly infer a single packet loss by reception of duplicate acknowledgements for a previously sent and acknowledged TCP/IP packet. This mechanism is useful when the network is intermittently congested. The reception of 3 (default value) successive duplicate acknowledgements indicates to the TCP sender that it can resend the last unacknowledged TCP/IP packet (fast retransmit) and not go into TCP slow start due to a single packet loss (fast recovery). Fast Retransmission and Recovery support is controlled by a registry key value in:
The registry key MaxDupAcks is DWORD taking integer values from 2 to N. IfMaxDupAcks is not defined, the default value is 3.
Update: If you wish to set the default receiver window for applications, you should set the following key:
DefaultRcvWindow is a string type and the value describes the default receive windowsize for the TCP stack. Otherwise the windowsize has to be programmed in apps with setsockopt.
For a long time, I had the following sentence on this page:
- I presume that there is also a DefaultSndWindow what you would want to use on servers sending data to get higher performance. I have not yet verified this, however.
It turns out that there is not in fact such a variable. My limited experience has shown that, in some cases, it is possible to see very large send windows from Microsoft boxes. However, recent reports on the tcpsat mailing list have also stated that a number of applications under Windows severely limit the sending window. These applications appear to include FTP and possibly also the CIFS protocol which is used for file sharing. With these applications, it appears to be impossible to exceed the performance limit dictated by this sending window.
If anyone has any further information on these specific applications under Windows, I would be happy to include it here.
Procedure for raising network limits under Microsoft Windows 2000
New: The following URL: http://rdweb.cns.vt.edu/public/notes/win2k-tcpip.htmappears to be a pretty good summary of the procedure for TCP tuning under Windows 2000. It also has the URL for the Windows 2000 TCP tuning document from Microsoft.
We are not sure if it still necessary to set DefaultReceiveWindow even after setting the parameters indicated in the URL above.
If your machine does a lot of large outbound transfers, it will be necessary to setDefaultSendWindow in addition to the suggestions mentioned above.
Matt Mathis <firstname.lastname@example.org>; and Raghu Reddy <email@example.com>;
(with help from many others, especially Jamshid Mahdavi)