TCP is one of those protocols that we usually don’t think about too much. As network engineers we are busy working with network devices like routers or switches. TCP is one of those protocols that is used most between hosts or servers and it works without giving it much thought. It establishes connections, transmits data, sends acknowledgments and when something goes wrong…it retransmits it.
TCP uses a sliding window size that indicates how much the receiver is willing to receive from the sender. Depending on the receive buffer and network conditions, this window size will increase or decrease as needed. The larger the window size, the higher the throughput will be. With a window size of 1, the receiver would send an acknowledgment for each segment that it receives which results in a lot of overhead.
This “stop and go” mechanism of TCP works very well “out of the box” but on certain links, TCP might require some tuning. This is especially true on so called long fat networks (LFN).
A LFN is a network that offers a high bandwidth but also a very high delay. An example could be a satellite connection. These connections offer a high bandwidth but the delay is also quite high since you have to send your signal 22000 miles up to the satellite and another 22000 miles down to reach the receiver. You can expect a round trip time anywhere between about 500-1000 ms.
The problem here is that when the sender sends some data, it has to be wait a very long time for an acknowledgment of the receiver before it can send the next data. During the time we are waiting, nothing happens so we don’t utilize the full bandwidth of our link.
The throughput of TCP is limited by the round trip time of the link and the window size. We can’t change the round trip time but we can play with the window size. Take a look at the image below:
Imagine we send some data from the host to the server, when this piece of data is on its way we have to wait a long time before it reaches the server and for the acknowledgment to come back. A lot of bandwidth is wasted. This is what happens with a large window size:
With a large window size, we can fill the entire “pipeline” with data. We don’t waste anything.
When you are using a 5 Mbit satellite link and you have a transmission rate of 1 or 2 Mbit of TCP traffic, you probably have some TCP tuning to do.
The most optimal window size depends on the bandwidth and delay of the link, we call this the bandwidth delay product. We can calculate it with the following formula:
Bandwidth Delay Product = bandwidth (bits per sec) * round trip time (in seconds)
So for example, let’s calculate the bandwidth delay product of a satellite link that has a round trip time of 500 ms:
5000000 bits * 0.5 seconds = bandwidth delay product 2500000
So our bandwidth delay product is 2500000 bits. The window size is typically configured in bytes so 2500000 / 8 would be 312500 bytes.
Here are some other examples:
ADSL 2 Mbit with 50 ms round trip time:
2000000 bits * 0.05 seconds = bandwidth delay product 100000 bits (or 12500 bytes)
ADSL2 20 Mbit with 50 ms round trip time:
20000000 bits * 0.05 seconds = bandwidth delay product 1000000 bits (or 125000 bytes)
FastEthernet LAN Interface with 1 ms round trip time:
100000000 bits * 0.001 seconds = bandwidth delay product 100000 bits (or 12500 bytes)
Gigabit LAN Interface with 1 ms round trip time:
1000000000 bits * 0.001 seconds = bandwidth delay product 1000000 bits (or 125000 bytes)
Are there any downsides to increasing the TCP window size? One thing to consider is that by increasing the window size, you also need a large receive buffer but this
shouldn’t be much of a problem on any modern hardware. Also with a larger window size you will have a lot of data “in transit” so if you have any errors on the link,
there’s a lot of data to retransmit.
Once you have calculated the bandwidth delay product, you should test if it works. A nice way to test this is by using iPerf. This application allows you to generate TCP traffic with different window sizes. To demonstrate this, I’ll use two hosts:
These two hosts are connected through a gigabit link so this is a high bandwidth low delay link. Even though the round trip time is low, we still have to use a decent window size to get some decent performance.
A quick ping tells us the round trip time: