I was working on a service that transmit very large files, sometimes hundreds of gigabytes. We use a high-speed network, up to 3–4 GB/s (≈ 32 Gbps). At first, I thought this was enough. But when I tested, the download speed was much slower than expected. After digging into it, I figured TCP buffer size could be related. In this post, I’ll share how I tuned it and what I learned.
1. TCP Buffers and Window Size
When you use TCP to send or receive data, the kernel stores that data in memory buffers:
- Send buffer (SO_SNDBUF): For data waiting to be sent.
- Receive buffer (SO_RCVBUF): For data received but not yet read by the app.
TCP also advertises a window size, which is how much unacknowledged data can be “in flight.”
If your buffer/window is too small, TCP will not fully use the link speed, no matter how fast your NIC is.
2. Bandwidth-Delay Product (BDP)
The buffer size you need depends on both network bandwidth and round-trip time (RTT):
BDP = Bandwidth (bytes/sec) × RTT (seconds)
This is the minimum TCP window size needed to fully use the connection.
Example:
Bandwidth RTT Required TCP Window
----------------- ------- -------------------
4 GB/s (≈ 32 Gbps) 1 ms 4 MB
4 GB/s (≈ 32 Gbps) 10 ms 40 MB
4 GB/s (≈ 32 Gbps) 20 ms 80 MB
So, even on a 1 ms RTT link, you already need 4 MB of TCP window per stream to reach full speed. Many OS defaults are far smaller (256 KB or 1 MB).
3. Why Jumbo Frames Help
MTU (Maximum Transmission Unit) is the maximum packet size on the network.
- Standard Ethernet MTU: 1500 bytes
- Jumbo frame MTU: 9000 bytes
With a jumbo MTU, each packet carries 9 KB of data instead of 1.5 KB.
Benefits:
- Fewer packets per second → Less CPU load on sender and receiver.
- Lower interrupt load on NICs → Better efficiency.
- Less protocol overhead → Higher effective throughput.
Jumbo frames do not change the TCP window calculation. You still need large buffers to keep the pipe full. Jumbo MTU just reduces per-packet cost so your CPU can keep up with multi-GB/s speeds.
4. Linux Tuning
On Linux, buffer size is automatically scaled, but only up to net.core.rmem_max and net.core.wmem_max. To achieve 3–4 GB/s, you often need to raise these limits.
sysctl -w net.core.rmem_max=134217728
sysctl -w net.core.wmem_max=134217728
sysctl -w net.ipv4.tcp_rmem='4096 87380 134217728'
sysctl -w net.ipv4.tcp_wmem='4096 65536 134217728'
Explanation:
rmem_max,wmem_max: Max size of receive/send buffers (128 MB here).tcp_rmem,tcp_wmem: Min, default, and max sizes for TCP buffers.- TCP window scaling (RFC 1323) must be enabled (Linux enables it by default).
You can check buffer sizes like this:
cat /proc/sys/net/core/rmem_max
cat /proc/sys/net/core/wmem_max
5. Testing with iperf3
To verify improvements, I use iperf3:
iperf3 -c <server_ip> -P 4
-P 4 runs 4 parallel streams. If one stream can’t saturate the link because of buffer limits, multiple streams help. After tuning, one stream should be enough.
Lessons Learned
- TCP window scaling and large buffers are required to fully use a high-speed link.
- BDP is the key formula: buffer = bandwidth × RTT.
- Jumbo frames reduce CPU and packet overhead, but you still need to tune TCP.
- Linux autotuning works well if you raise system max limits.
- Always measure with
iperf3and adjust step by step.
Quick Checklist for Large File Download Performance
- Measure RTT between your client and server.
- Calculate BDP:
bandwidth × RTT. - Increase
rmem_maxandwmem_maxto at least 2× BDP. - Enable jumbo MTU if your whole network path supports it.
- Re-test with
iperf3.
Learn more How TCP Buffer Size and Jumbo Frames Affect Large File Download Speed
