How Windows is Killing Internet Download Speeds? How to dramatically improve your speed?


A. Microsoft is intentionally restricting Internet download speeds

Background: I have always noticed that download speeds from my server were not as fast as they 'should' have been from my Laptop (given my Internet speed and the fact that other PC's were fast), but never spent the time to find out why. That all changed when Comcast upgraded my Internet download speed to around 120Mbps. I wanted that speed from my laptop! It was time to find the root cause of the slow download speeds. After days of sleuthing and days of research, I finally found out why, and the result was nothing short of stunning: Microsoft is intentionally restricting Internet download speeds!

What determines/limits download speed? There are many factors that determine actual download speed, but if we assume that a client and server are both on fast links, the critical factor that will absolutely determine (and possibly limit) download speed is something called the 'TCP receive window' (or RWIN), which is set on a per-socket connection. Additionally, there are ways to 'scale' this receive window size (how does not matter, but it explains some terminology seen below). Windows (Vista/7/8/etc) will automatically set -- and more importantly, increase -- the size of the TCP receive window for you, as needed, to maximize throughput.

Receive Window Auto-Tuning: Microsoft calls this automatic management of the receive window size 'auto-tuning'. To see the settings associated with this, go into a DOS prompt and run this command:
C:\>netsh interface tcp show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State : enabled
Chimney Offload State : automatic
NetDMA State : enabled
Direct Cache Acess (DCA) : disabled
Receive Window Auto-Tuning Level : normal
Add-On Congestion Control Provider : none
ECN Capability : disabled
RFC 1323 Timestamps : disabled
** The above autotuninglevel setting is the result of Windows Scaling heuristics
overriding any local/policy configuration on at least one profile.


So, Windows has "Auto-Tuning" set to "normal" (seen blue above; that is good), but the warning (in red above) is saying that the "Windows Scaling heuristics" has a role to play as well.

TCP Window Scaling heuristics: To query and see what the Windows heuristics settings are, run this command:
C:\>netsh interface tcp show heuristics
TCP Window Scaling heuristics Parameters
----------------------------------------------
Window Scaling heuristics : enabled
Qualifying Destination Threshold : 3
Profile type unknown : normal
Profile type public : restricted
Profile type private : normal
Profile type domain : normal
Where profile types 'unknown/public/private/domain' refers to your classification of your internet connection. All of my connections are 'public' by default (see dialog below), which results in the Internet connection being 'locked down' for increased security -- but because of that, heuristics will then cause TCP window scaling to be "restricted"!
What does "restricted" even mean? What is "normal" and what is "restricted"? To find out, ask Windows:
C:\>netsh interface set global
...
autotuninglevel - One of the following values:
  disabled: Fix the receive window at its default
      value.
  highlyrestricted: Allow the receive window to
      grow beyond its default value, but do so
      very conservatively.
  restricted: Allow the receive window to grow
      beyond its default value, but limit such
      growth in some scenarios.

  normal: Allow the receive window to grow to
      accomodate almost all scenarios.
      experimental: Allow the receive window to grow
      to accomodate extreme scenarios.
...
And after seeing "but limit such growth in some scenarios", it all became very clear -- Windows itself, via its "TCP Window Scaling heuristics" was restricting my Internet download speeds!

B. The problem, a solution, and a BUG in Windows

The problem: For a "public" Internet connection, Windows "TCP Window Scaling" was running in "restricted" mode due to Windows "heuristics" being enabled, and overriding the "normal" scaling mode.

A quick test to see if you are running into the problem: Open one browser window and download a large file (test file) from a known fast location. Note this result as 'speed1'. Now open two browser windows and start the same download at the same time in each browser Window. Note this result as 'speed2'. If 'speed2' is faster than 'speed1', then then your computer may have this problem.

The solution: Disable "Window Scaling heuristics" to eliminate the 'restricted' mode override -- and always let "TCP windows scaling" operate in "normal" mode, as it should for all home broadband connections.

How to turn Heuristics OFF: Issue the following command from a DOS command prompt (run in Administrator mode):
C:\>netsh interface tcp set heuristics disabled
Ok.
Verification: There is no need to reboot. All of a sudden, my download speeds improved instantly, and dramatically! The change does not seem to affect sockets that have already been opened, but (on my PC) does affect any new socket connections that are made (so exit all browser windows and open a new browser to test the change).

And a bug found in Windows: It is very clear that when Windows was asked to display the current "Receive Window Auto-Tuning Level" and Windows said "normal" and claimed that "is the result of Windows Scaling heuristics overriding any local/policy configuration on at least one profile" -- that Windows displayed the wrong mode, and should have instead displayed something like:
C:\>netsh interface tcp show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State : enabled
Chimney Offload State : automatic
NetDMA State : enabled
Direct Cache Acess (DCA) : disabled
Receive Window Auto-Tuning Level : normal restricted (heuristic)
Add-On Congestion Control Provider : none
ECN Capability : disabled
RFC 1323 Timestamps : disabled
** The above autotuninglevel setting is the result of Windows Scaling heuristics
overriding any local/policy configuration on at least one profile.


C. The reason why most people will NEVER notice this problem

The issue only happens on fast Internet connections: In Windows "restricted" Auto-Tune mode, Windows really does not like the RWIN to go above 256KB (262144) bytes. And if you know the RTT for the server you want to download from, that defines a maximum Mbps. As long at that maximum is above your rated Internet speed, you will not even notice this problem!

The average broadband user can NOT run into this problem: In mid 2014, the 'average' U.S. broadband speed was around 30Mbps (source: http://www.netindex.com). At that speed, you would have to download from a server around 70ms RTT away to even notice the problem. Most people in the US, connecting to a server in the US, will experience RTT less than that. The problem will never be seen for most people.

Fast (low RTT) servers also hide the problem: Virtually all 'big' companies (Apple, Microsoft, etc) use cloud based servers, meaning that servers are usually only around 15ms away from you. At that RTT, to even notice this problem, you would need an Internet connection faster than 133Mbps.

A Helpful Chart: Here is a chart of RWIN/(RTT/1000), when RWIN is 256K, and millisecond RTT:
How to use this chart: Take your (rated) broadband internet connection speed, and take the RTT for a server that you are downloading from. Is that intersection above the line in the red (you will see this problem), or it that intersection below the line in the green (you will not see this problem).

The chart in table form: Here is the same chart, but in table (raw data) form:
RTT and Mbps combinations that yield a RWIN of 256k
RTT 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200
Mbps 200 100 67 50 40 33 29 25 22 20 18 17 15 14 13 12 12 11 11 10
How to find the RTT for a server: Go into a DOS windows and "ping server-name" or "ping ip-address".

Only people who speed test their connection will notice: Unless you are speed testing your connection, most people will never notice a slow down in a download.

Blame the server: Even for people that notice, it is far easier to just blame the server, assuming it is running 'a little slow'.


D. RESULTS: 5x speed improvement - East Coast to West Coast

BEFORE: TCP slow start, then around 23Mbps:
AFTER: TCP slow start, then to 120Mbps:


E. RESULTS: 9x speed improvement - East Coast to Japan

BEFORE: Around 11Mbps:
AFTER: TCP slow start, then to 100Mbps:


F. The "TCP Receive Window" explained -- and how it limits throughput

Assume that you want to download a large file from a server to your PC. One way would be for the server to send one packet to your PC, wait for acknowledgement of the packet, and then repeat this 'send/ack' until the entire large file has been sent to your PC.
If the server was right next to you on a LAN, this actually might work quite well (because the server receive your 'ack' so fast). However, when the server is potentially half way around the world, the download would come to a crawl -- due to the TIME that it takes to send one packet and receive the acknowledgement. In TCP, the RTT (round trip time) is the time in seconds that takes for a signal to reach the other party and come back. It is often displayed as a whole number of milliseconds (1/1000 second).

In the above scenario, there can only be one packet outstanding and unacknowledged. Clearly, the RTT from the PC to the server, determines how much data can be sent. For example, if a packet can hold 1000 bytes, and the RTT is 1 second, then, at best, you can only transmit 1000 bytes per second from the server to the PC. But if the server is in the next room and the RTT is 0.001, then we can transmit 1000 bytes every 0.001 seconds, or 1,000,000 bytes per second. And so on.

How can we increase the speed of the download for a server half way around the world? By allowing multiple numbers of packets to be transmitted and unacknowledged. TCP calls the total number of bytes in the packets that can be 'in the pipe' and unacknowledged the RWIN (receive window).
So the formula for determining the maximum throughput is very simple and is just RWIN/RTT. And now it becomes obvious why the size of RWIN is so critical to throughput. The size of RWIN puts an upper limit on throughput.

Comments

Popular posts from this blog

Export Data as csv file from SQL server using command line tools

Query within a Task Scheduler Library folder using schtasks.exe and getting total number of tasks

Getting Started with AppCmd.exe