Monday, March 28, 2011

HTTPS and Keep-Alive Connections

As we explore network performance on the “real-world web”, one bad pattern in particular keeps recurring, and it’s not something that our many IE9 Networking Performance Improvements alone will resolve.

The bad pattern is the use of Connection: close semantics for HTTPS connections. In this bad pattern, a website allows only a single request and response on every HTTPS connection before closing the connection.

Defeating HTTP/1.1’s default Keep-Alive behavior is a bad practice for regular HTTP connections but it’s far worse for HTTPS connections because the initial setup costs of a HTTPS connection are far higher than a regular HTTP connection. Not only does the browser pay the performance penalty of setting up a new TCP/IP connection, including the handshake and initial congestion window sizing, but the request’s progress is also penalized by the time required to complete the HTTPS handshake used to secure the connection.

To do all of that work and then only use it for one HTTP request and response is a terrible waste of resources-- it's like paving a highway, allowing a single car to drive down it, and then dynamiting the road after it passes. You then need to pave a new highway for the next car, only to subsequently blow it up, and so on. This bad pattern can dramatically slow down the loading of the page and increase the load on your server.

Browsers have no choice but to close the connection when directed by the server; it would be a violation of the standard (and almost certainly wouldn’t work) to try to ignore the server’s directive to close the connection.

A while ago, I saw one site that was particularly bad in this regard—it was a shopping site that showed many product thumbnails on every page; the page included 200 thumbnail images, each delivered over HTTPS, and each from a server in Asia that closed the connection after every single response. While browsers work their hardest to load this page, performing multiple connections in parallel, each page on the site took several minutes to load. Using Fiddler to simulate exactly the same site, but allowing connection reuse, the site’s pages would load in about 15 seconds. I haven’t been back to that site since (they may not be in business any longer) but the problem can even be seen on “big” sites used by millions of people every day.