How the Internet Works, Chapter 22HTTP Performance Optimizations, Peer-To-Peer Architecture

HTTP Performance Optimizations

Performance optimizations center around reducing latency. Here are some ways to do that.

Connection Reuse

One way to reduce latency is to reduce internet traffic by reducing the number of connections required to request the content that a web page requires. Opening a connection requires multiple trips across the internet. Older versions of HTTP required a connection to be opened for each HTTP request. As the number of requests to open an average web page grew (the average is currently more than 100), it made sense to implement some way of persisting and reusing a connection.

Newer versions of HTTP (1.1 and above) accomplish this by supporting persistent connections on both ends. This significantly reduces the number of required round trips, consequently significantly improving performance.

Some servers disable persistent connections by default, and will allow them to be enabled via the Keep-Alive header. Enabling them can significantly improve performance.

Server Redundancy

Servers can reduce latency by replicating content across different browsers in different parts of the world. This reduces hops on requests: a user in New York might use a server in the eastern United States, while a user in Australia navigating to the same web page might get it from a server in Australia.

Browser Optimizations

Browsers can reduce actual and perceived latency in several ways. They divide themselves roughly into two types: document-aware optimizations and speculative optimizations.

An example of a document-aware optimization is when a user might see a page load with large images greyed out or in low resolution. The user will be able to scroll around the page and read text while the images finish loading. This is because the browser prioritizes certain types of content over others, allowing a page to render in steps. This reduces the perceived latency of the page, as users typically are more willing to read text content while waiting for images to load than they are to wait for text to load until after the images have loaded.

Speculative optimizations have to do with evaluating user behavior over time and attempting to anticipate what a user will want to do next. For example, when a user visits a page, the browser might perform DNS lookups ahead of time for pages that the user typically visits after the current one. Or, the browser might open a TCP connection to a page when a user hovers over a link to it, and drop the connection if the user moves off of the link.

Developer Techniques

Developers can minimize the number of resources required by keeping the idea of optimal use of resources in mind while designing a website. (If the page doesn’t need it, don’t request it!) For example, developers often copy the same <HEAD> tag from one page to another as they are developing a website. This tag contains (among other things), requests for supporting files such as CSS and JavaScript files. If a page doesn’t need a particular JavaScript file, it will reduce overhead to remove the request from its header section. It’s important to keep exactly what resources each web page in a site needs, and ensure that unnecessary and/or redundant requests are eliminated.

It’s also possible to compress the size of some resources. Minification removes extra whitespace and line feeds, reduces the size of variable names, and so on. For example, jQuery, a popular JavaScript library, has two versions: the developer (human-readable) version at 312kb and the minified version at 89kb. Minified files and other text files can be further compressed by using a compression utility such as GZip to compress files before sending and decompress them after receiving.

There are different ways to compress images. Some make smaller files at the cost of some detail, and some retain all detail (called lossless compression) but result in larger files. Judicious choice of which types of image formats to use is important for optimizing page size.

Peer-to-Peer Networking Architecture

The client-server model is the predominant model on the internet. However, networks can have a peer-to-peer (P2P) configuration as well. In a P2P configuration, every node on the network functions both as a client and a server.

This can be a good choice for networks that focus on content delivery. In a file-sharing application, users can look for a file and download it from the nearest node that has it stored on disk. As more users get added to the network, more files become available in more places, so as the network grows it also becomes both more responsive and more reliable.

Microsoft’s Windows Update Delivery Optimization uses a P2P model to deliver updates. If a user needs an update, the application will find a node that has it and download it from there, rather than relying on Microsoft’s servers. Bittorent is a well-known P2P-based technology for users to share large files such as movies, games and music. Bitcoin and other cryptocurrencies use a P2P network to handle transactions.

P2P architecture use in streaming video applications is increasing, as various techonologies continue to be invented to reduce latency. Netflix and Hulu have been researching ways of combining P2P solutions with their content-delivery networks for several years, to reduce work loads at peak times. Zoom uses a P2P platform to provide its videoconferencing and online chat services.

In Closing …

My aim in these articles has been to give a fairly comprehensive introduction to the various interconnected logical and physical parts of the internet. I hope that I have accomplished this, and my fellow web developers, and aspiring fellow web developers, will find in them something useful.

Robert Rodes

Software Developer

Turning ideas into software...