How Websites Load

Get posts like this one in your inbox by signing up for our newsletter.

The internet seems simple enough for end users. You type a URL into your browser, hit enter, and the page loads within a few seconds. If not, you’ll probably give up rather quickly and move on to a different site. However, behind the scenes, a lot is going on in order for that page to be displayed. So, what exactly happens behind the scenes to make websites load?

Routers

The internet works by using something called an IP address to know where to send packets. Packets are what they sound like; bits of data put in a virtual envelope to be shipped off to some address. Both your computer and web servers need an IP address for the same reason. Your computer needs to know which server to ask for the website, and the server needs to know where to send the website back to. If servers didn’t have an IP address, your computer wouldn’t know where to send the requests to. On the other hand, if your computer didn’t have an IP address, servers would have no way of sending a reply back to you.

You may have heard that the world is currently running out of IP addresses. There are roughly four billion possible IP addresses, many of which can’t be used because they’re reserved for special purposes. While four billion sounds like a lot, it’s nothing compared to the number of internet connected devices. Think of how many devices you have that are connected to the internet. Chances are you have at least one computer and a smartphone, and maybe also a laptop and/or tablet. Each of these needs an IP address for the internet’s routing magic to work. There are currently several times more internet connected devices than available IP addresses. So, how does this all work?

Routers, Routers, Routers

Remember those reserved IP addresses? Some of those are reserved to be used for internal IP addresses. Internal IP addresses are IP addresses that exist only within your network, so they can be reused on other networks. The router in your house typically gets assigned an external IP addresses, which does deplete the available IP addresses. However, each device connected to that router gets assigned an internal IP address. Whenever you make a request for a website, your request goes through your router which keeps track of which device within your network made that request and where it should go. It then routes the packets between your device and the rest of the internet. This way, regardless of how many devices you have in your house, you’re only using up a single external IP address.

IPv6

We’re not actually running out of all IP addresses, only IPv4 addresses. IPv4 uses a relatively small 32-bit address space, which is inadequate for the future of the internet. To fix this problem, IPv6 uses a 128-bit address space, allowing for exponentially more IP addresses to exist. However, most of the internet still uses IPv4, and will likely continue to mainly use that in the foreseeable future.

DNS

In order to communicate with any server, your computer has to know which IP address belongs to that particular server. However, remembering a bunch of IP addresses really isn’t convenient, and especially with IPv6, IP addresses can be longer than a good domain name. So, how does your computer know which server belongs to “google.com”? The answer is DNS, or Domain Name System.

A good and common analogy for explaining how DNS works is by comparing it to a phone book. Both accomplish very similar tasks. In the case of a phone book, you look up an identifier (e.g. a business name), and resolve it to a phone number. For DNS, your computer gives it a domain name, and it gets resolved to an IP address. Your computer can then contact the IP address and request the website.

Virtual Hosts

Another cool things using domain names allows for is hosting multiple websites on the same server. After your computer resolves the domain name into an IP address, the request to the web server includes the domain name. There is no limit to how many websites can live on the same server, as long as each has a unique domain name, even though the server only needs a single IP address.

This not only allows for more websites to exist than available IP addresses, but also makes running a website more affordable. Since an unlimited amount of websites can share a single IP addresses, hosting companies can put hundreds or thousands of websites onto one server with one IP address. Since a lot of websites share the same server, the cost of running that server can be split across all of them.

Load Balancing

Another important thing that DNS allows for is load balancing. When websites outgrow a single web server, the next step is to split the load across multiple servers. One way to do this is by using a load balancer, which is a server that sits in front of the web servers. Each incoming request then gets routed back to a web server, ensuring that the website stays up even in the event of a single web server failing. This also allows for easily scaling the site by adding more web servers during peak hours, for example. The question then becomes how to load balance the load that reaches each of the load balancers?

This is also where DNS can help. A website with multiple load balancers can simply set up the DNS server to response with multiple IP addresses. Your browser will then try them one by one until a web server responds. Additionally, geo-DNS can be used, which returns different IP addresses depending on where the request for the IP address comes from. This allows for users in different parts of the world to access the server closest to them, improving latency and speed.

Speaking Of Speed

If you haven’t changed your DNS settings, you’re most likely using the default DNS server provided by your ISP. These are often slower than the alternatives, and give your ISP an easy way to track which websites you visit. If you’re concerned about your privacy, or simply want a faster internet, you may want to change your DNS servers. One of the more popular ones for both speed and privacy is Cloudflare’s 1.1.1.1.

You can learn more about 1.1.1.1 by reading our post about 1.1.1.1 and WARP.

HTTPS

Once your browser figures out which IP address to connect to, HTTPS kicks in. That is, if the website supports HTTPS; otherwise, everything is just sent through an unencrypted HTTP connection. HTTPS provides more than just encryption, however. It also verifies the identity of the web server and provides data integrity. The former protects you from man-in-the-middle attacks, while the latter prevents the data from being modified while in transit.

If the website is loaded over HTTPS, then this is when the HTTPS handshake starts. The handshake process is when your browser verifies the server’s identity and when the browser and server decide on which TLS version and cipher suite to use. This is easier to understand graphically, so here is a good graphic from Cloudflare:

Source: https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/

Put into words, the first step of the process is the client sending its hello message to the server. The server then responds with its own hello message, which includes its SSL certificate. Your browser will then verify the certificate to ensure it’s communicating with the actual server, and not an attacker or another server impersonating the website. Lastly, your browser and server agree on a key to use for encryption, which is then used to encrypt the rest of the session.

VPNs and Proxies

If you use a VPN or web proxy, then another step is added to the list. Before anything gets sent from your computer, it gets encrypted and sent to the VPN or proxy server. The VPN or proxy server then decrypts your request, and sends it on to its destination. Once the destination responds, the VPN or proxy server encrypts the information and sends it back to you. Your computer then decrypts this data and sends it back to whichever application made the request. If a website is already encrypted with HTTPS, then the VPN or proxy server will add another layer of encryption. Otherwise, the VPN or proxy will act as the only layer of encryption.

Putting It All Together

  1. You enter a URL into the address bar of your browser and hit enter
  2. Your browser looks up the IP address for the domain name by sending a request to your DNS server
  3. A request gets sent to the IP address returned by the previous step
  4. Your browser and web server establish an HTTPS connection
  5. The page gets rendered by your browser

Five steps doesn’t seem like that much, but this was a very high level overview on how the internet works. In reality, there are many lower-level operations that make up each of the higher-level steps.