What happens when you type a URL in your browser and press Enter?

These are just some generalities of how the web works, for those of you wondering: “What happens when I type https://www.google.com in the browser and hit enter?”

Diana Henao
11 min readJan 28, 2022
A general summary of steps that occur when you type a URL in the browser and then press enter.

It is currently difficult to find a person who has not accessed, even once, a service on the web. Despite how widespread the use of the internet is today, it is not common to find people who know the process behind, for example, a web search. This blog will try to explain in a general way what happens when you type https://www.google.com in your browser and press Enter.

Enter the address to the browser.

We usually enter the URL (https://www.google.com), a unique identifier that allows us to locate resources that we want to access. A URL or Uniform Resource Locator consists of five parts: the protocol used, the domain name (or what corresponds to the address of the resources), the port number through which the server is listening, the route, and the specific file that we want to access and finally other additional parameters.

The protocol to be used is a series of rules that the browser must use to have effective communication on the network. Secondly, we find something important, the domain name. This domain name corresponds to the address of the computer responsible for displaying the content of the website that we are trying to access, in this case, the address where the Google computers/servers are located.

Get the IP address of that URL

But what happens is more complicated. The domain names we use were created for our benefit. It’s easy for us to remember an address like www.google.com, but actually, to start the connection between the browser (client) and the server we need the IP address. An IP (Internet Protocol) address is a unique identification number that all machines/computers connected to the Internet have.

This identification number allows knowing exactly how and when to contact another machine. Internet protocol (IP) is the Internet address system. The IP address of www.google.com is currently 142.251.40.164. That is, each domain name corresponds to an IP. This information is available thanks to the Domain Name System (DNS) servers, servers that contain the information, and are responsible for translating the web address into an IP address.

So, if we need to access some website, how can we get the IP number? The browser takes over that task. Initially, when the browser receives the domain name, it tries to find the IP address in its cache… perhaps we tried to access that same address recently. In case the browser does not find it, it searches the cache of the Operating System, which also keeps a record of websites that we have accessed. If it hasn’t found the IP yet, the browser tries to find the address in the Router’s cache. And in the last attempt, it will try to find the record in the cache of the Internet Service Provider (ISP). And it is that although many steps seem, all this route allows to regulate the traffic and improve the response times in the network.

Maybe even after searching all four caches, it was not possible to find the IP address of the website. So the Internet Service Provider can initiate a DNS query on a pre-configured ISP’s DNS server. It starts a recursive lookup on multiple DNS servers, also known as a recursive lookup or a DNS resourcer. During this search, there are several stages: First, the DNS resourcer or resolver communicates with the Root Name Server. This server does not deliver exact IP information, which means it does not return the IP. The power of this server is that it provides information from other DNS servers that could solve the DNS query.

The resolver will search the Top Level Domain server. In this server we can find the extensions like .com, .gov, .edu, .org… Subsequently, it goes to the Second Level Domain server that corresponds to the domain names: Google, Wikipedia, youtube… As in this case, we want to search for www.google.com, the address has the subdomain www. For this reason, the search also occurs in the Third Level Domain servers, which corresponds to the subdomains: www, sales, web-01… On this server, it will be possible to find the IP address.

The IP address will be returned to the resolver, which sends it back to the browser. Let’s remember that, it was the browser that initiated the IP searching request. Once the IP is obtained, the number is stored in the browser’s cache.

Finally, We can start the connection between the client and the server.

Start the TCP connection with the webserver

What we are looking for is to generate a connection with the server/servers that contain the page information stored. To do this, the browser starts a TCP connection with the server. This protocol, which is part of one of the layers of the TCP/IP model, establishes how data is transferred from one computer to another. Without this connection no data exchange is possible.

To establish the connection, it is necessary a process called TCP/IP 3-way handshake. The protocol has three main steps:

  1. At first, the client sends an SYN (SYNchronize) packet to the server, so that the server reports whether or not it is available for a new connection. Availability depends on whether the server has open ports to start and establish connections.
  2. In the second step, the server needs to respond to the client. If the server is available to connect, it sends a second packet with an ACK (ACKnowledgment) using an SYN/ACK packet.
  3. Finally, in a third step, the client receives the SYN/ACK from the server and responds to it by sending an ACK packet.

Once this TCP handshake is finished, the data transmission can start!

4. Get the web page in the browser
Once we have established the connection between the client and the server, the browser sends a request, called HTTP request, to the webserver.

HTTP (HyperText Transfer Protocol) is another well-known protocol that allows the exchange of information. It is an application layer protocol of the TCP/IP model and describes the meaning and format of the transferred messages.

It is the browser that starts the communication using HTTP requests. This request allows the client to express its intent to the server. The HTTP request consists of three parts: the URI, the HTTP methods, and the headers.
The URI allows the unique identification of something on the network. That is, they are like labels that identify the information.
In addition to that, we have the HTTP methods that are the ones that show the server what we need from it.
And finally, the HTTP headers allow the negotiation between the client and the server. Then the server can adjust the response according to the information contained in the headers.

There is another form of connection known as HTTPS, which is nothing more than the secure version of the HTTP protocol. The S (for Secure) allows ensuring that all communications between the client and the server are encrypted. Two protocols can be used to encrypt communications: SSL (Secure Sockets Layer) or TLS (Transport Layer Security). Both protocols use an infrastructure called ‘Asymmetric’ Public Key Infrastructure or PKI.

This PKI system uses two keys to encrypt communications, a public key and a private key. And both keys are needed to decrypt the information. Usually, the private key is installed on the webserver. While the public key is shared with anyone who needs to decrypt the information. To establish an HTTPS connection the server must show that it is the one we are looking for (Google’s server). It does this through the TLS or SSL certificate issued by the Certificate Authority (CA).

When the browser connects to www.google.com the browser validates the Server Certificate using a list of trusted certificates or “Root Certificate Authorities”. The browser verifies that the server is the ONE authorized using the Certificate and begins an exchange of public and private keys that allows the information to be encrypted/decrypted adequately.

In this case, the browser starts HTTPS communication with the server by sending a GET request. The server that contains the website receives the request and passes it to a request handler. This request handler is in charge of receiving it, reviewing the request, and generating the response. Initially, the server sends a response with the status code and later the web page (if the status code was successful).

The status code is a numerical code that is contained in the first line of the response given by the server. This first line is known as the status line.
These codes indicate what happened to the request.

For example, in case of a successful connection, the code will be 200. But if instead, what I am trying to find was temporarily moved to some other address then I will receive a status code of 307. If the resource was intentionally removed, the answer will be code 410.

In the following table, we can see the meanings of the numbers. These can be gathered in a kind of group according to the first digit:

The connection may also be denied by the server’s Firewall system. The Firewall is a system that provides security to a network and is capable of filtering incoming and outgoing traffic using previously established rules. This security layer prevents attackers from accessing the Google server for malicious purposes. On servers, rules have usually been defined that allow most traffic out of the server, and restrict incoming traffic.
For efficiency reasons, the information transfer is done in small packets. These packets are nothing more than containers with control information (such as origin and destination), packet sequence information, and data (or payload). The control information makes it possible to ensure that the data is sent properly. It also contains elements necessary for the Firewall to allow, or not, the connection according to the set of established rules. An example of a rule might be that a specific server accepts incoming traffic on ports 80 and 443 (HTTP and HTTPS) but could deny all outgoing FTP traffic. But if we do not have bad intentions, everything should work correctly. Google should allow us to access the web page. The connection to the Google server should be successful.

In case everything is successful and we have obtained a status code of 200, the browser receives the web page in smaller packets. These packets are assembled by the browser and displayed in phases: First, the skeleton of the web page is rendered — HTML. Subsequently, the semantic HTML tags and those that require more resources (code and resource component files) are verified. Finally, the web page information is displayed in the browser and shown to the user.

But, the information that Google manages is immense, and therefore, it needs a storage system that can scale to enormous sizes such as petabytes. To store all the needed resources, Google has implemented a solution known as Bigtable. This non-relational database (NoSQL) is indicated to store a huge amount of data and decrease latency. Important considering that sometimes it is necessary the display data in real-time. It is based on the Google File System (GFS) which is a kind of data backup, through the storage of tables known as SSTables. This is how we, as users, can access information and data quickly and very efficiently.

As we can see, Google not only handles static files and HTTP/HTTPS requests. So it will hardly be possible to manage it through a web server like Nginx or Apache. Therefore, Google manages instead an Application Server. An Application server is a platform that runs various applications. This server allows you to manage numerous services such as Database System (DBMS), communication and Internet services, and other applications. All this takes into account that we are not the only ones in the world trying to get into Google’s home page.

Google receives millions of requests per second around the world, which puts a strain on the server infrastructure. And yet, we receive a response from Google in a matter of seconds. How does Google manage that level of demand and maintain low latency?
Well, Google service is maintained not just on one server (as it seems from the way the article is written), but on multiple servers in multiple clusters around the world. In this way, the traffic that Google constantly receives can be distributed evenly to all servers. This allows to use of resources more efficiently and reduces the possibility of overload.

Taken from: https://k21academy.com/google-cloud/google-cloud-load-balancing/

This equitable distribution is done through what is known as Load Balancer. Google uses a Load Balancer known as Maglev. Load balancing allows Google to be able to receive a lot of traffic or connections simultaneously.
Additionally, this distribution eliminates the possibility of generating a Single Point of Failure (SPOF). If there is an attack or a problem on a server, there is the possibility of redirecting the requests to another of the servers in the cluster. So, using an algorithm, the Load Balancer can redirect clients to other enabled servers. This is how we can receive an almost automatic response from Google.

These are just some generalities of how the web works, but I hope this has been helpful to those wondering: “What happens when I type https://www.google.com in the browser and hit enter?” Thank you very much for coming here and until next time!

References:

Nigam A — FreeCodeCamp. What happens when you click on a URL in your browser: https://www.freecodecamp.org/news/what-happens-when-you-hit-url-in-your-browser/

Dubost K. HTTP — an Application-Level Protocol: https://dev.opera.com/articles/http-basic-introduction/

Dubost K. HTTP: Let’s GET It On!: https://dev.opera.com/articles/http-lets-get-it-on/

Dubost K. HTTP: Response Codes: https://dev.opera.com/articles/http-response-codes/

Evans J. Wizard zines — programming zines by Julia Evans: https://wizardzines.com/

MDN Web Docs. HTTP response status codes: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

DigitalOcean. What is a Firewall and How Does It Work?: https://www.digitalocean.com/community/tutorials/what-is-a-firewall-and-how-does-it-work

Navarro E. What happens when you type an URL in the browser and press Enter?: https://www.linkedin.com/pulse/what-happens-when-you-type-url-browser-press-enter-navarro-mill%C3%A1n

Wijesinghe M. What happens when you type a URL in the browser and press enter?: https://medium.com/@maneesha.wijesinghe1/what-happens-when-you-type-an-url-in-the-browser-and-press-enter-bb0aa2449c1a

--

--

Diana Henao

Programmer in training, because there is always a lot to learn. I’m not the best, but I’m trying my best!