A Pinch of Internet Protocol and a Dash of Routing
After typing in your favorite search engine,
duckduckgo.com, into your web browser’s address bar, a complex series of events occurs once you hit the
It’s too much to cover in one blog post, but I’d like to cover part of the process of communication between your computer and DuckDuckGo’s server.
Every computer1 on the internet is assigned an Internet Protocol address, or IP address.
IP addresses take then form
127.52.43.21. Four numbers from 0 - 255, separated by periods.
The range [0, 255] includes 256 numbers. 256 is equal to 2 to the power of 8, or 28.
This means it takes 8 bits to represent all numbers from 0 to 256. Since we have 4 of these numbers, IP addresses are expressed in 32 bits. This means we can express 232 or 4,294,967,296 unique IP addresses using this scheme.
For now, let’s just say your computer is assigned one of these numbers when it decides it wants to have conversations with other computers over the internet. Let’s also assume that you know
duckduckgo.com‘s IP address is
18.104.22.168, since the translation from domain name to IP address is done via DNS, and thus not discussed here.
So, your computer (let’s say IP Address
22.214.171.124) wants to talk to DuckDuckGo at
We know where we want our message to go, but we don’t know how to get there. Imagine if the internet was a myriad of belligerent devices wired together by frustrated and overworked engineers. It is. How do we get our little packet of bits to DuckDuckGo when we don’t even know where DuckDuckGo’s server is?
The world has to agree on some sort of standard… we’ve seen that we already agree on the IP address standard: the world has a way to assign identifiers to computers on the network.
But IP addresses are (at this point) totally arbitrary. There’s no organization. If I’m a computer and I join the network - I can only find out the IP addresses of the nodes I’m directly connected to. If a packet comes along addressed to one of them, I can send it right along.
What if I receive a packet addressed to an address unknown to me? I can try to send it to everyone… or I could ask all of my neighbors which nodes they’re connected to. Hopefully you can see that either option will become inefficient - the former leads to intense network flooding: everyone sending “FIND THIS COMPUTER FOR ME AND GIVE THIS TO IT” every time one receives a packet destined for an unknown location. The latter leads to substantial memory requirements: every computer must essentially keep a mental model of the entire network topology. That’s not even considering the fact that computers join and leave the internet by the second!.
We require a more intelligent structure. We need to group devices or nodes into a logical hierarchy that’s easier to manage.
Routers (more than a reset button)
Your laptop doesn’t need to know where every computer on the internet is. Instead, let’s give spin up a new computer whose sole job is to talk to the internet. If you are sitting on a college campus with your laptop and you want to talk to
duckduckgo.com, first you must talk to this new computer. Let’s call it a router, just for fun.
If everyone else follows suit, routers should only need to talk to other routers, and we’ve split the internet up into manageable chunks. Instead of having every single device directly accessible via the internet, routers are directly accessible and they delegate messages to their internal, managed nodes.
Our router can receive messages from internal nodes directed outbound, or messages from the internet directed inbound (or messages it should relay to other routers, but we’ll ignore that). This means it must keep track of 2 tables of state: a table from internal host IP to physical network address (your laptop’s unique hardware identifier, called a MAC address), and a table from network number to interface. Bear with me for that last part.
We haven’t actually assigned a way to numerically divide the network into a hierarchical group-system. But we have IP addresses. Why not designate a portion of the IP address as the Network portion, and another portion of the IP address as the Host portion? This way, routers can talk to each other about the specific Network the message they’re handling is looking for. Once the message reaches the appropriate router, that router can delegate the message to the correct internal node.
As stated, a given router must keep track of the IP addresses of the physical devices under its hegemony. If it receives a message destined for a node on its network, it will forward it appropriately, so it must know how to identify the actual device behind the IP address.
How do we split up the IP address? Well, IP addresses are 32 binary digits long. Let’s say the world agreed that the first 18 bits identified the Network portion and the last 14 bits identified the Host portion2.
Now, when we send our packet to
126.96.36.199 the first 18 bits represent the network the router will look for. The last 14 bits represent the host DuckDuckGo’s router will use to find the server I want to talk to.
Let’s say my school’s router is called Router2. If Router2’s IP is 188.8.131.52 and my laptop’s IP is 184.108.40.206. Router2’s internal routing table might look something like:
And it’s external table might look something like:
As you can see, DuckDuckGo’s Network number isn’t in Router2’s external table. So it doesn’t know exactly who to contact to find the pathway to DuckDuckGo. But we have a
default entry, so we will forward the message to the router on our
3rd interface (which is just a network cable; this router has 3 cables).
The Internet Protocol is a “best effort” protocol - Router2 doesn’t know if the router on interface 3 will ever find DuckDuckGo - it just sends it off and hopes that it will.
The High-Speed Rabbit Hole
There was a lot here, and it only skims the surface. Unfortunately I’ve glossed over and even distorted many a detail here. I attempted to make the basic concepts digestible - IP and routing are nuanced and deserving of many posts.
But in conclusion: my laptop sends a message to the router (a “vertical” message, going up to the second level of the hierarchy) managing my school or Internet Service Provider’s network. That router communicates with other routers (a “horizontal” message, between the second level) to find the appropriate network to send this message to. Once it arrives at the appropriate router, that router delegates to the appropriate host (a “vertical” message, down to the physical device). DuckDuckGo’s network receives it, then delegates it off to the server I was planning on talking to (
1 : This isn’t really true due to Network Address Translation (NAT) devices.
2 : This also isn’t really true. But it gets the point across here.