A Pinch of Internet Protocol and a Dash of Routing

After typing in your favorite search engine, duckduckgo.com, into your web browser’s address bar, a complex series of events occurs once you hit the Return key.

It’s too much to cover in one blog post, but I’d like to cover part of the process of communication between your computer and DuckDuckGo’s server.

 IP Addresses

Every computer1 on the internet is assigned an Internet Protocol address, or IP address.

IP addresses take then form 127.52.43.21. Four numbers from 0 - 255, separated by periods.

The range [0, 255] includes 256 numbers. 256 is equal to 2 to the power of 8, or 28.

Screen Shot 2016-05-29 at 7.09.20 PM.png

This means it takes 8 bits to represent all numbers from 0 to 256. Since we have 4 of these numbers, IP addresses are expressed in 32 bits. This means we can express 232 or 4,294,967,296 unique IP addresses using this scheme.

For now, let’s just say your computer is assigned one of these numbers when it decides it wants to have conversations with other computers over the internet. Let’s also assume that you know duckduckgo.com‘s IP address is 184.72.106.52, since the translation from domain name to IP address is done via DNS, and thus not discussed here.

So, your computer (let’s say IP Address 192.86.74.212) wants to talk to DuckDuckGo at 184.72.106.52.

 The Problem

We know where we want our message to go, but we don’t know how to get there. Imagine if the internet was a myriad of belligerent devices wired together by frustrated and overworked engineers. It is. How do we get our little packet of bits to DuckDuckGo when we don’t even know where DuckDuckGo’s server is?

The world has to agree on some sort of standard… we’ve seen that we already agree on the IP address standard: the world has a way to assign identifiers to computers on the network.

But IP addresses are (at this point) totally arbitrary. There’s no organization. If I’m a computer and I join the network - I can only find out the IP addresses of the nodes I’m directly connected to. If a packet comes along addressed to one of them, I can send it right along.

What if I receive a packet addressed to an address unknown to me? I can try to send it to everyone… or I could ask all of my neighbors which nodes they’re connected to. Hopefully you can see that either option will become inefficient - the former leads to intense network flooding: everyone sending “FIND THIS COMPUTER FOR ME AND GIVE THIS TO IT” every time one receives a packet destined for an unknown location. The latter leads to substantial memory requirements: every computer must essentially keep a mental model of the entire network topology. That’s not even considering the fact that computers join and leave the internet by the second!.

Screen Shot 2016-05-29 at 7.09.23 PM.png

We require a more intelligent structure. We need to group devices or nodes into a logical hierarchy that’s easier to manage.

 Routers (more than a reset button)

Your laptop doesn’t need to know where every computer on the internet is. Instead, let’s give spin up a new computer whose sole job is to talk to the internet. If you are sitting on a college campus with your laptop and you want to talk to duckduckgo.com, first you must talk to this new computer. Let’s call it a router, just for fun.

If everyone else follows suit, routers should only need to talk to other routers, and we’ve split the internet up into manageable chunks. Instead of having every single device directly accessible via the internet, routers are directly accessible and they delegate messages to their internal, managed nodes.

Our router can receive messages from internal nodes directed outbound, or messages from the internet directed inbound (or messages it should relay to other routers, but we’ll ignore that). This means it must keep track of 2 tables of state: a table from internal host IP to physical network address (your laptop’s unique hardware identifier, called a MAC address), and a table from network number to interface. Bear with me for that last part.

We haven’t actually assigned a way to numerically divide the network into a hierarchical group-system. But we have IP addresses. Why not designate a portion of the IP address as the Network portion, and another portion of the IP address as the Host portion? This way, routers can talk to each other about the specific Network the message they’re handling is looking for. Once the message reaches the appropriate router, that router can delegate the message to the correct internal node.

Screen Shot 2016-05-29 at 7.09.27 PM.png

As stated, a given router must keep track of the IP addresses of the physical devices under its hegemony. If it receives a message destined for a node on its network, it will forward it appropriately, so it must know how to identify the actual device behind the IP address.

How do we split up the IP address? Well, IP addresses are 32 binary digits long. Let’s say the world agreed that the first 18 bits identified the Network portion and the last 14 bits identified the Host portion2.

Screen Shot 2016-05-29 at 7.57.46 PM.png

Now, when we send our packet to duckduckgo.com at 184.72.106.52 the first 18 bits represent the network the router will look for. The last 14 bits represent the host DuckDuckGo’s router will use to find the server I want to talk to.

Let’s say my school’s router is called Router2. If Router2’s IP is 192.86.64.0 and my laptop’s IP is 192.86.74.212. Router2’s internal routing table might look something like:

Screen Shot 2016-05-29 at 7.57.51 PM.png

And it’s external table might look something like:

Screen Shot 2016-05-29 at 7.57.56 PM.png

As you can see, DuckDuckGo’s Network number isn’t in Router2’s external table. So it doesn’t know exactly who to contact to find the pathway to DuckDuckGo. But we have a default entry, so we will forward the message to the router on our 3rd interface (which is just a network cable; this router has 3 cables).

The Internet Protocol is a “best effort” protocol - Router2 doesn’t know if the router on interface 3 will ever find DuckDuckGo - it just sends it off and hopes that it will.

 The High-Speed Rabbit Hole

There was a lot here, and it only skims the surface. Unfortunately I’ve glossed over and even distorted many a detail here. I attempted to make the basic concepts digestible - IP and routing are nuanced and deserving of many posts.

But in conclusion: my laptop sends a message to the router (a “vertical” message, going up to the second level of the hierarchy) managing my school or Internet Service Provider’s network. That router communicates with other routers (a “horizontal” message, between the second level) to find the appropriate network to send this message to. Once it arrives at the appropriate router, that router delegates to the appropriate host (a “vertical” message, down to the physical device). DuckDuckGo’s network receives it, then delegates it off to the server I was planning on talking to (184.72.106.52).

1 : This isn’t really true due to Network Address Translation (NAT) devices.
2 : This also isn’t really true. But it gets the point across here.

 
0
Kudos
 
0
Kudos

Now read this

Fitness, and the problem with the ‘fuck it, ship it’ model

GSD (getting shit done) is great. I couldn’t get anything done if I didn’t get anything done. But ‘fuck it, ship it’ isn’t right. Ask your local personal trainer. Any serious fitness-enthusiast will tell beginners that it’s always ALWAYS... Continue →