Summary
The Internet is the backbone of the Web, the technical infrastructure that makes the Web possible. At its most basic, the Internet is a large network of computers which communicate all together.
The history of the Internet is somewhat obscure. It began in the 1960s as a US-army-funded research project, then evolved into a public infrastructure in the 1980s with the support of many public universities and private companies. The various technologies that support the Internet have evolved over time, but the way it works hasn't changed that much: Internet is a way to connect computers all together and ensure that, whatever happens, they find a way to stay connected.
A simple network
When two computers need to communicate, you have to link them, either physically (usually with an Ethernet cable) or wirelessly (for example with WiFi or Bluetooth systems). All modern computers can sustain any of those connections.
Note: For the rest of this article, we will only talk about physical cables, but wireless networks work the same.
Such a network is not limited to two computers. You can connect as many computers as you wish. But it gets complicated quickly. If you're trying to connect, say, ten computers, you need 45 cables, with nine plugs per computer!
To solve this problem, each computer on a network is connected to a special tiny computer called a router. This router has only one job: like a signaler at a railway station, it makes sure that a message sent from a given computer arrives at the right destination computer. To send a message to computer B, computer A must send the message to the router, which in turn forwards the message to computer B and makes sure the message is not delivered to computer C.
Once we add a router to the system, our network of 10 computers only requires 10 cables: a single plug for each computer and a router with 10 plugs.
A network of networks
So far so good. But what about connecting hundreds, thousands, billions of computers? Of course a single router can't scale that far, but, if you read carefully, we said that a router is a computer like any other, so what keeps us from connecting two routers together? Nothing, so let's do that.
By connecting computers to routers, then routers to routers, we are able to scale infinitely.
Such a network comes very close to what we call the Internet, but we're missing something. We built that network for our own purposes. There are other networks out there: your friends, your neighbors, anyone can have their own network of computers. But it's not really possible to set cables up between your house and the rest of the world, so how can you handle this? Well, there are already cables linked to your house, for example, electric power and telephone. The telephone infrastructure already connects your house with anyone in the world so it is the perfect wire we need. To connect our network to the telephone infrastructure, we need a special piece of equipment called a modem. This modem turns the information from our network into information manageable by the telephone infrastructure and vice versa.
So we are connected to the telephone infrastructure. The next step is to send the messages from our network to the network we want to reach. To do that, we will connect our network to an Internet Service Provider (ISP). An ISP is a company that manages some special routers that are all linked together and can also access other ISPs' routers. So the message from our network is carried through the network of ISP networks to the destination network. The Internet consists of this whole infrastructure of networks.
Finding computers
If you want to send a message to a computer, you have to specify
which one. Thus any computer linked to a network has a unique address
that identifies it, called an "IP address" (where IP stands for Internet Protocol). It's an address made of a series of four numbers separated by dots, for example: 192.168.2.10
.
That's perfectly fine for computers, but we human beings have a hard
time remembering that sort of address. To make things easier, we can
alias an IP address with a human readable name called a domain name. For example (at the time of writing; IP addresses can change) google.com
is the domain name used on top of the IP address 173.194.121.32
. So using the domain name is the easiest way for us to reach a computer over the Internet.
Internet and the web
As you might notice, when we browse the Web with a Web browser, we
usually use the domain name to reach a website. Does that mean the
Internet and the Web are the same thing? It's not that simple. As we
saw, the Internet is a technical infrastructure which allows billions of
computers to be connected all together. Among those computers, some
computers (called Web servers) can send messages intelligible to web browsers. The Internet is an infrastructure, whereas the Web
is a service built on top of the infrastructure. It is worth noting
there are several other services built on top of the Internet, such as
email and IRC.
How the web works provides a simplified view of what happens when you view a webpage in a web browser on your computer or phone.
This theory is not essential to writing web code in the short term, but before long you'll really start to benefit from understanding what's happening in the background.
Clients and servers
Computers connected to the web are called clients and servers. A simplified diagram of how they interact might look like this:
- Clients are the typical web user's internet-connected devices (for example, your computer connected to your Wi-Fi, or your phone connected to your mobile network) and web-accessing software available on those devices (usually a web browser like Firefox or Chrome).
- Servers are computers that store webpages, sites, or apps. When a client device wants to access a webpage, a copy of the webpage is downloaded from the server onto the client machine to be displayed in the user's web browser.
The other parts of the toolbox
The client and server we've described above don't tell the whole story. There are many other parts involved, and we'll describe them below.
For now, let's imagine that the web is a road. On one end of the road is the client, which is like your house. On the other end of the road is the server, which is a shop you want to buy something from.
In addition to the client and the server, we also need to say hello to:
- Your internet connection: Allows you to send and receive data on the web. It's basically like the street between your house and the shop.
- TCP/IP: Transmission Control Protocol and Internet Protocol are communication protocols that define how data should travel across the web. This is like the transport mechanisms that let you place an order, go to the shop, and buy your goods. In our example, this is like a car or a bike (or however else you might get around).
- DNS: Domain Name Servers are like an address book for websites. When you type a web address in your browser, the browser looks at the DNS to find the website's real address before it can retrieve the website. The browser needs to find out which server the website lives on, so it can send HTTP messages to the right place (see below). This is like looking up the address of the shop so you can access it.
- HTTP: Hypertext Transfer Protocol is an application protocol that defines a language for clients and servers to speak to each other. This is like the language you use to order your goods.
- Component files: A website is made up of many
different files, which are like the different parts of the goods you buy
from the shop. These files come in two main types:
- Code files: Websites are built primarily from HTML, CSS, and JavaScript, though you'll meet other technologies a bit later.
- Assets: This is a collective name for all the other stuff that makes up a website, such as images, music, video, Word documents, and PDFs.
So what happens, exactly?
When you type a web address into your browser (for our analogy that's like walking to the shop):
- The browser goes to the DNS server, and finds the real address of the server that the website lives on (you find the address of the shop).
- The browser sends an HTTP request message to the server, asking it to send a copy of the website to the client (you go to the shop and order your goods). This message, and all other data sent between the client and the server, is sent across your internet connection using TCP/IP.
- If the server approves the client's request, the server sends the client a "200 OK" message, which means "Of course you can look at that website! Here it is", and then starts sending the website's files to the browser as a series of small chunks called data packets (the shop gives you your goods, and you bring them back to your house).
- The browser assembles the small chunks into a complete website and displays it to you (the goods arrive at your door — new shiny stuff, awesome!).
DNS explained
Real web addresses aren't the nice, memorable strings you type into
your address bar to find your favorite websites. They are special
numbers that look like this: 63.245.215.20
.
This is called an IP address, and it represents a unique location on the web. However, it's not very easy to remember, is it? That's why Domain Name Servers were invented. These are special servers that match up a web address you type into your browser (like "mozilla.org") to the website's real (IP) address.
Websites can be reached directly via their IP addresses. You can find the IP address of a website by typing its domain into a tool like IP Checker.
Packets explained
Earlier we used the term "packets" to describe the format in which
the data is sent from server to client. What do we mean here? Basically,
when data is sent across the web, it is sent as thousands of small
chunks, so that many different web users can download the same website
at the same time. If websites were sent as single big chunks, only one
user could download one at a time, which obviously would make the web
very inefficient and not much fun to use.
Summary
As with any area of knowledge, the web comes with a lot of jargon. Don't worry, we won't overwhelm you with all of it (we have a glossary if you're curious). However, there are a few basic terms you need to understand at the outset, since you'll hear these expressions all the time as you read on. It's easy to confuse these terms sometimes since they refer to related but different functionalities. In fact, you'll sometimes see these terms misused in news reports and elsewhere, so getting them mixed up is understandable!
We'll cover these terms and technologies in more detail as we explore further, but these quick definitions will be a great start for you:
- web page
- A document which can be displayed in a web browser such as Firefox, Google Chrome, Opera, Microsoft Internet Explorer or Edge, or Apple's Safari. These are also often called just "pages."
- website
- A collection of web pages which are grouped together and usually connected together in various ways. Often called a "web site" or simply a "site."
- web server
- A computer that hosts a website on the Internet.
- search engine
- A web service that helps you find other web pages, such as Google, Bing, Yahoo, or DuckDuckGo. Search engines are normally accessed through a web browser (e.g. you can perform search engine searches directly in the address bar of Firefox, Chrome, etc.) or through a web page (e.g. bing.com or duckduckgo.com).
Let's look at a simple analogy — a public library. This is what you would generally do when visiting a library:
- Find a search index and look for the title of the book you want.
- Make a note of the catalog number of the book.
- Go to the particular section containing the book, find the right catalog number, and get the book.
Let's compare the library with a web server:
- The library is like a web server. It has several sections, which is similar to a web server hosting multiple websites.
- The different sections (science, math, history, etc.) in the library are like websites. Each section is like a unique website (two sections do not contain same books).
- The books in each section are like webpages. One website may have several webpages, e.g., the Science section (the website) will have books on heat, sound, thermodynamics, statics, etc. (the webpages). Webpages can each be found at a unique location (URL).
- The search index is like the search engine. Each book has its own unique location in the library (two books cannot be kept at the same place) which is specified by the catalog number.
Active learning
There is no active learning available yet. Please, consider contributing.
Deeper dive
So, let's dig deeper into how those four terms are related and why they are sometimes confused with each other.
Web page
A web page is a simple document displayable by a browser. Such documents are written in the HTML language (which we look into in more detail in other articles). A web page can embed a variety of different types of resources such as:
- style information — controlling a page's look-and-feel
- scripts — which add interactivity to the page
- media — images, sounds, and videos.
Note: Browsers can also display other documents such as PDF files or images, but the term web page specifically refers to HTML documents. Otherwise, we only use the term document.
All web pages available on the web are reachable through a unique address. To access a page, just type its address in your browser address bar:
Web site
A website is a collection of linked web pages (plus their associated resources) that share a unique domain name. Each web page of a given website provides explicit links—most of the time in the form of clickable portion of text—that allow the user to move from one page of the website to another.
To access a website, type its domain name in your browser address bar, and the browser will display the website's main web page, or homepage (casually referred as "the home"):
The ideas of a web page and a website are especially easy to confuse for a website that contains only one web page. Such a website is sometimes called a single-page website.
Web server
A web server is a computer hosting one or more websites. "Hosting" means that all the web pages and their supporting files are available on that computer. The web server will send any web page from the website it is hosting to any user's browser, per user request.
Don't confuse websites and web servers. For example, if you hear someone say, "My website is not responding", it actually means that the web server is not responding and therefore the website is not available. More importantly, since a web server can host multiple websites, the term web server is never used to designate a website, as it could cause great confusion. In our previous example, if we said, "My web server is not responding", it means that multiple websites on that web server are not available.
Search engine
Search engines are a common source of confusion on the web. A search engine is a special kind of website that helps users find web pages from other websites.
There are plenty out there: Google, Bing, Yandex, DuckDuckGo, and many more. Some are generic, some are specialized about certain topics. Use whichever you prefer.
Many beginners on the web confuse search engines and browsers. Let's make it clear: A browser is a piece of software that retrieves and displays web pages; a search engine is a website that helps people find web pages from other websites. The confusion arises because, the first time someone launches a browser, the browser displays a search engine's homepage. This makes sense, because, obviously, the first thing you want to do with a browser is to find a web page to display. Don't confuse the infrastructure (e.g., the browser) with the service (e.g., the search engine). The distinction will help you quite a bit, but even some professionals speak loosely, so don't feel anxious about it.
Here is an instance of Firefox showing a Google search box as its default startup page: