TCP/IP Ports

From Wiki99

Jump to: navigation, search

↑ Computers ↑
← prev: TCP/IP DNS next: TCP/IP Server Setup →


Contents

Ports: The Last Concept In Internet Addressing

Everything we have discussed so far, namely IP addresses and names, gives a way for one IP address (in other words one interface, which usually means one computer) to send a single packet of data to another interface (ie another computer). But this is much less than we need for networking. In particular we need two additional capabilities.

  1. We need the ability to specify what to do with a packet when it reaches the remote computer. For example our server may be handling web, mail, printer sharing and file sharing for us. Any packet that we send to it has to be tagged as saying why it is being sent to our server.
  2. We need the ability to send long streams of data to another computer, streams of data that may be broken up into multiple packets, with guarantees that all the data arrived at the other end and in the correct order. This means we need a protocol that labels each packet with a sequence number, and that the receiving computer needs to tell us about missed packets. This is easy enough to describe, but tricky to do in a way that is not excrutiatingly slow.

The first capability is handled through ports, which is what we will discuss below.

In networking we talk repeatedly about protocols.
A protocol is nothing more than an agreement about how to do things.
So far we have discussed IP (the protocol involving IP addresses) and DHCP.

The two capabilities we discussed above are combined into a protocol called TCP. All you need to know about TCP is how ports work. The technicalities of how TCP maintains packet sequence numbers, handles lost or out-of-order packets and so on are very interesting, but not relevant to the current discussion.

In addition to TCP a protocol called UDP uses the same idea of ports as TCP but does not provide the second capability of TCP, the ability to handle long streams of data that are broken into multiple packets. UDP is useful only in some specialized situations and we won't discuss it much.

There are in addition other protocols that use IP, but you'll deal with them even less than you deal with UDP; the most common of these is ICMP. As always you can use google to learn about ICMP if you're interested.

Overview of Ports

So what is a port? If you think of an IP address as something like the address of an apartment block, then the port number is the number of a specific apartment in that apartment block. The IP address is used to get a packet all the way to some computer. Once it gets to that computer, the packet has to be handed to some program to deal with. But which program? There are literally hundreds of programs running on your mac at any one time. This is where the port number comes in. A program that provides network services, when it starts up, registers itself with the OS as being interested in a specific port number, say TCP port 80. From that point on, all TCP packets addressed to port 80 will be handed to that particular program. Meanwhile a different program registered itself as being interested in TCP port 25, and will receive TCP packets addressed to port 25. If no program registers an interest in TCP port 97 and a TCP packet addressed to port 97 arrives, the OS will simply throw it away.

All the standard internet services use well known port numbers. Examples include

  • web servers use TCP port 80.
  • SMTP servers (which are what accepts your outgoing email and sends it on to the destination email address) use TCP port 25.

  • ssh servers uses port 22

If you are interested, you can find a list of the standard port numbers in the file /etc/services. Along with maybe five or ten internet services you recognize in this list, there will be hundreds you've probably never heard of.

Note that nothing forces the use of a particular port number for a particular services. If you want to be contrary, you can run your web server using TCP port 88, or even TCP port 25, but most software is set up to use the standard port numbers by default, and if you do things differently you make your life difficult, usually for no good reason.


Ports in More Details

As we've said, when a program provides a network service, it registers itself with the OS as interested in a particular port. Standard services use port numbers between 0 and 1023.
Now consider when your web browser connects to a web server, say www.google.com. What happens?
Step one is (using DNS) conversion of www.google.com to an IP address. Step three is to create a TCP packet that is addressed to port 80 of that IP address.
What about step two? Well think about what's going to happen when that TCP packet is sent to the google server and the google server wants to send a reply. It has the IP address of your computer to send the reply to, but it also needs a port number. So step two is that the web browser asks the OS for a random unused TCP port number and registers itself as interested in that port number. Then in step three, along with filling out the IP address and the port number for google server in the outgoing TCP packet, it also fills out a return address consisting of the local IP address and the random port number it was given.

If most programs use sensible defaults, why does all this matter to you? There is the general reason that it's a lot easier to debug things if you know what's going on. But there is a more specific reason having to do with NAT and/or firewalls.

Ports and NAT

Remember our friend NAT that we talked about a few paragraphs back. Recall that NAT stands for Network Address Translation. The way NAT works is that when any of the computers behind my Airport base station send a packet out to the internet, the NAT software running in the base station looks up some information from the outgoing packet (like which local address [eg 10.0.1.3] it is coming from and which port on that computer it is coming from), stores that information, then overwrites that information saying that the packet is coming from the IP address of the base station, and with some different random port number that the base station makes up, then the base station sends the packet out. When a return packet comes in, the NAT software looks at the port number the packet is addressed to, looks in its memory to see what original IP address and port it mapped onto that made up port number, overwrites the IP address and made up port number with the original local address (ie 10.0.1.3) and port number, and sends the packet out over the local network.

This is all very clever and works very well. It works automatically for client software, that is, software where your computer starts the conversation. But it does not work for servers without some additional setup.

Suppose you have a web server. You have the domain name bluecloud.com, and you have your dynamic dns software set up and working, so the outside world knows that www.bluecloud.com corresponds to IP address 200.20.85.65 (which is the IP address your ISP has currently assigned to your base station), and port number 80 (because by default web requests go to port 80). Now suppose someone somewhere types www.bluecloud.com into their web browser. A packet addressed to IP address 200.20.85.65, port 80, goes out onto the internet and eventually arrives at your base station. The base station NAT software looks in that table it creates of made up port numbers, and sees that it never made up a port number of 80 for any outgoing packet, so it has no idea where to deliver the packet and it just ignores it. Oh dear.

NAT and security

This has a good side and a bad side. The good side (very convenient) is that baddies out on the internet for the most part can't harm your computers even if you misconfigure said computers in some way, because if they try to send packets attacking any of your computers those packets will be ignored as we described above.

Let's discuss this in more detail. The usual way baddies attack computers on the internet is that they learn of some bug in software that listens to some port and exploit that bug. For example they might learn of a bug in the code that listens for printer sharing on TCP port 631, and then try to connect to any computer they can find on the internet that is running that particular buggy software on port 631. If you don't want the baddies to hurt you what can you do?

  • You can not run the buggy version of the software. This is what's usually going on when Apple provide their security updates every month or so. They are replacing programs that have been discovered to have some sort of exploitable bug with a version that does not have the bug.
  • You can (even better) not run any software at all that connects to port 631 (or any other particular port). This is why security people tell you not to turn on any servers on your computer that you aren't certain you need.
    MacOS X ships out the box with all servers turned off, and that's one reason why macs have done pretty well in terms of not having spyware and virus problems. In contrast, until recently, Windows shipped with a variety of servers built into the OS that were turned on by default, that most people did not know about (so did not know to turn off) and that were buggy (so that baddies could connect to those services, exploit the bugs, and gain control of the Windows computer).

  • But maybe you want the service provided by port 631, for example you want to be able to share your printer.
    Even so, there are probably limits to the sharing you want. For example the usual case for printer sharing is that you want to share the printer with all the other computers on your home network, but not with the entire internet.

For this third sort of situation, either a firewall or NAT can help you. What a real firewall does is allow you to set up rules saying that packets trying to connect to a particular port number can only do so if they have particular characteristics; for example to connect to port 631 the packet must come from the home network, not from some random computer on the internet.

Different firewalls differ in how complicated these rules can be. Fancy firewalls allow for all sorts of compicated rules. For example you can have one set of rules for packets coming in over ethernet, and another set for packets coming in over wireless. Or you could allow packets for a certain port to come in if they come from a certain IP address (maybe the address of your computer at work) but not if they come in from any other IP address.

Firewalls used in corporate settings bring up a further level of complexity, talking about things like stateful firewalls, firewall that can see into protocols, DMZs and other arcana. For now you should ignore these details. It's better to understand how to use a simple firewall well than to get so confused by everything you read that you decide it's just too much effort to have anything to do with the subject.

The firewall software built into MacOS X can do all of these things, but the user interface Apple provides is a simplified one that is useful for most users, but not so complicated that they are scared to use it, or that they make mistakes. If you want to be able to configure the firewall in more detail, you can use the GUI program Brickhouse, or you can read up about ipfw configuration in UNIX books or on the web.

The relevance of this to NAT is that although NAT was not designed for this purpose, it acts like an easy-to-configure (but not very flexible) firewall. This is because, as we've already described, any packets that come from the internet addressed to say port 631 and go through your NAT box will simply be ignored because the NAT box is not mapping port 631 to any listening computer.

OK, so that's good. NAT acts to protect your home network.

Even though there is no need to have the firewall switched on when you are protected by NAT, you might want to configure it and switch it on anyway. The reason for this is that a situation may arise where you connect your server to the internet directly without going through an Airport base station or some other NAT box.

Consider, for example your NAT box may break and you make a direct internet for a fewdays until you buy a new one. Or your internet connection may fail, so you move your server to your friend's house and use his (non-NAT-box-based) internet connection for a few days. When situations like this occur, it's difficult to remember at this point to setup the firewall correctly. You have your hands full dealing with your hardware problems, and you really don't want to add more confusion to the situation by now screwing around with firewall settings. It's a better idea to think about this before its necessary, and configure your server's built-in firewall as best you can. The main downside to switching the firewall on is that while you are setting up your server and debugging it, the firewall can get in the way, preventing some debugging techniques from working and sometimes preventing you from clearly seeing what is happening.

So I'd recommend that you leave the firewall switched off for now. Read through all the articles, get everything you want working, and then switch it on. By the time you've read all the articles, you'll have a much better feeling for how all this stuff works and the idea of setting up a firewall won't seem so scary.

NAT and servers

But we have to deal with the fact that, even though we don't want outsiders to have access to port 631 on our server, we do want them to have access to port 80. How can we fix this?

We need to do two things.

  • We have to give our server machine a fixed (local network) IP address. Recall that right now all our home computers are getting variable IP addresses from the DHCP server built into the Airport base station. However for the second step of this process to work, our server has to have a fixed IP address.
  • We have to tell the base station that, for certain ports, we don't want NAT to be active, rather we want any packets addressed to those ports simply to be routed to our server.

NAT Settings for a Server

Initial Airport configuration

Open Airport Admin Utility and click on the "Configure" button. Given everything you've read so far, you should understand most of what this app now shows you. For example you should understand that the information presented when you click the button Internet Connection describes how the Airport base station is configured for its interface to the DSL or cable modem, called the WAN (wide area network) interface.
What we want to do below will modify the wireless interface of the base station, but will not affect the WAN interface.

On the pane of data that came up when you clicked on the Internet Connection tab, you should see one or two internet addresses written in grey next to the text DNS Servers (but not inside the text fields next to that text).
Write down these two numbers because we'll need them soon.

Now click on the Network tab. You should have the Distribute IP addresses checkbox ticked, and the Share a single IP address (using DHCP and NAT) radio button selected, and Use 10.0.1.1 addressing popup menu item selected. (These are the defaults and there is no need to change them unless you know what you are doing.) This means that the base station is going to provide addresses to all the computers on the wireless network using numbers that looks like 10.0.1.x, and that it gives itself (the router) the address 10.0.1.1.

Now choose the Port Mapping tab. We want (for now) to add three entries to this table. (By the time you have read through this tutorial, you should know enough to know what additional entries you may want to add for your system.)
The three entries we are going to add correspond to ssh, smtp, and web.
What this means is that by making these three entries, people anywhere on the internet can access your server using these three services. (Of course you then have to set up your server to respond to requests for these types of services, something we'll discuss further along in the tutorial.)

Allowing a server to be connected to via ssh means that if you are away from your computer, you can still log into it via the internet (using terminal and a command-line interface) if you need to.
Even if you're not much interested in using a command line, we will also be using ssh (behind the scenes) to read mail from our server and for various other utility purposes, as explained later, so you will want an ssh entry.

The smtp entry means that other computers on the internet can send mail addressed to this computer (more exactly they can send mail to this domain name). We'll discuss this in great detail later on, both how to set this up and why you want to do this in the first place.
Note that when dealing with mail you need to be very careful about security. If you screw things up, spammers can send out junk mail using your computer. At this point your ISP may think you are a spammer and terminate your internet connection. We will, of course, discuss how you set things up to prevent that.

Finally the web entry is, of course, so that people can see your web pages.

Airport port mappings

Let's do ssh:

  • Hit the Add button and in the dialog box that comes up,
    • set the public port to 22
    • set the private address to 10.0.1.201, and
    • set the private port to 22.

What does this all mean? Suppose I, somewhere on the internet and far from home, type ssh name99.org.
First the process of dns lookup will tell my computer that the IP address for name99.org is say 200.20.85.65.
Next the ssh client program running on my computer will send packets to the ssh port (which is 22) of IP address 200.20.85.65.
Finally the base station (which is what is sitting at IP address 200.20.85.65) will look in its port mapping table, and see that it should forward this packet to machine port 22 (the private port) of 10.0.1.201.

Now do the same for the two other ports we care about, 25 (smtp), and 80 (www). Note that we are runnning all our services on the same server, because its easy and cheap and we don't expect to be doing anything very demanding. If we had some reason to, we could say, "OK I want to run web access on a different server" and then when filling in the port mapping for port 80 we'd set the private address to something different, say 10.0.1.202, which would be the IP address of some other computer which we'd chosen to handle our web serving.

When you've filled in all three entries for the base station, hit the Update button and let the base station update itself.


← prev: TCP/IP DNS next: TCP/IP Server Setup →

Personal tools