#! /usr/bin/perl use strict; use CGI; my $cgi = new CGI; my $file = $ENV{DOCUMENT_URI}; $file =~ s|^/||; $| = 1; # turn off output buffering and print out content header print "Content-type: text/html\n\n"; # Set our variables from the CGI my %vars; $vars{background} = $cgi->param('background') || "#119511"; $vars{root} = $cgi->param('root') || $file =~ m|^(~[^/]+/?)|?"/$1":"/"; $vars{title} = $cgi->param('title') || "CTL"; $vars{separator} = $cgi->param('separator') || ' => '; #################################### #Create the title bar #Process the beginning of the header my $header = "\n" . $vars{title} . ''; #don't process the file's name my @components = split('/', $file); my $tail = pop @components; if ($tail =~ m|index\.s?html?|i) { $tail = pop @components; } #If this is in a home directory, we've already added that in. shift @components if $file =~ m|^~|; #build the inbetween stuff my $path; foreach my $foo (@components) { next if $foo !~ /\S/; $path .= "$foo/"; my $newfoo = ucfirst $foo; $newfoo =~ s|_| |go; $newfoo =~ s|\.s?html?$||gio; $header .= "$vars{separator} $newfoo "; } #Add on the name of the file if ($tail ne '') { $tail =~ s|\.s?html?||io; $tail =~ s|_| |go; $tail = ucfirst $tail; $header .= "$vars{separator} $tail "; } #################### #Get the URLs right $header =~ s|//|/|go; $header =~ s|tp:/(?!/)|tp://|gio; $header .= "\n"; #and finally print the thing. print $header;

Specific Protocols

Ethernet

Ethernet is a protocol used by the network cards commonly found in desktop computers. Ethernet has no routing protocol - it simply broadcasts its message over the wire and the intended recipient pays attention. Ethernet solves such problems as actually hooking two (or more) computers up at all and dealing with what happens if two of them try to communicate at once.

Two or more network cards trying to communicate at once is called a collision. When the network cards detect a collision, they wait a fixed amount of time plus a random amount of time and try again. The random amount of time is that otherwise when the fixed delay was done, they would simply collide again. As you can surmise, collisions are bad because they take up bandwidth uselessly. Similarly, the more computers that one has on a given network communicating with ethernet, the greater the chances of collisions and the worse the network will perform. This is one reason that SMB (the windows networking protocol) is so bad - since every machine using it constantly sends out network traffic, performance is degraded even when nothing is going on.

Because there is no routing within ethernet, each network card (you can have more than one in a computer) has a simple name, called a MAC address (MAC = Media Access Control). It is usually used by humans as a string of six couples of hex digits. Each couple corresponds to a byte, so the address is really a six byte number. Theoretically (and I think practically, too) every network card in the world has a unique MAC address. For example, the MAC address of one network card that I work with is 00:90:27:75:E1:4F.

Ethernet is commonly used over two types of physical connection. The one is coaxial cable and the other is twisted pair (similar to a telephone line). Coax cable is slower and has other disadvantages so has been replaced almost entirely now by twisted pair wires (the specific type is called RJ45). In the days of coax cable, all computers on a local network were connected to each other by the same wire, so that sending a packet out on the wire automatically made it reach all of the other computers. Since twisted pair wires can only connect two machines, devices called hubs are used to spread the signal around. Each machine on the local twisted pair network plugs into the hub. When any of them send a signal out, the hub will replicate that signal to all of the other machines plugged into it.

IP

The next layer that is most commonly used is IP (=Internet Protocol). IP is a routable protocol, so machines are given hierarchical addresses. In IPv4 (IP version 4), an address is four bytes long. IPv6 will introduce 16 byte addresses, but it will be a while before that happens.

While computers deal with IP addresses as four bytes, it is more convenient to people to deal with them as four numbers, with each number ranging from 0 to 255. Thus a machine might have the IP address 149.94.148.201 or 12.88.162.162.

IP addresses are assigned in a hierarchical fashion which mirrors their layout. To get a class A allocation means that the first of the four numbers is fixed for you and you can allocate any of the other three numbers to whomever you wish however you wish. A class B allocation is to have the first two numers fixed and a class C allocation has the first three numbers fixed.

For example, Alfred university has a class B allocation - all of its addresses start with 149.84, but it can assign the rest of those numbers as it wishes.

At this point it should be relatively plain how routing is accomplished. When a computer wishes to send a packet, it sends the packet out to its local gateway, who continues sending the packet upstream until it gets to the junction between all of the class A holders. Then the appropriate class A holder takes it and starts pushing it downstream on their end until it reaches the individual computer that it was meant for.

The other important part of an IP address is a port number. The port number is specified as a number between 1 and 65,535. By using multiple ports, one can run different network connections on the same machine and have an easy way to figure out which packet goes to whom. This also allows one to easily set up multiple services being offered - each one simply goes on a different port. For example, mail is normally transfered on port 25, and web traffic is usually done on port 80.

IP is an inherently unreliable protocol. The specification makes no guarantees that a particular packet will ever reach its destination, or that two packets will arrive in the order in which they were sent out. It is perfectly valid for anyone along the routing chain to drop packets for whatever reason they want. They of course should do their best to send them on, but it's not a violation of the protocol not to.

As was previously indicated, it is common to layer IP over ethernet. This means that ethernet is used to carry the IP packets to the other computers on the network, including the gateway. The gateway is a computer which is responsible for carrying IP packets between the local network and the outside world.

There is a more intelligent (and therefore expensive) version of a hub called a switch. A switch performs the same functions as a hub, except that instead of replicating all signals to every computer, it understands the low-level network protocols involved and routes the packets only to their intended recipient. This can significantly reduce ethernet collisions which can significantly improve performance and bandwidth.

TCP

TCP is a two-person streaming protocol that is usually tunneled through IP. Unlike IP, TCP guarantees that packets will be recieved and that they will be recieved in order. This is done through the use of acknowledgements for every packet recieved and sequencing numbers in the packet headers.

Through the combined use of TCP and IP, or TCP/IP as it is commonly referred to, one can easily open up a reliable, sequenced network connection to pretty much anyone.

SMTP

The Simple Mail Transfer Protocol has been an extraordinarily successful protocol. SMTP governs the transfer of RFC 822 messages. SMTP is very simple yet robust. Once sent, a message must either be delivered to its recipient or bounce (bouncing is when you get a response to your message from some mail server saying that delivery is impossible). Moreover, if the recipient's mail server is temporarily down, the SMTP protocol requires that the sending mail server retry for five days before giving up. Thus short outages only delay mail, they don't cause messages to be lost. In point of fact, it is nearly (though not completely) impossible for messages to get lost making the excuse of "you didn't get my letter? Drat that post office!" not very believeable.

(RFC 822 is the standard format for email messages, RFC stands for Request For Comments and they both solicit comments and define standards for all sorts of things. As you might guess, RFCs start with 1 and get sequentially numbered.)


Christopher T. Lansdown
Last modified: Sun Dec 31 11:20:05 EST 2000