The Command Shell

The command shell is a text-based interace to the computer. It is normally connected to what is called a terminal. It typicall consists of a prompt (e.g. towerhs:/home/homepage$ ) and then an input field (usually just the space after the prompt). Once the user provides input, it is the shell's job to figure out what to do with it.

There are many different shells. Some examples are bash, sh, tcsh, csh, ash, bsh, ksh, zsh. There are many others. We will focus on just one of these shells, though: bash. Bash is the Bourne Again SHell (basically the Bourne shell plus a whole bunch of features). It is the default shell in linux and arguably the best or at least the most featureful.

This document will try to give you a conceptual understanding of what the shell and some of its components do, as well as some of the more interesting things that you can do in them. If you really want to learn more about the shell, use the command man bash. It has loads of information on the shell. While it is a bit dense, it is perfectly understandable and I encourage you to read up further on any of the things that I touch on here.

For further information on a command, the syntax is generally man command or info command (depending on whether they have man pages or info pages). Try both to find the documentation for a command, you can't hurt anything by it. This will be described more below.

What a shell does generally consists of two things: shell builtins and programs.

Shell Builtins

Command Line Arguments

When you type a command at the command prompt, you are not limited strictly to the command that you type. You can type a large number of words after the command as well. (any grouping of characters separated by spaces, they don't have to be meaningful outside of the context of the command that you're typing). These words that follow the command on the command line are called arguments. They are passed to the program as a list, and normally the program uses them to modify its behavior. What arguments a program uses and what they mean varies significantly from program to program, so you have to look up the arguments for the specific program that you want.

It's not terribly relevant until you start programming, but the first argument that the program can access is the name that it was executed as.

As a usuage note, the name of the program and all its arguments is often referred to as the command line. This doesn't include the prompt or any conjunctive characters (such as |) that you will learn about later, just what you typed in.

Pipes

A pipe is a method of making the output of one program in the input of another program. This is achieved by writing the first command to be executed then a vertical bar ('|') then the second program to be executed with the first program as its input. (e.g. cat /etc/passwd | grep root)

This means that the output of the first command goes not to your terminal window but rather to the input of the second program. Consequently, the input of the second program comes not from your keyboard but from the first program.

This can be done as many times as you want. For example: cat /etc/passwd | grep /bin/bash | cut -f 1 -d : | sort | uniq (this prints out the unique accounts who use bash as their default shell in alphabetical order).

Redirection

Redirection is similar to pipes, except that instead of sending the output of one program to another program, you send it to a file instead. This is achieved by using a greater than character. For example, cat /etc/passwd > /tmp/password.txt (this prints the output of cat /etc/passwd, whatever that is, to the file /tmp/password.txt).

There are two complications. The first is that a single greater than character will erase the contents of the destination file, if they exist, and replace them with the output of the program. If you want to keep the contents of the destination file and simply add the output of the program to the end of the file, then you need to use two greater than symbols. E.g. cat /etc/passwd >> /tmp/password.txt (this will append the output of cat /etc/passwd to the file /tmp/password.txt).

The second complication is that any given program has more than one output "channel" available to it. The first one, or stream 1, is called standard output. This is the normal place to put output for a program. However, if a program encounters some sort of error condition or in general wants to separate its output, it also has a "channel" called standard error, or stream 2. Dealing with this stream isn't really any different than dealing with the first, you just have to be more explicit about it. E.g. cat /etc/passwd > /tmp/password.txt 2> /tmp/password.err.

It is also possible to merge these output streams. For example: cat /etc/passwd > /tmp/password.txt 2>&1. This redirects standard error to standard output. This will mix the two streams as they come so there will be no way to tell what text came from what stream in the destination file.

Command Substitution

Command substitution is the replacing of a command with the output of that command on the command line. It is used when you want the output of one program to be not the input of the next program, but its command line arguments. Just how useful this is will become apparent soon enough.

There are two forms of command substitution. The first uses what is called a back-tick, it's the character located above the tab characters and is on the same key as the tilde characters. It's the one that looks like ` (not ' which is next to the " key). With this form, you place the command whose output you want to substitute between backticks on the command line. For example, grep Fred `cat /tmp/phone.book` will first run the command cat /tmp/phone.book, then it will run the command grep Fred CONTENTS_OF_FILE_/TMP/PHONE.BOOK.

The other way to do this is to put the command to be substituted inside of the expression $(). Thus, the previous command would be grep Fred $(cat /tmp/phone.book). This construction will probably make your life easier if you want to make use of nested command substitutions.

Combining Commands

There are several ways to combine commands together. The three usual ways to do it are with the semicolon ';', the double-ampersand '&&', and the double-pipe '||'. They each have different meanings.

The semicolon unconditionally combines commands together. Once the first command finishes, the second command will be executed. Once the second is finished, the third will be executed and so on. This can be useful to do things that you know will work. For example: rm -f /tmp/*.gif; xmessage "All of the gifs in /tmp are gone.".

The double ampersand executes the second command that it conjoins only if the first one returns true. For example, ./configure && make && make install.

The double pipe executes the second command only if the first command doesn't succeed. For example, ls /alsmd 2> /dev/null|| echo Hello will produce the output Hello since /alsmd does not exist and thus ls will fail because it cannot give you a listing for it. By contrast, ls /tmp 2> /dev/null|| echo Hello will produce the output File_Upload-6.00.tar.gz nbench.tar.gz FlightGear-0.7.1/ netscape.ps since that is the listing for /tmp.

Spanning multiple lines

If a command is longer than the current terminal width, if you simply keep typing, bash will normally just wrap the line to the next line and allow you to continue typing. If, however, you wish to manually break the lines yourself without activiating the command, you can add a \ character to the end of your line. For example:

ls /tmp; \
cd ~/stuff; \
for i in fred.html joe.html mary.html joann.html \
paco.html maribel.html; \
do \
cat $i | grep Fred; \
done;

Echo

Echo is usually both a shell-builtin and a probram (/bin/echo). It is a simple program which prints out whatever is on its command line. Thus if you enter the command echo Hello There, echo will print out "Hello There" (minus the quotes).

While this might seem useless, it is actually quite useful. Say for example you want to create a file and write one line to it. You can open it up in an editor, but you can also simple say something line echo "The line that I want to write" > file_that_i_want_to_create instead and save yourself some time.

Environment Variables

While not exactly part of the shell itself (environment variables are really handled by the kernel), the shell does give various ways to use environment variables which are pretty useful.

The concept of an environment variable is basically that there is something called the environment. It's basically just a collection of names with associated date. The names are called environment variables. They are basically meant as a way to set data for programs, especially when multiple programs will use the same data.

On the command line, any text of the form $text is treated as an environment variable and the shell will substitute the value of that environment value in for the name (and the dollar sign). Thus, if you type in echo $PS1, the value of the environment variable PS1 will be substituted on the command line, and the command that actually gets executed is echo value_of_PS1. This makes it easy to see what the current value of an environment variable is. The other way is the command env, which when executed without any arguments will print out all environment variables and their associated values.

Setting environment variables is pretty straightforward. It actually varies from shell to shell, but we'll only be dealing with bash as if you ever have a need to use a different shell, you'll probably be doing stuff that's advanced enough that learning how to do it in whatever shell you're using will be easy. Anyhow, the way to do it is with the export command. The export command will set an environment variable to the value that you specify. It takes the form export PS1="\u@\h:\w\$ ".

Export will set an environment variable for as long as you are using that particular shell. If you were to log in on a different terminal (such as starting up a new gnome-terminal, a new telnet session, etc.) then that shell would not have the value of the environment variable that you specified.

There is no way to set environment variables permanently. There is, however, a way to get the same effect. This is by setting them in a login script, such as .bash_profile in your home directory. This will be described later.

If you don't want an environment variable to be set for more than just one execution of a program, then there is a shortcut to accomplish this. Simply specify out the environment variables and their values before the command. For example, CFLAGS="-O3 -mcpu=ev56 -Wa,-m21164a" LDFLAGS=-lffm MAKE=-j2 ./configure. This would set the variables CFLAGS, LDFLAGS, and MAKE for the program configure for that execution of the program. This is generally useful for programs that you want to execute only once, or for when the environment variables in question need values specific to that program. Why this might be the case will be apparent below when we discuss special environment variables.

Special Environment Variables

There are a number of environment variables which are special and not particular to a program. Here are some of them:

PS1

This variable controls what your prompt looks like. The full syntax is documented in the man page for Bash, but some useful escape sequences are:

\u - the user
\h - the host
\w - the full path
\W - the current directory
\t - the time
\$ - this prints out '$' if you are a normal user and '#' if you are root.

There are plenty of neat things that you can use, look them up if you're interested in customizing your prompt.

PATH

The PATH environment variable is a list of directories to search for executables when you type in commands without specifying the full path to the program to be executed. The directory names are separated by colons.

As an example, part of my path is: "/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin".

When you type in a command, such as cat, the shell will look through every directory in your path, in order, to find an executable file whose name matches the command that you typed. Once it finds a match, it will execute it with whatever environment and arguments that you've specified.

DISPLAY

The DISPLAY environment variable is used by programs that use the X window system. This is detailed more in the section about the X window system, but basically this environment variable is used to tell X programs what X server they should connect to.

The format is "host:display.window" The host is either the DNS name of the host or its IP address. If this argument is omitted, the local machine is assumed. The display is the display number of the X server (normally 0). The window part of the argument is generally ignored, so just set it to 0 to be on the safe side. As an example, in the Eterm next to the Xemacs window that I'm writing this in, my DISPLAY is ":0.0". On another machine that I'm logged into, my DISPLAY is set to "towerhs.alfred.edu:0.0".

LD_LIBRARY_PATH

To understand this fully you will need to know what dynamic libraries are. For the moment, think of them as libraries of code that a program can use. The process of using them is handled by the operating system. Anyhow, they are stored in files which group similar operations together (they are normally named with the format libwhatever.so). For example, code to deal with jpeg images are stored in libjpeg.so.

There are standard directories that the operating system will search for these libraries (an executable contains the names of what libraries that it needs inside itself). If the libraries that a particular program needs are not in one of these directories, you can still use the program. You simply add those directories to the environment variable LD_LIBRARY_PATH. The operating system will search for libraries in all of the directories specified in LD_LIBRARY_PATH. Its format is the same as the PATH environment variable.

The LD_LIBRARY_PATH has several uses. One of them is for users on a multi-user system who don't have rights to modify a system library path but still need to install their own libraries. They could easily install their libraries to a directory called lib in their home directory and then add that directory to their LD_LIBRARY_PATH. Other uses are for programs which need specific libraries that must be modified from the original libraries. Mozilla (the new netscape) also uses this strategy for some reason, I don't really know why.

LD_PRELOAD

LD_PRELOAD is similar to the LD_LIBRARY_PATH, only you specify a list of files here rather than a list of directories and these files are automatically linked in to the program (i.e. their code is automatically made available) and code from these files is used in preference to code from anywhere else.

HOME

HOME is simply the full path to your home directory. It's primary use is in setting other environment variables. For example, if you want to generalically set your path to include the directory bin/ off of your home directory, you could do it like this: export PATH="$PATH:$HOME/bin".

Aliases

An alias is basically a method of redefining commands. When you set up an alias, it is of the form alias orignial="whatever". This can be used to create shortcuts to commands. For example, you could try alias madonna="cd ~/mp3s/madonna/ && mpg123 *.mp3". This can also be used to provide default arguments to programs. For example, alias ls="ls --color -F". Aliases only apply to commands, so don't worry that ls will be replaced anywhere else in the command line. For example, you could use the command echo Visit my Girlfriend\'s page on alcohols with the previous alias and you will get the correct output.

Wildcards and Special File Characters

Wildcards and other special file characters are used to make your life easier when dealing with files. The standard for Unix systems is to support filenames of aroun 240 characters, which would be quite annoying to have to type. It would also be quite inconvenient to type out every one of 17 different file names. For this reason, the shell supports special characters that can be used to get the same information accross with less typing.

The most important of these characters is the wildcard, or the '*' character. It stands for any character. It's use is to indicate places where you don't care about what characters are there. For example, if you want to do something with both Bob_smith.txt and Fred_smith.txt, you could simply call them "*_smith.txt" on the command line. For example, rm *_smith.txt would delete both of them. You can stick wildcards in the middle of words too. For example, cat Bob_*.txt is perfectly valid. So is gimp *.jpg, rm *, and cat title*.

Another important character is the tilde '~' character. It represents your home directory. For example, "~/.bash_profile" is the file ".bash_profile" in your home directory.

Please note that since these wildcards are expanded before the program that you are going to run ever sees them, if you want to find out the behavior of a given wildcard, you can always simply issue the command echo ~/T*frie*st.gif to see what that will produce, then use it in a command until you get the hang of how this works. Another benefit of these wildcards being expanded before the program is executed is that they work for all programs since the program itself has nothing to do with them.

Tab Completion

Tab completion is a function of bash that can be thought of as a realtime interactive wilcard expansion. Tab completion is the function of the shell which will expand the current word (delimeted by whitespace) to its full unique filename when you hit tab. It is invoked with the tab key, hence its name.

For example, let's say that you want a long listing of /etc/passwd. So you start the command ls -l /e. Now you hit the Tab key. What's on your command line now (unless you have a weird root directory) is ls -l /etc/. You now type in pas so that the line reads ls -l /etc/pas. Another pressing of the Tab key will result in the line now reading ls -l /etc/passwd.

This is how tab completion works. The shell will complete your filename up until the first non-unique character. If the filename that you have partially typed in is unique, then it will complete the name with a space afterwards. Please note that absolute and relative paths still apply here.

This behavior is not unique to file names. If you are typing the first word in the command, tab completion will also work on executables that are in your path.

Tab completion is an extremely useful tool that makes the command line much quicker to deal with. I recommend using it whenever possible as time is valuable. Also, if you press tab twice, the shell will print out all possible completions of the file that you are specifying. If you have typed the file name in preceded by some strange garbage, this may mean that nothing will be printed out as there are no possible completions. Try this out for a bit to get used to it. It is so useful that it gets addictive after a while.

Terminal Association

You may notice that when you run a program, what you type at the keyboard goes to that program and not to the shell. In essence, when you run a program, you have to stop using that shell until the program that you ran is finished. Sometimes you don't want the program that you are running to terminate soon, especially in the case of a program that is actually an X program. There's no reason for it to still keep control of your terminal.

To dissociate a program from your terminal, you simply append an ampersand to the command that you used to start the program. For example, xemacs shell.html & will dissociate the program xemacs from my terminal window so that xemacs continues to execute but it doesn't hog my terminal window, which it isn't using.

This is generally useful when it's quicker to start up an X application from a terminal than a run dialog box or menu (typically when you want to to do something with a file that you currently have in the directory that your terminal is in). It can also be used as a quick way to start up a daemon.

Foreground, Background, & Suspending

The concepts of foreground and background are particular to terminal input. A program which is in the foreground has control of the terminal input. A program which is in the background is not in control of the terminal input and thus does not recieve any input from the terminal.

Before getting into how to move a program from the foreground to the background and vice versa, it is necessary to learn how to suspend a program. This is accomplished by the command sequence Ctrlz. This will prevent the program from getting any further CPU time and will dissociate it from the terminal, as well. When this happens, the program is assigned a number starting from one based on the number of programs that you currently have stopped.

If you want to put a program that you have suspended into the background, so that it resumes operation but does not have a controlling terminal, you use the 'bg' command. The syntax is very simple: bg jobnumber. For example, bg 3. If you omit the job number, it will default to what it thinks is the current job (usually the one that you suspended most recently).

To get a process into the foreground, you use the command fg. It uses the same syntax, fg 1 and it defaults to the current job if you omit the job number.

If you need a list of all the jobs that you have on a given terminal, run the command jobs on that terminal and it will give you a list of them plus their current status.

You can also kill a suspended process pretty easily. The syntax is kill %jobnumber. (see below for what kill does.)

Killing a program with the keyboard

There is normally a way to kill a program with the keyboard. It is the key combination Ctrl-c. This key combination causes the shell to send SIGINT. Most programs to not handle this signal and thus they simply die from it. However, some programs to trap and ignore this signal, so you have to be a little more crafty. You can find out its PID and use another terminal to kill it. My preffered method is to use Ctrl-z to send it to the background, then kill %1 (replace %1 with the appropriate number if you have other programs suspended).

Special Files

/etc/profile

This script is executed on login for every user in the system whose default shell is bash (i.e. everyone on a normal system). It is used to set up a default environment. The path, important environment variables, things of that nature. As well, it's quite common to append the command /usr/games/fortune to it so that every user gets a "fortune" when logging in. It's a very nice touch that I strongly recommend. That logging in is different in an interesting way every time tends to make people feel better and enjoy using the system more. Trifles make perfection and perfection is no trifle (--Michaelangelo), after all.

/etc/bashrc

This file performs the same function as /etc/profile except that this file gets executed more often. /etc/profile is executed for all logins. /etc/bashrc is executed for all sessions, login or not.

The difference between a login session and a non-login session is a bit unclear, the main difference seeming to be the execution of /etc/profile and ~/.bash_profile or not. Actually, non-login sessions may not get added to the list of currently logged in users so that if one does the command w or finger one will not see those sessions listed.

In any event, /etc/bashrc will be executed every time a new session is created (new logins do this as well). Thus if you want an environment set up all the time you would put it here. In general things such as setting up what the prompt looks like should be done here. Things like printing a login fortune should be left for /etc/profile.

~/.bash_profile

This file servers the same purpose as /etc/profile except that it is user specific (which should be obvious since it resides in a users home directory (remember what ~/ means)). This allows users to customise their environment.

~/.bashrc

This is simply a user-specific analog to /etc/bashrc. It exists for the same purpose as ~/.bash_profile does.

~/.bash_logout

This is a user-specific file which gets executed when a user terminates a session. It's rarely useful but does have its occasional raison d'etra. For example, you could add the command clear to this file to clear the screen when you log out so that the next login-screen looks nicer if you often use the regular terminal login.

Other Command Shell Information

Some useful command-line programs