Some Useful command-line programs

Please note, virtually all unix commands are in all lower case characters. Except in command examples, they will generally be capitalized here for asthetic purposes. Please assume that they are lower case unless the description explicitly states otherwise.

Man

Man is a program which will print out the manual pages for whatever topic you request (if they exist, of course). Generally, a program will come with man pages which describe its functionality, command-line options, configuration files, and other miscellaneous information.

Man itself has a fairly simple interface. You can run it with man program in which case it will give to the man page of the program if it has one. If you aren't sure of what the man page that you are looking for is, you can use the command man -k keyword. Man will then print out all manual names which match. For example, man -k directory | grep create will produce the output:

install-info (8)     - create or update entry in Info directory
mkdir (2) - create a directory
mklost+found (8) - create a lost+found directory on a mounted Linux second extended file system
mknod (2) - create a directory or special or ordinary file
lndir (1x) - create a shadow directory of symbolic links to another directory tree
mkfontdir (1x) - create an index of X font files in a directory

As you may notice, you can use this method to also search for commands as well as to search for the correct name of the page about a package whose name you already know. (see below for what exactly grep does.)

There's one more detail to explain on man pages. This is the number after the name on the left. This refers to the manpages section. There are nine sections that a man page can be in (this is borrowed from the man page for man):

  1. Executable programs or shell commands
  2. System calls
  3. Library calls (functions within system libraries)
  4. Special files (usually found in /dev)
  5. File formats and conventions e.g. /etc/passwd
  6. Games
  7. Macro packages and conventions eg man(7), groff(7)
  8. System administration commands (usually only for root)
  9. Kernel routines [Non standard]

If there is a manpage by the same name in different sections, you access them by inserting the desired section number in front of the man page. If no number is specified, the entry from the first section (in numerical order) is taken. Contrast man crontab and man 1 crontab with man 5 crontab (I mean try it out and see what happens - it is what you would expect).

Info

Info is basically man + hypertext links. Info pages are arranged into sections and subsections and so on. Some people prefer it, I find it to be more cumbersome than Man. Either way, there are some things that are documented only in man and others which are documented only in Info.

If you're interested in the history of it, this is the history as far as I know it: Man was around very early on unix systems. It used a text formatting system known as troff. The FSF people decided that the man format was lacking - they wanted the ability to link documents so that they could break the content up into more manageable sections. They also noticed that TeX was a nicer formatting system than troff is, so they wanted to switch to that too. Thus info, or more technically, TeXInfo, was born.

LS

List Files. Ls is the equivalent to dir on an msdos system. By default, it prints out all of the names in as many colums as the filename sizes will allow it in roughly alphabetical order.

If you want more information about a file, try the -l option(-l is for long). It will print out in this order: permissions, # of hard links, owner of the file, the group that the file is in, the size of the file in bytes, the modification time of the file, and the name of the file.

You will notice that you if use the command ls -l directory/ the result will be not the information of the directory but instead a long listing of the contents of that directory. The easy way to get around this is to add the -d option. This prints directories as directories rather than printing their contents. Thus you would type ls -ld directory/

If you want more colorful output, try ls --color -F. It prints out files in different colors based on their permissions and types. If you always want this behavior, you can set an alias to this in your profile.

Oh, If for some reason you don't want the files to be output more than one to a line (perhapse you might be intending to pass the output of ls through grep) but you do want just the file names printed, you can use the option -1. For example, ls -1 /tmp.

CD

Cd is used to change directory. The semantics are pretty straight-forward. It's cd /usr/local/bin or whatever directory that you want to go to. cd ~/checkbook is another example. Cd takes only one argument. If it is not preceded by a / it is assumed to be relative to the current directory. A leading / implies that it is a full path from the root directory.

CP

Cp copies files. The way it handles relative versus absolute paths is the same as in all other programs as that is the way that the open() system call operates and not a special property of the programs in question. Anyhow, the basic behavior is to copy every argument but the last into the last argument. If there are only two arguments, the second can be a filename (with an optional path in front of it). If there are more than two arguments, the final argument must be a directory, and all of the files specified will be copied into that directory. Please note that the paths to the files to be copied is irrelevant, just the files themselves are copied.

Cp can take many arguments, but I generally only use a few in the rare instances that I use any at all. If one wants to copy one file over another already existing file, the -f flag is useful for this (yes, it means force in this instance too). For example, cp -f .emacs.safe .emacs. The other useful flag is the -r flag. Yes, as in most other cases of GNU software this means recursive. In this case, the path structure of any source directories (but not the path leading up to them) is preserved and any subdirectories of a source directory become subdirectories of the target directory (that is, the final argument on the command line).

RM

Rm removes files specified on the command line. Like all filenames, if the name begins with a / character, it is assumed to contain the full path to the file. If it doesn't start with a / character, its location is assumed to be relative to the current directory.

Rm takes quite a number of flags, but the really interesting ones are -r and -f. The -r flag causes rm to delete recursively. That is, if any one of its arguments is a directory, it removes all of the files in that directory and then gets rid of the directory. If there are any directories in that directory, they will be removed as well and so on. The -f flag means force. If rm is set up to prompt you if you mean what you typed, -f will override that and simply do what you asked it to without asking you if what you told it is in fact what you want. This flag is also useful for situations where the permissions on a file are strange. Basically this is when a file that you own does not have its writing flag set for the owner group. -f is normally just a convenience so you won't have to get prompted for anything, whether or not you actually would.

MV

Mv moves files. It's basically just cp followed by rm, though that isn't the actual implementation. It takes roughly the same arguments that both of cp and rm take. You generally are never going to need to use any arguments with it, so look it up if you want

MKDIR

This program creates directories. Once again a leading / implies an absolute path versus no leading / indicating a relative path. Mkdir will create an empty directory of the name that you specify. The only option of mkdir that I have ever used is the -p option. It acts a bit like the -f option of cp and rm. If the directory of which the new directory will be a subdirectory does not yet exist, mkdir will create it with the -p option. If the target directory does exist, mkdir will return success and not print an error message (normally it is an error to try to create an already existant directory). In this case the directory contents will remain unchanged.

PS

PS prints a process list. There are all sorts of arguments that you can give it to have it output different pieces of data. There are two ways that you will probably use it: without arguments and with the arguments: "ax".

With no arguments, ps prints out a list of all programs which are related to ones current terminal (this list is usually very short).

The a argument prints out a list of all processes attached to a controling terminal, yours or otherwise.

The x arguments prints out a list of all processes without controling terminals. (most daemons fall into this categor.)

If you want a list of all processes, ps ax should do it.

Kill

Kill is a program used to send signals to a process. Without any arguments, it sends SIGTERM. Kill always requires a PID as an argument. For example, let's say that some program that I wrote is in an infinite loop. I use ps to find its PID which is in this case 12546, then I run the command kill 12546. Let's say that this program traps SIGTERM and ignores it. I'm rather annoyed at this program now, and I really need it to die, so it's time to bring out the big guns. I type kill -9 12546. The -9 argument is the signal number to send to the process. If you look at the signal table on the linux terms page again, you will notice that signal nine is SIGKILL. It is quite possible to send processes signals by name, as well. The previous command is equivalent to kill -KILL 12546. With kill, signal names and numbers are interchangeable.

You may remember from the section on signals that the HUP signal can cause inetd to reread its configuration files. The easiest way to send it this signal is to first us ps to find its PID. Then use the command kill -HUP 138 replacing 138 with the actual pid of inetd.

More

More performs the same function as the file-viewing use of cat, except that for files which have more lines than the terminal (programs can find out the dimension of the terminal that they are in if they want to), it prints out as many lines as will fit, then prompts you to press a key to continue (the prompt is usually something like --More--, hence the name. The standard interface is that the enter key will scroll by a line and the space bar will scroll by a page. Once more prints the last line of the file, it quits.

You will notice some deficiencies - that you can't scroll up and you can't do text searching are the two that I notice most often. This brings us to our next entry:

Less

Less is a program similar to more, except that it fixes the deficiencies of more. Less allows you to use the arrow keys to scroll up and down (or Ctrl-N andCtrl-P if your terminal emulator is broken). It also allows text searching with reasonably powerful regular expressions (see the page on regular expressions).

Since Less is the successor to more and has many more features than More, it is often joked that in this context, Less is more and More is less. ;-)

Grep

I've heard that grep stands for "Grep for Regular ExPressions". Whatever the original name of grep, its purpose is to print out lines which match a specified regular expression. It's out of the scope of this document to explain regular expressions, but there are some simple regular expressions which are extremely useful.

First, it is very useful to know the -i option, which makes grep do its regular expression matching without being case sensitive. Also, if you want all lines but the lines matching the specified regular expression, use the -v option. Oh, the syntax of grep is grep expression file1 file2 file3 ... If no files are listed on the command line, standard input is assumed.

The most basic regular expression and often the most useful regular expression in grep is simply a bare word. If I want all lines with my username from the file /etc/password, I would use the command grep raistlin /etc/passwd. This would produce the output

raistlin:x:500:500:Christopher T. Lansdown,,,:/home/raistlin:/bin/bash
	  

If for example I wanted all of the configuration lines for talk protocols in /etc/inetd.conf, I would use the command grep talk /etc/inetd.conf which would give me the output:

talk            dgram   udp     wait    nobody.tty      /usr/sbin/tcpd  /usr/sbin/in.talkd
ntalk dgram udp wait nobody.tty /usr/sbin/tcpd /usr/sbin/in.ntalkd
Sed

Sed stands for Stream EDitor. It is a program which can edit an input stream according to specified rules. One of its most common uses is to iterate over every line from standard input and print it to standard output after applying a regular expression to it. Sed is a fairly powerful program, but Perl can be made to duplicate its behavior fairly easily and if you're not doing the very simple stuff the more powerful perl regular expressions will probably make perl the right choice.

The syntax for doing a regexp replace with sed is -e 's/foo/bar/' where foo is the thing to replace and bar is the thing to replace it with. Regular expressions will be covered in the regular expression page as they are out of the scope of this document, but to do a simple search and replace, you could try this: cat /etc/passwd | sed -e 's/501/1001/' which would print out /etc/passwd with the UID 501 replaced by 1001 (the actual file would not of course be affected and any other appearance of the string "501" would also be replaced). While this application isn't terribly useful, once you learn regular expressions it will become more interesting.

Cut

Cut is a program designed to grab only certain sections out of every line. It does this by breaking each line into section based upon a delimeter (the default is a tab). It will then only print out those section which you specify. The syntax for specifying the field is pretty flexible. It is -f followed by either a number, a comma-separated list of numbers, or a range specified by lower-upper.

While the default behavior is nice, very often we want to use some other delimeter. The passwd file /etc/passwd uses colons to separate fields, for example. To specify a different delimeter, the syntax is -d 'delimiting character'. Please note that only a single character can be used as a delimeter. So if we want a list of the users in /etc/passwd (if you look at the file you will notice that the user is the first field of each entry), we would use the command cat /etc/passwd | cut -f 1 -d ':'. On my system this produces the output root daemon bin sys sync games man lp mail news uucp proxy majordom postgres www-data backup msql operator list irc gnats nobody raistlin bethnewt ftp identd which is in fact a list of the users on my system.

Find

Find is a very nice program for finding files that have certain characteristics. For example if you want a list of all files larger than 8k which were modified after yesterday and end in html that reside in the current directory and all subdirectories of it, find is the program to do it for you.

The syntax of find is relatively simple. The simple construction is find directory/ test1 test2 .... If we want to find the files which matched our example above, we would use the find . -type f -name "*.html" -mtime 0 -size +8k. The full list of tests that find can perform is in its man page.

While find is not so useful by itself, it becomes extremely powerful when used in scripts and programs. Very often one wants to do things only to certain files. For example, one might want to process only those html files which have changed since last time. Well, the first time we process all of the files. Afterwards we create a file that will function like a timestamp. Subsequent executions would get the file list by invoking find like this: find . -type f -name "*.html" -newer timestamp_file. Please note that the * is being handled by find rather than the command shell because we have the * inside of quotation marks.

Locate

Locate is a program which finds files by name. Periodically find is used to build a list of all files on the sytem. This list is then put into a database which locate uses to print a list of files matching a requested patter. For example, if we want a list of all jpg files on our system, we would use the command locate '*.jpg'. Please note that in this case the * is being handled by locate on not by the shell because we are putting it inside of the quotes.

Locate can be useful both for shell scripts and for finding files.

Ln

Ln is used to create links. There are two types of links, symbolic and hard links. Hard links are rarely used because they have no real advantages over symlinks. The difference is that a hardlink is a regular file which has two entries in a filesystem and a symlink is a file which points to another file. In the first case if one link is deleted with rm the file still exists and the other link is till perfectly valid. If the file that a symlink points to is deleted the symlink still exists but programs will get an error if they try to access it.

The syntax of ln is quite simple. It is ln [-s] original_file link_name. If for example we want to create a symlink from /etc/passwd to the file ~/users, we would use the command ln -s /etc/passwd ~/users.

Please note that if you specify a directory for original_file then ln will create a symlink to that directory. What you will see in the directory specified in link_name if you specify an already existing directory will be a directory by the same name as the original. If you cd into it, you will appear to be in that directory, but all the files in the original directory will be there (as will all subdirectories of the original). If you specify a directory name which doesn't exist at the end of link_name, then the symlinked directory will have that name but behave the same way.

Top

Top is a program that can be thought of as an interactive ps that updates itself periodically. It prints out all process on the system sorted by CPU usuage by default. The best way to figure it out is to try it, it is self-explanatory. To get help while running top, use the ? key. To quite simply hit the q key.

DF

Df is a simple but useful utility which prints out how full each mounted partition on the system is.

Free

Free prints out how much free RAM and SWAP are used on the system as well as how much total RAM and SWAP are available. Note: Linux uses almost all otherwise unused RAM for disk buffering. This does not reduce the amount of RAM available to the system, when a program needs more memory it is simply deallocated from the disk buffer and allocated to the program. This is what the line with +/- buffers/cache is about.