Recovering from a Crash

Though it is uncommon, there are several reasons why a linux machine might crash:

All of these will result in your hard drives needing to be checked with fsck since they were not shut down properly. This can be avoided, however, in the swap thrashing case and possibly the X case. The method is relatively simple. You need the magic sysrequest key. This is the key combination Alt-SysRq. SysRq is normally also the printscrn key. Hold these two keys down while typing these keys in order with one second of time between each: s (repeat three times), u, b. This will flush all unwritten data to disk (the three times is just to make sure), remount the disk read-only, and then reboot your computer. This will make the disks stay in a consistent state and not need to be checked at bootup.

If the above doesn't work, just hit the reboot switch. When the magic sysrequest key doesn't work, there's nothing that can be done without rebooting.

If you were not so fortunate as to be able to use the magic sysrequest key to sync the disks and reboot the sytem, there are three possible outcomes (listed in order of likelihood):

  1. The system comes back up and takes a long time on fsck (which prints out lots of stuff). Eventually you come back to the login screen and life is normal.
  2. The system comes back up and takes a long time on fsck. Eventually it says that it encountered unexpected inconsistencies and that you must run fsck manually. It then prompts you for the root password.
  3. The system doesn't come back up. Either your BIOS gives you errors or the machine just doesn't boot.

In the first case you were just inconvenienced and everything should be fine. You will have lost any unsaved data that was on your system, but everything else should be ok. Don't worry about it and keep on going. Buy a UPS, scold your roomate, restart netscape more often, fix the bug, fix X or get a more supported video card, or stop using binary-only drivers as appropriate.

In the second case, you will need to run fsck manually. First note what partition fsck said that it encountered problems on. This will be something of the form /dev/hda3, /dev/hdab2, or /dev/sdc7. The /dev/ is simply where devices are stored. The hd or sd is for IDE hard drives or SCSI hard drives respectively. The a, b, c, etc. is for which one it is. The first drive is a, the second is b and so on. Finally the number is for which partition is being dealt with. What this means doesn't really matter, just be able to recognize it so that you can type it below. Next enter the root password. You will then have a prompt. Run the command fsck -y /dev/hda3 replacing /dev/hda3 with whatever fsck reported above as the drive it was checking. This will run through a whole bunch of stuff. Eventually it will finish. Now type the command reboot. This will restart the system and it should restart normally.

What happened is the fsck encountered stuff that it isn't configured to fix automatically. In this case you have to tell it to fix everything no matter what and without reference to whether fsck knows that it will be successful. This is a protection for people who can afford to send their hard drive off to a specialist to recover the data that was on it (as far as I can tell at least). It's a minor inconvenience, but it can be nice to have this safety net - fsck won't do anything that could potentially be harmful in its default configuration. It's nice to know.

If you experienced a hardware failure, replace the appropriate part. If it was your hard drive you're basically really out of luck and need to get a new one and re-install. There are supposedly hard drive specialists around who can read the data off of the drives in order to put it on a new drive, but this is extremely expensive and not for normal people. I recommend using a backup policy on all your important data. You'll really be in trouble if you don't. It's one of the few certain things in computers: those without backups will eventually regret it.

Anyhow, this should be all you need to know about how to recover from a crash. I've never experience filesystem corruption except for one time when my hard drive died on me, so don't worry about crashing. Try to avoid it, but don't get overly paranoid about it. It's normally quite recoverable.