Tuesday, 25 September 2012

Data Recovery in Linux, Part-1


In the first part of this series, I will cover how to perform data recovery in Linux, i.e., getting your data back in case you formatted the drive, or lost data due to any other reason.
About a month back, a colleague mistakenly formatted his external hard disk, which contained precious data. So I spent the next few hours searching for a solution to his problem. I stumbled on a lot of proprietary software that claimed to fix things but was bent on finding an open source application, being a FOSS advocate. After hours of search, I finally succeeded! A company called CGSecurity had two utilities, TestDisk and PhotoRec, which are free to use. You can get them here.
The following table lists the dos and don’ts that must be followed in case you have lost data (on external media, on the internal hard disk of your laptop, etc).








File deletion — what happens?

Usually, when we delete a file, it goes to the Trash/Recycle Bin, provided the file is not larger than the bin’s capacity. One can restore these files easily. If you have emptied the Recycle Bin, permanently deleted the file using Shift+Delete or an rm command at the command-line, you may think the file is irretrievable — it isn’t. Only a pointer to the file is deleted as yet, not the data, which remains on the disk.
Data stored on a medium as files has a “table of contents” indicating the storage location for each file (name) on the drive. When a file is “deleted”, its entry is removed from the table of contents. This indicates that the space previously used by the file can be used to store other data — it is now “available” space.
When a new file is written to the disk, and uses this “available” space, it replaces the previous data with new data. Now, recovery of the old file becomes very difficult (though not impossible for experts). Thus, when you find that you have accidentally messed up and lost data that you want, you should avoid writing new data to the medium. This means you shouldn’t continue running the OS that is using that partition or medium in read-write mode; even applications like a browser caching files, the OS downloading and applying updates in the background, or downloading new files and installing new packages, could cause the space of the deleted file to be overwritten.
Therefore, shutdown that OS immediately, and boot with a live CD or a live USB that does not write to that medium.
Tip: CGSecurity has a list of live CD images, which includes their utilities, here. You could consider keeping a CD of one of these in your toolkit to guard against such emergencies.

Overview — TestDisk

This software should be used to recover lost partitions, or to make non-bootable disks bootable again. TestDisk claims to perform the following operations:
  • Fix partition tables and recover deleted partitions
  • Recover the FAT32 boot sector from its backup
  • Rebuild the FAT12/FAT16/FAT32 boot sector
  • Fix FAT tables
  • Rebuild the NTFS boot sector
  • Recover the NTFS boot sector from its back-up
  • Locate the ext2/ext3/ext4 backup superblock
  • Undelete files from the FAT, exFAT, NTFS and ext2 filesystem
  • Copy files from the deleted FAT, exFAT, NTFS and ext2/ext3/ext4 partitions.

PhotoRec

This software claims to recover individual files, such as ZIP, Office, PDF, HTML, JPEG and 390 other file extensions. In case TestDisk fails to do the job, use PhotoRec. Personally, I prefer PhotoRec.

Using PhotoRec

PhotoRec recovers files by finding deleted files and copying them to another disk/medium (files should not be recovered to the partition/medium which has the deleted files because that could lead to data being overwritten, as explained before). Also, PhotoRec will most likely recover a lot of files, so the partition/medium on which you are storing recovered data should ideally be at least as large as the partition/medium being recovered.
The possible modes for recovery are:
  • Recover the files to a separate hard drive.
  • Recover the files to a networked storage drive.
  • Recover the files to a separate partition on the same hard drive.
I personally prefer the first option. Now, I have an 8 GB pen drive containing some data that I have formatted. I will show you how to recover data from it, using my laptop that runs Ubuntu 11.10.
First, download PhotoRec from the link mentioned earlier, and unpack it. Next, fire up the terminal, and navigate to the directory where it’s extracted. Launch PhotoRec with: sudo ./photorec_static. Select the disk from which to recover data. In my case, it is my HP pen drive, shown as /dev/sdb (Figure 1).
Disk selection in PhotoRec
Figure 1: Disk selection in PhotoRec
Next, select a partition, as shown in Figure 2. You can also go to Options to select a specific type of file to recover (e.g., Figure 3). Select which types to recover.
Partition selection
Figure 2: Partition selection
File type information
Figure 3: File type information
As you can see in Figure 4, I have ‘unticked’ the last four file types, such as bmp, bkf, etc. After you are done, quit and return to the main menu.
File type selection
Figure4: File type selection
Select the file system of the partition on which your data was stored, as shown in Figure 5.
Filesystem type selection
Figure 5: Filesystem type selection
Proceed with the recovery (similar to Figure 6). This will take a long time, so be patient.
Recovery in progress
Figure 6: Recovery in progress
After the recovery is done, you will get a lot of folders, as shown in Figure 7.
Recovered files and folders
Figure 7: Recovered files and folders
It’s up to you to search through them for the data you want to keep. You can now unmount or eject the partition/drive to which recovered files were stored, shutdown and reboot the normal distribution you were using earlier, and dig through the recovered files from it, if you prefer to.
In the second part of this article, I will discuss data recovery using more advanced tools.

No comments: