Recent Changes - Search:

PmWiki

pmwiki.org

edit SideBar

HowToMakeErrorCorrectedCDBackups

4 CD, Raid 5. (Score:5, Interesting)
by evilviper (135110) on Sunday July 19, @09:43PM (#28751857) Journal

I used-to make 2 CDs of every ISO, until I figured out how best to utilize PAR2.

PAR2 calculates parity information on a set of files, and writes out a file which can be used in the event that any of the files is damaged. This is quite similar to RAID-5, but PAR2 is more robust, and works on any files, not just equally sized hard drives.

Though it's no help on DVDs, CDs work GREAT with PAR2, because of their two different methods of recording. Mode 1, as all regular files are stored, reduces the amount of space available by about 12.5%, using that space for additional error correction data. Audio CDs, and Video CDs, where a single bit error isn't nearly as critical, are recorded in Mode 2, with substantially reduced error correction, but about 100MBs more usable space available.

PAR2 is similarly resilient to errors, so it can safely be used with Mode 2. This allows much more space for the parity information, and the opportunity to be safe against, and correct, respectively more damage to a disk.

Specifically, I recomend a 4-disk parity set. You fill 3 CDs full of data, and tell PAR2 to calculate 37% recovery data on those files. The first 33.334% allows you to RECOVER THE DATA FROM ONE COMPLETELY LOST CD, no matter which of the 3 it is. That still leaves you with a margin of 3.667%, so those two CDs you DO have, can have a few bad sectors as well, and all the data from the lost CD, as well as undamaged versions of the files on the two lightly damaged CDs can be recovered. Alternatively, if you DON'T lose an entire CD, all three (4 actually) CDs can have numerous bad sectors, in any distribution, up to a total of 37% of all the discs, and pristine data can still be recovered.

The method to do all this is quite simple. Just run the par2create command, telling it to create 37% recovery information. Then take the resulting BASENAME.Par2+??????? file, and create a CUE file, describing a CD with a single track across the whole CD, with the PAR2 file as the supposed audio data. eg.:

    FILE "par2.bin" BINARY
        TRACK 01 MODE2/2352
            FLAGS DCP
            INDEX 01 00:00:00
        TRACK 02 MODE2/2352
            FLAGS DCP
            INDEX 00 00:04:00
            INDEX 01 00:06:00

Now, any CD recording software that understands CUE files will happily record this to disc. On Unix systems, you can choose cdrecord, or cdrdao.

Now, like regular audio CDs and Video CDs, you can't just use or copy this data off the disc like a normal file on a CD. There are programs for converting VCDs into regular files, something like dat2mpeg, but I prefer a more generalized tool that can do the job:

    mplayer vcd://2 -dumpstream -dumpfile par2.bin

You'll note that checksums of the file and the data on disk don't quite match... This is because, in mode2, data MUST be padded to the block size. PAR2 files are fine with it, and the padding is silently discarded.

Something like DD_RESCUE to copy the (normal) files off the other CDs, in the event of damage, is probably necessary as well. Then, once you've got 3CDs worth of data (eg. 700MB CDs x 3 = 2100MBytes) you can run par2recover and all with be repaired, like magic.

The only footnote being that calculating the parity information isn't fast, so this method is probably slower than just recording 2 copies of every CD. Also, if you lose more than 37% of the data across all the discs, the error-free originals can't be recovered. However, I consider it more reliable than duplicate discs, if only because the odds of an error on the same sector of two discs (or one disc lost, and the backup with a few errors), seems more likely than 37% of the discs being damaged beyond hope. And as an added bonus, you save 1/3rd on your CD-R purchases.

Edit - History - Print - Recent Changes - Search
Page last modified on July 20, 2009, at 08:41 PM