[TriLUG] Copying Files from a Dying HDD...with a twist

Tue Jul 3 17:13:41 EDT 2018

I've learned the hard way that it often happens that hard disk drives with
bad blocks get progressively worse as the recovery process progresses. So
the best approach is to *first* try to mount the file system read-only, and
copy the most important files from it. *Then* do the ddrescue recovery.

Then, when ddrescue gives up, and can no longer recovery any additional
data, you have a couple of options.

With ddrescue you can fill the un-recovered sectors with a recognizable
repeating byte sequence, like "*BAD_DISK_BLOCK*".  Then you can search the
rescued files for that string, to find damaged files.

Better yet, make a couple of copies of the recovered .iso (being sure to
also save the ddrescue logfile), and then use its "fill" option to fill the
bad blocks two different ways, e.g., with 0x00 bytes in one copy, and with
0xFF bytes in another copy.

Then mount the two image files (read-only, using the loop device), copy the
files from both of them, into two directory trees.

The files which are identical in the two trees are probably good. The files
which differ in the two trees are definitely damaged.

If it were a Windows file system, I'd have some additional suggestions,
working from the ddrescue logfile to identify the damaged files. But with
Linux file systems I don't know if there's a way to identify which file
contains a particular cluster or sector number.

Dave

On Tue, Jul 3, 2018 at 11:52 AM, Sean Korb via TriLUG <trilug at trilug.org>
wrote:

> This is almost completely non sequitur but a decade ago I worked with a
> boutique filesystem (no longer available so you should be safe) that
> attempted to recover from a double disk failure on its array of RAID
> arrays.  It was not successful and while file size and metadata were fully
> retained, I discovered some characters were replaced with nulls.  It was
> very difficult to detect. Though we had a cache of md5sums, the individual
> filesizes were over 6GB in many cases and I determined it would take years
> to fully scrub for coherent recovery.
>
> My solution for not really the issue you are investigating: if you want to
> find specifically nulls in your bits that have rotted away, you can use
>
> for file in `ls -1 /allmyprecious`; do echo "Grep for nulls in '$file'   "
> `find /allmyprecious/$file -type f -name "*.txt"  -exec grep -Pan '\x00'
> /dev/null {} \; >> /allmylogs/logs/grepfornulls-2000-02-04`;done
>
> That gave me a list of files to restore from a legacy mass storage
> systems.  Once you have a list, it's not so tough to sort.
>
> sean
>
>
> On Tue, Jul 3, 2018 at 10:01 AM, Brian via TriLUG <trilug at trilug.org>
> wrote:
>
> > Hi Gang,
> >
> > I know there're several choices available to me when it comes to copying
> > bunches of files from one place to another (under Linux, if that didn't
> go
> > without saying.. :-) ).  I have a bit of a twist I want to put on it.
> >
> > The source drive was dropped while it was running, and although it still
> > functions, there are now collections of errors in certain areas (my guess
> > being correlating to where the heads were when the drive was dropped).
> >
> > What I'd like to do is something that will mirror a directory structure,
> > but if a read error is encountered in any given file, that file winds up
> in
> > a different location.  In other words, I want to replicate the directory
> > structure in two places; one that gets filled up with error-free files,
> the
> > other that gets filled up with error-containing files.  That way I'll
> have
> > an easy, organized way of seeing what files and directories were actually
> > lost/corrupted.  e.g.
> >
> > /good_files/<original_directory_structure>
> > /bad_files/<original_directory_structure>
> >
> > So the question is, is there a command that'll already do this? Something
> > like rsync, with an option to give a different target path for files when
> > errors occur.  I could always script something, but I feel like
> > file-by-file invocations of cp would be much less efficient than some
> > command that would handle it internally.
> >
> > Thanks!
> > -Brian
> > --
> > This message was sent to: Sean Korb <spkorb at gmail.com>
> > To unsubscribe, send a blank message to trilug-leave at trilug.org from
> that
> > address.
> > TriLUG mailing list : https://www.trilug.org/mailman/listinfo/trilug
> > Unsubscribe or edit options on the web  : https://www.trilug.org/mailman
> > /options/trilug/spkorb%40gmail.com
> > Welcome to TriLUG: https://trilug.org/welcome
>
>
>
>
> --
> Sean Korb spkorb at spkorb.org http://www.spkorb.org
> '65 Suprang,'68 Cougar,'78 R100/7,'60 Metro,'59 A35,'71 Pantera #1382
> "The more you drive, the less intelligent you get" --Miller
> "Computers are useless.  They can only give you answers." -P. Picasso
> --
> This message was sent to: Dave Burton <ncdave4life at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : https://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web  : https://www.trilug.org/
> mailman/options/trilug/ncdave4life%40gmail.com
> Welcome to TriLUG: https://trilug.org/welcome
>