[TriLUG] Help with a "broken" LVM drive

Brian McCullough via TriLUG trilug at trilug.org
Wed Oct 20 15:40:06 EDT 2021


Folks,

The more that I work on this, the more that I feel that paranoia is the
better part of valour.


I recently ( a week ago or so ) had a drive go bad on me, bad blocks,
unreadable sectors, the works.

Because of that, the machine will not boot, and stops during that
process and asks for the root password.

I removed the drive, bought a "work" drive and started with ddrescue.

According to fdisk, there are three partitions on this drive: a Windows
partition of more than half the size ( which is a bit strange, since it
would have had no use in one of my machines ), a 300GB LVM partition,
and a Linux partition of about 30G.  OK, it has been so long since I
first set up this drive that I have to accept this, I guess.  It is a
3TB portable Seagate, so I can accept that it was initially
pre-formatted as Windows.


Using ddrescue, I was apparently able to recover both the LVM and Linux
partitions ( separately ).

When I ran ddrescue on the LVM partition, it ran for some hours and then
completed, recording zero errors.


I found some data at the beginning for the LVM partition that indicated
that it had been an LVM partition.  I found the string " LVM2 " at just
after 0x01000, and something that looks a lot like a ( partial )
vgconfig file at 0x01200, even if that seems to start part-way into that
file.

I don't find any evidence of LVM on either of the other two partions.

I did not find the "LABELONE" tag which should be at the beginning of a
PV, just zeros.  ( possibly part of the ddrescue process??? )

OK, that's the preamble.  ( Sorry )

Since I am able to start to boot the machine, I was able to get to
/etc/lvm/backup and retrieve the appropriate vgconfig file that this
partition should be part of.  In the data on the partition itself, I
found what should probably be the UUID for this PV.  What bothers me is
that I do not find that UUID in the current vgconfig backup file. (
Well, one of the things that bothers me. )  That UUID does show up in
archive versions of that file.  However, if that UUID is no longer used,
why does the machine not boot?  Oh, well.

I also purchased a drive to replace that partition, and used ddrescue to
copy the contents to it.   However, at the moment, the partition is not
recognized as a PV.

If I use the UUID that I found on that partition and run "pvcreate
--uuid" should I use "--restorefile" or "--norestorefile"?  I tried this
with --restorefile, using the data that I dd'd from the partition, which
turned out to be an incomplete vgconfig file, and it overwrote what was
on the partition at 0x01200, so I stopped at that point.  However, now
that I have been examining things, including the vgconfig backup file
from the machine, that have been properly successful if I was using the
proper vgconfig file.



Now that I have talked through all of that, I am wondering whether I am
possibly heading down the correct path, and have some hope of restoring
this machine to operation?

Do you have any further suggestions for tests or things that I should be
doing, or whether I should just go ahead with pvcreate?  If I run
pvcreate on a different machine ( where I am doing all of my recovery
work ), I suspect that I should not run vgcfgrestore until I re-attach
the new drive to the original machine, correct?

( The instructions that I have been reading say to run pvcreate followed
by vgcfgrestore. )




If there anything more that I can say, answer questions or perform any
more experiments, just say so.


Thank you,
Brian



More information about the TriLUG mailing list