@matigo …..Hopefully, the fsck -c run will mark the bad blocks off of the failing logical volume and mdadm will know to ignore them when restoring to the new drive. That way, I can just remove the bad drive, add another at leisure and be done with it.
@matigo It's the straightforward way but so time consuming. Make appropriate partitions on new drive, mount each logical volume on the failing raid volume and cpio them over, install bootloader, generate initial ram disk, hopefully boot on new drive, insert another, duplicate the partition layout, convert it to a new raid volume, cpio everything over to THAT drive, do the bootloader and ramdisk dance again, boot from THAT drive, duplicate the layout on the prior drive, add it to the raid array and let Linux stripe it.
2:45am. At office. Round two. Round three will take place if a knockout is not scored this round. Round three is the brute force round where I will certainly win via knockout but I'll have to start at midnight as it will take seven hours for the brute force approach.
We'll have to be down for upwards of three hours in order to have fsck mark the bad blocks on the logical volume causing the headaches. THEN.maybe I can get a reimaging to the second drive to take. THEN pull the original out and reimage another new one.
Bad blocks appear to be at the very end of the filesystem where no data is written. This is good.
@matigo sdb failed utterly. Dead. Kaput. Replaced with new drive. Now, it turns out, that sda has errors. SMART reported nothing. Never once got a degraded raid event until sdb failed. I should have gotten one the moment sda started developing errors. Unless, of course, it picked JUST NOW to develop them.
I'm forcing a sync as best I can onto the new drive, booting from it and replacing sda too.
I may wind up having to put a bare drive in, cpio the filesystem over, go through the gyrations of making the new drive bootable and a raid device and going from there.
Aug 20 06:03:27 server kernel: md/raid1:md1: sda: unrecoverable I/O read error for block 4799744
Well, this isn't going very well at all.
[>………………..] recovery = 1.3% (13498240/975735676) finish=164.4min speed=97542K/sec
And while the RAID volume recovers, I will busy myself with other matters, I guess. Thank God that system is only 1TB RAID 1/.
Got my obscure dynamic dhcp/bind issue at the house resolved. My options are to either play Fallout 4 or drive to the office NOW and get the server situation rectified and attack a few remaining annoyances while no-one is there to bother me.
Opting for the latter.