The rig is a 7-drive mdadm RAID5 consisting of mis-matched branded 2TB drives. 5 of those drives are attached to SATA ports on the motherboard, while another 2 are in a 5-disk Rosewill SATA enclosure. This enclosure is attached via a Sil 3132 PCIe eSata card that supports port-multiplication.
03:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)
This and the enclosure came as part of Rosewil's RSV-5 system.
I had previous been usin gthe RSV-4 system, which worked fairly well. That system eventually gave me errors withing dmesg, that I originally attributed to a bad eSATA cable. Eventually that enclosure died because of an errant power surge.
Replacing it with the RSV-5 yielded hte same dmesg errors. No amount of replacement would alleviate the errors. Eventually I figured out, that the errors when away after I upgraded from Ubuntu 14.04 to 16.04. It has performed well since.
Recently I ran into similar ( but not exact) errors as before.
I have now tried my usual trick ( --assemble --force, and the drives in the correct order). But now the error comes back as:
sudo mdadm --verbose --assemble --force /dev/md127 /dev/sdf1 /dev/sdc1 /dev/sdb1 /dev/sdd1 /dev/sdg1 /dev/sde1 /dev/sdh1
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdf1 is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sdc1 is identified as a member of /dev/md127, slot 1.
mdadm: /dev/sdb1 is identified as a member of /dev/md127, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md127, slot 3.
mdadm: /dev/sdg1 is identified as a member of /dev/md127, slot 4.
mdadm: /dev/sde1 is identified as a member of /dev/md127, slot 5.
mdadm: /dev/sdh1 is identified as a member of /dev/md127, slot 6.
mdadm: added /dev/sdc1 to /dev/md127 as 1
mdadm: added /dev/sdb1 to /dev/md127 as 2
mdadm: added /dev/sdd1 to /dev/md127 as 3
mdadm: added /dev/sdg1 to /dev/md127 as 4 (possibly out of date)
mdadm: added /dev/sde1 to /dev/md127 as 5
mdadm: added /dev/sdh1 to /dev/md127 as 6 (possibly out of date)
mdadm: added /dev/sdf1 to /dev/md127 as 0
mdadm: /dev/md127 assembled from 5 drives - not enough to start the array.
I thought it was because the Event counter was too far off:
Events : 120796
Events : 120796
Events : 120796
Events : 120796
Events : 120796
Events : 120788
Events : 120788
For posterity here is a good example of a drive:
/dev/sdb1:
Magic : a92b4efc
Version : 0.90.00
UUID : f21f5306:c8a07e60:fad3a920:52a40d5b
Creation Time : Tue Dec 21 20:21:48 2010
Raid Level : raid5
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Array Size : 11721071616 (11178.09 GiB 12002.38 GB)
Raid Devices : 7
Total Devices : 5
Preferred Minor : 127
Update Time : Thu Aug 9 23:39:20 2018
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 2
Spare Devices : 0
Checksum : ce567320 - correct
Events : 120796
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 17 2 active sync /dev/sdb1
0 0 8 81 0 active sync /dev/sdf1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 17 2 active sync /dev/sdb1
3 3 8 49 3 active sync /dev/sdd1
4 4 0 0 4 faulty removed
5 5 8 65 5 active sync /dev/sde1
6 6 0 0 6 faulty removed
A drive that was kicked:
/dev/sdh1:
Magic : a92b4efc
Version : 0.90.00
UUID : f21f5306:c8a07e60:fad3a920:52a40d5b
Creation Time : Tue Dec 21 20:21:48 2010
Raid Level : raid5
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Array Size : 11721071616 (11178.09 GiB 12002.38 GB)
Raid Devices : 7
Total Devices : 7
Preferred Minor : 127
Update Time : Thu Aug 9 23:31:29 2018
State : clean
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Checksum : ce567281 - correct
Events : 120788
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 6 8 113 6 active sync /dev/sdh1
0 0 8 81 0 active sync /dev/sdf1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 17 2 active sync /dev/sdb1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 97 4 active sync /dev/sdg1
5 5 8 65 5 active sync /dev/sde1
6 6 8 113 6 active sync /dev/sdh1
I opened up this thread on ubuntu forums:
https://ubuntuforums.org/showthread.php?t=2399971&p=13796868#post13796868
They suggested I try --run.
This yielded:
sudo mdadm --verbose --assemble --force --run /dev/md127 /dev/sdf1 /dev/sdc1 /dev/sdb1 /dev/sdd1 /dev/sdg1 /dev/sde1 /dev/sdh1
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdf1 is identified as a member of /dev/md127, slot 0.
mdadm: /dev/sdc1 is identified as a member of /dev/md127, slot 1.
mdadm: /dev/sdb1 is identified as a member of /dev/md127, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md127, slot 3.
mdadm: /dev/sdg1 is identified as a member of /dev/md127, slot 4.
mdadm: /dev/sde1 is identified as a member of /dev/md127, slot 5.
mdadm: /dev/sdh1 is identified as a member of /dev/md127, slot 6.
mdadm: added /dev/sdc1 to /dev/md127 as 1
mdadm: added /dev/sdb1 to /dev/md127 as 2
mdadm: added /dev/sdd1 to /dev/md127 as 3
mdadm: added /dev/sdg1 to /dev/md127 as 4 (possibly out of date)
mdadm: added /dev/sde1 to /dev/md127 as 5
mdadm: added /dev/sdh1 to /dev/md127 as 6 (possibly out of date)
mdadm: added /dev/sdf1 to /dev/md127 as 0
mdadm: failed to RUN_ARRAY /dev/md127: Input/output error
mdadm: Not enough devices to start the array.
It was suggested that maybe the Input/Output error was because of one of the drives instead of the md127 not being able to be created.
Googling has not resulted in any conclusive direction.
I did find this possible sokution:
https://ubuntuforums.org/showthread.php?t=2276699&page=2&highlight=mdadm+event+counter
sudo mdadm --stop /dev/md2
sudo mdadm --zero-superblock /dev/sd[abcdhijk]
sudo mdadm --create --assume-clean /dev/md2 /dev/sd[abcdhijk]
Though it is cautioned as the "nuclear option" and thus I'm saiving it for when all other alternatives have been exhausted .
I will keep this page updated as I try more things.
No comments:
Post a Comment