Talk:Raid:Manual Rebuild

From SME Server
Revision as of 22:29, 6 February 2013 by Stephdl (talk | contribs)
Jump to navigationJump to search

Please see my remarks at User_talk:Davidbray — Cactus (talk | contribs 16:48, 19 March 2010 (UTC)

Thanks Cactus - I've made some changes here so look forward to your feedback

HowTo: Write the GRUB boot sector

Trex (talk) 00:26, 5 February 2013 (MST) Should add a note as per the comment 24 in this Bug re grub will not install on an unpartioned disk

Stephdl (talk) 12:24, 6 February 2013 (MST) ok i work on the howTo...work in progress, don't disturb :p



HowTo: Remove a disk from the RAID1 Array from the command Line

Look at the mdstat

First we must determine which drive is in default.


[root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
      
md2 : active raid1 sdb2[2](F) sda2[0]
      52323584 blocks [2/1] [U_]
      
unused devices: <none>

(S)= Spare (F)= Fail [0]= number of the disk


Important.png Note:
As we can see the partition sdb2 is in default, we can see the flag: sdb2 [2] (F). We need to resynchronize the disk sdb to the existing array md2.


Fail and remove the disk, sdb in this case

[root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2
[root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md1
[root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1

Do your Disk Maintenance here

At this point the disk is idle.

[root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]
      
md2 : active raid1 sda2[0]
      52323584 blocks [2/1] [U_]
      
unused devices: <none>


Important.png Note:
You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some outages times for physical disk itself. It is necessary to test the hard drive if this occurs repeatedly. For this we will use smartctl.


smartctl -a /dev/sdb  # For all the details available by SMART on the disk

At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).

smartctl -t short /dev/sdb #short test
smartctl -t long  /dev/sdb #long test

to access the results / statistics for these tests:

smartctl -l selftest /dev/sdb


Add the partitions back

[root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
mdadm: hot added /dev/sdb1
[root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
mdadm: hot added /dev/sdb2

Another Look at the mdstat

[root@ ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb2[2] sda2[0]
      488279488 blocks [2/1] [U_]
      [=>...................]  recovery =  6.3% (31179264/488279488) finish=91.3min speed=83358K/sec
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]

unused devices: <none>