Difference between revisions of "Talk:Raid:Manual Rebuild"

From SME Server
Jump to navigationJump to search
Line 61: Line 61:
 
{{note box|You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some outages times for physical disk itself. It is necessary to test the hard drive if this occurs repeatedly. For this we will use '''smartctl'''.}}
 
{{note box|You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some outages times for physical disk itself. It is necessary to test the hard drive if this occurs repeatedly. For this we will use '''smartctl'''.}}
  
smartctl -a /dev/sdb  # For all the details available by SMART on the disk
+
For all the details available by SMART on the disk
 +
 
 +
smartctl -a /dev/sdb
  
 
At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).
 
At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).
Line 72: Line 74:
 
  smartctl -l selftest /dev/sdb
 
  smartctl -l selftest /dev/sdb
  
 +
{{Note box|if you need to change the disk due to physical failure found by the smartctl command, install a new disk of the same capacity (or more) and enter the following commands to recreate new partitions by copying them from healthy disk sda.}}
 +
 +
sfdisk -d /dev/sda > sfdisk_sda.output
 +
sfdisk /dev/sdb < sfdisk_sda.output
  
 +
If you want to reintegrate the same disk without replacing it, go to the next step.
  
 
=== Add the partitions back ===
 
=== Add the partitions back ===
Line 83: Line 90:
 
=== Another Look at the mdstat ===
 
=== Another Look at the mdstat ===
  
  [root@ ~]# '''cat /proc/mdstat'''
+
  [root@sme8-64-dev ~]# cat /proc/mdstat
  Personalities : [raid1]
+
  Personalities : [raid1]  
md2 : active raid1 sdb2[2] sda2[0]
 
      488279488 blocks [2/1] [U_]
 
      [=>...................]  recovery =  6.3% (31179264/488279488) finish=91.3min speed=83358K/sec
 
 
  md1 : active raid1 sdb1[1] sda1[0]
 
  md1 : active raid1 sdb1[1] sda1[0]
 
       104320 blocks [2/2] [UU]
 
       104320 blocks [2/2] [UU]
 +
     
 +
md2 : active raid1 sdb2[2] sda2[0]
 +
      52323584 blocks [2/1] [U_]
 +
      [>....................]  recovery =  1.9% (1041600/52323584) finish=14.7min speed=57866K/sec
 
   
 
   
 
  unused devices: <none>
 
  unused devices: <none>

Revision as of 22:46, 6 February 2013

Please see my remarks at User_talk:Davidbray — Cactus (talk | contribs 16:48, 19 March 2010 (UTC)

Thanks Cactus - I've made some changes here so look forward to your feedback

HowTo: Write the GRUB boot sector

Trex (talk) 00:26, 5 February 2013 (MST) Should add a note as per the comment 24 in this Bug re grub will not install on an unpartioned disk

Stephdl (talk) 12:24, 6 February 2013 (MST) ok i work on the howTo...work in progress, don't disturb :p



HowTo: Remove a disk from the RAID1 Array from the command Line

Look at the mdstat

First we must determine which drive is in default.


[root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
      
md2 : active raid1 sdb2[2](F) sda2[0]
      52323584 blocks [2/1] [U_]
      
unused devices: <none>

(S)= Spare (F)= Fail [0]= number of the disk


Important.png Note:
As we can see the partition sdb2 is in default, we can see the flag: sdb2 [2] (F). We need to resynchronize the disk sdb to the existing array md2.


Fail and remove the disk, sdb in this case

[root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2
[root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md1
[root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1

Do your Disk Maintenance here

At this point the disk is idle.

[root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]
      
md2 : active raid1 sda2[0]
      52323584 blocks [2/1] [U_]
      
unused devices: <none>


Important.png Note:
You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some outages times for physical disk itself. It is necessary to test the hard drive if this occurs repeatedly. For this we will use smartctl.


For all the details available by SMART on the disk

smartctl -a /dev/sdb

At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).

smartctl -t short /dev/sdb #short test
smartctl -t long  /dev/sdb #long test

to access the results / statistics for these tests:

smartctl -l selftest /dev/sdb


Important.png Note:
if you need to change the disk due to physical failure found by the smartctl command, install a new disk of the same capacity (or more) and enter the following commands to recreate new partitions by copying them from healthy disk sda.


sfdisk -d /dev/sda > sfdisk_sda.output
sfdisk /dev/sdb < sfdisk_sda.output

If you want to reintegrate the same disk without replacing it, go to the next step.

Add the partitions back

[root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
mdadm: hot added /dev/sdb1
[root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
mdadm: hot added /dev/sdb2

Another Look at the mdstat

[root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1] 
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
      
md2 : active raid1 sdb2[2] sda2[0]
      52323584 blocks [2/1] [U_]
      [>....................]  recovery =  1.9% (1041600/52323584) finish=14.7min speed=57866K/sec

unused devices: <none>