Difference between revisions of "Raid"

From SME Server
Jump to navigationJump to search
(Wikification, add numbers to the instruction and a minor typo)
Line 162: Line 162:
  
 
====Convert Software RAID1 to RAID5====
 
====Convert Software RAID1 to RAID5====
Note: these instructions are only applicable if you have SME8 and a RAID1 system with 2 hd in sync; new drive(s) must be of the same size or larger as the current drive(s)
+
{{Note box|msg=these instructions are only applicable if you have SME8 and a RAID1 system with 2 hd in sync; new drive(s) must be of the same size or larger as the current drive(s)}}
 
+
{{Warning box|msg=Please make a full backup before proceeding}}
* CAUTION MAKE A FULL BACKUP!
+
<ol></li><li>Login as root
 
+
</li><li>Move to /boot (we must create a new initrd image to load raid5 driver).
Login as root
 
Move to /boot (we must create a new initrd image to load raid5 driver).
 
 
  cd /boot
 
  cd /boot
Make a backup copy  
+
</li><li>Make a backup copy  
 
  mv initrd-`uname -r`.img initrd-`uname -r`.img.old
 
  mv initrd-`uname -r`.img initrd-`uname -r`.img.old
Create the new image
+
</li><li>Create the new image
 
  mkinitrd --preload raid5 initrd-`uname -r`.img `uname -r`
 
  mkinitrd --preload raid5 initrd-`uname -r`.img `uname -r`
  
Shut down and install new drive(s) in system.  
+
</li><li>Shut down and install new drive(s) in system.  
  
Boot up with SME cd and enter the rescue mode.
+
</li><li>Boot up with SME cd and enter the rescue mode.
 
  sme rescue
 
  sme rescue
Skip network setup.
+
</li><li>Skip network setup.
Skip mounting the current SME installation.
+
</li><li>Skip mounting the current SME installation.
Now, create on the new drive(s) the correct partition table.
+
</li><li>Now, create on the new drive(s) the correct partition table.
 
  sfdisk -d /dev/sda > tmp.out
 
  sfdisk -d /dev/sda > tmp.out
 
  sfdisk /dev/sdc < tmp.out
 
  sfdisk /dev/sdc < tmp.out
  
Repeat the last step for each new hd (sdd, sde ecc.).
+
</li><li>Repeat the last step for each new hd (sdd, sde ecc.).
Create the new array
+
</li><li>Create the new array
 
  mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2
 
  mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2
 
  mdadm: /dev/sda2 appears to be part of a raid array:
 
  mdadm: /dev/sda2 appears to be part of a raid array:
Line 194: Line 192:
 
  mdadm: array /dev/md2 started.
 
  mdadm: array /dev/md2 started.
  
Wait for resync; monitor the status with
+
</li><li>Wait for resync; monitor the status with
 
  cat /proc/mdstat
 
  cat /proc/mdstat
 
   
 
   
Line 202: Line 200:
 
  1048512 blocks level 5, 256k chunk, algorithm 2 [2/1] [U_]
 
  1048512 blocks level 5, 256k chunk, algorithm 2 [2/1] [U_]
 
  [==>..................]  recovery = 12.5% (132096/1048512) finish=0.8min speed=18870K/sec
 
  [==>..................]  recovery = 12.5% (132096/1048512) finish=0.8min speed=18870K/sec
Reboot  
+
</li><li>Reboot  
 
  exit
 
  exit
Login as root
+
</li><li>Login as root
Add the new drives to the array
+
</li><li>Add the new drives to the array
 
  mdadm --add /dev/md2 /dev/sdc2
 
  mdadm --add /dev/md2 /dev/sdc2
  
Repeat the last step for each new hd (sdd2, sde2 ecc.)
+
</li><li>Repeat the last step for each new hd (sdd2, sde2 ecc.)
  
Grow the array
+
</li><li>Grow the array
 
  mdadm --grow /dev/md2 --raid-devices=N
 
  mdadm --grow /dev/md2 --raid-devices=N
  
N is the total number of drives: minimun is 3
+
</li><li>N is the total number of drives: minimum is 3
  
Wait for array reshaping. This part can take a substantial amount of time; monitor it with
+
</li><li>Wait for array reshaping. This part can take a substantial amount of time; monitor it with
 
  cat /proc/mdstat
 
  cat /proc/mdstat
 
   
 
   
Line 224: Line 222:
 
  [==>..................]  reshape = 12.5% (131520/1048512) finish=2.5min speed=5978K/sec
 
  [==>..................]  reshape = 12.5% (131520/1048512) finish=2.5min speed=5978K/sec
  
 
+
</li><li>Issue the following commands:
Issue the following commands:
 
 
  pvresize /dev/md2
 
  pvresize /dev/md2
 
  lvresize -l +100%FREE main/root
 
  lvresize -l +100%FREE main/root
 
  resize2fs /dev/main/root   
 
  resize2fs /dev/main/root   
 
+
</li></ol>
 
Notes :   
 
Notes :   
 
* If you have disabled lvm   
 
* If you have disabled lvm   
Line 236: Line 233:
 
# More info: http://www.arkf.net/blog/?p=47
 
# More info: http://www.arkf.net/blog/?p=47
  
 +
----
 
<noinclude>[[Category:Howto]][[Category:Administration:Storage]]</noinclude>
 
<noinclude>[[Category:Howto]][[Category:Administration:Storage]]</noinclude>

Revision as of 18:57, 10 August 2010

Hard Drives – Raid

SME Server 7 introduces a new feature - Automatic configuration of Software RAID 1, 5 or 6. RAID is a way of storing data on more than one hard drive at once, so that if one drive fails, the system will still function.

Your server will be automatically configured as follows:

  • 1 Drive - Software RAID 1 (ready to accept a second drive).
  • 2 Drives - Software RAID 1
  • 3 Drives - Software RAID 1 + 1 Hot-spare
  • 4-6 Drives - Software RAID 5 + 1 Hot-spare
  • 7+ Drives - Software RAID 6 + 1 Hot-spare

Hard Drive Layout

Mirroring drives in the same IDE channel (eg. hda and hdb) is not desirable. If that channel goes out, you may loose both drives. Also, performance will suffer slightly.

The preferred method is to use the master location on each IDE channel (eg. hda and hdc). This will ensure that if you loose one channel, the other will still operate. It will also give you the best performance.

In a 2 drive setups put each drive on a different IDE channel:

IDE 1 Master - Drive 1
IDE 1 Slave - CDROM
IDE 2 Master - Drive 2

Identifying Hard Drives

It may not always be obvious which physical hard drive maps to which logical device. The simplest method to verify this if you have a drive with S.M.A.R.T. capability is to map the serial number on the physical package with that displayed by smartctl. Assuming the device of interest is sda then you would issue the following command from root:

smartctl -i /dev/sda


Adding another Hard Drive Later

ENSURE THAT THE NEW DRIVE IS THE SAME SIZE OR LARGER AS THE CURRENT DRIVE(S)

  • Shut down the machine
  • Install drive as master on the second IDE channel (hdc)
  • Boot up
  • Log on as admin to get to the admin console
  • Go to #5 Manage disk redundancy

It should tell you there if the drives are syncing up. Don't turn off the server until the sync is complete or it will start from the beginning again. When it is done syncing it will show a good working raid1.

Reusing Hard Drives

If it was ever installed on a Windows machine (or in some cases an old system) then you will need to clear the MBR first before installing it.

From the linux command prompt, type the following:

#dd if=/dev/zero of=/dev/hdx bs=512 count=1

You MUST reboot so that the empty partition table gets read correctly. For more information, check: http://bugs.contribs.org/show_bug.cgi?id=2154

Upgrading the Hard Drive Size

Note: these instructions are only applicable if you have a RAID system with more than one drive. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311

  • CAUTION MAKE A FULL BACKUP!
  • Ensure you have e-smith-base-4.16.0-33 or newer installed. [or Update to at least 7.1.3]
  1. Shut down and install larger drive in system.
  2. Boot up and manage raid to add new (larger) drive to system.
  3. Wait for raid to fully sync.
  4. Repeat steps 1-3 until all drives in system are upgraded to larger capacity.
  5. Ensure all drives have been replace with larger drives and array is in sync and redundant!
  6. Issue the following commands:
mdadm --grow /dev/md2 --size=max
pvresize /dev/md2
lvresize -l +100%FREE main/root
ext2online -C0 /dev/main/root   

In the last command above, the -C0 is: dash C zero

Notes :

  • All of this can be done while the server is up and running with the exception of #1.
  • These instructions should work for any raid level you have as long as you have >= 2 drives
  • If you have disabled lvm
  1. you don't need the pvresize or lvresize command
  2. the final line becomes ext2online -C0 /dev/md2 (or whatever / is mounted to)

Raid Notes

Many on board hardware raid cards are in fact software RAID. Turn it off as cheap "fakeraid" cards aren't good. You will get better performance and reliability with Linux Software RAID (http://linux-ata.org/faq-sata-raid.html). Linux software RAID is fast and robust.

If your persistent on getting a hardware raid, buy a well supported raid card which has a proper RAID BIOS. This hides the disks and presents a single disk to Linux (http://linuxmafia.com/faq/Hardware/sata.html). Please check that it is supported by the kernel and has some form of management. Also avoid anything which requires a driver. Try googling for the exact model of RAID controller before buying it. Please note that you won't get a real hardware raid controller cheap.

It rarely happens, but sometimes when a device has finished rebuilding, its state doesn't change from "dirty" to "clean" until a reboot occurs. This is cosmetic

nospare

If you use the commandline parameter nospare during installation ("sme nospare"), the system will still count the missing spare towards the number of drives. A system with 6 physically present harddrives thus will be formated Raid6 _not_ Raid5. Resulting capacity of course will be "n-2".

Resynchronising a Failed RAID

You can refer to 'man mdadm' or http://www.linuxmanpages.com/man8/mdadm.8.php

Sometimes a partition will be taken offline automatically. Admin will receive an email DegradedArray event on /dev/md2.

This will happen if, for example, a read or write error is detected in a disk in the RAID set, or a disk does not respond fast enough, causing a timeout. As a precaution, verify the health of your disks as documented in: http://wiki.contribs.org/Monitor_Disk_Health and specifically with the command:

smartctl -a /dev/hda

Where hda is the device to be checked; check all of them.

When this happens, the details of the raid can be seen by inspecting the mdstat file.

[root@sme]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda3[0] hdb3[1]
     38837056 blocks [2/2] [UU]

md2 : active raid1 hdb2[1]        <--    Shows current active partition - note there is one missing
     1048704 blocks [2/1] [_U]    <--    '_' = partition missing from array

md0 : active raid1 hda1[0] hdb1[1]
     255936 blocks [2/2] [UU]

Make a note of the raid partition that has failed, shown by [_U]
In this case it is md2, the device being /dev/md2.

Determine the missing physical partition, Look carefully, and fill in the gap,
in this example, it's hda2, the device being /dev/hda2

md1 : active raid1 hda3[0] hdb3[1]
md2 : active raid1 hda2[0] hdb2[1]
md0 : active raid1 hda1[0] hdb1[1]

If the raid has a failed disk that has not yet been kicked out of the array then mdstat will show something like the following:

md2 : active raid1 hda2[0](F) hdb2[1]   <--    Shows current active partition - with one FAILED (F)
     1048704 blocks [2/1] [_U]          <--    '_' = partition missing from array

In this case before you add the disk back in you will need to remove the disk as per:

[root@sme]# mdadm --remove /dev/md2 /dev/hda2

To add the physical partition back into that raid partition.

[root@sme]# mdadm --add /dev/md2 /dev/hda2

Your devices are likely to be different, and you may have more than two disks, including a hot standby, but will always be determined from the mdstat file. Once the raid resync has been started, the progress will be noted in mdstat. You can see this real time by:

[root@sme]# watch -n .1 cat /proc/mdstat

or you can see this in a snapshot by:

[root@sme]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda3[0] hdb3[1]
      38837056 blocks [2/2] [UU]

md2 : active raid1 hda2[2] hdb2[1]
      1048704 blocks [2/1] [_U]
      [=>...................]  recovery =  6.4% (67712/1048704) finish=1.2min speed=13542K/sec
md0 : active raid1 hda1[0] hdb1[1]
      255936 blocks [2/2] [UU]

When recovery is complete, the partitions will all be up:

[root@sme]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda3[0] hdb3[1]
      38837056 blocks [2/2] [UU]

md2 : active raid1 hda2[0] hdb2[1]
     1048704 blocks [2/2] [UU]

md0 : active raid1 hda1[0] hdb1[1]
      255936 blocks [2/2] [UU]

If this action is required regularly, you should test your disks for SMART errors and physical errors, check your disk cables, and make sure no two hard drives share the same IDE port. See also: http://wiki.contribs.org/Monitor_Disk_Health

Also check your driver cards, since a faulty card can destroy the data on a full RAID set as easily as it can a single disk.

Convert Software RAID1 to RAID5

Important.png Note:
these instructions are only applicable if you have SME8 and a RAID1 system with 2 hd in sync; new drive(s) must be of the same size or larger as the current drive(s)


Warning.png Warning:
Please make a full backup before proceeding


  1. Login as root
  2. Move to /boot (we must create a new initrd image to load raid5 driver). cd /boot
  3. Make a backup copy mv initrd-`uname -r`.img initrd-`uname -r`.img.old
  4. Create the new image mkinitrd --preload raid5 initrd-`uname -r`.img `uname -r`
  5. Shut down and install new drive(s) in system.
  6. Boot up with SME cd and enter the rescue mode. sme rescue
  7. Skip network setup.
  8. Skip mounting the current SME installation.
  9. Now, create on the new drive(s) the correct partition table. sfdisk -d /dev/sda > tmp.out sfdisk /dev/sdc < tmp.out
  10. Repeat the last step for each new hd (sdd, sde ecc.).
  11. Create the new array mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2 mdadm: /dev/sda2 appears to be part of a raid array: level=raid1 devices=2 ctime=Fri Dec 18 13:17:49 2009 mdadm: /dev/sdb2 appears to be part of a raid array: level=raid1 devices=2 ctime=Fri Dec 18 13:17:49 2009 Continue creating array? y mdadm: array /dev/md2 started.
  12. Wait for resync; monitor the status with cat /proc/mdstat root# cat /proc/mdstat Personalities : [raid0] [raid1] [raid5] md2 : active raid5 sdb1[2] sda1[0] 1048512 blocks level 5, 256k chunk, algorithm 2 [2/1] [U_] [==>..................] recovery = 12.5% (132096/1048512) finish=0.8min speed=18870K/sec
  13. Reboot exit
  14. Login as root
  15. Add the new drives to the array mdadm --add /dev/md2 /dev/sdc2
  16. Repeat the last step for each new hd (sdd2, sde2 ecc.)
  17. Grow the array mdadm --grow /dev/md2 --raid-devices=N
  18. N is the total number of drives: minimum is 3
  19. Wait for array reshaping. This part can take a substantial amount of time; monitor it with cat /proc/mdstat root# cat /proc/mdstat Personalities : [raid0] [raid1] [raid5] md2 : active raid5 sdc1[2] sdb1[1] sda1[0] 1048512 blocks super 0.91 level 5, 256k chunk, algorithm 2 [3/3] [UUU] [==>..................] reshape = 12.5% (131520/1048512) finish=2.5min speed=5978K/sec
  20. Issue the following commands: pvresize /dev/md2 lvresize -l +100%FREE main/root resize2fs /dev/main/root

Notes :

  • If you have disabled lvm
  1. you don't need the pvresize or lvresize command
  2. the final line becomes resize2fs /dev/md2 (or whatever / is mounted to)
  3. More info: http://www.arkf.net/blog/?p=47