Raid
Hard Drives – Raid
From SME Server 8 a new feature was introduced - Automatic configuration of Software RAID 1, 5 or 6. RAID is a way of storing data on more than one hard drive at once, so that if one drive fails, the system will still function.
Your server will be automatically configured as follows:
- 1 Drive - Software RAID 1 (degraded RAID1 mirror ready to accept a second drive).
- 2 Drives - Software RAID 1
- 3 Drives - Software RAID 1 + 1 Hot-spare
- 4-6 Drives - Software RAID 5 + 1 Hot-spare
- 7+ Drives - Software RAID 6 + 1 Hot-spare
Hard Drive Layout
Mirroring drives in the same IDE channel (eg. hda and hdb) is not desirable. If that channel goes out, you may loose both drives. Also, performance will suffer slightly.
The preferred method is to use the master location on each IDE channel (eg. hda and hdc). This will ensure that if you loose one channel, the other will still operate. It will also give you the best performance.
In a 2 drive setups put each drive on a different IDE channel:
IDE 1 Master - Drive 1
IDE 1 Slave - CDROM
IDE 2 Master - Drive 2
Obviously this section is completely obsolete with SATA hard drives because each disk has its own channel.
Identifying Hard Drives
It may not always be obvious which physical hard drive maps to which logical device. The simplest method to verify this if you have a drive with S.M.A.R.T. capability is to map the serial number on the physical package with that displayed by smartctl. Assuming the device of interest is sda , (a SCSI drive), then you would issue the following command from root:
smartctl -i /dev/sda
Or if an IDE Drive
smartctl -i /dev/hda
Adding another Hard Drive Later (Raid1 array only)
ENSURE THAT THE NEW DRIVE IS THE SAME SIZE OR LARGER AS THE CURRENT DRIVE(S)
- Shut down the machine
- Install drive as master on the second IDE channel (hdc) or the second SATA channel (sda)
- Boot up
- At the login prompt log on as admin with the root password to get to the admin console
- Go to #5 Manage disk redundancy
It will show the status and progress if the drives are syncing up. Don't turn off the server until the sync is complete or it will start syncing again from the beginning. When it is done syncing, it will show a good working Raid1.
If the Manage disk redundancy page displays the message "The free disk count must equal one" and "Manual intervention may be required", then you probably have additional hard drives that need to be disconnected while the RAID is set up. An external USB drive will have this effect, and should be unplugged.
First we need write the partition table from sda (or sdb) to sdc :
sfdisk -d /dev/sda > sfdisk_sda.output sfdisk /dev/sdc < sfdisk_sda.output
Then we need to add the new partitions to the existings arrays :
mdadm --add /dev/md1 /dev/sdc1 mdadm --add /dev/md2 /dev/sdc2
Verify this with :
mdadm --detail /dev/md1 mdadm --detail /dev/md2
/dev/md1: Version : 0.90 Creation Time : Sat Feb 2 22:24:38 2013 Raid Level : raid1 Array Size : 104320 (101.89 MiB 106.82 MB) Used Dev Size : 104320 (101.89 MiB 106.82 MB) Raid Devices : 2 Total Devices : 3 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Mon Feb 4 13:28:43 2013 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 UUID : f97a86c5:8bb46daa:6854855e:558a3e16 Events : 0.6 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 - spare /dev/sdc1
Alternatively you can try this.
cat /proc/mdstat
cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sdc1[2](S) sdb1[1] sda1[0] 104320 blocks [2/2] [UU] md2 : active raid1 sdc2[2](S) sdb2[1] sda2[0] 52323584 blocks [2/2] [UU]
(S)= Spare (F)= Fail [0]= number of the disk
You should ensure that grub has been written correctly to the spare disk to ensure that it will boot correctly.
To copy boot partition (sda=disk of the array sdc=the spare), from within a terminal with administrator privileges :
Step 1
dd if=/dev/sda1 of=/dev/sdc1
Issue following from within a terminal with administrator privileges :
Step2
grub device (hd2) /dev/sdc root (hd2,0) setup (hd2)
Last of all, try forcing a failure of one of the original two drives and ensure that the server boots, and the RAID rebuilds corectly. You may then have to repeat this exercise to get the drives in the correct order (i.e sda/sdb in the array with sdc as the spare)
Reusing Hard Drives
If it was ever installed on a Windows machine (or in some cases an old system) then you will need to clear the MBR first before installing it.
From the linux command prompt, type the following:
#dd if=/dev/zero of=/dev/hdx bs=512 count=1
or
#dd if=/dev/zero of=/dev/sdx bs=512 count=1
You MUST reboot so that the empty partition table gets read correctly. For more information, check: http://bugs.contribs.org/show_bug.cgi?id=2154
Upgrading the Hard Drive Size
Note: these instructions are only applicable if you have a RAID system with more than one drive. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
- CAUTION MAKE A FULL BACKUP!
- Ensure you have e-smith-base-4.16.0-33 or newer installed. [or Update to at least 7.1.3]
HD Scenario - Current 250gb drives, new larger 500gb drives
- Shut down and install one larger drive in system for one old HD. Unplug any USB-connected drives.
- Boot up and manage raid to add new (larger) drive to system.
- Wait for raid to fully sync.
- Repeat steps 1-3 until all drives in system are upgraded to larger capacity.
- Ensure all drives have been replace with larger drives and array is in sync and redundant!
- Issue the following commands:
mdadm --grow /dev/md2 --size=max pvresize /dev/md2 lvresize -l +100%FREE main/root ext2online -C0 /dev/main/root
In the last command above, the -C0 is: dash C zero
If you receive an "command not found" error, try this:
resize2fs /dev/mapper/main-root &
TIP: I put an "&" at end to allow it to run in background even if I close ssh session.
Notes :
- All of this can be done while the server is up and running with the exception of #1.
- These instructions should work for any raid level you have as long as you have >= 2 drives
- If you have disabled lvm
- you don't need the pvresize or lvresize command
- the final line becomes
ext2online -C0 /dev/md2 #(or whatever / is mounted to)
or If you receive an "command not found" error, try this:
resize2fs /dev/md2 &
Replacing and Upgrading Hard Drive after HD fail
Note: See Bugzilla: 6632 and Bugzilla:6630 a suggested sequence for Upgrading a Hard Drive size is detailed after issue when attempting to sync a new drive when added first as sda.
Note: These instructions are applicable if you have a faulty HD on a RAID system with more than one drive and intend to upgrade the sizes as well as replacing the failed HD. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
- CAUTION MAKE A FULL BACKUP!
- Ensure you have e-smith-base-4.16.0-33 or newer installed. [or Update to at least 7.1.3]
HD Scenario - Current 250gb drives, new larger 500gb drives
- Remove failed HDD from system, ensure remaining drive is on sda on its own and boot up.
- Shutdown, connect one new 500gb drive as sdb and boot up
- Login to the admin panel and manage raid to add new (larger) drive to system.
- Wait for raid to fully sync
- Do full reboot with those 2 drives in place (1 original, 1 new)
- Shutdown again, disconnect the original drive, and connect the new drive just sync'd as sda (in place of original)
- Boot up again with just the one new drive in place, and confirm it boots OK.
- Shutdown, and connected the other 500gb drive as sdb
- Boot up login to admin panel and add sdb to the array, and wait for raid to fully sync.
- Reboot and ensure all drives have been replaced with larger drives and array is in sync and redundant!
- Issue the following commands:
mdadm --grow /dev/md2 --size=max pvresize /dev/md2 lvresize -l +100%FREE main/root ext2online -C0 /dev/main/root
In the last command above, the -C0 is: dash C zero
If you receive an "command not found" error, try this:
resize2fs /dev/mapper/main-root &
TIP: I put an "&" at end to allow it to run in background even if I close ssh session.
Notes :
- These instructions should work for any raid level you have as long as you have >= 2 drives
- If you have disabled lvm
- you don't need the pvresize or lvresize command
- the final line becomes ext2online -C0 /dev/md2 (or whatever / is mounted to)
Raid Notes
Many on board hardware raid cards are in fact software RAID. Turn it off as cheap "fakeraid" cards aren't good. You will get better performance and reliability with Linux Software RAID (http://linux-ata.org/faq-sata-raid.html). Linux software RAID is fast and robust.
If your persistent on getting a hardware raid, buy a well supported raid card which has a proper RAID BIOS. This hides the disks and presents a single disk to Linux (http://linuxmafia.com/faq/Hardware/sata.html). Please check that it is supported by the kernel and has some form of management. Also avoid anything which requires a driver. Try googling for the exact model of RAID controller before buying it. Please note that you won't get a real hardware raid controller cheap.
It rarely happens, but sometimes when a device has finished rebuilding, its state doesn't change from "dirty" to "clean" until a reboot occurs. This is cosmetic
Periodic scrub of RAID arrays
A Periodic scrub of RAID arrays (weekly raid check) is performed every week on Sunday at 04:22 local time, refer Bugzilla:3535 and Bugzilla:6160 for more information.
Theses operations are logged, however, no emails will be sent to admin as of the release of packages associated with Bug #6160 or the release of the 8.1 ISO.
nospare
If you use the commandline parameter nospare during installation ("sme nospare"), the system will still count the missing spare towards the number of drives. A system with 6 physically present harddrives thus will be formated Raid6 _not_ Raid5. Resulting capacity of course will be "n-2". Note: with the release of version 7.6 and 8.0, the commandline parameter "sme nospare" has been changed to "sme spares=0" . In addition, you may also select the number of spare(s) implemented [0,1,or 2].
Resynchronising a Failed RAID
Sometimes a partition will be taken offline automatically. Admin will receive an email DegradedArray event on /dev/md2.
smartctl -a /dev/hda
Where hda is the device to be checked; check all of them.
You may check the health of your array using the Admin Console. Login as root, type console. Select Item 5. "Manage disk reduncancy"
--------Disk Reduncancy status as of Thursday Dec 22 ------- Current RAID status: Personalities : [raid1] md2 : active raid1 hda2[0] <-- NOTICE hdb2[#] is missing. Means hdb2[#] failed. 38973568 blocks [2/1] [U_] md1 : active raid1 hda1[0] hdb1[1] 104320 blocks [2/2] [UU] unused devices: <none> Only Some of the RAID devices are unclean. <-- NOTICE This message and Manual intervention may be required. <-- this message.
Notice the last 2 sentences of the window above. You have some problems.
If your system is healthy however the message you will see at the bottom of Raid Console window is:
All RAID devices are in clean state
If you have no software RAID devices you will see the message at the bottom of the Console window:
Your system only has a single disk drive installed or is using hardware
mirroring. If you would like to enable software mirroring, please shut
down, install a second disk drive (of the same capacity) and then return
to this screen.
Additionally, the details of the raid can be seen by inspecting the mdstat file from the shell prompt.
[root@sme]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hda3[0] hdb3[1] 38837056 blocks [2/2] [UU] md2 : active raid1 hdb2[1] <-- Shows current active partition - note there is one missing 1048704 blocks [2/1] [_U] <-- '_' = partition missing from array md0 : active raid1 hda1[0] hdb1[1] 255936 blocks [2/2] [UU]
Make a note of the raid partition that has failed, shown by [_U]
In this case it is md2, the device being /dev/md2.
The failed drive partition is indicated by the '_' underline character. In the above example _U indicates that the first drive partition on md2, (Multi-Device 2) has failed. The second drive partition on md2, symbolized by the character 'U' is still part of the md2. If the second drive partition had failed, that is hdb2 then the details would be reversed. E.g. [U_] . Placing the _ Underline second in the details.
Determine the missing physical partition, Look carefully at the sample above and fill in the gap for which drive is missing.
In this example, it's hda2, the device being /dev/hda2
md1 : active raid1 hda3[0] hdb3[1] md2 : active raid1 hda2[0] hdb2[1] <--- In the above sample hda2[0] is missing md0 : active raid1 hda1[0] hdb1[1]
If the raid has a failed disk that has not yet been kicked out of the array then mdstat will show something like the following:
md2 : active raid1 hda2[0](F) hdb2[1] <-- Shows current active partition - with one FAILED (F) 1048704 blocks [2/1] [_U] <-- '_' = partition missing from array
In this case before you add the disk back in you will need to remove the disk as per:
[root@sme]# mdadm --remove /dev/md2 /dev/hda2
However if the drive has already been removed by the operating system then removing the drive is unnecessary. To determine this use the command:
mdadm --query --detail /dev/md2
Of course use the proper md# based on your configuration. This command will give you several lines of data, including the size of the array. Near the end of the output you will see the following if the drive has been removed already. There is no need to remove the drive since it has already been removed.
Number Major Minor RaidDevice State 0 3 2 0 active sync /dev/hda2 1 0 0 - removed <-- NOTE THIS
To add the physical partition back and rebuild the raid partition.
[root@sme]# mdadm --add /dev/md2 /dev/hda2
Your devices are likely to be different, and you may have more than two disks, including a hot standby, but will always be determined from the mdstat file. Once the raid resync has been started, the progress will be noted in mdstat. You can see this real time by:
[root@sme]# watch -n .1 cat /proc/mdstat
or you can see this in a snapshot by:
[root@sme]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hda3[0] hdb3[1] 38837056 blocks [2/2] [UU] md2 : active raid1 hda2[2] hdb2[1] 1048704 blocks [2/1] [_U] [=>...................] recovery = 6.4% (67712/1048704) finish=1.2min speed=13542K/sec md0 : active raid1 hda1[0] hdb1[1] 255936 blocks [2/2] [UU]
When recovery is complete, the partitions will all be up:
[root@sme]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hda3[0] hdb3[1] 38837056 blocks [2/2] [UU] md2 : active raid1 hda2[0] hdb2[1] 1048704 blocks [2/2] [UU] md0 : active raid1 hda1[0] hdb1[1] 255936 blocks [2/2] [UU]
If this action is required regularly, you should test your disks for SMART errors and physical errors, check your disk cables, and make sure no two hard drives share the same IDE port. See also: http://wiki.contribs.org/Monitor_Disk_Health
Also check your driver cards, since a faulty card can destroy the data on a full RAID set as easily as it can a single disk.
Convert Software RAID1 to RAID5
- Login as root
- Move to /boot (we must create a new initrd image to load raid5 driver). cd /boot
- Make a backup copy mv initrd-`uname -r`.img initrd-`uname -r`.img.old
- Create the new image mkinitrd --preload raid5 initrd-`uname -r`.img `uname -r`
- Shut down and install new drive(s) in system.
- Boot up with SME cd and enter the rescue mode. sme rescue
- Skip network setup.
- Skip mounting the current SME installation.
- Now, create on the new drive(s) the correct partition table. sfdisk -d /dev/sda > tmp.out sfdisk /dev/sdc < tmp.out
- Repeat the last step for each new hd (sdd, sde ecc.).
- Create the new array mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2 mdadm: /dev/sda2 appears to be part of a raid array: level=raid1 devices=2 ctime=Fri Dec 18 13:17:49 2009 mdadm: /dev/sdb2 appears to be part of a raid array: level=raid1 devices=2 ctime=Fri Dec 18 13:17:49 2009 Continue creating array? y mdadm: array /dev/md2 started.
- Wait for resync; monitor the status with cat /proc/mdstat root# cat /proc/mdstat Personalities : [raid0] [raid1] [raid5] md2 : active raid5 sdb1[2] sda1[0] 1048512 blocks level 5, 256k chunk, algorithm 2 [2/1] [U_] [==>..................] recovery = 12.5% (132096/1048512) finish=0.8min speed=18870K/sec
- Reboot exit
- Login as root
- Add the new drives to the array mdadm --add /dev/md2 /dev/sdc2
- Repeat the last step for each new hd (sdd2, sde2 ecc.)
- Grow the array mdadm --grow /dev/md2 --raid-devices=N
- N is the total number of drives: minimum is 3
- Wait for array reshaping. This part can take a substantial amount of time; monitor it with cat /proc/mdstat root# cat /proc/mdstat Personalities : [raid0] [raid1] [raid5] md2 : active raid5 sdc1[2] sdb1[1] sda1[0] 1048512 blocks super 0.91 level 5, 256k chunk, algorithm 2 [3/3] [UUU] [==>..................] reshape = 12.5% (131520/1048512) finish=2.5min speed=5978K/sec
- Issue the following commands: pvresize /dev/md2 lvresize -l +100%FREE main/root resize2fs /dev/main/root
Notes :
- If you have disabled lvm
- you don't need the pvresize or lvresize command
- the final line becomes resize2fs /dev/md2 (or whatever / is mounted to)
- More info: http://www.arkf.net/blog/?p=47