Changes

From SME Server
Jump to navigationJump to search
9,144 bytes added ,  03:46, 10 November 2017
m
typo: ecc to etc
Line 1: Line 1:  +
{{Note box| SME Servers Raid Options are largely automated, but with the best laid plans things don't always go according to plan. See also: [[Raid:Manual Rebuild]], [[Raid:Growing]] and [[Hard Disk Partitioning]]. There is a wiki on the Linux software raid, you will find many [https://raid.wiki.kernel.org/index.php/Linux_Raid cool Tips here] }}
 +
 
===Hard Drives – Raid===
 
===Hard Drives – Raid===
From SME Server 7 a new feature was introduced - Automatic configuration of Software RAID 1, 5 or 6.  RAID is a way of storing data on more than one hard drive at once, so that if one drive fails, the system will still function.  
+
From SME Server 8 a new feature was introduced - Automatic configuration of Software RAID 1, 5 or 6.  RAID is a way of storing data on more than one hard drive at once, so that if one drive fails, the system will still function.  
 +
 
 +
{{Note box| As per the [http://lists.contribs.org/pipermail/updatesannounce/2014-June/000366.html '''release notes'''], SME Server 9 default install will only configure a Raid 1 configuration regardless of the number of Hard Drives, there are selectable install options for other Raid configurations available from the install menu}}
    
Your server will be automatically configured as follows:
 
Your server will be automatically configured as follows:
* 1 Drive - Software RAID 1 (ready to accept a second drive).  
+
* 1 Drive - Software RAID 1 (degraded RAID1 mirror ready to accept a second drive).  
 
* 2 Drives - Software RAID 1
 
* 2 Drives - Software RAID 1
 
* 3 Drives - Software RAID 1 + 1 Hot-spare  
 
* 3 Drives - Software RAID 1 + 1 Hot-spare  
 
* 4-6 Drives - Software RAID 5 + 1 Hot-spare
 
* 4-6 Drives - Software RAID 5 + 1 Hot-spare
 
* 7+ Drives - Software RAID 6 + 1 Hot-spare
 
* 7+ Drives - Software RAID 6 + 1 Hot-spare
 +
 +
As per the above note, on SME Server 9.0, the RAID 1 configuration will add the 3rd drive as a member of the RAID and not as a spare.:
 +
* 1 Drive - Software RAID 1 (degraded RAID1 mirror ready to accept a second drive).
 +
* 2 Drives - Software RAID 1
 +
* 3 Drives - Software RAID 1
 +
 +
If you use a true hardware raid controller to manage your hard drives and choose noraid during install, your system will still be configured with RAID1.
    
====Hard Drive Layout====
 
====Hard Drive Layout====
Line 19: Line 30:  
IDE 1 Slave - CDROM  <br />
 
IDE 1 Slave - CDROM  <br />
 
IDE 2 Master - Drive 2  <br />
 
IDE 2 Master - Drive 2  <br />
 +
 +
'''Obviously this section is completely obsolete with SATA hard drives because each disk has its own channel.'''
    
====Identifying Hard Drives====
 
====Identifying Hard Drives====
Line 27: Line 40:  
  smartctl -i /dev/hda
 
  smartctl -i /dev/hda
   −
====Adding another Hard Drive Later (Raid 1 array only)====
+
====Adding another Hard Drive Later (Raid1 array only)====
    
ENSURE THAT THE NEW DRIVE IS THE SAME SIZE OR LARGER AS THE CURRENT DRIVE(S)
 
ENSURE THAT THE NEW DRIVE IS THE SAME SIZE OR LARGER AS THE CURRENT DRIVE(S)
 
* Shut down the machine  
 
* Shut down the machine  
* Install drive as master on the second IDE channel (hdc)  
+
* Install drive as master on the second IDE channel (hdc) or the second SATA channel (sda)
 
* Boot up  
 
* Boot up  
* Log on as admin to get to the admin console  
+
* At the login prompt log on as admin with the root password to get to the admin console  
 
* Go to #5 Manage disk redundancy  
 
* Go to #5 Manage disk redundancy  
   −
It should tell you there if the drives are syncing up. Don't turn off the server until the sync is complete or it will start from the beginning again. When it is done syncing it will show a good working raid1.
+
It will show the status and progress if the drives are syncing up. Don't turn off the server until the sync is complete or it will start syncing again from the beginning. When it is done syncing, it will show a good working Raid1.
 +
 
 +
If the Manage disk redundancy page displays the message "The free disk count must equal one" and "Manual intervention may be required", then you probably have additional hard drives that need to be disconnected while the RAID is set up. An external USB drive will have this effect, and should be unplugged.
 +
 
 +
{{Note box| the addition of another drive is restricted to a Raid1 that is degraded, i.e. when the system has been installed with a single drive (/dev/hda and /dev/hdc or their SATA equivalent). The addition of a third drive to a Raid1 '''(i.e. a spare)''' is not recognized by the system. To add a spare you need to use the management tool '''mdadm''' at the command line}}
   −
If the Manage disk redundancy page gives the message "The free disk count must equal one" and "Manual intervention may be required", then you likely will have additional hard drives that need to be disconnected while the RAID is set up. An external USB drive will have this effect, and should be unplugged.
+
{{Note box|I will assume the system is installed with a Raid1 array functioning with two disks sda and sdb and you want to add another disk sdc as a spare (for adding to the array automatically if one disk of the array fails). This HowTo can be adapted to other types of RAID as long as you want to add a spare disk.}}
   −
Note: the addition of another drive is restricted to a RAID 1 scenario, i.e. two hard drives only (/dev/hda and /dev/hdc or their SATA equivalent). The addition of a third drive to a RAID 1 (i.e. a spare) is not recognized by the system [[bugzilla:6907]]
+
First we need write the partition table from  sda (or sdb) to sdc :
 +
 
 +
sfdisk -d /dev/sda > sfdisk_sda.output
 +
sfdisk /dev/sdc < sfdisk_sda.output
 +
 
 +
Then we need to add the new partitions to the existings arrays :
 +
 
 +
mdadm --add /dev/md1 /dev/sdc1
 +
mdadm --add /dev/md2 /dev/sdc2
 +
 
 +
Verify this with :
 +
 
 +
mdadm --detail /dev/md1
 +
mdadm --detail /dev/md2
 +
 
 +
/dev/md1:
 +
        Version : 0.90
 +
  Creation Time : Sat Feb  2 22:24:38 2013
 +
      Raid Level : raid1
 +
      Array Size : 104320 (101.89 MiB 106.82 MB)
 +
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 +
    Raid Devices : 2
 +
  Total Devices : 3
 +
Preferred Minor : 1
 +
    Persistence : Superblock is persistent
 +
 +
    Update Time : Mon Feb  4 13:28:43 2013
 +
          State : clean
 +
  Active Devices : 2
 +
Working Devices : 3
 +
  Failed Devices : 0
 +
  Spare Devices : 1
 +
 +
            UUID : f97a86c5:8bb46daa:6854855e:558a3e16
 +
          Events : 0.6
 +
 +
    Number  Major  Minor  RaidDevice State
 +
        0      8        1        0      active sync  /dev/sda1
 +
        1      8      17        1      active sync  /dev/sdb1
 +
 +
        2      8      33        -      spare  /dev/sdc1
 +
 
 +
Alternatively you can try this.
 +
 
 +
cat /proc/mdstat
 +
 
 +
cat /proc/mdstat
 +
Personalities : [raid1]
 +
md1 : active raid1 sdc1[2](S) sdb1[1] sda1[0]
 +
      104320 blocks [2/2] [UU]
 +
     
 +
md2 : active raid1 sdc2[2](S) sdb2[1] sda2[0]
 +
      52323584 blocks [2/2] [UU]
 +
 
 +
(S)= Spare
 +
(F)= Fail
 +
[0]= number of the disk
 +
 
 +
You should ensure that grub has been written correctly to the spare disk to ensure that it will boot correctly.
 +
 
 +
{{Warning box|Grub is unable to install itself on an empty disk or empty partitions; to have the spare fully working and booting after a sync the boot partition on the spare drive needs to be duplicated:}}
 +
 
 +
{{Warning box|as the dd command is named "data destroyer" you need to be extremely prudent and sure of the name of source partition and/or destination. At first you should skip the dd command, Step 1 below, and attempt to install grub without it, see Step 2 below. If grub can be installed without using dd, then Step 1 can be discarded.}}
 +
 
 +
To copy boot partition (sda=disk of the array sdc=the spare), from within a terminal with administrator privileges :
 +
 
 +
Step 1
 +
dd if=/dev/sda1 of=/dev/sdc1
 +
 
 +
Issue following from within a terminal with administrator privileges :
 +
 
 +
Step2
 +
grub
 +
device (hd2) /dev/sdc
 +
root (hd2,0)
 +
setup (hd2)
 +
 
 +
Last of all, try forcing a failure of one of the original two drives and ensure that the server boots, and the RAID rebuilds corectly. You may then have to repeat this exercise to get the drives in the correct order (i.e sda/sdb in the array with sdc as the spare)
    
====Reusing Hard Drives====
 
====Reusing Hard Drives====
Line 47: Line 141:  
From the linux command prompt, type the following:
 
From the linux command prompt, type the following:
 
  #dd if=/dev/zero of=/dev/hdx bs=512 count=1
 
  #dd if=/dev/zero of=/dev/hdx bs=512 count=1
 +
or
 +
#dd if=/dev/zero of=/dev/sdx bs=512 count=1
    
You MUST reboot so that the empty partition table gets read correctly.
 
You MUST reboot so that the empty partition table gets read correctly.
Line 52: Line 148:     
====Upgrading the Hard Drive Size====
 
====Upgrading the Hard Drive Size====
 +
 
Note: these instructions are only applicable if you have a RAID system with more than one drive. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
 
Note: these instructions are only applicable if you have a RAID system with more than one drive. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
   Line 59: Line 156:  
HD Scenario - Current 250gb drives, new larger 500gb drives
 
HD Scenario - Current 250gb drives, new larger 500gb drives
   −
# Shut down and install larger drive in system to old . Unplug any USB-connected drives.
+
# Shut down and install one larger drive in system for one old HD. Unplug any USB-connected drives.
# Boot up and manage raid to add new (larger) drive to system.
+
# Boot up and login to the admin console and use option 5 to add the new (larger) drive to system.
 
# Wait for raid to fully sync.
 
# Wait for raid to fully sync.
 
# Repeat steps 1-3 until all drives in system are upgraded to larger capacity.
 
# Repeat steps 1-3 until all drives in system are upgraded to larger capacity.
 
# Ensure all drives have been replace with larger drives and array is in sync and redundant!
 
# Ensure all drives have been replace with larger drives and array is in sync and redundant!
 
# Issue the following commands:
 
# Issue the following commands:
 +
 +
{{Note box|SME9 uses /dev/md1 not /dev/md2.}}
    
  mdadm --grow /dev/md2 --size=max
 
  mdadm --grow /dev/md2 --size=max
Line 82: Line 181:  
* All of this can be done while the server is up and running with the exception of #1.
 
* All of this can be done while the server is up and running with the exception of #1.
 
* These instructions should work for any raid level you have as long as you have >= 2 drives  
 
* These instructions should work for any raid level you have as long as you have >= 2 drives  
* If you have disabled lvm
+
* If you have disabled lvm , you don't need the pvresize or lvresize command, therefore the final line becomes  
# you don't need the pvresize or lvresize command
+
ext2online -C0 /dev/md2 <nowiki>#</nowiki>(or whatever / is mounted to)
# the final line becomes ext2online -C0 /dev/md2 (or whatever / is mounted to)
+
or If you receive an  "command not found" error,  try this:
 +
resize2fs /dev/md2 &
    
====Replacing and Upgrading Hard Drive after HD fail====
 
====Replacing and Upgrading Hard Drive after HD fail====
Note: See [[Bugzilla 6632]] these instructions are applicable if you have a faulty HD on a RAID system with more than one drive and intend to upgrade the sizes as well as replacing the failed HD. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
+
 
 +
Note: See [[Bugzilla: 6632]] and [[Bugzilla:6630]] a suggested sequence for Upgrading a Hard Drive size is detailed after issue when attempting to sync a new drive when added first as sda.
 +
 
 +
Note: These instructions are applicable if you have a faulty HD on a RAID system with more than one drive and intend to upgrade the sizes as well as replacing the failed HD. They are not applicable to a single-drive RAID 1 system, and increasing the useable space on such a system by cloning the existing single drive to a larger drive is not supported. See http://bugs.contribs.org/show_bug.cgi?id=5311
    
* CAUTION MAKE A FULL BACKUP!  
 
* CAUTION MAKE A FULL BACKUP!  
Line 105: Line 208:  
# Reboot and ensure all drives have been replaced with larger drives and array is in sync and redundant!  
 
# Reboot and ensure all drives have been replaced with larger drives and array is in sync and redundant!  
 
# Issue the following commands:  
 
# Issue the following commands:  
 +
 +
{{Note box|SME9 uses /dev/md1 not /dev/md2.}}
    
  mdadm --grow /dev/md2 --size=max
 
  mdadm --grow /dev/md2 --size=max
Line 120: Line 225:  
Notes :   
 
Notes :   
 
* These instructions should work for any raid level you have as long as you have >= 2 drives  
 
* These instructions should work for any raid level you have as long as you have >= 2 drives  
* If you have disabled lvm
+
* If you have disabled lvm , you don't need the pvresize or lvresize command, therefore the final line becomes  
# you don't need the pvresize or lvresize command
+
ext2online -C0 /dev/md2 <nowiki>#</nowiki>(or whatever / is mounted to)
# the final line becomes ext2online -C0 /dev/md2 (or whatever / is mounted to)
+
or If you receive an  "command not found" error,  try this:
 +
resize2fs /dev/md2 &
    
====Raid Notes====
 
====Raid Notes====
Line 132: Line 238:  
its state doesn't change from "dirty" to "clean" until a reboot occurs.
 
its state doesn't change from "dirty" to "clean" until a reboot occurs.
 
This is cosmetic
 
This is cosmetic
 +
 +
====Periodic scrub of RAID arrays====
 +
A Periodic scrub of RAID arrays (weekly raid check) is performed every week on Sunday at 04:22 local time, refer [[Bugzilla:3535]] and [[Bugzilla:6160]] for more information.
 +
 +
Theses operations are logged, however, no emails will be sent to admin as of the release of packages associated with Bug #6160 or the release of the 8.1 ISO.
 +
==== Receive periodic check of Raid by email ====
 +
 +
There are routines in SME Server to check the raid and sent mail to the admin user, when the raid is degraded or when the raid is resynchronizing. But the admin user receive a lot of emails and some time messages can be forgotten.
 +
So the purpose is to have a routine which sent email to the user of your choice each week.
 +
 +
nano /etc/cron.weekly/raid-status.sh
 +
 +
You have to change the variable '''DEST="stephane@your-domaine-name.org"''' to the email you decide to use.
 +
 +
#!/bin/sh
 +
# cron.weekly/mdadm-status -- weekly status of the RAID
 +
# 2013 Pierre-Alain Bandinelli
 +
# distributed under the terms of the Artistic Licence 2.0
 +
 +
# Get status from the RAID array and send the details by email.
 +
# Email will go to the address specified in the commandline.
 +
set -eu
 +
 +
MDADM=/sbin/mdadm
 +
[ -x $MDADM ] || exit 0 # package may be removed but not purged
 +
 +
'''DEST="stephane@your-domaine-name.org"'''
 +
exec $MDADM --detail  $(ls /dev/md*) | mail -s "RAID status SME Server" $DEST
 +
 +
save by ctrl+x
 +
chmod +x /etc/cron.weekly/raid-status.sh
 +
 +
each sunday a 4h00 AM you will receive a mail which looks to this :
 +
 +
/dev/md1:
 +
        Version : 0.90
 +
  Creation Time : Sun Jan  6 20:50:41 2013
 +
    Raid Level : raid1
 +
    Array Size : 104320 (101.89 MiB 106.82 MB)
 +
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 +
  Raid Devices : 2
 +
  Total Devices : 2
 +
Preferred Minor : 1
 +
    Persistence : Superblock is persistent
 +
 +
    Update Time : Sun Dec 22 04:22:42 2013
 +
          State : clean
 +
Active Devices : 2
 +
Working Devices : 2
 +
Failed Devices : 0
 +
  Spare Devices : 0
 +
 +
          UUID : 28745adb:d9cff1f4:fcb31dd8:ff24cb0c
 +
        Events : 0.208
 +
 +
    Number  Major  Minor  RaidDevice State
 +
      0      8        1        0      active sync  /dev/sda1
 +
      1      8      17        1      active sync  /dev/sdb1
 +
/dev/md2:
 +
        Version : 0.90
 +
  Creation Time : Sun Jan  6 20:50:42 2013
 +
    Raid Level : raid1
 +
    Array Size : 262036096 (249.90 GiB 268.32 GB)
 +
  Used Dev Size : 262036096 (249.90 GiB 268.32 GB)
 +
  Raid Devices : 2
 +
  Total Devices : 2
 +
Preferred Minor : 2
 +
    Persistence : Superblock is persistent
 +
 +
    Update Time : Sun Dec 22 05:30:36 2013
 +
          State : clean
 +
Active Devices : 2
 +
Working Devices : 2
 +
Failed Devices : 0
 +
  Spare Devices : 0
 +
 +
          UUID : c343c79e:91c01009:fcde78b4:bad0b497
 +
        Events : 0.224
 +
 +
    Number  Major  Minor  RaidDevice State
 +
      0      8        2        0      active sync  /dev/sda2
 +
      1      8      18        1      active sync  /dev/sdb2
 +
 +
If you want to test the message sent without waiting the next sunday, you can do
 +
/etc/cron.weekly/raid-status.sh
    
====nospare====
 
====nospare====
Line 137: Line 328:  
A system with 6 physically present harddrives thus will be formated Raid6 _not_ Raid5. Resulting capacity of course will be "n-2".
 
A system with 6 physically present harddrives thus will be formated Raid6 _not_ Raid5. Resulting capacity of course will be "n-2".
 
'''Note:'''  with the release of version 7.6 and 8.0, the commandline parameter "sme nospare" has been changed to "sme spares=0" .  In addition, you may also select the number of spare(s) implemented [0,1,or 2].
 
'''Note:'''  with the release of version 7.6 and 8.0, the commandline parameter "sme nospare" has been changed to "sme spares=0" .  In addition, you may also select the number of spare(s) implemented [0,1,or 2].
 +
 +
==== remove the degraded raid ====
 +
when you install the smeserver with one drive with a degraded raid, you will see a 'U_' state but without warnings. If you want to leave just one 'U' in the /proc/mdstat and stop all future questions about your degraded raid state, then :
 +
mdadm --grow /dev/md0 --force --raid-devices=1
 +
mdadm --grow /dev/md1 --force --raid-devices=1
 +
 +
after that you will see this
 +
 +
# cat /proc/mdstat
 +
Personalities : [raid1]
 +
md0 : active raid1 sda1[0]
 +
      255936 blocks super 1.0 [1/1] [U]
 +
     
 +
md1 : active raid1 sda2[0]
 +
      268047168 blocks super 1.1 [1/1] [U]
 +
      bitmap: 2/2 pages [8KB], 65536KB chunk
 +
 +
unused devices: <none>
    
====Resynchronising a Failed RAID====
 
====Resynchronising a Failed RAID====
   −
You can refer to 'man mdadm' or http://www.linuxmanpages.com/man8/mdadm.8.php
+
{{tip box|You can refer to 'man mdadm' or [http://www.linuxmanpages.com/man8/mdadm.8.php Mdadm Man page] or [[Raid:Manual_Rebuild]]}}
    
Sometimes a partition will be taken offline automatically. Admin will receive an email '''DegradedArray event on /dev/md2'''.
 
Sometimes a partition will be taken offline automatically. Admin will receive an email '''DegradedArray event on /dev/md2'''.
   −
This will happen if, for example, a read or write error is detected in a disk in the RAID set, or a disk does not respond fast enough, causing a timeout. As a precaution, verify the health of your disks as documented in: http://wiki.contribs.org/Monitor_Disk_Health and specifically with the command:
+
{{note box|This will happen if, for example, a read or write error is detected in a disk in the RAID set, or a disk does not respond fast enough, causing a timeout. As a precaution, verify the health of your disks as documented in: [[Monitor_Disk_Health]] and specifically with the command:}}
 
  smartctl -a /dev/hda
 
  smartctl -a /dev/hda
 
Where hda is the device to be checked; check all of them.
 
Where hda is the device to be checked; check all of them.
Line 219: Line 428:     
  [root@sme]# mdadm --add /dev/md2 /dev/hda2
 
  [root@sme]# mdadm --add /dev/md2 /dev/hda2
 +
 +
Once you type the command the following message will appear, appropriate for your device.
 +
  [root@sme]  mdadm: hot added /dev/hda2
 
   
 
   
Your devices are likely to be different, and you may have more than two disks, including a hot standby, but will always be determined from the mdstat file. Once the raid resync has been started, the progress will be noted in mdstat. You can see this real time by:
+
It important to know that your devices are likely to be different, E.G your device could be /dev/sda2 or you may have more than two disks, including a hot standby. These details can always be determined from the mdstat file. Once the raid resync has been started, the progress will be noted in mdstat. You can see this real time by:
    
  [root@sme]# watch -n .1 cat /proc/mdstat
 
  [root@sme]# watch -n .1 cat /proc/mdstat
Line 253: Line 465:     
Also check your driver cards, since a faulty card can destroy the data on a full RAID set as easily as it can a single disk.
 
Also check your driver cards, since a faulty card can destroy the data on a full RAID set as easily as it can a single disk.
 +
 +
{{Tip box| we could use a shortcut for the raid rebuild :
 +
mdadm -f /dev/md2 /dev/hda2 -r /dev/hda2 -a /dev/hda2}}
    
====Convert Software RAID1 to RAID5====
 
====Convert Software RAID1 to RAID5====
{{Note box|msg=these instructions are only applicable if you have SME8 and a RAID1 system with 2 hd in sync; new drive(s) must be of the same size or larger as the current drive(s)}}
+
{{Note box|msg=these instructions are only applicable if you have SME8 or greater and a RAID1 system with 2 hd in sync; new drive(s) must be of the same size or larger as the current drive(s)}}
 
{{Warning box|msg=Please make a full backup before proceeding}}
 
{{Warning box|msg=Please make a full backup before proceeding}}
 
{{Warning box|msg=Newer versions of mdadm use the v1.x superblocks stored at the beginning of the block device, which could overwrite the filesystem metadata. You’ll need to be starting with a v0.9 metadata device for the above instructions to work (which was the default for years).First, check the existing superblock version with:
 
{{Warning box|msg=Newer versions of mdadm use the v1.x superblocks stored at the beginning of the block device, which could overwrite the filesystem metadata. You’ll need to be starting with a v0.9 metadata device for the above instructions to work (which was the default for years).First, check the existing superblock version with:
Line 280: Line 495:  
  sfdisk /dev/sdc < tmp.out
 
  sfdisk /dev/sdc < tmp.out
   −
</li><li>Repeat the last step for each new hd (sdd, sde ecc.).
+
</li><li>Repeat the last step for each new hd (sdd, sde etc.).
 
</li><li>Create the new array
 
</li><li>Create the new array
 
  mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2
 
  mdadm --create /dev/md2 -c 256 --level=5 --raid-devices=2 /dev/sda2 /dev/sdb2
Line 304: Line 519:  
  mdadm --add /dev/md2 /dev/sdc2
 
  mdadm --add /dev/md2 /dev/sdc2
   −
</li><li>Repeat the last step for each new hd (sdd2, sde2 ecc.)
+
</li><li>Repeat the last step for each new hd (sdd2, sde2 etc.)
    
</li><li>Grow the array
 
</li><li>Grow the array
2

edits

Navigation menu