
From SME Server
Jump to navigationJump to search
5,332 bytes removed ,  00:33, 15 April 2021
Line 1: Line 1: −
==Raid: Manual Rebuild==
{{Warning box|Get it right or you will lose data. '''Take a backup!''' Let the raid sync, this can take quite a while.}}
Under Re-Write
SME Servers Raid Options are largely automated, if you built your system with a single hard disk simply logon as admin and select Disk Redundancy to add a new drive to your RAID1 array. The same procedure is used if you have a disk failure in a RAID array and you have replaced that failed disk.
SME Servers Raid Options are largely automated, if you built your system with a single hard disk, or have a hard disk failure, simply logon as ''admin'' and select ''Disk Redundancy'' to add a new drive to your RAID1 array.
But with the best laid plans things don't always go according to plan, these are the processes required to do it manually.
== HowTo Manage/Check a RAID1 Array from the command Line ==
See also: [[Hard Disk Partitioning]] and [[Raid#Resynchronising_a_Failed_RAID]]
=== Are the Disk Partitioned Correctly ? ===
==HowTo: Manage/Check a RAID1 Array from the command Line==
===What is the Status of the Array===
Here two disks are partitioned identically
[root@ ~]# '''cat /proc/mdstat'''
Personalities : [raid1]
md2 : active raid1 sdb2[2] sda2[0]
      488279488 blocks [2/1] [U_]
      [=>...................]  recovery =  6.3% (31179264/488279488) finish=91.3min speed=83358K/sec
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
unused devices: <none>
==HowTo: Reinstate a disk from the RAID1 Array with the command Line==
[root@ ~]# '''fdisk -lu /dev/sda; fdisk -lu /dev/sdb'''
===Look at the mdstat===
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
    Device Boot      Start        End      Blocks  Id  System
/dev/sda1  *          63      208844      104391  fd  Linux raid autodetect
/dev/sda2          208845  1953520064  976655610  fd  Linux raid autodetect
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
    Device Boot      Start        End      Blocks  Id  System
/dev/sdb1  *          63      208844      104391  fd  Linux raid autodetect
/dev/sdb2          208845  1953520064  976655610  fd  Linux raid autodetect
==== Example : Incorrecty Partitioned 2nd Disk ====
First we must determine which drive is in default.
I this example the partitions are set too close to the start of the disk and there is no room for GRUB to be written, the disk will not boot
  [root@ ~]# '''fdisk -l /dev/sdb; fdisk -lu /dev/sdb'''
  [root@ ~]#'''cat /proc/mdstat'''
  Personalities : [raid1]
  Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
  md1 : active raid1 sdb1[1] sda1[0]
255 heads, 63 sectors/track, 121601 cylinders
      104320 blocks [2/2] [UU]
Units = cylinders of 16065 * 512 = 8225280 bytes
  md2 : active raid1 sdb2[2](F) sda2[0]
    Device Boot      Start        End      Blocks  Id  System
      52323584 blocks [2/1] [U_]
/dev/sdb1   *          1         13      104384+  fd  Linux raid autodetect
'''Partition 1 does not end on cylinder boundary.'''
  unused devices: <none>
/dev/sdb2              13      121601  976655647  fd  Linux raid autodetect
  Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
    Device Boot      Start        End      Blocks  Id System
/dev/sdb1  *          1      208769      104384+  fd  Linux raid autodetect
'''Partition 1 does not end on cylinder boundary.'''
/dev/sdb2          208770  1953520063  976655647  fd  Linux raid autodetect
===The Leadup===
(S)= Spare
I'm not sure if I'm reporting a bug or just some manual maintenance
(F)= Fail
[0]= number of the disk
My Disk didn't respond correctly to the Menu option "Manage Disk Redundancy". I was upgrading the hard disks to 1Gb disks from the 500Gb that came with the Dell server, the new disks were the Seagate 1Tb ST1000340NS, they are a Server Edition disk. It did this on both disks
{{note box|As we can see the partition sdb2 is in default, we can see the flag: sdb2 [2] (F). We need to resynchronize the disk sdb to the existing array md2.}}
The Disk was installed as the 2nd Hard Disk during an Upgrade process
===Fail and remove the disk, '''sdb''' in this case===
''It's not fatal'', but it did stop the machine from booting on the disk, perhaps that's just ''not living, therefore not fatal'', whatever, it's not terribly useful.
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# '''mdadm --manage /dev/md2 --fail /dev/sdb2'''
My message Log show Grub as follows
mdadm: hot removed /dev/sdb2
[root@ ~]# '''mdadm --manage /dev/md2 --remove /dev/sdb2'''
add_drive_to_raid: Waiting for boot partition to sync before installing grub...
mdadm: set /dev/sdb1 faulty in /dev/md1
add_drive_to_raid: Probing devices to guess BIOS drives. This may take a long time.
  [root@ ~]# '''mdadm --manage /dev/md1 --fail /dev/sdb1'''
add_drive_to_raid:     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
add_drive_to_raid:  [ Minimal BASH-like line editing is supported.  For the first word, TAB
add_drive_to_raid:    lists possible command completions.  Anywhere else TAB lists the possible
add_drive_to_raid:    completions of a device/filename.]
add_drive_to_raid: grub> device (hd0) /dev/sdb
  add_drive_to_raid: grub> root (hd0,0)
add_drive_to_raid:  Filesystem type is ext2fs, partition type 0xfd
add_drive_to_raid: grub> setup (hd0)
add_drive_to_raid:  Checking if "/boot/grub/stage1" exists... no
add_drive_to_raid:  Checking if "/grub/stage1" exists... yes
add_drive_to_raid:  Checking if "/grub/stage2" exists... yes
add_drive_to_raid:  Checking if "/grub/e2fs_stage1_5" exists... yes
add_drive_to_raid:  Running "embed /grub/e2fs_stage1_5 (hd0)"... failed (this is not fatal)
add_drive_to_raid:  Running "embed /grub/e2fs_stage1_5 (hd0,0)"... failed (this is not fatal)
add_drive_to_raid:  Running "install /grub/stage1 (hd0) /grub/stage2 p /grub/grub.conf "... succeeded
add_drive_to_raid: Done.
add_drive_to_raid: grub> quit
and a look from fdisks view shows
mdadm: hot removed /dev/sdb1
[root@ ~]# '''mdadm --manage /dev/md1 --remove /dev/sdb1'''
Note the correct partitioning on sda
===Do your Disk Maintenance here===
[root@ ~]# fdisk -lu /dev/sda
At this point the disk is idle.
  Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
  [root@ ~]# '''cat /proc/mdstat'''
  255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Personalities : [raid1]
  Units = sectors of 1 * 512 = 512 bytes
md1 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]
    Device Boot      Start        End      Blocks  Id  System
  /dev/sda1  *          63      208844      104391  fd  Linux raid autodetect
  md2 : active raid1 sda2[0]
  /dev/sda2          208845 1953520064  976655610  fd Linux raid autodetect
      52323584 blocks [2/1] [U_]
  unused devices: <none>
{{note box|You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some times for physical outages of the hard disk. It is necessary to test the hard disk if this occurs repeatedly. For this we will use '''smartctl'''.}}
For all the details available by SMART on the disk
  [root@ ~]# '''smartctl -a /dev/sdb'''
At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).
  [root@ ~]# '''smartctl -t short /dev/sdb''' #short test
  [root@ ~]# '''smartctl -t long /dev/sdb''' #long test
to access the results / statistics for these tests:
What has happened here is the disk partition has been written too close to the start of the drive, so the boot record hasn't got enough room for its GRUB staging - if thats the right term.
[root@ ~]# '''smartctl -l selftest /dev/sdb'''
To correct this, remove the disk from the array, you will need to fail it, then remove it, the repartition and add it back to the array
You can refer to this page for more information how activate or understand the Analysis and Reporting Technology (SMART) [[Monitor_Disk_Health]]
{{Warning box|Get it right or you will lose data. Take a backup, I let the raid sync anyway, probably didn't need to but things get confusing here, this was my initial screen, I thought it looked funny as sdb was the disk added.}}
{{Note box|if you need to change the disk due to physical failure found by the smartctl command, install a new disk of the same capacity (or more) and enter the following commands to recreate new partitions by copying them from healthy disk sda.}}<!-- Do NOT try to use sfdisk on disks llarger than 2 TiB, use gdisk or similar, see below. -->
{{Note box|I'm using sdb which was right for me, it might not be for you (if it's RAID 1, there is a 50% chance it's not !)}}
[root@ ~]# '''sfdisk -d /dev/sda > sfdisk_sda.output'''
[root@ ~]# '''sfdisk /dev/sdb < sfdisk_sda.output'''
[root@ ~]# cat /proc/mdstat
GPT Disks
Personalities : [raid1]
md2 : active raid1 sdb2[2] sda2[0]
      488279488 blocks [2/1] [U_]
      [=>...................]  recovery =  6.3% (31179264/488279488) finish=91.3min speed=83358K/sec
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
unused devices: <none>
===Here we go lets fix this===
Larger disks will be GPT Disks, sfdisk will not work - you will need to use gdisk and partx (parted)
[root@ ~]# '''yum install gdisk'''
====First another look at the mdstat====
The copy the partition table from a good disk to the new disk, the first line will copy the partition table from disk sda to sdd, the second will randomize the GUID
[root@ ~]# '''sgdisk /dev/sda -R /dev/sdd'''
[root@ ~]# '''sgdisk -G /dev/sdd'''
  [root@ ~]# cat /proc/mdstat
To view the partitions use partx
Personalities : [raid1]
  [root@ ~]# '''partx -l /dev/sdd'''
md2 : active raid1 sdb2[1] sda2[0]
      488279488 blocks [2/2] [UU]
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
unused devices: <none>
====Then fail and remove the disk, sdb in my case====
[root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
If you want to reinstate the same disk without replacing it, go to the next step.
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2
[root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md1
[root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1
====Re-Partition, first clean the old partitions====
===Add the partitions back===
  [root@ ~]# fdisk /dev/sdb
mdadm: hot added /dev/sdb1
  [root@ ~]# '''mdadm --manage /dev/md1 --add /dev/sdb1'''
The number of cylinders for this disk is set to 121601.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot      Start        End      Blocks  Id  System
mdadm: hot added /dev/sdb2
/dev/sdb1  *          1          13      104384+  fd  Linux raid autodetect
  [root@ ~]# '''mdadm --manage /dev/md2 --add /dev/sdb2'''
Partition 1 does not end on cylinder boundary.
  /dev/sdb2              13      121601  976655647  fd  Linux raid autodetect
Command (m for help): d
Partition number (1-4): 1
Command (m for help): d
Selected partition 2
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
====Then Create the new partitions====
===Another Look at the mdstat===
Note: change the partitions system id to reflect Linux raid autodetect
  [root@ ~]# fdisk /dev/sdb
  [root@sme8-64-dev ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]
md2 : active raid1 sdb2[2] sda2[0]
      52323584 blocks [2/1] [U_]
      [>....................]  recovery =  1.9% (1041600/52323584) finish=14.7min speed=57866K/sec
  The number of cylinders for this disk is set to 121601.
  unused devices: <none>
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
{{note box|with a new disk it may be worthwhile to reinstall grub to avoid problems on startup error. The grub is the program that allows you to launch the operating systems. Please enter the following commands. }}
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n
Command action
    e  extended
    p  primary partition (1-4)
Partition number (1-4): 1
First cylinder (1-121601, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-121601, default 121601): 13
Command (m for help): n
Command action
    e  extended
    p  primary partition (1-4)
Partition number (1-4): 2
First cylinder (14-121601, default 14):
Using default value 14
Last cylinder or +size or +sizeM or +sizeK (14-121601, default 121601):
Using default value 121601
Command (m for help): m
Command action
    a  toggle a bootable flag
    b  edit bsd disklabel
    c  toggle the dos compatibility flag
    d  delete a partition
    l  list known partition types
    m  print this menu
    n  add a new partition
    o  create a new empty DOS partition table
    p  print the partition table
    q  quit without saving changes
    s  create a new empty Sun disklabel
    t  change a partition's system id
    u  change display/entry units
    v  verify the partition table
    w  write table to disk and exit
    x  extra functionality (experts only)
Command (m for help): a
Partition number (1-4): 1
Command (m for help): p
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot      Start        End      Blocks  Id  System
/dev/sdb1  *          1          13      104391  83  Linux
/dev/sdb2              14      121601  976655610  83  Linux
Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): fd
Changed system type of partition 2 to fd (Linux raid autodetect)
Command (m for help): p
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot      Start        End      Blocks  Id  System
/dev/sdb1  *          1          13      104391  fd  Linux raid autodetect
/dev/sdb2              14      121601  976655610  fd  Linux raid autodetect
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
==HowTo: Write the GRUB boot sector==
====Add the partitions back====
{{Warning box|as the dd command is named "data destroyer" you need to be extremely prudent and sure of the name of source partition and/or destination. At first you should skip the dd command, Step 1 below, and attempt to install grub without it, see Step 2 below. If grub can be installed without using dd, then Step 1 can be discarded. }}
[root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
mdadm: hot added /dev/sdb1
[root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
mdadm: hot added /dev/sdb2
[root@ ~]# '''dd if=/dev/sda1 of=/dev/sdb1'''
====and lastly, write the boot sector====
  [root@ ~]# grub
  [root@ ~]# '''grub'''
     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
Line 289: Line 150:  
     completions of a device/filename.]
     completions of a device/filename.]
  grub> device (hd0) /dev/sdb
  grub> '''device (hd0) /dev/sdb'''
  grub> root (hd0,0)
  grub> '''root (hd0,0)'''
   Filesystem type is ext2fs, partition type 0xfd
   Filesystem type is ext2fs, partition type 0xfd
  grub> setup (hd0)
  grub> '''setup (hd0)'''
   Checking if "/boot/grub/stage1" exists... no
   Checking if "/boot/grub/stage1" exists... no
   Checking if "/grub/stage1" exists... yes
   Checking if "/grub/stage1" exists... yes
Line 301: Line 162:  
   Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
   Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
   Running "install /grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
   Running "install /grub/stage1 (hd0) (hd1)1+16 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
  grub> quit
  grub> '''quit'''
and then I can use the wiki's proceedure to grow the disk - which is why I am here
David Bray
17 March, 2010
<!-- noinclude>[[Category:Howto]]</noinclude -->

Navigation menu