Difference between revisions of "Raid:Manual Rebuild"

From SME Server
Jump to navigationJump to search
Line 109: Line 109:
 
=== Fail and remove the disk, '''sdb''' in this case ===
 
=== Fail and remove the disk, '''sdb''' in this case ===
  
  [root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
+
  [root@ ~]# '''mdadm --manage /dev/md2 --fail /dev/sdb2'''
 
  mdadm: set /dev/sdb2 faulty in /dev/md2
 
  mdadm: set /dev/sdb2 faulty in /dev/md2
  [root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
+
  [root@ ~]# '''mdadm --manage /dev/md2 --remove /dev/sdb2'''
 
  mdadm: hot removed /dev/sdb2
 
  mdadm: hot removed /dev/sdb2
  [root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
+
  [root@ ~]# '''mdadm --manage /dev/md1 --fail /dev/sdb1'''
 
  mdadm: set /dev/sdb1 faulty in /dev/md1
 
  mdadm: set /dev/sdb1 faulty in /dev/md1
  [root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
+
  [root@ ~]# '''mdadm --manage /dev/md1 --remove /dev/sdb1'''
 
  mdadm: hot removed /dev/sdb1
 
  mdadm: hot removed /dev/sdb1
  
 
=== Add the partitions back ===
 
=== Add the partitions back ===
  
  [root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
+
  [root@ ~]# '''mdadm --manage /dev/md1 --add /dev/sdb1'''
 
  mdadm: hot added /dev/sdb1
 
  mdadm: hot added /dev/sdb1
  [root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
+
  [root@ ~]# '''mdadm --manage /dev/md2 --add /dev/sdb2'''
 
  mdadm: hot added /dev/sdb2
 
  mdadm: hot added /dev/sdb2
  
 +
== Partition / Re-Partition, this disk ==
  
 +
=== Delete Existing Partitions ===
  
 
+
  [root@ ~]# '''fdisk /dev/sdb'''
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
===The Leadup===
 
I'm not sure if I'm reporting a bug or just some manual maintenance
 
 
 
My Disk didn't respond correctly to the Menu option "Manage Disk Redundancy". I was upgrading the hard disks to 1Gb disks from the 500Gb that came with the Dell server, the new disks were the Seagate 1Tb ST1000340NS, they are a Server Edition disk. It did this on both disks
 
 
 
The Disk was installed as the 2nd Hard Disk during an Upgrade process
 
 
 
''It's not fatal'', but it did stop the machine from booting on the disk, perhaps that's just ''not living, therefore not fatal'', whatever, it's not terribly useful.
 
 
 
 
 
and a look from fdisks view shows
 
 
 
Note the correct partitioning on sda
 
 
 
  [root@ ~]# fdisk -lu /dev/sda
 
 
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
 
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
 
Units = sectors of 1 * 512 = 512 bytes
 
 
    Device Boot      Start        End      Blocks  Id  System
 
/dev/sda1  *          63      208844      104391  fd  Linux raid autodetect
 
/dev/sda2          208845  1953520064  976655610  fd  Linux raid autodetect
 
 
 
What has happened here is the disk partition has been written too close to the start of the drive, so the boot record hasn't got enough room for its GRUB staging - if thats the right term.
 
 
 
To correct this, remove the disk from the array, you will need to fail it, then remove it, the repartition and add it back to the array
 
 
 
{{Note box|I'm using sdb which was right for me, it might not be for you (if it's RAID 1, there is a 50% chance it's not !)}}
 
 
 
===Here we go lets fix this===
 
 
 
 
 
====Re-Partition, first clean the old partitions====
 
 
 
[root@ ~]# fdisk /dev/sdb
 
 
   
 
   
 
  The number of cylinders for this disk is set to 121601.
 
  The number of cylinders for this disk is set to 121601.
Line 203: Line 161:
 
  Syncing disks.
 
  Syncing disks.
  
====Then Create the new partitions====
+
=== Create new partitions ===
 +
 
 
Note: change the partitions system id to reflect Linux raid autodetect
 
Note: change the partitions system id to reflect Linux raid autodetect
  
Line 215: Line 174:
 
     (e.g., DOS FDISK, OS/2 FDISK)
 
     (e.g., DOS FDISK, OS/2 FDISK)
 
   
 
   
  Command (m for help): n
+
  Command (m for help): '''n'''
 
  Command action
 
  Command action
 
     e  extended
 
     e  extended
 
     p  primary partition (1-4)
 
     p  primary partition (1-4)
  p
+
  '''p'''
 
  Partition number (1-4): 1
 
  Partition number (1-4): 1
 
  First cylinder (1-121601, default 1):
 
  First cylinder (1-121601, default 1):
Line 225: Line 184:
 
  Last cylinder or +size or +sizeM or +sizeK (1-121601, default 121601): 13
 
  Last cylinder or +size or +sizeM or +sizeK (1-121601, default 121601): 13
 
   
 
   
  Command (m for help): n
+
  Command (m for help): '''n'''
 
  Command action
 
  Command action
 
     e  extended
 
     e  extended
 
     p  primary partition (1-4)
 
     p  primary partition (1-4)
  p
+
  '''p'''
 
  Partition number (1-4): 2
 
  Partition number (1-4): 2
 
  First cylinder (14-121601, default 14):
 
  First cylinder (14-121601, default 14):
Line 236: Line 195:
 
  Using default value 121601
 
  Using default value 121601
 
   
 
   
  Command (m for help): m
+
  Command (m for help): '''a'''
Command action
+
Partition number (1-4): '''1'''
    a   toggle a bootable flag
 
    b  edit bsd disklabel
 
    c  toggle the dos compatibility flag
 
    d  delete a partition
 
    l  list known partition types
 
    m  print this menu
 
    n  add a new partition
 
    o  create a new empty DOS partition table
 
    p  print the partition table
 
    q  quit without saving changes
 
    s  create a new empty Sun disklabel
 
    t  change a partition's system id
 
    u  change display/entry units
 
    v  verify the partition table
 
    w  write table to disk and exit
 
    x  extra functionality (experts only)
 
 
   
 
   
  Command (m for help): a
+
  Command (m for help): '''t'''
Partition number (1-4): 1
+
  Partition number (1-4): '''1'''
+
  Hex code (type L to list codes): '''fd'''
Command (m for help): p
 
 
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
 
255 heads, 63 sectors/track, 121601 cylinders
 
Units = cylinders of 16065 * 512 = 8225280 bytes
 
 
    Device Boot      Start        End      Blocks  Id  System
 
/dev/sdb1  *          1          13      104391  83  Linux
 
/dev/sdb2              14      121601  976655610  83  Linux
 
 
Command (m for help): t
 
  Partition number (1-4): 1
 
  Hex code (type L to list codes): fd
 
 
  Changed system type of partition 1 to fd (Linux raid autodetect)
 
  Changed system type of partition 1 to fd (Linux raid autodetect)
 
   
 
   
  Command (m for help): t
+
  Command (m for help): '''t'''
  Partition number (1-4): 2
+
  Partition number (1-4): '''2'''
  Hex code (type L to list codes): fd
+
  Hex code (type L to list codes): '''fd'''
 
  Changed system type of partition 2 to fd (Linux raid autodetect)
 
  Changed system type of partition 2 to fd (Linux raid autodetect)
 
   
 
   
  Command (m for help): p
+
  Command (m for help): '''p'''
 
   
 
   
 
  Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
 
  Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
Line 288: Line 218:
 
  /dev/sdb2              14      121601  976655610  fd  Linux raid autodetect
 
  /dev/sdb2              14      121601  976655610  fd  Linux raid autodetect
 
    
 
    
  Command (m for help): w
+
  Command (m for help): '''w'''
 
  The partition table has been altered!
 
  The partition table has been altered!
 
   
 
   
Line 294: Line 224:
 
  Syncing disks.
 
  Syncing disks.
  
 +
== Write the GRUB boot sector ==
  
====and lastly, write the boot sector====
+
  [root@ ~]# '''grub'''
 
 
  [root@ ~]# grub
 
 
   
 
   
 
     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
 
     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
Line 305: Line 234:
 
     completions of a device/filename.]
 
     completions of a device/filename.]
 
   
 
   
  grub> device (hd0) /dev/sdb
+
  grub> '''device (hd0) /dev/sdb'''
 
   
 
   
  grub> root (hd0,0)
+
  grub> '''root (hd0,0)'''
 
   Filesystem type is ext2fs, partition type 0xfd
 
   Filesystem type is ext2fs, partition type 0xfd
 
   
 
   
  grub> setup (hd0)
+
  grub> '''setup (hd0)'''
 
   Checking if "/boot/grub/stage1" exists... no
 
   Checking if "/boot/grub/stage1" exists... no
 
   Checking if "/grub/stage1" exists... yes
 
   Checking if "/grub/stage1" exists... yes
Line 320: Line 249:
 
  Done.
 
  Done.
 
   
 
   
  grub> quit
+
  grub> '''quit'''
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
===The Leadup===
 +
I'm not sure if I'm reporting a bug or just some manual maintenance
 +
 
 +
My Disk didn't respond correctly to the Menu option "Manage Disk Redundancy". I was upgrading the hard disks to 1Gb disks from the 500Gb that came with the Dell server, the new disks were the Seagate 1Tb ST1000340NS, they are a Server Edition disk. It did this on both disks
 +
 
 +
The Disk was installed as the 2nd Hard Disk during an Upgrade process
 +
 
 +
''It's not fatal'', but it did stop the machine from booting on the disk, perhaps that's just ''not living, therefore not fatal'', whatever, it's not terribly useful.
 +
 
 +
 
 +
and a look from fdisks view shows
 +
 
 +
Note the correct partitioning on sda
 +
 
 +
[root@ ~]# fdisk -lu /dev/sda
 +
 +
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
 +
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
 +
Units = sectors of 1 * 512 = 512 bytes
 +
 +
    Device Boot      Start        End      Blocks  Id  System
 +
/dev/sda1  *          63      208844      104391  fd  Linux raid autodetect
 +
/dev/sda2          208845  1953520064  976655610  fd  Linux raid autodetect
 +
 
 +
What has happened here is the disk partition has been written too close to the start of the drive, so the boot record hasn't got enough room for its GRUB staging - if thats the right term.
 +
 
 +
To correct this, remove the disk from the array, you will need to fail it, then remove it, the repartition and add it back to the array
 +
 
 +
{{Note box|I'm using sdb which was right for me, it might not be for you (if it's RAID 1, there is a 50% chance it's not !)}}
 +
 
 +
===Here we go lets fix this===
 +
 
  
 
and then I can use the wiki's proceedure to grow the disk - which is why I am here
 
and then I can use the wiki's proceedure to grow the disk - which is why I am here

Revision as of 12:40, 22 March 2010

Raid: Manual Rebuild

PythonIcon.png Skill level: Advanced
The instructions on this page may require deviations from standard procedures. A good understanding of linux and Koozali SME Server is recommended.



Warning.png Warning:
Get it right or you will lose data. Take a backup, let the raid sync



Warning.png Warning:
Under Re-Write


SME Servers Raid Options are largely automated, if you built your system with a single hard disk, or have a hard disk failure, simply logon as admin and select Disk Redundancy to add a new drive to your RAID1 array.

HowTo: Manage/Check a RAID1 Array from the command Line

What is the Status of the Array

[root@ ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb2[2] sda2[0]
      488279488 blocks [2/1] [U_]
      [=>...................]  recovery =  6.3% (31179264/488279488) finish=91.3min speed=83358K/sec
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]

unused devices: <none>

Are the Disk Partitioned Correctly ?

Here two disks are partitioned identically

[root@ ~]# fdisk -lu /dev/sda; fdisk -lu /dev/sdb

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *          63      208844      104391   fd  Linux raid autodetect
/dev/sda2          208845  1953520064   976655610   fd  Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *          63      208844      104391   fd  Linux raid autodetect
/dev/sdb2          208845  1953520064   976655610   fd  Linux raid autodetect

Example : Incorrecty Partitioned 2nd Disk

In this example the partitions are set too close to the start of the disk and there is no room for GRUB to be written, the disk will not boot, there will not be enough room for grub staging

[root@ ~]# fdisk -l /dev/sdb; fdisk -lu /dev/sdb

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104384+  fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2              13      121601   976655647   fd  Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1      208769      104384+  fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2          208770  1953520063   976655647   fd  Linux raid autodetect
message Log showing Grub errors
add_drive_to_raid: Waiting for boot partition to sync before installing grub...
add_drive_to_raid: Probing devices to guess BIOS drives. This may take a long time.
add_drive_to_raid:
add_drive_to_raid:
add_drive_to_raid:     GNU GRUB  version 0.95  (640K lower / 3072K upper memory)
add_drive_to_raid:
add_drive_to_raid:  [ Minimal BASH-like line editing is supported.  For the first word, TAB
add_drive_to_raid:    lists possible command completions.  Anywhere else TAB lists the possible
add_drive_to_raid:    completions of a device/filename.]
add_drive_to_raid: grub> device (hd0) /dev/sdb
add_drive_to_raid: grub> root (hd0,0)
add_drive_to_raid:  Filesystem type is ext2fs, partition type 0xfd
add_drive_to_raid: grub> setup (hd0)
add_drive_to_raid:  Checking if "/boot/grub/stage1" exists... no
add_drive_to_raid:  Checking if "/grub/stage1" exists... yes
add_drive_to_raid:  Checking if "/grub/stage2" exists... yes
add_drive_to_raid:  Checking if "/grub/e2fs_stage1_5" exists... yes
add_drive_to_raid:  Running "embed /grub/e2fs_stage1_5 (hd0)"... failed (this is not fatal)
add_drive_to_raid:  Running "embed /grub/e2fs_stage1_5 (hd0,0)"... failed (this is not fatal)
add_drive_to_raid:  Running "install /grub/stage1 (hd0) /grub/stage2 p /grub/grub.conf "... succeeded
add_drive_to_raid: Done.
add_drive_to_raid: grub> quit

HowTo: Remove a disk from the RAID1 Array from the command Line

Look at the mdstat

[root@ ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb2[1] sda2[0]
      488279488 blocks [2/2] [UU]

md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]

unused devices: <none>

Fail and remove the disk, sdb in this case

[root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2
[root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md1
[root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1

Add the partitions back

[root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
mdadm: hot added /dev/sdb1
[root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
mdadm: hot added /dev/sdb2

Partition / Re-Partition, this disk

Delete Existing Partitions

[root@ ~]# fdisk /dev/sdb

The number of cylinders for this disk is set to 121601.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104384+  fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2              13      121601   976655647   fd  Linux raid autodetect

Command (m for help): d
Partition number (1-4): 1

Command (m for help): d
Selected partition 2

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Create new partitions

Note: change the partitions system id to reflect Linux raid autodetect

[root@ ~]# fdisk /dev/sdb

The number of cylinders for this disk is set to 121601.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-121601, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-121601, default 121601): 13

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (14-121601, default 14):
Using default value 14
Last cylinder or +size or +sizeM or +sizeK (14-121601, default 121601):
Using default value 121601

Command (m for help): a
Partition number (1-4): 1

Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): fd
Changed system type of partition 2 to fd (Linux raid autodetect)

Command (m for help): p

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14      121601   976655610   fd  Linux raid autodetect
 
Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Write the GRUB boot sector

[root@ ~]# grub

    GNU GRUB  version 0.95  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename.]

grub> device (hd0) /dev/sdb

grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.

grub> quit






The Leadup

I'm not sure if I'm reporting a bug or just some manual maintenance

My Disk didn't respond correctly to the Menu option "Manage Disk Redundancy". I was upgrading the hard disks to 1Gb disks from the 500Gb that came with the Dell server, the new disks were the Seagate 1Tb ST1000340NS, they are a Server Edition disk. It did this on both disks

The Disk was installed as the 2nd Hard Disk during an Upgrade process

It's not fatal, but it did stop the machine from booting on the disk, perhaps that's just not living, therefore not fatal, whatever, it's not terribly useful.


and a look from fdisks view shows

Note the correct partitioning on sda

[root@ ~]# fdisk -lu /dev/sda

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *          63      208844      104391   fd  Linux raid autodetect
/dev/sda2          208845  1953520064   976655610   fd  Linux raid autodetect

What has happened here is the disk partition has been written too close to the start of the drive, so the boot record hasn't got enough room for its GRUB staging - if thats the right term.

To correct this, remove the disk from the array, you will need to fail it, then remove it, the repartition and add it back to the array


Important.png Note:
I'm using sdb which was right for me, it might not be for you (if it's RAID 1, there is a 50% chance it's not !)


Here we go lets fix this

and then I can use the wiki's proceedure to grow the disk - which is why I am here

David Bray

17 March, 2010