In almost all cases, RAID provides redundancy that allows the loss of one or more disks to be tolerated. On the one hand, however, one must know when a failure occurs to act, and on the other hand, know what to do. It is not enough to remove the failing disk and put another in its place so that everything goes in order. Small survival manual in the admin of a RAID.


How to update or switch linux distributions without deleting files



As explained in the introduction, all RAID levels other than RAID 0 are fault tolerant. The tolerance level depends on the number of drives, the RAID level, and the RAID configuration. Our article on RAID levels allows you to calculate fault tolerance for your configuration.

The diagnosis

It all begins with knowledge. We can not solve a problem we do not know. By default, mdstatsends mail to root when the RAID is in a degraded state. Depending on your configuration, you may not receive emails sent to root. However, you can change the address to which the email is sent in /etc/mdadm/mdadm.conf. You must then modify the line:

MAILADDR my - email @ mail . com

However, when the RAID changes to gradient, it is usually the case that you have completely lost a disk. The earlier the problems are known, the easier it is to remedy them. For my part, I often take a look at the system logs to see if there are any traces of errors or failing sectors on the disk.

I advise you to install Logwatch. This software will allow you to receive every day (or every week), a report of the activity of your servers. Also, the article also explains how to start installing an email sending server (useful if you do not already have one to receive alerts mdadm).

Whether you have configured Logwatch or you inspect the logs manually in /dev/log/syslog, or look for errors of this type:

End_request : I / O error , dev sdb , sector 3848483864

If you see such disk errors, it is advisable to look more closely at the smart state of your disks. Start by listing your different disks and partitions with fdisk -lor parted -l(if you have gpt partitions).

root @ st1 : ~ # parted - the
 Mode the : ATA ST2000DM001 - 1CH1 ( scsi ) Disk / dev / nda : 2000GB Size sectors ( logical / physical ): 512B / 4096B Table partitions : gpt 
  
 


Num e ro D e aim    End Size Syst e m File   Name Flags 1 20 , 5KB 1049kB 1029kB                        primary bios_grub
  2 2097kB 21 , 0GB 21 , 0GB   ext4 primary raid
  3 21 , 0GB 2000GB 1979GB   ext4 primary raid
  4 2000GB 2000GB 536MB    linux - swap ( V1 )             
                                                primary


Mod e the : ATA HGST HUS724020AL ( scsi ) Disk / dev / sdb : 2000GB Size sectors ( logical / physical ): 512B / 512B Table partitions : gpt
  
 


Num e ro D e aim    End Size Syst e m File   Name Flags 1 20 , 5KB 1049kB 1029kB                        primary bios_grub
  2 2097kB 21 , 0GB 21 , 0GB   ext4 primary raid
  3 21 , 0GB 2000GB 1979GB                        primary raid
  4 2000GB 2000GB 536MB                         primary             
                                         


Mod e the : ATA HGST HUS724020AL ( scsi ) Disk / dev / sdc : 2000GB Size sectors ( logical / physical ): 512B / 512B Table partitions : gpt
  
 


Num e ro D é aim    End Size Syst e m File   Name Flags 1 20 , 5KB 1049kB 1029kB                        primary bios_grub
  2 2097kB 21 , 0GB 21 , 0GB   ext4 primary raid
  3 21 , 0GB 2000GB 1979GB                        primary raid
  4 2000GB 2000GB 536MB    linux - swap ( v1             
                                          )       primary


Mod è the : ATA HGST HUS724020AL ( scsi ) Disk / dev / sdd : 2000GB Size sectors ( logical / physical ): 512B / 512B Table partitions : gpt
  
 


Num e ro D e aim    End Size Syst e m File   Name Flags 1 20 , 5KB 1049kB 1029kB                        primary bios_grub
  2 2097kB 21 , 0GB 21 , 0GB   ext4 primary raid
  3 21 , 0GB 2000GB 1979GB                        primary raid
  4 2000GB 2000GB 536MB    linux - swap ( V1 )             
                                                primary


Mod e the : Matrix RAID software under Linux ( tm ) Disk / dev / md 2 : 21 , 0GB Size sectors ( logical / physical ): 512B / 4096B Table partitions : loop  
  
 


Num e ro D e aim   End Size Syst e m files   Flags 1 0 , 00B 21 , 0GB 21 , 0GB   ext4       
           


Mod è the : Matrix RAID software under Linux ( tm ) Disk / dev / md3 : 3958GB Size sectors ( logical / physical ): 512B / 4096B Table partitions : loop  
  
 


Num e ro D e aim   End Size Syst e m files   Flags 1 0 , 00B 3958GB 3958GB   ext4       

It shows here that I have four physical disks of 2TB each on which rest two software RAID ( /dev/md2and /dev/md3). As we had just seen above, we have read errors due to failing sectors on the sdb. We then inspect the smart state of this disk more carefully:

Root @ st1 : ~ # smartctl - a - d ata / dev / sdb
Smartctl 5.41 2017 - 06 - 09 r3365 [ x86_64 - linux - 3.10 . 23 - xxxx - std - ipv6 - 64 - rescue ] ( local build ) Copyright ( C ) 2002 - 17 by Bruce Allen , http : //smartmontools.sourceforge.net  
     

[...]

SMART Attributes Data Structure revision number : 10 Vendor Specific SMART Attributes with Thresholds : 
ID # ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 111 099 006 Pre - fail   Always - 120 551 345 3 Spin_Up_Time 0x0003 095 095 000 Pre - fail   Always - 0 4   
   
                                   
                                          
   Start_Stop_Count 0x0032 100 100 020 Old_age Always - 32 5 Reallocated_Sector_Ct 0x0033 097 097 036 Pre - fail   Always - 4008 7 Seek_Error_Rate 0x000f 077 060 030 Pre - fail   Always - 4351310992 9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 18725 10 Spin_Retry_Count 0x0013 100                                      
                                 
                                       
                                           
                100 097 Pre - fail   Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 32 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End - to - END_ERROR 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 089 089 000 Old_age                     
                                       
                                      
                                       
                       Always - 11 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 068 056 045 Old_age Always - 32 ( Min / Max 26 / 35 ) 191 G - Sense_Error_Rate 0x0032 100              
                                          
                                        
                                  
            100 000 Old_age Always - 0 192 Power - Off_Retract_Count 0x0032 100 100 000 Old_age Always - 31 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 46 194 Temperature_Celsius 0x0022 032 044 000 Old_age Always - 32 ( 0 16 0 0 ) 197 Current_Pending_Sector 0x0012 082                        
                                
                                       
                                        
         082 000 Old_age Always - 3056 198 Offline_Uncorrectable 0x0010 082 082 000 Old_age Offline - 3056 199 UDMA_CRC_Error_Count     0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 18708h + 109m + 27.415s 241 Total_LBAs_Written 0x0000 100 253 000                        
                                 
                              
                                     
                    Old_age Offline - 24242600021 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 112149279603                
                                       

[...]

We observe in line 1 that very many sectors – 120551345! – have become illegible. The age of the disc recorded on line 9 confirms that it is old – 18725/24 = 780 days ie more than two years non-stop function. Given the number of errors, this record should have already passed the trash a long time ago, and given its age, it has made its time. It must now be replaced by a new one.

Although a disk is tired, your RAID is not necessarily in a degraded state. When the RAID is in degraded, it means that it is still operational but at least one disk is removed from the array. However, when a disc is found to be in an advanced wear condition, it is more prudent to replace it without waiting.

You can check the raid status by displaying the output of the file /proc/mdstat:

root @ st1 : ~ # cat / proc / mdstat
 Personalities : [ linear ] [ multipath ] [ RAID0 ] [ raid1 ] [ raid6 ] [ RAID5 ] [ RAID4 ] [ raid10 ]  
md3 : active raid6 sdc3 [ 2 ] SDS3 [ 3 ] sda3 [ 0 ] 3865011200 blocks level 6 , 512k         
       chunk , algorithm 2 [ 4 / 3 ] [ U_UU ]  
      
md2 : active raid1 sdc2 [ 2 ] sdd2 [ 3 ] sda2 [ 0 ] 20,478,912 blocks [ 4 / 3 ] [ U_UU ]

We see here that a disk is output from the array since in the md2as in the md3one we have [4/3]. This means that only three of the four disks in the array are in use and that the second one [U_UU]is out of RAID. md3Is a raid consisting of partitions sdc3sdd3and sda3. It is then deduced that this sdbis failing.

The operation

The operation involves changing the disc. In general, the easiest way is to turn the machine off to change the disc, but it is still possible to do it hot. If you want to do this without turning off the machine, you will need to remove the defective disk from the RAID.

It is for obvious reasons impossible to remove a non-defective disk from an operating array. We first mark our drive as faulty if it is not already out of RAID. In our case, if sdbnot out, we should have removed the partitions sdb3and sdb2RAID md2and md3:

Mdadm - manage - set - faulty / dev / md2 / dev / sdb2
Mdadm - manage - set - faulty / dev / md3 / dev / sdb3

Once the faulty drive is disconnected from the RAID (s), it can be disconnected safely and replaced with a new one.

Once the new disk is installed, it is not even partitioned, you will see it with partedor fdisk:

Error : Unable to open / dev / sdb - unrecognized disk label.

However, if you refer to the output partedor fdiskthe beginning, you see that our disks have the same partitions. Each disc has a partition dedicated to the 1979GB raid 6.

3 21 , 0GB 2000GB 1979GB   ext4 primary raid

We will simply clone the partition table of a disk, sdafor example, to the sdb.

# If the partitions are gpt 
sgdisk - R = / dev / sdb / dev / sda


# If the partition table is MBR 
sfdisk - d / dev / sda | Sfdisk - force / dev / sdb

We can verify that our disk is partitioned by re-displaying the partition table. We still have to insert the disk in the RAID to begin the reconstruction of the latter:

# First add the partition sdb3 to md3 (raid 6) 
mdadm - manage / dev / md3 - add / dev / sdb3

# Then we do the same with sdb2 and md2 
mdadm - manage / dev / md2 - add / dev / sdb2

And now we have only to watch the rebuilding progress!

# Then notes that the md3 is rebuilding and the md 2 waits his turn;) 
oot @ st1 : ~ # cat / proc / mdstat
 Personalities : [ linear ] [ multipath ] [ RAID0 ] [ raid1 ] [ raid6 ] [ RAID5 ] [ Raid4 ] [ raid10 ]  
md3 : active raid6 sdb3 [ 4 ] sdc3 [ 2 ] sdd3 [ 3 ]         sda3 [ 0 ] 3865011200 blocks level 6 , 512k chunk , algorithm 2 [ 4 / 3 ] [ U_UU ] [> ....................]   recovery = 4.9 % ( 96507008 / 1932505600 ) finish = 281.9min speed = 108540K / sec
         
         
      
md2 : active raid1 sdb2 [ 4 ] sdc2 [ 2 ] sdd2 [ 3 ] sda2 [ 0 ] 20,478,912 blocks [ 4 / 3 ] [ U_UU ] 
      	resync = DELAYED

It may take more or less time depending on the size of your drives and the level of RAID, be patient!

So, did you manage to save the furniture?

Author

Am a tech geek.. Do you wanna know more about me..? My contents will do tell you.

Pin It