It’s the moment we all dread. The notification telling us that there is a failed hard drive in our storage array. Thoughts of “I hope the disk array will rebuild properly”, “I hope nothing else fails during the replacement” and so on. It was just before Christmas that my dad forwarded me a mail from his Synology saying “what do I do next” and this is what we did!
Drive Replacement Instructions
In this scenario the Synology had been setup with SHR2 – which effectively is a software RAID6 – I have my suspicions that it is Linux LVM under the covers (this blog has further details). This software RAID 6 does mean that we have two parity disks in the array, therefore I could withstand two drive failures at once.
- You will first see a notification within DSM (Synology administration page) or via email notifying you of the failure and which drive has failed.


- Go to the storage manager, and click on Overview, from the left hand side options.

- Within the main panel of this window, you should see Drive Information, showing you visually which disk has failed. This should correlate with the message you saw in Step 1.


- From the left hand menu, now select the HDD/SSD option located at the bottom of the list. From the main panel, you will then need to select the drive that has failed, and from the Action menu at the top of the window, click “Deactivate Drive”. This then removes the disk from operation.

- I thought it wise to also run an extended disk check (available through the ‘Health Info’ option), to check and validate the rest of the disks.

- I then went ahead and physically pulled the drive out of the system (ensuring it was the correct one, shown in step 3!), and replaced it with a new drive. Depending on the model of system you may have to power the Synology down to do this.
- The disk then showed up in the GUI as being ‘Not Initialized’

- You should now select the ‘Not Initialized’ drive, and click on ‘Manage Available Drives’. This will present you with options to “Assign as hot spare”, “create a new storage pool”, or the option we want “Repair Storage Pool”

- You will then be guided through a wizard to repair the storage pool. First select the pool you’d like to repair (in my scenario, it’s Storage Pool 1, which matches what we saw in step 1). Click Next.

- Now confirm the drive you wish to use to repair (in my scenario, it’s disk 2, which matches what we saw in step 1), then click next.

- You will then see confirmation of the action you are going to take, when you are happy, click ‘Apply’

- You will then receive a second confirmation, if you are sure, click ‘OK’.

- The final phase will take many hours, depending on the amount of data, and how large your disks are. The system will go through an “Initialization” phase, and then will automatically proceed to a copying phase. In my case it took about 8 hours, however the time did fluctuate up and down a little.


Key Take Away
- Setup notifications for important events on your Synology. This will enable you to react to issues promptly before they get worse
- Ensure you have a plan for a failure scenario. A failure will happen, it’s just a case of when!
- Make sure you follow a good backup strategy. RAID is no substitute for a proper backup methodology. You should adopt one that meets your level of risk, fault tolerance and investment. A tried and tested plan is the 3-2-1 backup rule.