raid - What will happen if encountered a URE?

About hdd URE, I knew these points:

For some reasons, when harddisk reading a sector that the FEC(Foward Error Correction data) could not correct the errors on that sector, we encontered an URE.

The rating we encountered an URE is very low, but still exists.

When reconstructing a RAID 5 array, sometimes it will happened and the reconstruction progress will stop.

But I still have some questions:

If there is a single disk, what will happened? Hardware/file system report an error and we lost a file? Or we got the file with wrong data?

Will rewrite some data to that URE sector could turn the sector become normal? Or must we use some utilities provieded by HDD manufactory and remap another reserve sector?

If it happened when we mirror/re-mirror a RAID 1/10 array, what will the RAID controller do? Stop the mirror progress? Or just copy the uncorrect data to another disk?

Thanks for the answer, question 1&2 is solved.

But the 3rd question I mean if encountered URE when converting a single HDD to RAID 1 array by add another new disk, or replacing a failure disk in a RAID 1/10 array, there's no redundancy to correct the error. Will it complete the mirror/re-mirror progress with error data? OR stop the progress like RAID 5 recontruction?

Answer

With a single disk, an unrecoverable error is just that - it can't be done and it is reported to the filesystem and subsequently to the application trying to read a file. Generally, it is preferred to get an explicit error instead of unreliable data.

Writing to an unreadable sector will either fix the physical sector (for a soft error when e.g. writing was interrupted by power loss) or the drive will map the logical sector to one of its spare pool. This happens at the discretion of the drive and isn't usually user/driver selectable.

The RAID controller will most likely repair the sector - either from the mirror or by rebuilding the data from the redundancy set. When during a mirror or rebuild a(nother) read error prevents this repair, the error sticks and the array is bad. Some RAID sets can repair multiple errors (RAID 6 or some nested RAIDs) but once errors pile up you're out of luck.

It's important to make sure errors don't pile up on rarely used sectors - they can become uncorrectable errors when sectors are left unread for months or even years. So, make sure you enable data scrubbing, media patrol, patrol read or whatever it's called on your hardware to check all data on a regular basis. That way, you ensure a rebuild works when you need it.

Some people report that during a rebuild, additional drives start to fail because of the stress but I've found this to be a myth. The drives just stumble on stale, accumulated errors. You can stress even very old drives for days without any problems.

Blog

Search This Blog

raid - What will happen if encountered a URE?

Comments

Post a Comment

Popular posts from this blog

linux - Awstats - outputting stats for merged Access_logs only producing stats for one server's log

iLO 3 Firmware Update (HP Proliant DL380 G7)

hp proliant - Smart Array P822 with HBA Mode?