Skip to main content

ext4 - How does SSD meta-data corruption on power-loss happen? And can I minimize it?



Note: This is a follow-up question to Is there a way to protect SSD from corruption due to power loss?. I got good info there but it basically centered in three area, "get a UPS", "get better drives", or how to deal with Postgres reliability.



But what I really want to know is whether there is anything I can do to protect the SSD against meta-data corruption especially in old writes. To recap the problem. It's an ext4 filesystem on Kingston consumer-grade SSDs with write-cache enabled and we're seeing these kinds of problems:




  • files with the wrong permissions


  • files that have become directories (for example, toggle.wav is now a directory with files in it)

  • directories that have become files (not sure of content..)

  • files with scrambled data



The problem is less with these things happening on data that's being written while the drive goes down, or shortly before. It's a problem but it's expected and I can handle that in other ways.



The bigger surprise and problem is that there is meta-data corruption happening on the disk in areas that were not recently written to (ie, a week or more before).



I'm trying to understand how such a thing can happen at the disk/controller level. What's going on? Does the SSD periodically "rebalance" and move blocks around so even though I'm writing somewhere else? Like this:




enter image description here



And then there is a power loss when D is being rewritten. There may be pieces left on block 1 and some on block 2. But I don't know if it works this way. Or maybe there is something else happening..?



In summary - I'd like to understand how this can happen and if there anything I can do to mitigate the problem at the OS level.



Note: "get better SSDs" or "use a UPS" are not valid answers here - we are trying to move in that direction but I have to live with the reality on the ground and find the best outcome with what we have now. If there is no solution with these disks and without a UPS, then I guess that's the answer.



References:




Is post-sudden-power-loss filesystem corruption on an SSD drive's ext3 partition "expected behavior"?
This is similar but it's not clear if he was experiencing the kinds of problems we are.



EDIT:
I've also been reading issues with ext4 that might have problems with power-loss. Ours are journaled, but I don't know about anything else.



Prevent data corruption on ext4/Linux drive on power loss



http://www.pointsoftware.ch/en/4-ext4-vs-ext3-filesystem-and-why-delayed-allocation-is-bad/



Answer



For how metadata corruption can happen after an unexpected power failure, give a look at my other answer here.



Disabling cache can significantly reduce the likehood of in-flight data loss; however, based on your SSDs, data-at-rest remain at risk of being corrupted. Moreover, it commands a massive performance loss (I saw 500+ MB/s SSDs to write at a mere 5 MB/s after disabling the private DRAM cache).



If you can't trust your SSDs, the only "solution" (or, rather, workaround) is to use an end-to-end checksumming filesystem as ZFS or BTRFS and a RAID1/mirror setup: in this manner, any eventual single-device (meta)data corruption can be recovered from the other mirror side by running a check/scrub.


Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits,

apache 2.2 - Server Potentially Compromised -- c99madshell

So, low and behold, a legacy site we've been hosting for a client had a version of FCKEditor that allowed someone to upload the dreaded c99madshell exploit onto our web host. I'm not a big security buff -- frankly I'm just a dev currently responsible for S/A duties due to a loss of personnel. Accordingly, I'd love any help you server-faulters could provide in assessing the damage from the exploit. To give you a bit of information: The file was uploaded into a directory within the webroot, "/_img/fck_uploads/File/". The Apache user and group are restricted such that they can't log in and don't have permissions outside of the directory from which we serve sites. All the files had 770 permissions (user rwx, group rwx, other none) -- something I wanted to fix but was told to hold off on as it wasn't "high priority" (hopefully this changes that). So it seems the hackers could've easily executed the script. Now I wasn't able