Skip to main content

Software vs hardware RAID performance and cache usage




I've been reading a lot on RAID controllers/setups and one thing that comes up a lot is how hardware controllers without cache offer the same performance as software RAID. Is this really the case?



I always thought that hardware RAID cards would offer better performance even without cache. I mean, you have dedicated hardware to perform the tasks. If that is the case what is the benefit of getting a RAID card that has no cache, something like a LSI 9341-4i that isn't exactly cheap.



Also if a performance gain is only possible with cache, is there a cache configuration that writes to disk right away but keeps data in cache for reading operations making a BBU not a priority?


Answer



In short: if using a low-end RAID card (without cache), do yourself a favor and switch to software RAID. If using a mid-to-high-end card (with BBU or NVRAM), then hardware is often (but not always! see below) a good choice.



Long answer: when computing power was limited, hardware RAID cards had the significant advantage to offload parity/syndrome calculation for RAID schemes involving them (RAID 3/4/5, RAID6, ecc).




However, with the ever increasing CPU performance, this advantage basically disappeared: even my laptop's ancient CPU (Core i5 M 520, Westmere generation) has XOR performance of over 4 GB/s and RAID-6 syndrome performance over 3 GB/s over a single execution core.



The advantage that hardware RAID maintains today is the presence of a power-loss protected DRAM cache, in the form of BBU or NVRAM. This protected cache give very low latency for random write access (and reads that hit) and basically transform random writes into sequential writes. A RAID controller without such a cache is near useless. Moreover, some low-end RAID controllers do not only come without a cache, but forcibly disable the disk's private DRAM cache, leading to slower performance than without RAID card at all. An example are DELL's PERC H200 and H300 cards: if newer firmware has not changed that, they totally disable the disk's private cache (and it can not be re-enabled while the disks are connected to the RAID controller). Do a favor yourself and do not, ever, never buy such controllers. While even higher-end controller often disable disk's private cache, they at least have their own protected cache - making HDD's (but not SSD's!) private cache somewhat redundant.



This is not the end, though. Even capable controllers (the one with BBU or NVRAM cache) can give inconsistent results when used with SSD, basically because SSDs really need a fast private cache for efficient FLASH page programming/erasing. And while some (most?) controllers let you re-enable disk's private cache (eg: PERC H700/710/710P let the user re-enable it), if that private cache is not write-protected you risks to lose data in case of power loss. The exact behavior really is controller and firmware dependent (eg: on a DELL S6/i with 256 MB WB cache and enabled disk's cache, I had no losses during multiple, planned power loss testing), giving uncertainty and much concern.



Open source software RAIDs, on the other hand, are much more controllable beasts - their software is not enclosed inside a proprietary firmware, and have well-defined metadata patterns and behaviors. Software RAID make the (right) assumption that disk's private DRAM cache is not protected, but at the same time it is critical for acceptable performance - so they typically do not disable it, rather they use ATA FLUSH / FUA commands to be certain that critical data land on stable storage. As they often run from the SATA ports attached to the chipset SB, their bandwidth is very good and driver support is excellent.



However, if used with mechanical HDDs, synchronized, random write access pattern (eg: databases, virtual machines) will greatly suffer compared to an hardware RAID controller with WB cache. On the other hand, when used with enterprise SSDs (ie: with a powerloss protected write cache), software RAID often excels and give results even higher than what achievable with hardware RAID cards. That said you had to remember that consumer SSDs (read: with non-protected writeback cache), while very good at reading and async writing, deliver very low IOPS in synchronized write workloads.




Also consider that software RAIDs are not all created equal. Windows software RAID has a bad reputation, performance wise, and even Storage Space seems not too different. Linux MD Raid is exceptionally fast and versatile, but Linux I/O stack is composed of multiple independent pieces that you need to carefully understood to extract maximum performance. ZFS parity RAID (ZRAID) is extremely advanced but, if not correctly configured, can give you very poor IOPs; mirroring+striping, on the other side, performs quite well. Anyway, it need a fast SLOG device for synchronous write handling (ZIL).



Bottom line:




  1. if your workloads are not synchronized random write sensitive, you don't need a RAID card

  2. if you need a RAID card, do not buy a RAID controller without WB cache

  3. if you plan to use SSD software RAID is preferred but keep in mind that for high synchronized random writes you need a powerloss-protected SSD (ie: Intel S4600, Samsung PM/SM863, etc). For pure performance the best choice probably is Linux MD Raid, but nowadays I generally use striped ZFS mirrors. If you can not afford losing half the space due to mirrors and you needs ZFS advanced features, go with ZRAID but carefully think about your VDEVs setup.

  4. if you, even using SSD, really need an hardware RAID card, use SSDs with write-protected caches (Micron M500/550/600 have partial protection - not really sufficient but better than nothing - while Intel DC and S series have full power loss protection, and the same can be said for enterprise Samsung SSDs)


  5. if you need RAID6 and you will use normal, mechanical HDDs, consider to buy a fast RAID card with 512 MB (or more) WB cache. RAID6 has a high write performance penalty, and a properly-sized WB cache can at least provide a fast intermediate storage for small synchronous writes (eg: filesystem journal).

  6. if you need RAID6 with HDDs but you can't / don't want to buy a hardware RAID card, carefully think about your software RAID setup. For example, a possible solution with Linux MD Raid is to use two arrays: a small RAID10 array for journal writes / DB logs, and a RAID6 array for raw storage (as fileserver). On the other hand, software RAID5/6 with SSDs is very fast, so you probably don't need a RAID card for an all-SSDs setup.


Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

ubuntu - Monitoring CPU, Mem, disk, on a single server

I've been looking for a simple starter solution for monitoring my [currently] single server hosted solution. Other than Nagios and similar, are there other good (simple) solutions people are using? Answer Everything depends on what you want. For example Munin is very simple, you can install and configure it in less then 10 minutes (on one server), it can sends alarms, make graphs from monitoring cpu, mem. apache connections, eaccellerator, disk io and many many more (it has many plugins). But if you are planning in future get some more machines, munin may not be enough. For example in munin you cant monitor state of individual processes, can't monitor changes in files (for security purpose). So if you wanna only see what is the utilization of basics parameters on your server and don't plan to buy some more servers Munin is what you are looking for, but if you wanna be alarmed when some of your service is down, take more control on what is happeninig on...

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits, ...