Skip to main content

Extendable RAID storage system



I am currently building a storage unit for our office. It is rather low budget at the moment, but it needs to be extendable.




basically we have a huge database that will grow over the next few months quite heavily. Therefore, ideally we would just like to throw hard discs at our new server.



We have not purchased the server yet, but going through some details. However, I would like to get an answer to a question first.



How easy is it to expand existing RAID systems?



We will start with two HDD 4TBs WD black. But after about 1 month we will need to add another 2 4TB disks. The server we are going to get has 12 bays.



Mirroring is important, However RAID 1 only works with 2 disks. Raid 10, would already allow us to mirror a RAID 0. And from what I have seen even the raid 10 can be installed with two disks. However, what happens after that ? Is there any recommendation to achieve a flexible RAID system ?




On the OS layer I would just like to build a LVM, that recognises once there is space added to the "disk" so that it can be expanded. But in fact, it lies on several disks which are managed by the RAID controller.


Answer



There are quite a few options with varying degrees of resilience, disk efficiency and ease of operation. Here are a few:



RAID 0 & 1
RAID 0 and 1 are immediately out of the question, RAID 0 offers no redundancy (in fact it increases risk) and RAID 1 is limited - as you mentioned - to the maximum size of 1 disk.



RAID 5
Is an option and you only lose 1 disk for parity; though this is bitter-sweet as the more disks you have the chances of finding an error on 2 (or more) disks rises (and you're screwed if that happens). Write speeds are often lacking. Minimum 3 disks to start with. Expanding is time consuming and has a high risk of total failure.




RAID 6
Is a more resilient option, same as RAID 5 except it uses 2 x disks for parity, thus you can lose 2 x disks to failure as still be able to rebuild. Write speeds are often lacking. Minimum 4 disks to start with. Expanding is very time consuming and has a lower-than-RAID5 risk of total failure, but not negligible risk.



RAID 10
Is the most resilient option of all the RAIDs and also the lease efficient as it uses half of all disks present for mirroring. One major benefit over RAID 5 and 6 is write speeds are often significantly improved with every disk you add (as opposed to declining in performance) - this can be essential depending on what type of database you're implementing. Minimum 4 disks to start with, adding 2 at a time after that. Expanding is probably the fastest with the minimum risk associated.



RAID 50/60
Is the middle ground between RAID 5/6 and RAID 10 - has better disk usage efficiency than RAID 10 but has the requirement of having lots of disks to start with (Minimum 6), also performs better than the basic RAIDs. Expansion is very time consuming, risk depends on how many disks are in the array but is somewhere between RAID 10 and 5/6 (weighted towards 5/6).



LVM

I don't use this much, so I'll leave that avenue for someone else to comment on.



Filesystem-based RAID
BTRFS and ZFS can both perform RAID 0,1,5 transparently across disks without the need for Linux RAID managment. Adding/removing/altering array sets are easy (though time consuming, as RAID is). ZFS has the benefit of being tried and tested for many years, whereas BTRFS is still an emerging filesystem.



Conclusion:



Linux RAID is somewhat more forgiving than hardware RAID, where RAID 5/6/50/60 are involved Linux RAID can make your life a bit easier if things go pear shaped (like losing 2 disks on a RAID 5 array, you can still assemble the array and try to recover whereas most HBAs will outright refuse). RAID 10 with hardware RAID is usually the safest bet both in regards to resiliency, I/O throughput and expanding times. So put it down to my top 2:



If I/O throughput is not a high priority:

* Linux RAID 5, but routinely back your data up elsewhere to offset the risk. Expansion is as simple as a 1-line command, though it'll take a while to complete.



If I/O throughput is a priority:
* Hardware RAID 10, the schedule for backing your data up can be relaxed somewhat. Expansion will depend on the hardware RAID type, but won't take as long to initialise.


Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

ubuntu - Monitoring CPU, Mem, disk, on a single server

I've been looking for a simple starter solution for monitoring my [currently] single server hosted solution. Other than Nagios and similar, are there other good (simple) solutions people are using? Answer Everything depends on what you want. For example Munin is very simple, you can install and configure it in less then 10 minutes (on one server), it can sends alarms, make graphs from monitoring cpu, mem. apache connections, eaccellerator, disk io and many many more (it has many plugins). But if you are planning in future get some more machines, munin may not be enough. For example in munin you cant monitor state of individual processes, can't monitor changes in files (for security purpose). So if you wanna only see what is the utilization of basics parameters on your server and don't plan to buy some more servers Munin is what you are looking for, but if you wanna be alarmed when some of your service is down, take more control on what is happeninig on...

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits, ...