Skip to main content

amazon ec2 - AWS EC2 Mailserver Failover Strategies done right

I'm researching in this topic really hard the last few
days and i just want to discuss this with a few specific questions - i did not find any
suitable thread here that is covering my needs and especially, that is quite actual -
the most posts about this topic are around 2010 when, i guess, the last time AWS had a
big failure (a whole region in murica was down when i remember
right)



The current
state:



We're running a Mailserver based on
Ubuntu with Postfix/Dovecot/Horde, reading all mailbased configs out of a MySQL
database. This is running as an EC2 instance with an EBS Storage where the OS and
currently also the mails are stored. So far so good, but we're a startup and not just a
private person who needs this server - so it is a Mailservice for our customers, super
critical and verry important for us. After a few fails and downtimes in the first year,
i will dramatically improve the setup - so i thought about "redundancy",
basically..



The
requirement:



The server must be "redundant" in
some way, a fail of a single EC2 instance should not break the whole service
anymore.




My research so far and
options i see to
solve:




  • Copy the
    instance into another region for example and build a "real" redundancy, a little bit old
    fashioned but that's what i learned back in school - using the new server as an
    MX-Backup configured through a second MX-Entry in DNS with lower priority. Problems
    here: Solving the data-redundancy -> i need to use rsync and db-replication for
    example to sync both servers. Not the option i want to implement because it can be
    super-tricky...


  • Service-Driven
    Solution, just using the AWS Possibilities right. I should use RDS for database and S3
    for storage. So, if i have all the mails in the storage cloud (S3) and all the
    config-database-data in the db-cloud (RDS) -> the instance itself gets super
    flexible. This will give me the possibility to run several instances of that type in the
    same moment - so i can use ELB of EC2 to handle the load, starting new instances and
    detect failovers if one instance dies!! On the other side, my critical data spots, db
    and mailstorage would be service-driven, so i have not to think about failovers,
    downtime and most important, about scaleability anymore! So far the absolutely best
    solution i can imagine, but i see some serious
    problems.




Final
Questions:





  • I
    never saw a good integration of S3 directly into the filesystem of Ubuntu - the
    experience i made is, that after few days of permanent run, the mount can disappear
    suddenly and with no reason and on the other side, multiple mounted S3 "drives" will
    replicate their data very very slow - i can understand that because it's a global cloud
    service but... How should this work? Imagine multiple running mailserver-instances, each
    using the same S3-drive -> so it is a requirement to replicate the maildata in an
    instant! So how we can "implement" a service-driven mailstorage that is really working
    with AWS? Has anyone ever made something like this? I just read everywhere "yeah so, you
    have to use aws services to solve that" but i can't find real implementations of that
    with mail.


  • Would an EBS-Based solution
    be better? So each running instance will have its own, dedicated drive to store,
    super-available and fast and again i will make an rsync setup to sync each other... Big
    contra here, huge costs.. each instance must have a huge EBS because everyone have to
    store ALL mails -> bullshit
    ^^




Is there any
other failover scenario with AWS which i don't know yet? Sorry for the long text but i
wanted to share all my thoughts so far... Thanks for reading if anyone does!
:)

Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits,

apache 2.2 - Server Potentially Compromised -- c99madshell

So, low and behold, a legacy site we've been hosting for a client had a version of FCKEditor that allowed someone to upload the dreaded c99madshell exploit onto our web host. I'm not a big security buff -- frankly I'm just a dev currently responsible for S/A duties due to a loss of personnel. Accordingly, I'd love any help you server-faulters could provide in assessing the damage from the exploit. To give you a bit of information: The file was uploaded into a directory within the webroot, "/_img/fck_uploads/File/". The Apache user and group are restricted such that they can't log in and don't have permissions outside of the directory from which we serve sites. All the files had 770 permissions (user rwx, group rwx, other none) -- something I wanted to fix but was told to hold off on as it wasn't "high priority" (hopefully this changes that). So it seems the hackers could've easily executed the script. Now I wasn't able