I'm looking for a way to setup Apache as high-availability. The idea is to have a cluster of 2+ Apache servers serving the same websites. I can have the IP address of each server set up with round-robin DNS so that each request is randomly sent to one of the servers in the cluster (I'm not too concerned with load-balancing just yet, though that may come into play later on).
I already have it set up and working with multiple Apache VM servers (spread across multiple physical servers) serving websites, and round-robin DNS, and this works fine. The SQL database is set up using MariaDB in a high-availability cluster, the web data (HTML, JS, PHP scripts, images, other assets) are stored within LizardFS, and the sessions are stored in a shared location as well. This all works well until one of the servers in the cluster becomes inaccessible for whatever reason. Then a percentage of the requests (roughly the number of downed servers divided by the number of total servers in the cluster) are unanswered. Here are the options I've considered:
Automatic DNS Updates
Have some process that monitors the functionality of the web servers, and removes any downed servers from DNS. This has two issues:
First, even though we can set our TTL to some very low number (like 5
seconds), I've heard that a handful of DNS servers will enforce a
minimum TTL higher than ours. And, some browsers (namely Chrome)
will cache DNS for no less than 60 seconds regardless of TTL
settings. So even though we're good on our end, some clients may not
be able to reach sites for some time in the event of a DNS update.Second, the program that monitors the functionality of the cluster
and updates DNS records becomes a new single point of failure. We
may be able to get around this by having more than one monitor spread
across multiple
systems, because if they both detect a problem and they both make the same DNS changes, then that shouldn't cause any issues.
uCarp/Heartbeat
Make the IP addresses that are accessed and in round-robin DNS virtual, and have them reassigned to up servers from down servers in the case that a server goes down. For instance, server1's VIP is 192.168.0.101 and server2's VIP is 192.168.0.102. If server1 goes down, then 192.168.1.102 becomes an additional IP on server2. This has two issues:
First, to my knowledge, uCarp/Heartbeat monitors their peers
specifically for inaccessibility, for instance, if the peer can't be
pinged. When that happens, it takes over the IP of the downed peer.
This is an issue because there are more reasons a web server may not
be able to serve requests other than just being inaccessible on the
network. Apache may have crashed, a config error may exist, or some
other reason. I would want the criteria to be "the server isn't
serving pages as required" rather than "the server isn't pingable".
I don't think I can define that in uCarp/Heartbeat.Second, this doesn't work across data centers, because each set of
servers across data centers has different blocks of IP addresses. I
can't have a virtual IP float between data centers. The requirement
to function across data centers (yes, my distributed file system and
database cluster are available across data centers) isn't required,
but it would be a nice plus.
Question
So, any thoughts on how to deal with this? Basically, the holy grail of high availability: No single points of failures (either in the server, load balancer, or the data center), and virtually no downtime in the event of a switch over.
Answer
When I want HA and load sharing, I use keepalived and configure it with two VIPs. By default, VIP1 is assigned to server1 and VIP2 is assigned to server2. When any server is down, the other server takes both VIPs.
Keepalived will take care of HA by watching the other server. If a server is not reachable or any interface is down, it changes to FAULT
state. VIP will be taken by other server. To monitor your service, you can use track_script
option.
If you want to add another cluster in another data center, you can add two more servers and do the same configuration. Now, you can load-share traffic between data centers using DNS round-robin. No DNS update is required in this case.
Comments
Post a Comment