high availability - Understanding the nameserver aspect of a DNS based failover system

As part of a project I'm involved in, system is required with as close to 99.999% uptime as possible (the system involves healthcare). The solution I am investigating involves having multiple sites which in turn have their own load balancers and multiple internal servers, and their own replicated database which is synchronised with every other site. What sits in front of all of this is a DNS based failover system that redirects traffic if a site goes down (or is manually taken down for maintenance).

What I'm struggling with however is how the DNS aspect functions without preventing a single point of failure. I've seen talk of floating IPs (which present that point of failure), various managed services such as DNSMadeEasy (which don't provide the ability to fully test their failover process during their free trial, so I can't verify if it's right for the project or not) and much more, and have been playing around with simple solutions such as assigning multiple A records for a domain name (which I understand falls far short given the discrepancies between how different browsers will interact with such a setup).

For a more robust DNS based approach, do you simply stipulate a nameserver for each location on a domain, run a nameserver at each location, and update each nameserver's independent records regularly when a failure is detected at another site (using scripts run on each nameserver to check all other sites)? If so, aren't there still the same issues that are found with regularly changed A records (browsers not updating to the new records, or ignoring very low TTLs)?

Here's a visual representation of how I understand the system would work.

I have been reading around this subject for several days now (including plenty of Q&As on here), but feel like I'm missing a fundamental piece of the puzzle.

Thanks in advance!

Blog

Search This Blog

high availability - Understanding the nameserver aspect of a DNS based failover system

Comments

Post a Comment

Popular posts from this blog

iLO 3 Firmware Update (HP Proliant DL380 G7)

linux - Awstats - outputting stats for merged Access_logs only producing stats for one server's log

hp proliant - Smart Array P822 with HBA Mode?