Skip to main content

domain name system - Multiple data centers and HTTP traffic: DNS Round Robin is the ONLY way to assure instant fail-over?



Multiple A records pointing to the same domain seem to be used almost exclusively to implement DNS Round Robin as a cheap load balancing technique.



The usual warning against DNS RR is that it is not good for high availability. When 1 IP goes down clients will continue to use it for minutes.



A load balancer is often suggested as a better choice.



Both claims are not completely true:





  1. When the traffic is HTTP then, most of the HTML browsers are able to automatically try the next A record if the previous is down, without a new DNS look-up. Read here chapter 3.1 and here.


  2. When multiple data centers are involved then, DNS RR is the only option to distribute traffic across them.




So, is it true that, with multiple data centers and HTTP traffic, the use of DNS RR is the ONLY way to assure instant fail-over when one data center goes down?



Thanks,




Valentino



Edit:




  • Off course each data center has a local Load Balancer with hot spare.

  • It's OK to sacrifice session affinity for an instant fail-over.

  • AFAIK the only way for a DNS to suggest a data center instead of another is to reply with just the IP (or IPs) associated to that data center. If the data center becomes unreachable then all those IP are also unreachables. This means that, even if smart HTML browsers are able to instantly try another A record , all the attempts will fail until the local cache entry expires and a new DNS lookup is done, fetching the new working IPs (I assume DNS automatically suggests to a new data center when one fail). So, "smart DNS" cannot assure instant fail-over.

  • Conversely a DNS round-robin permits it. When one data center fail, the smart HTML browsers (most of them) instantly try the other cached A records jumping to another (working) data center. So, DNS round-robin doesn't assure session affinity or the lowest RTT but seems to be the only way to assure instant fail-over when the clients are "smart" HTML browsers.




Edit 2:




  • Some people suggest TCP Anycast as a definitive solution. In this paper (chapter 6) is explained that Anycast fail-over is related to BGP convergence. For this reason Anycast can employ from 15 minutes to 20 seconds to complete.
    20 seconds are possible on networks where the topology was optimized for this.
    Probably just CDN operators can grant such fast fail-overs.



Edit 3:*





  • I did some DNS look-ups and traceroutes (maybe some expert can double check) and:


    • The only CDN using TCP Anycast seems to be CacheFly, other operators like CDN networks and BitGravity use CacheFly. Seems that their edges cannot be used as reverse proxies. Therefore, they cannot be used to grant instant failover.

    • Akamai and LimeLight seems to use geo-aware DNS. But! They return multiple A records.
      From traceroutes seems that the returned IPs are on the same data center. So, I'm puzzled on how they can offer a 100% SLA when one data center goes down.




Answer



When I use the term "DNS Round Robin" I generally mean in in the sense of the "cheap load balancing technique" as OP describes it.



But that's not the only way DNS can be used for global high availability. Most of the time, it's just hard for people with different (technology) backgrounds to communicate well.



The best load balancing technique (if money is not a problem) is generally considered to be:




  1. A Anycast'ed global network of 'intelligent' DNS servers,

  2. and a set of globally spread out datacenters,


  3. where each DNS node implements Split Horizon DNS,

  4. and monitoring of availability and traffic flows are available to the 'intelligent' DNS nodes in some fashion,

  5. so that the user DNS request flows to the nearest DNS server via IP Anycast,

  6. and this DNS server hands out a low-TTL A Record / set of A Records for the nearest / best datacenter for this end user via 'intelligent' split horizon DNS.



Using anycast for DNS is generally fine, because DNS responses are stateless and almost extremely short. So if the BGP routes change it's highly unlikely to interrupt a DNS query.



Anycast is less suited for the longer and stateful HTTP conversations, thus this system uses split horizon DNS. A HTTP session between a client and server is kept to one datacenter; it generally cannot fail over to another datacenter without breaking the session.




As I indicated with "set of A Records" what I would call 'DNS Round Robin' can be used together with the setup above. It is typically used to spread the traffic load over multiple highly available load balancers in each datacenter (so that you can get better redundancy, use smaller/cheaper load balancers, not overwhelm the Unix network buffers of a single host server, etc).




So, is it true that, with multiple data centers
and HTTP traffic, the use of DNS RR is the ONLY
way to assure high availability?




No it's not true, not if by 'DNS Round Robin' we simply mean handing out multiple A records for a domain. But it's true that clever use of DNS is a critical component in any global high availability system. The above illustrates one common (often best) way to go.




Edit: The Google paper "Moving Beyond End-to-End Path Information to Optimize CDN Performance" seems to me to be state-of-the-art in global load distribution for best end-user performance.



Edit 2: I read the article "Why DNS Based .. GSLB .. Doesn't Work" that OP linked to, and it is a good overview -- I recommend looking at it. Read it from the top.



In the section "The solution to the browser caching issue" it advocates DNS responses with multiple A Records pointing to multiple datacenters as the only possible solution for instantaneous fail over.



In the section "Watering it down" near the bottom, it expands on the obvious, that sending multiple A Records is uncool if they point to datacenters on multiple continents, because the client will connect at random and thus quite often get a 'slow' DC on another continent. Thus for this to work really well, multiple datacenters on each continent are needed.



This is a different solution than my steps 1 - 6. I can't provide a perfect answer on this, I think a DNS specialist from the likes of Akamai or Google is needed, because much of this boils down to practical know-how on the limitations of deployed DNS caches and browsers today. AFAIK, my steps 1-6 are what Akamai does with their DNS (can anyone confirm this?).




My feeling -- coming from having worked as a PM on mobile browser portals (cell phones) -- is that the diversity and level of total brokeness of the browsers out there is incredible. I personally would not trust a HA solution that requires the end user terminal to 'do the right thing'; thus I believe that global instantaneous fail over without breaking a session isn't feasible today.



I think my steps 1-6 above are the best that are available with commodity technology. This solution does not have instantaneous fail over.



I'd love for one of those DNS specialists from Akamai, Google etc to come around and prove me wrong. :-)


Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

ubuntu - Monitoring CPU, Mem, disk, on a single server

I've been looking for a simple starter solution for monitoring my [currently] single server hosted solution. Other than Nagios and similar, are there other good (simple) solutions people are using? Answer Everything depends on what you want. For example Munin is very simple, you can install and configure it in less then 10 minutes (on one server), it can sends alarms, make graphs from monitoring cpu, mem. apache connections, eaccellerator, disk io and many many more (it has many plugins). But if you are planning in future get some more machines, munin may not be enough. For example in munin you cant monitor state of individual processes, can't monitor changes in files (for security purpose). So if you wanna only see what is the utilization of basics parameters on your server and don't plan to buy some more servers Munin is what you are looking for, but if you wanna be alarmed when some of your service is down, take more control on what is happeninig on...

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits, ...