I thought DNS primary/secondary for redundancy purposes was straightforward. My understanding is that you should have a primary and at least one secondary, and that you should set up your secondary in a geographically different location, but also behind a different router (see for example https://serverfault.com/questions/48087/why-are-there-several-nameservers-for-my-domain)
Currently, we have two name servers both in our main data center. Recently, we've suffered some outages for various reasons that took out both name servers, and left us and our customers without working DNS for a few hours. I've asked my sysadmin team to finish setting up a DNS server in another data center and configure it as the secondary name server.
However, our sysadmins claim that this doesn't help much if the other data center is not at least as dependable as the primary data center. They claim that most clients will still fail to look up properly, or time out too long, when the primary data center is down.
Personally, I'm convinced we're not the only company with this kind of problem and that it most likely is already a solved problem. I can't imagine all those internet companies being affected by our kind of problem. However, I can't find good online docs that explain what happens in failure cases (for example, client timeouts) and how to work around them.
What arguments can I use to poke holes in our sysadmins' reasoning ? Any online resources I can consult to better understand the problems they claim exist ?
Some additional notes after reading the replies:
- we're on Linux
- we have additional complicated DNS needs; our DNS entries are managed by some custom software, with BIND currently slaving from a Twisted DNS implementation, and some views in the mix as well. However we're completely capable of setting up our own DNS servers at another data center.
- I'm talking about authoritative DNS for outsiders to find our servers, not recursive DNS servers for our local clients.
Comments
Post a Comment