Skip to main content

networking - DNS resolution failing over to secondary DNS - why?

itemprop="text">

We have large number of branch offices
connected via VPN, but without any kind of server infrastructure. The client machines in
each office get their network configuration from an ASA 5505, which is also used for the
VPN connection.



The Windows XP client machines
are configured to use one of our corporate DNS servers as the primary, with the DNS
server of the ISP as the secondary. The idea is that if the VPN connection fails for any
reason, staff in the office will still be able to access the internet, and access our
webmail and home access portal. In the majority of cases this works
fine.



However, for offices based in South
America we are seeing DNS resolution on the client machines regularly being done against
the ISP DNS server - this results in our corporate resources being effectively
unavailable to staff in the
offices.




The client machines are able
to ping the corporate DNS server ok. When doing an nslookup of a corporate hostname, I
get a reply.



I'm thinking one of the following
(or a combination) is
happening:




  • our corporate
    DNS server is not always replying to requests in a timely fashion (although why this
    would only affect clients in one geographic region I don't
    know)

  • DNS queries from Latin America are somehow delayed,
    causing the client to treat it as failed (although we have offices at the end of much
    slower VSAT connections which do not have this issue)

  • a
    single failure is resulting in a DNS cache entry in Windows that somehow results in the
    lookups not happening on subsequent
    tries




Has
anyone else come across this issue? Any ideas for resolutions?



Answer




Windows href="http://technet.microsoft.com/en-us/library/cc775637%28WS.10%29.aspx" rel="nofollow
noreferrer">queries DNS in this
order:




  • hosts
    file

  • local DNS
    cache

  • Preferred DNS
    servers

  • Other DNS
    servers




MS
also has an article describing href="http://technet.microsoft.com/en-us/library/cc779517%28WS.10%29.aspx" rel="nofollow
noreferrer">how the DNS server list is
obtained:





The DNS Client service uses a server search list, ordered by preference. This
list includes all preferred and alternate DNS servers configured for each of the active
network connections on the system.



The list is
arranged based on the following criteria:





  • Preferred DNS servers are given first
    priority.


  • If no preferred DNS servers are
    available, then alternate DNS servers are used.


  • Unresponsive servers are removed temporarily from these
    lists.





Windows has an
escalating timeout for DNS
requests
:



Value Default
value Attempt
1st limit 1 second Query the preferred DNS server on a preferred
connection.
2nd limit 2 seconds Query the preferred DNS server on all
connections.

3rd limit 2 seconds Query all DNS servers on all
connections (1st attempt).
4th limit 4 seconds Query all DNS servers on all
connections (2nd attempt).
5th limit 8 seconds Query all DNS servers on all
connections (3rd attempt).
6th value (Must be
0.)


I could not find a
clear answer on this exact point, but it sounds like if it doesn't get a response from
your primary DNS in 1 or 2 seconds (1st or 2nd attempt, respectively), then that server
will be removed from the DNS server lookup list for 15 minutes, and so it will use the
secondary DNS servers. Since those servers have up to an 8 second timeout, they are much
more likely to respond. (It's unclear to me if it continues to query the preferred DNS
server during the 3rd+ attempt if it's already
failed).



I also suspect that you do indeed have
a WAN latency issue for this geographical area, as it would explain why the timeouts are
working.




/>

One solution is to change the DNS query timeouts,
using the rel="nofollow noreferrer">DNSQueryTimeouts registry parameter. See also
href="http://drewthaler.blogspot.com/2005/09/changing-dns-query-timeout-in-windows.html"
rel="nofollow
noreferrer">http://drewthaler.blogspot.com/2005/09/changing-dns-query-timeout-in-windows.html



/>

Another solution is to put a local caching DNS
server on the network, and have the clients use that. You can use a DNS server that may
be built in to a router, or install something like href="http://thekelleys.org.uk/dnsmasq/doc.html" rel="nofollow
noreferrer">dnsmasq.


Comments

Popular posts from this blog

linux - iDRAC6 Virtual Media native library cannot be loaded

When attempting to mount Virtual Media on a iDRAC6 IP KVM session I get the following error: I'm using Ubuntu 9.04 and: $ javaws -version Java(TM) Web Start 1.6.0_16 $ uname -a Linux aud22419-linux 2.6.28-15-generic #51-Ubuntu SMP Mon Aug 31 13:39:06 UTC 2009 x86_64 GNU/Linux $ firefox -version Mozilla Firefox 3.0.14, Copyright (c) 1998 - 2009 mozilla.org On Windows + IE it (unsurprisingly) works. I've just gotten off the phone with the Dell tech support and I was told it is known to work on Linux + Firefox, albeit Ubuntu is not supported (by Dell, that is). Has anyone out there managed to mount virtual media in the same scenario?

hp proliant - Smart Array P822 with HBA Mode?

We get an HP DL360 G8 with an Smart Array P822 controller. On that controller will come a HP StorageWorks D2700 . Does anybody know, that it is possible to run the Smart Array P822 in HBA mode? I found only information about the P410i, who can run HBA. If this is not supported, what you think about the LSI 9207-8e controller? Will this fit good in that setup? The Hardware we get is used but all original from HP. The StorageWorks has 25 x 900 GB SAS 10K disks. Because the disks are not new I would like to use only 22 for raid6, and the rest for spare (I need to see if the disk count is optimal or not for zfs). It would be nice if I'm not stick to SAS in future. As OS I would like to install debian stretch with zfs 0.71 as file system and software raid. I have see that hp has an page for debian to. I would like to use hba mode because it is recommend, that zfs know at most as possible about the disk, and I'm independent from the raid controller. For us zfs have many benefits,

apache 2.2 - Server Potentially Compromised -- c99madshell

So, low and behold, a legacy site we've been hosting for a client had a version of FCKEditor that allowed someone to upload the dreaded c99madshell exploit onto our web host. I'm not a big security buff -- frankly I'm just a dev currently responsible for S/A duties due to a loss of personnel. Accordingly, I'd love any help you server-faulters could provide in assessing the damage from the exploit. To give you a bit of information: The file was uploaded into a directory within the webroot, "/_img/fck_uploads/File/". The Apache user and group are restricted such that they can't log in and don't have permissions outside of the directory from which we serve sites. All the files had 770 permissions (user rwx, group rwx, other none) -- something I wanted to fix but was told to hold off on as it wasn't "high priority" (hopefully this changes that). So it seems the hackers could've easily executed the script. Now I wasn't able