Load Balancing DNS Servers: UDP / TCP

Don't load balance your DNS.

It's an incredibly light protocol - you'd need an enormous amount of traffic to need more than one box (in which case you'll just be bottlenecking on your load balancer anyway), and there's resilience built in because you can use multiple NS records in your delegation (other servers will be used if one's down).


I'm uncomfortable with this Q&A because it hasn't really been established what type of DNS server you're talking about. There are some significant misconceptions when it comes to the resiliency of recursive DNS and it's important that people cruising in via search engines don't walk away from this discussion with a false sense of security.

  • Authoritative DNS: For authoritative DNS servers, the common knowledge regarding the resilience of DNS is pretty spot on. So long as you have multiple authoritative DNS servers that are geo-redundant, you're fine. The main reason for adding high availability for individual IPs is if you're hosting many authoritative zones. This allows you to grow your number of servers without having to change the registrar settings for every domain that is hosted.

  • Recursive DNS: Always use some form of high availability solution. (BGP, appliance, etc.) This is where you can get into some serious trouble. All resolver libraries are not created equal: Windows DNS clients will round robin the initial server used between queries, but the majority of Unix-based systems will always cycle through the list sequentially. What is even less known is that these Unix libraries will have to time out on every search domain combination before moving on to the next server. If you have multiple search domains configured and the first server in the resolver lookup order is dead, this can create significant delays in DNS resolution for every single request: more than enough to cause problems within your critical applications.

When it comes to recursive DNS, remember that your server infrastructure is only as resilient as the most braindead client configuration. As your company grows, this is something you never have control over. Do not make any design assumptions based on a homogonous server OS environment, as things rarely stay the same in a growing company. This will definitely bite someone if you don't plan ahead for it.


These days you can use dnsdist by PowerDNS

From the README

dnsdist is a highly DNS-, DoS- and abuse-aware loadbalancer. Its goal in life is to route traffic to the best server, delivering top performance to legitimate users while shunting or blocking abusive traffic.

dnsdist is dynamic, in the sense that its configuration can be changed at runtime, and that its statistics can be queried from a console-like interface.

https://github.com/PowerDNS/pdns/tree/master/pdns/dnsdistdist

They provide repositories for common OSes: https://repo.powerdns.com/