Post by Steve GibbardPost by Mike LewinskiPost by David CoulsonDepends - It doesn't help if the DNS server is dead, but the front-end
is still advertising the routes.
Possibly a good argument for allowing the DNS servers to originate the
routes for them...? I've seen configuration where the routes were
Running Quagga or something similar on the anycasted server to announce
the routes is the standard way of setting up anycast. That way, if the
server fails completely, the route goes away.
that's what joe said to do in <http://www.isc.org/pubs/tn/isc-tn-2004-1.txt>.
Post by Steve GibbardA common improvement on that is to run a script on the server that checks
to make sure the name server process is running and responding correctly,
and kills BGP if it isn't. That covers cases where named has problems
that don't take down the whole server.
in ISC-TN-2004-1 [ibid], appendix D, joe suggests bringing up and down the
interface BIND listens on (which presumes that it's a dedicated loopback
like lo1 whose address is covered by a quagga route advertisement rule).
note that joe's example brings up the interface before starting the name
server program, and bringing it down if the name server program exits.
this presumes that the name server will start very quickly, and that while
running, it is healthy. since i've seen name server programs be unhealthy
while running, and/or take a long time to start, i'm now considering an
outboard shell script that runs some kind of DNS query and decides, based
on the result, whether to bring the dedicated loopback interface up or down.
Post by Steve Gibbard...
The right solution is to design the anycast servers to be as sure as
possible that the route will go away when you want it gone, but to have
multiple non-interdependent anycast clouds in the NS records for each
zone. If the local node in one cloud does fail improperly, something will
still be responding on the other cloud's IP address.
the need for multiple independent anycast clouds is an RFC 2182 topic, but
joe's innovation both in ISC-TN-2004-1 and in his earlier ISC-TN-2003-1 (see
<http://www.isc.org/pubs/tn/isc-tn-2003-1.txt> is that if each anycast cluster
is really several servers, each using OSPF ECMP, then you can lose a server
and still have that cluster advertising the route upstream, and only when you
lose all servers in a cluster will that route be withdrawn.
Post by Steve GibbardNote that any of these failure scenarios is still preferable to what you
get with unicast servers. With unicast, if the server has trouble, the
route always stays up, and the the traffic always ends up in a black hole.
here, the real problem is the route staying up, which also blackholes anycast.
the only things DNS anycast universally buys you are DDoS resilience and
hot swap. anything else anycast can do (high availability, low avg. RTT, etc)
can also be engineered using a unicast design, though probably at higher TCO.
--
Paul Vixie