Hal Burgiss
2010-09-11 19:44:16 UTC
Hello,
I am trying understand a predicament I found myself in today. As background,
my environment is that I work for a small web hosting company. We handle the
authoritative DNS for most of our clients, using djbdns/tinydns. So we have
ns1, ns2, and ns3 type setup. data.cdb is shared among the 3 when the
Makefile is executed so that everything stays in sync. This is a
non-clustered set up, with one ip address per server.
This has seemed to work flawlessly for years now. Last night though someone
inadvertantly disconnected the wrong server, and unplugged the ns1 system.
The eventual impact of that one mistake was that the dns for the hosted
domains all went down totally. The ns2 and n3 systems were never queried.
Direct querying during testing showed they were responding normally (eg dig
blah.com @ns2). Yet, for all practical purposes they might as well been
unplugged too since they were totally quiet. I had been under the false
assumption that should ns1 go down, that the others would automatically come
into play. What am I missing?
Secondly, when I realized what happened and that the two secondary systems
were totally useless, I moved the ip address from the ns1 to ns3, and
changed the tinydns configs, restarted the service, verified that tinydns
was listening on the correct ip and port, and direct test queries worked
fine. I am doing all this remotely, and did not have the ability to
reconnect the original system. I was assuming the ip move would be a
reasonable hotfix. But this did not work. Some 2 hours later the original
system was reconnected, and within mintues all started working normally
again. Help me understand this so I can avoid this kind of headache in the
future!
Thank you.
I am trying understand a predicament I found myself in today. As background,
my environment is that I work for a small web hosting company. We handle the
authoritative DNS for most of our clients, using djbdns/tinydns. So we have
ns1, ns2, and ns3 type setup. data.cdb is shared among the 3 when the
Makefile is executed so that everything stays in sync. This is a
non-clustered set up, with one ip address per server.
This has seemed to work flawlessly for years now. Last night though someone
inadvertantly disconnected the wrong server, and unplugged the ns1 system.
The eventual impact of that one mistake was that the dns for the hosted
domains all went down totally. The ns2 and n3 systems were never queried.
Direct querying during testing showed they were responding normally (eg dig
blah.com @ns2). Yet, for all practical purposes they might as well been
unplugged too since they were totally quiet. I had been under the false
assumption that should ns1 go down, that the others would automatically come
into play. What am I missing?
Secondly, when I realized what happened and that the two secondary systems
were totally useless, I moved the ip address from the ns1 to ns3, and
changed the tinydns configs, restarted the service, verified that tinydns
was listening on the correct ip and port, and direct test queries worked
fine. I am doing all this remotely, and did not have the ability to
reconnect the original system. I was assuming the ip move would be a
reasonable hotfix. But this did not work. Some 2 hours later the original
system was reconnected, and within mintues all started working normally
again. Help me understand this so I can avoid this kind of headache in the
future!
Thank you.
--
Hal
Hal