Pete's Log: ModuleNotFoundError: No module named 'dns.resolver'; 'dns' is not a package

Entry #1943, (Coding, Hacking, & CS stuff)
(posted when I was 42 years old.)

Sigh. I'm getting too fancy for my pants.

First, a note on the error that is the subject of this entry. Apparently if you're going to try to import dns.resolver in a Python script, don't name that script dns.py. You will get the error above, and search engines will make you think there is something wrong with how you installed dnspython. I stayed up way past my bedtime last night trying to figure that out. (almost midnight, yikes!)

After trying apt packages and pip installs and mucking with environment variables and different incantations of things, the thought finally occurred to me that maybe the "dns" in 'dns' is not a package was referring to the name of my script and not the "dns" in "dns.resolver". A quick mv dns.py foo.py and suddenly it worked. Perhaps this is obvious to Python people. I'm not a Python person.

What I learned from my misadventures with Python DNS is that certbot is not happy when it can't communicate with the DNS server you want it to make an rfc2136 update to. Which obviously seems obvious, but again—I was up way past my bedtime.

How did I get myself into this mess? Maybe I've let our home network take over my life too much. The ingredients for trouble:

  • I want a dedicated domain name for our home network (using the .house TLD)
  • I wanted Dynamic DNS so that certain hostnames can resolve externally
  • I wanted a wildcard Letsencrypt certificate for my webserver pi
  • I don't know what I'm doing

So I actually have two rfc2136 clients on our network. One is the firewall which keeps the A and AAAA records for the external .house domain updated. And one is associated with certbot on my webserver so I can have a wildcard certificate.

Everything worked fine when I first set it up. But then two things happened.

  1. Branden informed me that named was writing errors into syslog about failed updates to the house domain
  2. Certbot stopped being able to renew my wildcard certificate

These things turned out to be not related to each other, other than coincidentally happening at close to the same time.

For the first issue, since it had cropped up after setting up my rfc2136 clients, I was fixated on thinking one of those clients was responsible. So I disabled them but the errors kept coming. In my mind I was convinced one of those clients had some cached configuration and was being subversive. But even while thinking that, I was at least smart enough to enable some logging on the firewall. Since my firewall logs overflow pretty quickly, it took a few days before I finally caught one of these update failures in the act. And as it turned out, it was my work laptop!

Apparently Windows likes to try to send DNS updates to the authoritative DNS server for a domain when its lease is renewed. Or something like that. Well, inside my network, the firewall should be considered the authority for DNS. But the quick fix I'm trying was just to redirect external DNS traffic on the guest network to the firewall. Time to wait and see if that helps.

But speaking of redirecting external DNS traffic to the firewall. Remember that time I added a firewall rule to redirect all DNS traffic on the HA network? Turns out I did have one legitimate need for DNS traffic on that network to go out. Certbot. There was even already a firewall rule that allowed that particular DNS traffic, since traffic on that network is for the most part not allowed access to the internet. But redirecting DNS traffic to the firewall broke certbot.

Encountered exception during recovery: certbot.errors.PluginError: Unable to determine base domain for _acme-challenge.my.house using names: '_acme-challenge.my.house', 'my.house', 'house'.

And it took me way longer to figure out than it should have. The systemd stub resolver was doing some weird things that led me down the wrong path. The Letsencrypt logs led me to believe certbot was having trouble with the SOA record for the house domain. Which it was. And initially dig/host were having trouble too, but only when resolving against 127.0.0.53, not when resolving against my firewall. I got that resolved (I don't remember how—again, past bedtime) and then host/dig happily resolved the SOA record.

That's when I got into the weeds and started seeing if Python was resolving DNS differently. And then I copied the SOA lookup code out of the Certbot rfc2136 and after banging my head against Python, I finally realized Certbot was performing the lookup against the DNS server it's supposed to update, not against local DNS. And then it all clicked into place.

Not even sure why I wrote up this many details. Maybe it'll help future me. Maybe because search engines failed me in fixing this. Maybe as a warning to others. At least I know my firewall rules are working.