Quite a while back, RS wrote a comprehensive ansible role for handling
Let's Encrypt certificate issuance and renewal.
We both use this role extensively, which is why it was a significant issue when it
suddenly started throwing type errors deep inside of the
dnspython library during an
nsupdate call in a critical
part of the script.
A cursory examination of the component parts indicated that the most likely cause was
a change to the
dnspython library, which had recently been upgrade from 1.16 to 2.0.
Although there wasn't anything we could find online indicating other people had suffered
this breakage (which should have been a clue), it hadn't been out very long, it crashed
in a module that indicated it was checking something with IPv6, we use a lot of IPv6 on
our systems, many people use no IPv6, and well, we hadn't changed anything...
This was an annoyance, but relatively easy to avoid in one of the following ways:
- Pin the
dnspythonlibraries to <2.0 in
- On the Mac, use
ansibleand manually roll back the
dnspythonlibraries in the installed version
I used both, as we ran ansible on both SmartOS and macOS.
Taking brew hackery up a notch
After maintaining this for a while, I needed to upgrade some modules in
needed to keep my CI environment (running on Macs under Jenkins
in sync with what we were running on my desktop, laptop, and servers; and that lead me to
create my own tap in homebrew
by cloning the standard ansible formula and using my own repository.
The addition of this tap meant that I could configure this and test it once, but I could deploy it on all of my Macs (and anyone else who had access to the tap on my private git server could do the same).
One thing leads to another
After a few more months of using this tap on my Macs (and slowly moving ahead the
version on the SmartOS machines, but keeping
dnspython pinned), I needed to upgrade the
version of ansible at home (due to a project that I'll likely write about later, using
ansible to configure my Jenkins agents). The driver here was the need to execute
homebrew commands on an M1 mac, something that didn't work out of the box with
2.9, which is what I was pinned to.
Ever-hopeful, I first decided to see if my aforementioned problem was "fixed" by unlinking
my private tap's version of
ansible, and installing homebrew's version.
Sadly, running the
ansible playbook just resulted in the familiar crash. I looked at it
for a few minutes, decided the bug that was introduced in summer 2020 was still there and
set about building a new tap for version 3.2.0 of
ansible. This went smoothly, but after
updating my formula, installing took a long time, on the order of a few minutes. Why was
the standard homebrew install so much faster?
A bottle for monsieur?
Quick investigation lead to the fact that most brew taps are installed these days using bottles, or pre-built versions of the entire subdirectory that ends up in the Cellar. That seemed like it was a significant win, especially since I was going to install this at least 5 times each update, so I decided to figure out how to create my own custom bottles for my custom tap.
Thanks to a good article on Custom Tap and Bottles with Homebrew by Yehowshua Immanuel, I was on my way quickly after rebuilding from my tap formula once for each platform of Mac that I run (Intel Catalina, Intel Big Sur, and ARM Big Sur at this time).
The final verdict
After all this work, and getting a great solution in place for working around the
perceived bug in
dnspython, I took another quick look at the bug that was popping up in
our role. I'd contributed to random python projects in the past and also contributed to
ansible directly, so I was familiar with the process and figured I could track the
problem down. I fired up pycharm to get a little
better perspective on the particular bugs and settled in to reproduce a minimal set of the
problem with the
nsupdate command in
A few minutes (literally) into the investigation and I found myself looking at the what
seemed like completely reasonable arguments to the
dns.query.tcp method which were
raising exceptions due to not being able to determine whether my hostname was an IPv4
or IPv6 address. I immediately checked the current docs for
server argument is now designated an IP address (v4 or v6). Checking whether
we'd just been lucking and ignoring this all along, I went back to the
documentation and verified that it was mute on the issue of what was in the string argument.
At some point between 2.9 of
ansible and 3.0, they documented the change caused by
the the underlying library and I missed that change.
A few take-aways:
- Once again, a reminder that checking your arguments against current documentation is often time well spent.
- Assuming a behavior that goes against your expectations is a bug when nobody else is complaining about it is often a recipe for a lot of work.
- Homebrew is a really well thought out package and if you have a need to maintain your own tools, it may be well worth it to use private taps and bottles, they're easy to create and super-easy to use.
Every once in a while, it's good to have your own assumptions challenged. I made a point
of commenting on the bug report
ansible regarding this filed by someone else. Hopefully they're find my information
This experience lead me to a nifty thing about
brew, which is that many installations have every dependency installed in the Cellar directory for that specific package, including (for most python tools), it's own copy of site-packages. This makes it very easy to pin specific versions of dependencies and be able to run a number of python tools with different libraries and even interpreters. ↩︎
Everyone who uses
ansibleshould be familiar with the
unlinkcommands, which allow you to keep a version or command installed while switching to another one. In my case, since I was using a tap that had named versions (the best example of this I can think of is Postgresql, which has separate versions for current, 12, 11, 10, 9.6 and even some of the deprecated versions--use at your own peril). So, I could
brew unlink firstname.lastname@example.org
brew install ansibleand get my private copy to move out of the way and use the brew-standard version for testing. ↩︎