Proj Floating Point Error Investigation


TL;DR

MacOS 13.3 or 13.3.1 incorporated a change that is affecting calculations in proj for applications running on those versions of the OS. The change appears to be relatviely subtle, only affecting a single test in a single projection and only on x86_64, not arm, but nonetheless resulting in a failure on macOS in the standard gie tests for proj.

Since the tests work successfully on other platforms and on versions of macOS prior to 13.3, I'm going to assume for now that it is a variation from the floating point execution that is peculiar to macOS and only on Intel CPUs. I did file an issue with the project just in case we want to believe this isn't a bug in macOS.

Investigating

Last weekend, I decided to finally upgrade my macOS build farm, which I'd been putting off doing because it would require 5 macOS upgrades and 5 Xcode upgrades. At this point, my automation for the former is inadequate so I usually do that process by hand. The latter, I had similarly done by hand (until moving that work to xcodes this week).

Unfortunately, I hadn't upgraded my desktop MacPro to 13.3.1 prior to doing the upgrades in the build farm, so I didn't know what kind of fun I had in store.

Once I'd finished the upgrades, I ran a test build across the build farm and the Intel-based Ventura (13.3.1) machine was the only one that was failing. I went back to my MacPro (still running 13.2.1) and I had no problems. A few days later, I upgraded the MacPro to 13.3.1, and suddenly, it was causing the same problem. At this point, I've isolated the problem (somewhat accidentally) to the Intel macOS 13.3.1 plafforms.

As a side note, testing in this case was made much easier by the fact that I'd released the proj and gdal CLI tools with the last major release of Cartographica, meaning I could run the functional portion of the tests without having to use xctest.

I was having some other problems with Xcode 14.3, so I decided to back down the Xcode version thinking that may be causing my problem. (some more on that is detailed in Xcodes for Xcode Switching).

After downgrading to Xcode 14.2, the aforementioned inaccuracy was still happening, which seemed to rule out Xcode as the cause of this problem.

Unfortunately, I'd already upgraded all of my macOS 13.2 machines to 13.3, so I decided to create a VM to run the regression tests against macOS 13.2. Thanks to the list of Apple macOS Downloads for Ventura, I was able to download the installer and create a VM under Parallels to do my tests. It took a bit of time, but once operating, I could confirm the problems were only happening on macOS 13.3.1 on Intel-based CPUs.

I'll post an end to this story when it happens, but it's a bit frightening to see floating point changes coming in without significant notification in minor macOS updates.

I did find 2 potential work-arounds, which I detailed in my Issue with the Proj owners above. I'll post a follow-up article when the final issue is dispositioned.