fp-concat Accuracy

My previous post about proj floating point investigation discussed an issue that I'd tracked down to the OS level. However, it's clear that this relates to an underlying change to code compiled by Xcode (and/or the LLVM toolchain that it is built upon).

Based on a post about Xcode 14.3 in Michael Tsai's blog, I started looking at a change in the compiler around the handling of fp-contract, which controls the use of contractions in optimizing floating point operations in the compiler.

There's certainly beeen some debate about the change that was introduced in Clang 14 which changed the handling of the fp-contract flag, and I'm not going to take a stand on which is "better", but I am going to note that the change was unexpected (in a minor release) and had notable, if not significant effects on floating point handling in Cartographica.

The change in behavior results in Xcode 14.3 (Clang 14) by default choosing to contract Multiply and Add instructions in floating point using a Fused Multiply Add instruction that is intended to capture rounding betwen the operations. Although this likely makes the calculations more accurate, it runs the risk of diverging from existing resutls and can create compolexitiies in testing.

In practice, I haven't found a large number of differences, but in some cases, there are variances that are causing some difficulties in test management.

The change itself was in how the fp-contract (floating-point contraction) flag was being handled by Clang to bring it more into alignment with the standards for C/C++. The details on the flag handling are a bit esoteric so I'll leave that for the reader, but there are a variety of options for the setting and the change was in the default handling between Clang 13 and Clang 14, effectively causing it to move from off to on by default.

If you want to see an illustration of what this does from a code generation perspective, there's a good comparison using godbolt between default Clang 13 and Clang 14, as well as Clang 14 with -ffp-contract=off, showing the behavior change. ( longer godbolt link here if the short one ever goes stale).

I'm still on the fence over whether this is actually a code problem or a test problem, but at the moment, it's really feeling more like the latter. The IEEE Floating Point standards define the fused-multiply-add operator and it's clearly intended to remove some error in the combining of floating point operations while also improving speed.