Understanding Decimals

Background

Decimals or floating-point numbers have a long and storied history in computer science, causing all forms of misunderstanding and unexpected outcomes. This all stems from how floating-point numbers are represented in computer programming as binary values.

In modern programming languages and computers, floating-point numbers are represented using the https://en.wikipedia.org/wiki/IEEE_754 standard, which is used to deal with floating-point arithmetic and address various problems faced in the early days of computer science. It defines how floating-point numbers are represented as binary values in the underlying layers of programming languages. Unfortunately though, decimal values and more so the precision of decimal values cannot be precisely represented using binary.

What does this look like?

Helium’s decimal data type is backed by Java’s double primitive type, which falls victim to the general “Double Precision Issue” present in most (if not all) computer programming languages. Take the following example:

public class Main { public static void main(String[] args) { for (double i = 0.0; i < 10; i = i + 0.1) { System.out.println(i); } } }

A simple operation mathematically, it is just the addition of decimal numbers where the decimal value increments by a tenth for every iteration and then prints the result. The mathematical result would be:

0.0 0.1 0.2 0.3 0.4 0.5 ...

The actual result however, shows a different story:

0.0 0.1 0.2 0.30000000000000004 0.4 0.5 0.6 0.7 0.7999999999999999 0.8999999999999999 0.9999999999999999 1.0999999999999999 ...

So what’s going on here?

As mentioned, the issue here is with the binary representation of these floating-point numbers using the IEEE 754 standard. Because there are fixed number of bits (64 bits for Java doubles) for these floating-point numbers, the binary representation of 0.1 looks like this:

There is a 0011 repeating sequence in the third part of the value which is truncated at the very end, indicating that this is an infinite number in binary. This infinite value has to be rounded to fit into this binary representation however and it is here that we find a very minute precision differences when we do calculations with these values. The arithmetic operations cause rounding errors to occur.

What effect does this have on Helium Rapid Apps?

It is important to note that Helium Rapid (and any other programming language or software) will round these values correctly, even after performing mathematical operations with these decimals as the issue is on the minute scale of precision in these values. We’re talking discrepancies on the 10-17 level of precision, but where it will affect Helium Rapid Apps is when it is comparing these decimal values. When rounded and displayed, two decimal values could look exactly the same, but their binary representation could be minutely different because of how these values were ascertained.

How to compare decimal values in Helium Rapid?

To compare two decimal values in Helium Rapid you can use a technique used in any programming language to compare two floating-point numbers; the Epsilon Comparison. In short, the Epsilon Comparison is making sure the difference between two values you want to check for equality, is incredibly small (the epsilon value). If this is true, then you can be reasonably sure that the two values are the same, or at least similar enough that you don’t care about the difference between the two which is smaller than your epsilon value. In simple terms, a Helium Rapid solution would look like this:

PLEASE DON’T JUST DO THIS!

While this is could be an adequate solution in certain situations, it is not an effective solution to compare two decimal values when they are already victims of the Double Precision Issue mentioned above, which will have unpredictable results. As an illustration of some edge cases with this comparison, look at the following examples:

Based on what your expectation is, “nearly equals” can mean different things. A more apt solution is finding the relative difference between the two values, while making sure to catch some edge cases:

This would solve the exceptions mentioned above, however, even this solution doesn’t cater for even more edge cases. What if either a or b is infinitely small, or almost zero? What if either a or b is infinitely large? More importantly, what if these values are just smaller than the epsilon value you’ve chosen to hard code in your function (the 0.00000001 value above)?

This may seem like a convoluted mathematical exercise to confound and confuse people, but when working with blood or water samples that contain trace amounts these minuscule differences matter. This is one of the (honestly there are many other) reasons why banks still use COBOL.

Conclusion

There are two reasonable solutions here to an unreasonably obscure issue in programming, but both of them do not cover all edge cases and could cause issues if you’re not expecting them. Therefore, when developing these decimal comparisons, the actual use and implementation thereof should be thoroughly designed and tested to cover every possibility that use case could incur.

Further Reading

To prove that this isn’t just a convoluted math exercise, here is a lot more reading about this topic:

https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

https://floating-point-gui.de/