Web lists-archives.com

Re: [Mingw-users] msvcrt printf bug




One of the limitations of floating point arithmetic is that it has a limited number of decimal places.  This affects, not only the representation of irrational numbers like pi and e, but also rational numbers like 1/3 or 22/7.

In particular, to represent 1/3 exactly using a binary or decimal representation would require an infinite number of bits.  But computers are finite machines with a limited number of bits.  So, for example, using 10 decimal digits of precision, the best representation you could achieve for 1/3 would be 0.3333333333

Then, if you multiply this value by 3, you will get 0.9999999999 rather than 1.0000000000, which is what you would want 3 * 1/3 to produce.

By rounding to a smaller number of digits than the internal precision, you can often get the result you want.  For example, rounding 0.9999999999 to nine digits results in 1.000000000, which is the correct answer.

If rounding is not done, then an expression like (3*(1./3.) == 1.0) will return false, even though with infinite precision, it would return true.

The problem is exacerbated when we are storing the numbers as pure binary (rather than binary coded decimal), and then convert the pure binary to a decimal number.  In that case, the binary approximation of 1/3, when converted to decimal, may wind up being 0.33333333333333331.  If you multiply this by 3, you get 0.9999999999999993, which, when rounded to 15 digits, is still 0.999999999999999, rather than the desired 1.000000000000000

Rounding to 14 digits gives the desired answer, of course.

So, yes, there is a certain "squishiness" when trying to represent real numbers with finite-precision hardware.  Not all real numbers can be represented exactly with finite-precision numbers.  Even if we represent numbers internally as fractions (numerator, denominator), we are limited by machine precision to only a certain range of values for numerator and denominator, and certainly cannot represent pi or e or sqrt(2) exactly.



-----Original Message-----
From: K. Frank [mailto:kfrank29.c@xxxxxxxxx] 
Sent: Thursday, February 02, 2017 9:38 PM
To: MinGW Users List <mingw-users@xxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [Mingw-users] msvcrt printf bug

Hello List (and Alexandru)!

I see that I'm coming late to the party, but I would like to chime in.

On Sat, Jan 14, 2017 at 9:02 AM,  <tei@xxxxxxxxx> wrote:
>
> Hello,
> I encountered a loss of precision printing a float with C printf.
> ...

There are a couple of common misconceptions running through this thread that I would like to correct.

Short story:  Floating-point numbers are well-defined, precise, accurate, exact mathematical objects.  Floating-point arithmetic (when held to the standard of the "laws of arithmetic") in not always exact.

Unfortunately, when you conflate these two issues and conclude that floating-point numbers are these mysterious, fuzzy, squishy things that can never under any circumstances be exact, you create the kind of floating-point FUD that runs through this thread, this email list, and the internet in general.

(This kind of FUD shows up in a number of forms.  You'll see, for example:  "Never test for equality between two floating-point numbers."  "Never test a floating-point number for equality with zero."  "Floating-point numbers are fuzzy and inexact, so equality of floating-point numbers is meaningless."  When you see this on the internet (and you will), don't believe it!  Now, it's true that when you have two floating-point numbers that are the results of two independent chains of floating-point calculations it generally won't make sense to test for exact equality, but there are many cases where it does.)

Worse yet (as is typical with FUD), when people call out this FUD (on the internet, not just this list), they get attacked with scorn and ad hominem arguments.  So, to Alexandru Tei (the original poster) and Emanuel Falkenauer, stand your ground, you are right!  (And don't let the personal attacks get you down.)

Let me start with an analogy:

Consider the number 157/50.  It is a well-defined, precise, accurate, exact mathematical object.  It's a rational number, which makes it also a real number (but it's not, for example, an integer).  There is nothing fuzzy or imprecise or inaccurate about it.

It has as its decimal representation 3.14 (and is precisely equal to 3.14).  Now if we use it as an approximation to the real number pi, we find that it is an inexact value for pi -- it is only an approximation.

But the fact that 3.14 is not exactly equal to pi doesn't make 3.14 somehow squishy or inaccurate.  3.14 is exactly equal to 314 divided by 100 and is exactly equal to the average of 3.13 and 3.15.  It's just not exactly equal to pi (among many other things).

Now it is true that the vast majority of floating-point numbers running around in our computers are the results of performing floating-point arithmetic, and the large majority of these numbers are inexact approximations to the "correct" values (where by correct I mean the real-number results that would be obtained by performing real arithmetic on the floating-point operands).  And anybody performing substantive numerical calculations on a computer needs to understand this, and should be tutored in it if they don't.

(By the way, Alexandru asked nothing about floating-point calculations in his original post, and everything he has said in this thread indicates that he does understand how floating-point calculations work, so I have no reason to think that he needs to be tutored in the fact that floating-point arithmetic can be inexact.)

Alexandru asked about printing out floating-point numbers.  People have called this the "decimal" or "ASCII" representation of a floating-point numbers.  I will stick to calling it the "decimal representation," by which I will mean a decimal fraction, of potentially arbitrary precision, that approximates a given floating-point number.

In keeping with my point that floating-point numbers are well-defined, precise mathematical values, it is the case that every floating-point number is exactly equal to a single, specific decimal fraction.

Alexandru complains that msvcrt doesn't use as its decimal representation of a floating-point number the decimal fraction that is exactly equal to it.
This is a perfectly legitimate complaint.

Now an implementation can use as its decimal representation of floating-point numbers whatever it wants -- it's a quality-of-implementation issue.
The implementation could always print out floating-point numbers with two significant decimal digits.  Or it could use ten significant digits, and add three to the last significant digit just for fun.  But there is a preferred, canonically distinguished decimal representation for floating-point numbers -- use the  unique decimal fraction (or ASCII string or whatever you want to call it) that is exactly equal to the floating-point number.  "Exactly equal" -- what could get more canonical than that?

In fairness, I don't consider this to be a particularly important quality-of-implementation issue.  I do prefer that my implementations use the canonically distinguished decimal representation, but I don't care enough to have retrofitted mingw to do so (or to set the _XOPEN_SOURCE flag).  But it isn't hard to get this right (i.e., to use the canonically distinguished representation), and apparently both glibc and Embarcadero/Borland have done so.

I would argue that the glibc/Borland implementation is clearly better in this regard than that of msvcrt, and that there is no basis on which one could argue that the msvcrt implementation is better.  (Again, in fairness, microsoft probably felt that is was a better use of a crack engineer's time to more smoothly animate a tool-tip fade-out than to implement the better decimal representation, and from a business perspective, they were probably right.  But it's not that hard, so they could have done
both.)

On Mon, Jan 16, 2017 at 3:17 PM, Keith Marshall <km@xxxxxxxxxxxxxxxxx> wrote:
> On 16/01/17 16:51, Earnie wrote:
>> ...
> Regardless, it is a bug to emit more significant digits than the 
> underlying data format is capable of representing ... a bug by which 
> both glibc and our implementation are, sadly, afflicted; that the OP 
> attempts to attribute any significance whatsoever to those superfluous 
> digits is indicative of an all too common gap in knowledge ... garbage 
> is garbage, whatever form it may take.

Here Keith claims that the glibc/Borland implementation is actually a bug.  This is the kind of FUD we need to defend against.

I create a floating-point number however I choose, perhaps by twiddling bits.  (And perhaps not by performing a floating-point
operation.)  The number that I have created is perfectly well-defined and precise, and it is not a bug to be able to print out the decimal representation to which it is exactly equal.  An implementation that lets me print out the exactly correct decimal representation is better than an implementation that does not.

The floating-point number that I created may well have a well-defined, precise -- and even useful -- mathematical meaning, Keith's assertion that "garbage is garbage," notwithstanding.

To reiterate this point ...

On Wed, Jan 18, 2017 at 4:39 PM, Keith Marshall <khall@xxxxxxxxxxx> wrote:
> ...
> On 18/01/17 10:00, tei@xxxxxxxxx wrote:
>> Emanuel, thank you very much for stepping in. I am extremely happy 
>> that you found my code useful.
>
> Great that he finds it useful; depressing that neither of you cares in 
> the slightest about accuracy; rather, you are both chasing the grail 
> of "consistent inaccuracy".

Representing a floating-point number with the decimal fraction (or ASCII string or whatever you want to call it) that is exactly, mathematically equal to that floating-point number is, quite simply, accuracy, rather than inaccuracy.

Granted, there are times when it may not be important or useful to do so, but there are times when it is.

> ...
>> I will use cygwin when I need a more accurate printf.
> ...
> Yes, I deliberately said "consistently inaccurate"; see, cygwin's
> printf() is ABSOLUTELY NOT more accurate than MinGW's, (or even 
> Microsoft's, probably, for that matter).  You keep stating these 
> (sadly all too widely accepted) myths:

Alexandru is right here, and is stating truths.  The printf() that emits the decimal fraction that is exactly, mathematically equal to the floating-point number being printed is in a very legitimate and substantive sense more accurate than the one that does not.

>> Every valid floating point representation that is not NaN or inf 
>> corresponds to an exact, non recurring fraction representation in 
>> decimal.
>
> In the general case, this is utter and absolute nonsense!

On the contrary, Alexandru is completely correct here

(Note:  "utter and absolute nonsense"  <--  FUD alert!)

Alexandru's statement is sensible, relevant, and mathematically completely correct.

>> There is no reason why printf shouldn't print that exact 
>> representation when needed, as the glibc printf does.

Absolutely correct, Alexandru.  If I or Alexandru or Emanuel wants the exact representation it's a plus that the implementation provides it for us.

>
> Pragmatically, there is every reason.  For a binary representation 
> with N binary digits of precision, the equivalent REPRESENTABLE 
> decimal precision is limited to a MAXIMUM of N * log10(2) decimal 
> digits;

Again, you conflate the inaccuracy of some floating-point calculations with individual floating point numbers themselves.  Individual floating-point numbers have mathematically well-defined, precise, accurate values (and some of us want to print those values out).

Let me repeat this point in the context of a comment of Peter's:

On Thu, Jan 19, 2017 at 6:19 AM, Peter Rockett <sprocket@xxxxxxxxxxxxx> wrote:
> On 19/01/17 08:21, tei@xxxxxxxxx wrote:
> ...
> I suspect the OP's conceptual problem lies in viewing every float in 
> splendid isolation rather than as part of a computational system.

On the contrary, the conceptual problem underlying the FUD in this thread is conflation of the properties of the overall computational system with the individual floating-point numbers, and attributing to the individual floating-point numbers, which are well-defined and exact, the inexactness of some floating-point operations.

Floating-point numbers make perfect sense and are perfectly well-defined "in splendid isolation" and to assume that all floating-point numbers of legitimate interest are the results of inexact floating-point computations is simply wrong.

> ...
> Or another take: If you plot possible floating point representations 
> on a real number line, you will have gaps between the points. The OP 
> is trying print out numbers that fall in the gaps!

When I read Alexandru's original post, it appears to me that he is trying to print out individual, specific floating-point numbers.  That's his use case.  I see nothing to suggest that he is trying to print out values in the gaps.  (Alexandru clearly knows that floating-point numbers are discrete points on the real number line with gaps between them.)

On Sun, Jan 15, 2017 at 10:08 PM, KHMan <cainherb@xxxxxxxx> wrote:
> On 1/16/2017 8:56 AM, John Brown wrote:
>> ...
> I do not think there are canonical conversion algorithms that must 
> always be upheld, so I did not have an expectation that glibc must be 
> canonical.

There is a canonically distinguished conversion algorithm -- it's the one that produces the decimal representation that is mathematically equal to the floating-point number.  To repeat myself: Mathematical equality, what's more canonical than that?

But, of course, this algorithm does not need to be upheld.  I am quite sure that it is not required by either the c or c++ standard, and I am pretty sure that IEEE 754 is silent on this matter.  (I also don't think that this issue is that important.  But it is legitimate, and an implementation that does uphold this conversion algorithm is a better implementation.)

> The glibc result is one data point, msvcrt is also one data point.
> He claims to have his own float to string, but knowing digits of 
> precision limitations and the platform difference, why is he so 
> strident in knocking msvcrt? Curious. I won't score that, so we are 
> left with two data points running what are probably non-identical 
> algorithms.

But it's more than just data points.  We have a canonical representation.
From this thread, we have three data points -- msvcrt, glibc, and Borland (plus Alexandru's roll-your-own) -- and (apparently, as I haven't tested them myself) glibc and Borland (and Alexandru's) produce the canonical representation, while msvcrt doesn't, so msvcrt is not canonical and is also the odd man out.

> ...
> For that expectation we pretty much need everyone to be using the same 
> conversion algorithm.

Yes, and we probably won't have everyone using the same conversion algorithm (for example, msvcrt).  Well that's what standards are for, and some things don't get standardized.  But if everyone were to use the same algorithm (for example, the canonical decimal representation), then this whole thread would be much simpler, and life would be easier for Emanuel.

There are some side comments I would like to make:

Two distinct, but related issues have been discussed.  The first is whether printf() should print out the exact decimal representation of a floating-point number (Alexandru), and the second is whether different implementations should print out the same representation (Emanuel).  Both are desirable goals, and if you get the first (for all implementations), you get the second.

My preference, of course, would be to have all implementations print out the exact representation (when asked to).  But you could, say, have printf() print out the (specific-floating-point-number-dependent)
minimum number of digits for which you get "round-trip consistency"
(i.e., floating-point number --> printf() --> ASCII --> scanf() --> back to the same floating-point number).  That would be reasonable, and would solve Emanuel's problem.  (Or you could print out the minimum number of "consistency" digits, and swap the last two digits just for fun.
That would be less reasonable, but would also solve Emanuel's problem.)

My point is that a canonically distinguished representation exists, so, even if you only care about the second issue, it's easier to get various implementations to hew to that canonical representation, rather than to some well-defined, but semi-arbitrary representation that I (or someone else) might make up.

In retort to Keith's claim that such a standard across implementations, such as my "swap the last two digits" standard, would be chasing "consistent inaccuracy" (Of, course, the canonical representation would be "consistent accuracy."), Emanuel has presented a perfectly logical and valid use case for this.  Sure, there are other ways Emanuel could achieve his goal of cross-checking the output of different builds, but this is a good one.

More importantly, all of us (including Emanuel) agree that he has no right to expect the floating-point results of his different builds to be the exactly the same.  (Nothing in the c or c++ standard, nor in IEEE 754 requires this.)  However, he enjoys the happy accident that the floating-point results do agree, so it's unfortunate that printf() from mingw/msvcrt and Borland print out different values, and that he therefor has to go through additional work to cross-check his results.

That he can use Emanuel's code or set the _XOPEN_SOURCE flag to resolve this issue is a good thing, but the fact that he has to take this extra step is a minor negative.

Last, and quite tangential, Kein-Hong took a dig at Intel and the 8087:

On Fri, Jan 20, 2017 at 7:54 PM, KHMan <herbnine@xxxxxxxx> wrote:
> On 1/21/2017 6:18 AM, mingw@xxxxxxxxxxxxx wrote:
> ...
> AFAIK it is only with 8087 registers -- just about the only company 
> who did this was Intel. Didn't really worked out,

Actually, it worked out quite well for some important use cases, and kudos to Intel for doing this.

Often in numerical analysis you perform linear algebra on large systems.  Often with large systems round-off error accumulates excessively, and when the systems are ill-conditioned, the round-off error is further amplified.

Often, as part of the linear-algebra algorithm, you compute a sum of products, that is, the inner product of two vectors.  It turns out (and, if you're a numerical analyst, you can prove, given conditions on the linear systems involved), that you do not need to perform the entire calculation with higher precision to get dramatically more accurate results -- you can get most of the benefit just using higher precision to compute the inner products.  The inner-product computation (a chain of multiply-accumulates) fits trivially in the 8087 floating-point registers (without register spill), and the use of the 8087's 80-bit extended-precision just on these inner-product computations yields dramatically more accurate results in many cases.

There are various engineering considerations for not using extended-precision registers (some legitimate, some less so), but Intel, for whatever reason, decided that they wanted to do floating-point right, they hired Kahan to help them, and we're all the better for it.

Look, I understand the frustration felt by many commenters here, particularly that expressed by Keith and Kein-Hong.  Stack Overflow and forums and bulletin boards and chat rooms (and even classrooms, where they use things like, you know, "blackboards" and "chalk") are filled with naive confusion about floating-point arithmetic, with questions of the sort "I did x = y / z, and w = x * z, and x and w don't test equal.  How can this be?  Mercy!  It must be a compiler bug!"  And now you have to tutor another generation of novice programmers in how floating-point arithmetic works.  It gets old.

But it's counter-productive to tutor them with misinformation.  Floating-point numbers are what they are and are mathematically perfectly well defined.
Floating-point arithmetic is inexact when understood as an approximation to real-number arithmetic.  Let's tutor the next generation in what actually
happens: Two perfectly well-defined floating-point operands go into a floating-point operation, and out comes a perfectly well-defined (if you're using a well-defined standard such as IEEE 754) floating-point result that in general, is not equal to the real-number result you would have obtained if you had used real-number arithmetic on the floating-point operands.

But, to repeat Emanuel's comment, "it's not as if a FPU had a Schrödinger cat embedded!"  The floating-point operation doesn't sometimes give you one (inherently fuzzy) floating-point result and sometimes another (inherently
fuzzy) result.  It gives you a single, consistent, perfectly well-defined (if you're using a well-defined standard) result that makes perfectly good mathematical sense.  Furthermore, it's also not as if the data bus connecting memory to the FPU has an embedded Schrödinger cat, and that these (squishy, fuzzy, inexact) floating-point numbers get fuzzed up somehow by cat hair as they travel around inside our computers.

The way floating-point numbers -- and floating-point arithmetic -- really work is completely precise and mathematically well defined, even if it's subtle and complex -- and different from real-number arithmetic.  And how it really works -- not FUD -- is what we need to help the next generation of people doing substantive numerical calculations learn.


Happy Floating-Point Hacking!


K. Frank

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________
MinGW-users mailing list
MinGW-users@xxxxxxxxxxxxxxxxxxxxx

This list observes the Etiquette found at http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:mingw-users-request@xxxxxxxxxxxxxxxxxxxxx?subject=unsubscribe
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
MinGW-users@xxxxxxxxxxxxxxxxxxxxx

This list observes the Etiquette found at 
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:mingw-users-request@xxxxxxxxxxxxxxxxxxxxx?subject=unsubscribe