Web lists-archives.com

Re: [Mingw-users] msvcrt printf bug




Hi K.Frank,

Wow, you went to some pretty length defending Alexandru's (and my own) 
arguments! Well done (and thx): it's exactly what I (and I'm sure 
Alexandru) meant.

> So, to Alexandru Tei (the
> original poster) and Emanuel Falkenauer, stand your ground, you
> are right!  (And don't let the personal attacks get you down.)

Indeed: I already told Alexandru to take it easy and keep the good work 
he's doing, the unfortunate (and unfair) personal attacks 
notwithstanding.  ;-)  As for me... well, I've got so many of those in 
the past already that I guess I've seen it all... and I clearly 
developed a serious rhino skin.  :-)

The only thing I'm now puzzled about are the 80-bit FPU Intel registers: 
so are they (as you seem to imply) or are they not (as KHMan asserts) 
actually used these days - in, say, Xeons which I'm on? I.e. could 
actually our programs run differently on, say, AMDs?


Thanks & "Happy Floating-Point Hacking" as you say,

Emanuel


On 03-Feb-17 04:38, K. Frank wrote:
> Hello List (and Alexandru)!
>
> I see that I'm coming late to the party, but I would like to chime in.
>
> On Sat, Jan 14, 2017 at 9:02 AM,  <tei@xxxxxxxxx> wrote:
>> Hello,
>> I encountered a loss of precision printing a float with C printf.
>> ...
> There are a couple of common misconceptions running through
> this thread that I would like to correct.
>
> Short story:  Floating-point numbers are well-defined, precise,
> accurate, exact mathematical objects.  Floating-point arithmetic
> (when held to the standard of the "laws of arithmetic") in not
> always exact.
>
> Unfortunately, when you conflate these two issues and conclude
> that floating-point numbers are these mysterious, fuzzy, squishy
> things that can never under any circumstances be exact, you create
> the kind of floating-point FUD that runs through this thread, this
> email list, and the internet in general.
>
> (This kind of FUD shows up in a number of forms.  You'll see,
> for example:  "Never test for equality between two floating-point
> numbers."  "Never test a floating-point number for equality with
> zero."  "Floating-point numbers are fuzzy and inexact, so equality
> of floating-point numbers is meaningless."  When you see this on
> the internet (and you will), don't believe it!  Now, it's true that
> when you have two floating-point numbers that are the results
> of two independent chains of floating-point calculations it
> generally won't make sense to test for exact equality, but
> there are many cases where it does.)
>
> Worse yet (as is typical with FUD), when people call out this
> FUD (on the internet, not just this list), they get attacked
> with scorn and ad hominem arguments.  So, to Alexandru Tei (the
> original poster) and Emanuel Falkenauer, stand your ground, you
> are right!  (And don't let the personal attacks get you down.)
>
> Let me start with an analogy:
>
> Consider the number 157/50.  It is a well-defined, precise, accurate,
> exact mathematical object.  It's a rational number, which makes it
> also a real number (but it's not, for example, an integer).  There
> is nothing fuzzy or imprecise or inaccurate about it.
>
> It has as its decimal representation 3.14 (and is precisely equal to
> 3.14).  Now if we use it as an approximation to the real number pi,
> we find that it is an inexact value for pi -- it is only an approximation.
>
> But the fact that 3.14 is not exactly equal to pi doesn't make 3.14
> somehow squishy or inaccurate.  3.14 is exactly equal to 314 divided
> by 100 and is exactly equal to the average of 3.13 and 3.15.  It's
> just not exactly equal to pi (among many other things).
>
> Now it is true that the vast majority of floating-point numbers
> running around in our computers are the results of performing
> floating-point arithmetic, and the large majority of these numbers
> are inexact approximations to the "correct" values (where by correct
> I mean the real-number results that would be obtained by performing
> real arithmetic on the floating-point operands).  And anybody
> performing substantive numerical calculations on a computer needs
> to understand this, and should be tutored in it if they don't.
>
> (By the way, Alexandru asked nothing about floating-point calculations
> in his original post, and everything he has said in this thread indicates
> that he does understand how floating-point calculations work, so I have no
> reason to think that he needs to be tutored in the fact that floating-point
> arithmetic can be inexact.)
>
> Alexandru asked about printing out floating-point numbers.  People
> have called this the "decimal" or "ASCII" representation of a
> floating-point numbers.  I will stick to calling it the "decimal
> representation," by which I will mean a decimal fraction, of
> potentially arbitrary precision, that approximates a given
> floating-point number.
>
> In keeping with my point that floating-point numbers are well-defined,
> precise mathematical values, it is the case that every floating-point
> number is exactly equal to a single, specific decimal fraction.
>
> Alexandru complains that msvcrt doesn't use as its decimal representation
> of a floating-point number the decimal fraction that is exactly equal to it.
> This is a perfectly legitimate complaint.
>
> Now an implementation can use as its decimal representation of floating-point
> numbers whatever it wants -- it's a quality-of-implementation issue.
> The implementation could always print out floating-point numbers with
> two significant decimal digits.  Or it could use ten significant digits,
> and add three to the last significant digit just for fun.  But there is
> a preferred, canonically distinguished decimal representation for floating-point
> numbers -- use the  unique decimal fraction (or ASCII string or whatever you
> want to call it) that is exactly equal to the floating-point number.  "Exactly
> equal" -- what could get more canonical than that?
>
> In fairness, I don't consider this to be a particularly important
> quality-of-implementation issue.  I do prefer that my implementations
> use the canonically distinguished decimal representation, but I don't care
> enough to have retrofitted mingw to do so (or to set the _XOPEN_SOURCE
> flag).  But it isn't hard to get this right (i.e., to use the canonically
> distinguished representation), and apparently both glibc and Embarcadero/Borland
> have done so.
>
> I would argue that the glibc/Borland implementation is clearly better in
> this regard than that of msvcrt, and that there is no basis on which one
> could argue that the msvcrt implementation is better.  (Again, in fairness,
> microsoft probably felt that is was a better use of a crack engineer's
> time to more smoothly animate a tool-tip fade-out than to implement the
> better decimal representation, and from a business perspective, they
> were probably right.  But it's not that hard, so they could have done
> both.)
>
> On Mon, Jan 16, 2017 at 3:17 PM, Keith Marshall <km@xxxxxxxxxxxxxxxxx> wrote:
>> On 16/01/17 16:51, Earnie wrote:
>>> ...
>> Regardless, it is a bug to emit more significant digits than the
>> underlying data format is capable of representing ... a bug by
>> which both glibc and our implementation are, sadly, afflicted;
>> that the OP attempts to attribute any significance whatsoever to
>> those superfluous digits is indicative of an all too common gap
>> in knowledge ... garbage is garbage, whatever form it may take.
> Here Keith claims that the glibc/Borland implementation is actually
> a bug.  This is the kind of FUD we need to defend against.
>
> I create a floating-point number however I choose, perhaps by
> twiddling bits.  (And perhaps not by performing a floating-point
> operation.)  The number that I have created is perfectly well-defined
> and precise, and it is not a bug to be able to print out the decimal
> representation to which it is exactly equal.  An implementation that
> lets me print out the exactly correct decimal representation is better
> than an implementation that does not.
>
> The floating-point number that I created may well have a well-defined,
> precise -- and even useful -- mathematical meaning, Keith's assertion
> that "garbage is garbage," notwithstanding.
>
> To reiterate this point ...
>
> On Wed, Jan 18, 2017 at 4:39 PM, Keith Marshall <khall@xxxxxxxxxxx> wrote:
>> ...
>> On 18/01/17 10:00, tei@xxxxxxxxx wrote:
>>> Emanuel, thank you very much for stepping in. I am extremely happy
>>> that you found my code useful.
>> Great that he finds it useful; depressing that neither of you cares
>> in the slightest about accuracy; rather, you are both chasing the
>> grail of "consistent inaccuracy".
> Representing a floating-point number with the decimal fraction
> (or ASCII string or whatever you want to call it) that is exactly,
> mathematically equal to that floating-point number is, quite
> simply, accuracy, rather than inaccuracy.
>
> Granted, there are times when it may not be important or useful
> to do so, but there are times when it is.
>
>> ...
>>> I will use cygwin when I need a more accurate printf.
>> ...
>> Yes, I deliberately said "consistently inaccurate"; see, cygwin's
>> printf() is ABSOLUTELY NOT more accurate than MinGW's, (or even
>> Microsoft's, probably, for that matter).  You keep stating these
>> (sadly all too widely accepted) myths:
> Alexandru is right here, and is stating truths.  The printf() that
> emits the decimal fraction that is exactly, mathematically equal
> to the floating-point number being printed is in a very legitimate
> and substantive sense more accurate than the one that does not.
>
>>> Every valid floating point representation that is not NaN or inf
>>> corresponds to an exact, non recurring fraction representation in
>>> decimal.
>> In the general case, this is utter and absolute nonsense!
> On the contrary, Alexandru is completely correct here
>
> (Note:  "utter and absolute nonsense"  <--  FUD alert!)
>
> Alexandru's statement is sensible, relevant, and mathematically
> completely correct.
>
>>> There is no reason why printf shouldn't print that exact
>>> representation when needed, as the glibc printf does.
> Absolutely correct, Alexandru.  If I or Alexandru or Emanuel wants
> the exact representation it's a plus that the implementation provides
> it for us.
>
>> Pragmatically, there is every reason.  For a binary representation
>> with N binary digits of precision, the equivalent REPRESENTABLE
>> decimal precision is limited to a MAXIMUM of N * log10(2) decimal
>> digits;
> Again, you conflate the inaccuracy of some floating-point calculations
> with individual floating point numbers themselves.  Individual
> floating-point numbers have mathematically well-defined, precise,
> accurate values (and some of us want to print those values out).
>
> Let me repeat this point in the context of a comment of Peter's:
>
> On Thu, Jan 19, 2017 at 6:19 AM, Peter Rockett <sprocket@xxxxxxxxxxxxx> wrote:
>> On 19/01/17 08:21, tei@xxxxxxxxx wrote:
>> ...
>> I suspect the OP's conceptual problem lies in viewing every float in
>> splendid isolation rather than as part of a computational system.
> On the contrary, the conceptual problem underlying the FUD in this
> thread is conflation of the properties of the overall computational
> system with the individual floating-point numbers, and attributing
> to the individual floating-point numbers, which are well-defined and
> exact, the inexactness of some floating-point operations.
>
> Floating-point numbers make perfect sense and are perfectly well-defined
> "in splendid isolation" and to assume that all floating-point numbers of
> legitimate interest are the results of inexact floating-point computations
> is simply wrong.
>
>> ...
>> Or another take: If you plot possible floating point representations on a
>> real number line, you will have gaps between the points. The OP is trying
>> print out numbers that fall in the gaps!
> When I read Alexandru's original post, it appears to me that he is trying
> to print out individual, specific floating-point numbers.  That's his use
> case.  I see nothing to suggest that he is trying to print out values in
> the gaps.  (Alexandru clearly knows that floating-point numbers are discrete
> points on the real number line with gaps between them.)
>
> On Sun, Jan 15, 2017 at 10:08 PM, KHMan <cainherb@xxxxxxxx> wrote:
>> On 1/16/2017 8:56 AM, John Brown wrote:
>>> ...
>> I do not think there are canonical conversion algorithms that must
>> always be upheld, so I did not have an expectation that glibc must
>> be canonical.
> There is a canonically distinguished conversion algorithm -- it's the
> one that produces the decimal representation that is mathematically
> equal to the floating-point number.  To repeat myself: Mathematical
> equality, what's more canonical than that?
>
> But, of course, this algorithm does not need to be upheld.  I am quite
> sure that it is not required by either the c or c++ standard, and I
> am pretty sure that IEEE 754 is silent on this matter.  (I also don't
> think that this issue is that important.  But it is legitimate, and
> an implementation that does uphold this conversion algorithm is a
> better implementation.)
>
>> The glibc result is one data point, msvcrt is also one data point.
>> He claims to have his own float to string, but knowing digits of
>> precision limitations and the platform difference, why is he so
>> strident in knocking msvcrt? Curious. I won't score that, so we
>> are left with two data points running what are probably
>> non-identical algorithms.
> But it's more than just data points.  We have a canonical representation.
>  From this thread, we have three data points -- msvcrt, glibc, and
> Borland (plus Alexandru's roll-your-own) -- and (apparently, as I
> haven't tested them myself) glibc and Borland (and Alexandru's)
> produce the canonical representation, while msvcrt doesn't, so
> msvcrt is not canonical and is also the odd man out.
>
>> ...
>> For that expectation we pretty much need everyone to be using the same
>> conversion algorithm.
> Yes, and we probably won't have everyone using the same conversion
> algorithm (for example, msvcrt).  Well that's what standards are for,
> and some things don't get standardized.  But if everyone were to use
> the same algorithm (for example, the canonical decimal representation),
> then this whole thread would be much simpler, and life would be easier
> for Emanuel.
>
> There are some side comments I would like to make:
>
> Two distinct, but related issues have been discussed.  The first is
> whether printf() should print out the exact decimal representation
> of a floating-point number (Alexandru), and the second is whether
> different implementations should print out the same representation
> (Emanuel).  Both are desirable goals, and if you get the first (for
> all implementations), you get the second.
>
> My preference, of course, would be to have all implementations print
> out the exact representation (when asked to).  But you could, say,
> have printf() print out the (specific-floating-point-number-dependent)
> minimum number of digits for which you get "round-trip consistency"
> (i.e., floating-point number --> printf() --> ASCII --> scanf() -->
> back to the same floating-point number).  That would be reasonable,
> and would solve Emanuel's problem.  (Or you could print out the minimum
> number of "consistency" digits, and swap the last two digits just for fun.
> That would be less reasonable, but would also solve Emanuel's problem.)
>
> My point is that a canonically distinguished representation exists,
> so, even if you only care about the second issue, it's easier to
> get various implementations to hew to that canonical representation,
> rather than to some well-defined, but semi-arbitrary representation
> that I (or someone else) might make up.
>
> In retort to Keith's claim that such a standard across implementations,
> such as my "swap the last two digits" standard, would be chasing "consistent
> inaccuracy" (Of, course, the canonical representation would be "consistent
> accuracy."), Emanuel has presented a perfectly logical and valid use case
> for this.  Sure, there are other ways Emanuel could achieve his goal of
> cross-checking the output of different builds, but this is a good one.
>
> More importantly, all of us (including Emanuel) agree that he has no
> right to expect the floating-point results of his different builds to
> be the exactly the same.  (Nothing in the c or c++ standard, nor in
> IEEE 754 requires this.)  However, he enjoys the happy accident that
> the floating-point results do agree, so it's unfortunate that printf()
> from mingw/msvcrt and Borland print out different values, and that he
> therefor has to go through additional work to cross-check his results.
>
> That he can use Emanuel's code or set the _XOPEN_SOURCE flag to
> resolve this issue is a good thing, but the fact that he has to take this
> extra step is a minor negative.
>
> Last, and quite tangential, Kein-Hong took a dig at Intel and the 8087:
>
> On Fri, Jan 20, 2017 at 7:54 PM, KHMan <herbnine@xxxxxxxx> wrote:
>> On 1/21/2017 6:18 AM, mingw@xxxxxxxxxxxxx wrote:
>> ...
>> AFAIK it is only with 8087 registers -- just about the only
>> company who did this was Intel. Didn't really worked out,
> Actually, it worked out quite well for some important use cases,
> and kudos to Intel for doing this.
>
> Often in numerical analysis you perform linear algebra on large
> systems.  Often with large systems round-off error accumulates
> excessively, and when the systems are ill-conditioned, the round-off
> error is further amplified.
>
> Often, as part of the linear-algebra algorithm, you compute a sum
> of products, that is, the inner product of two vectors.  It turns
> out (and, if you're a numerical analyst, you can prove, given conditions
> on the linear systems involved), that you do not need to perform the
> entire calculation with higher precision to get dramatically more
> accurate results -- you can get most of the benefit just using higher
> precision to compute the inner products.  The inner-product computation
> (a chain of multiply-accumulates) fits trivially in the 8087 floating-point
> registers (without register spill), and the use of the 8087's 80-bit
> extended-precision just on these inner-product computations yields
> dramatically more accurate results in many cases.
>
> There are various engineering considerations for not using extended-precision
> registers (some legitimate, some less so), but Intel, for whatever reason,
> decided that they wanted to do floating-point right, they hired Kahan to
> help them, and we're all the better for it.
>
> Look, I understand the frustration felt by many commenters here, particularly
> that expressed by Keith and Kein-Hong.  Stack Overflow and forums and bulletin
> boards and chat rooms (and even classrooms, where they use things like, you
> know, "blackboards" and "chalk") are filled with naive confusion about
> floating-point arithmetic, with questions of the sort "I did x = y / z, and
> w = x * z, and x and w don't test equal.  How can this be?  Mercy!  It must
> be a compiler bug!"  And now you have to tutor another generation of novice
> programmers in how floating-point arithmetic works.  It gets old.
>
> But it's counter-productive to tutor them with misinformation.  Floating-point
> numbers are what they are and are mathematically perfectly well defined.
> Floating-point arithmetic is inexact when understood as an approximation
> to real-number arithmetic.  Let's tutor the next generation in what actually
> happens: Two perfectly well-defined floating-point operands go into a
> floating-point operation, and out comes a perfectly well-defined (if you're
> using a well-defined standard such as IEEE 754) floating-point result that
> in general, is not equal to the real-number result you would have obtained
> if you had used real-number arithmetic on the floating-point operands.
>
> But, to repeat Emanuel's comment, "it's not as if a FPU had a Schrödinger cat
> embedded!"  The floating-point operation doesn't sometimes give you one
> (inherently fuzzy) floating-point result and sometimes another (inherently
> fuzzy) result.  It gives you a single, consistent, perfectly well-defined
> (if you're using a well-defined standard) result that makes perfectly good
> mathematical sense.  Furthermore, it's also not as if the data bus connecting
> memory to the FPU has an embedded Schrödinger cat, and that these (squishy,
> fuzzy, inexact) floating-point numbers get fuzzed up somehow by cat hair as
> they travel around inside our computers.
>
> The way floating-point numbers -- and floating-point arithmetic -- really work
> is completely precise and mathematically well defined, even if it's subtle
> and complex -- and different from real-number arithmetic.  And how it really
> works -- not FUD -- is what we need to help the next generation of people
> doing substantive numerical calculations learn.
>
>
> Happy Floating-Point Hacking!
>
>
> K. Frank
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> MinGW-users mailing list
> MinGW-users@xxxxxxxxxxxxxxxxxxxxx
>
> This list observes the Etiquette found at
> http://www.mingw.org/Mailing_Lists.
> We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.
>
> _______________________________________________
> You may change your MinGW Account Options or unsubscribe at:
> https://lists.sourceforge.net/lists/listinfo/mingw-users
> Also: mailto:mingw-users-request@xxxxxxxxxxxxxxxxxxxxx?subject=unsubscribe


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
MinGW-users@xxxxxxxxxxxxxxxxxxxxx

This list observes the Etiquette found at 
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:mingw-users-request@xxxxxxxxxxxxxxxxxxxxx?subject=unsubscribe