Web lists-archives.com

Re: textmode for stdout, what is "correct" now?

On Feb 18 12:47, Michael Haubenwallner wrote:
> On 2/18/19 11:26 AM, Corinna Vinschen wrote:
> > On Feb 18 10:40, Michael Haubenwallner wrote:
> >> On 2/16/19 6:43 PM, Corinna Vinschen wrote:
> >>> I really miss the problem you're trying to solve here.  Why should an
> >>> application setting O_BINARY explicitely revert this decision on the
> >>> same file descriptor?  That doesn't make sense.
> >>
> >> Well, it's not necessarily about really switching binary mode on and off,
> >> it's more about avoiding breakage when applications try to intuitively
> >> follow the original API, even if that actually causes the call to
> >> setmode(fd, O_TEXT) to be redundant.
> >>
> >> OTOH, this question also would apply to native Win32 applications, so why
> >> do people call setmode(fd, O_TEXT) with any DOS based platform at all?
> >>
> >> IMO, unfortunately we're not in a position to modify the intention of the
> >> original API.  And finally, I do want to stop discussions like this one
> >> with application developers like openssl, as soon as we can argue like:
> >> "Cygwin does not use \r internally, but does support text mode mounts,
> >> so we had to invent the Cygwin text mode, which may or may not use \r.
> >> Hence you get the Cygwin text mode with O_TEXT, and if you really are
> >> in some 'unix2dos' position, please use the new O_DOSTEXT mode instead."
> >>
> >> However, agreed this does not seem to be trivial to implement.  Yet I
> >> will look into it when there is a chance for a patches to be accepted.
> > 
> > Bottom line:
> > 
> > - Make O_TEXT equivalent to O_BINARY on the API level so Cygwin
> >   actually uses binary mode on open(O_TEXT) and setmode(O_TEXT).
> No, O_TEXT is neither equal to O_BINARY nor to O_DOSTEXT - it's something
> in between.  My first ideas are either (O_BINARY|O_DOSTEXT) or another bit.
> > - Make O_DOSTEXT equivalent to the former O_TEXT.
> Yes.
> > Result: we use binary mode even with tools explicitely specifying O_TEXT.
> No, not binary mode. It's text mode with \r being allowed rather than forced.

You lost me here.  Reading in O_TEXT mode already does not require \r\n,
it just allows \r\n as well as \n as line ending.  Given that, I don't
see a reason to add O_DOSTEXT.  What would it do differently?  Enforcing
\r\n line endings in input?

> Just stumbled over the distinction between readmode and writemode:
> What's up with that?

automode.o and textreadmode.o are just conveniences.  If you link an
application with them, descriptors the app opened O_RDONLY are in O_TEXT
mode automatically, and descriptors opened O_WRONLY are O_BINARY
(automode.o) or depending on mount mode (textreadmode.o) automatically.
You can't mix O_TEXT and O_BINARY on O_RDWR descriptors.

> Unless binary mode, reading always could be done in dostext mode.
> Here the default is to link without them, and the opposite of binmode.c
> is to not use anything, hence the text*mode should be O_DOSTEXT.

I fail to scan this paragraph, sorry.  Are you still taking about
the *mode.c files?

> > - How do you avoid breakage of existing tools which have been written to
> >   work explicitely with certain DOS formatted text file and use O_TEXT
> >   for that?
> > 
> > The answer to the last one could be using a new version check like the
> > ones already in include/cygwin/version.h.  Existing tools and libs keep
> > the current behaviour.  Only newly built binaries get the new behaviour.
> Exactly. And for the check:
> For dostext mode: ifdef O_DOSTEXT: use O_DOSTEXT, otherways use O_TEXT.
> For cygtext mode: ifdef O_DOSTEXT: use O_TEXT, otherways avoid setmode.
> > However, this still may result in breakage if the developer isn't aware
> > of this subtil change.  As much as I hate O_TEXT mode, there's a
> > pretty basic expectation how this is supposed to work.
> Yes, but I do expect this in corner cases only, with unix2dos/dos2unix as
> the specific example.
> OTOH, with setmode(fd, 0) coming to my mind: If that would denote the default
> (=cygwin text) mode, I can imagine we may convince (openssl) developers to use
> zero instead of O_TEXT, and everything could be fine without any Cygwin change.
> Heck, this would feel like most obvious - even API wise, no?
> Then we may want to add O_NOBINARY defined to zero as the only Cygwin change.

No, wait.  This is getting a bit out of hand.  The fact that we have to
handle two different read modes in Cygwin is already bad enough.  I'm
not really looking forward to add another read mode for which I don't
see an obvious need.  You don't really expect lots of upstream devs to
happily pick up on such a change with two new O_ open flags *just* for
Cygwin, do you?

You have two modes for input and three for output:

- input O_BINARY  -> only \n
- input O_TEXT    -> \n and \r\n

- output 0        -> generates \n or \r\n depending on mount mode
- output O_BINARY -> generates \n
- output O_TEXT   -> generates \r\n

I don't see how O_DOSTEXT comes into this picture.  There's no reason
for an enforcing \r\n input mode, and in output mode it won't differ
from O_TEXT, unless you define O_TEXT to be the same as O_BINARY in
future.  We can also already control per-app default settings by linking
apps against one of the *mode.o files.

What do we really gain by inventing two new Cygwin-only open flags,
other than restarting the old O_TEXT problems with upstream devs?


Corinna Vinschen
Cygwin Maintainer

Attachment: signature.asc
Description: PGP signature