Web lists-archives.com

Re: CR-LF handling behavior of SED changed recently - this breaks a lot of MinGW cross build scripts




On 06/08/2017 01:51 PM, L A Walsh wrote:

>> 1.) I build many (~ 50) unix libs and tools MinGW cross on cygwin from
>> sources and this breaks many of the configure and other scripts. 
> ---
>    But didn't one have to use 'sed -b' before, in order to
> strip out CR's?

No, the exact opposite.  It used to be that you HAD to use 'sed -b' to
preserve CRs on a binary mount; now binary mounts preserve CRs
automatically, making 'sed -b' a no-op on binary mounts.  (This is
closer to Linux behavior, where 'sed' preserves CRs automatically
because everything is binary mount, and 'sed -b' is a no-op).  On text
mounts, 'sed -b' allows you to preserve CRs where they would otherwise
be stripped automatically.

>  I.e. wouldn't all the individual lib/tool maintainers have
> been required to add '-b' to their sed scripts?

Sort of. The problem was that it used to be difficult to write portable
scripts that worked on Cygwin and non-cygwin and still dealt with CRs.
That's because you could not rely on 'sed -b' existing (not all the
world uses GNU sed, and POSIX doesn't require -b to exist).  But if you
omitted the -b on Cygwin, your data was silently corrupted.

With the change back in February, now Cygwin sed defaults to POSIX
behavior on binary mounts, and the ONLY people that still have to use
'sed -b' are those who use text mounts; while remembering that text
mounts are not the default.

>  Seems either way,
> you have the undesirability of forcing external products to change to
> support cygwin.

External products were being lazy by relying on cygwin to strip CR when
they should have stripped it themselves.  But 'sed -b' does NOT strip CR
(it is the exact opposite, of keeping CR unstripped).

> 
>    Whereas, what I'd wonder is, how you are supplying input to sed
> in the first place?  I.e. how did CR's get into the stream to begin with.
> If you used cygwin and some tool on cygwin generated CR's into the output
> stream, I'd think that'd be a problem (or bug).  But if you are mixing
> DOS/Win-progs w/cygwin, then you need to adapt the DOS/Win progs'
> outputs to
> not have CR in them.

Exactly - it used to be you could be lazy and feed the DOS/Win prog
output (with CRs) to cygwin, and cygwin would ignore the CR - but that
laziness came at a price that it would silently corrupt data for someone
that was not aware that they needed the non-portable 'sed -b' to
preserve CR when operating on known-binary data.  Yes, the change is
forcing clients of external data to be more explicit about the CR in
their data, but in my mind, that's a GOOD thing - it's always better to
be explicit about intentions, and the new behavior is something YOU
control by whether you pre-filter the data, and not something that sed
FORCED on you by using text mode against your wishes.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature