Web lists-archives.com

Re: gawk 4.1.4: CR separate char for CRLF files

On 08/09/2017 03:37 AM, Jannick wrote:

> Which is a pretty much of a pain when there is no easy fallback solution
> provided in case a major change is applied. E.g. for sed - if I understand
> the reference to sed in https://cygwin.com/ml/cygwin/2017-08/msg00033.html
> correctly - a separate switch '-b' is added.

Incorrect. 'sed -b' has always existed, but did NOT do what you wanted
(it forced CR to be treated as a separate character; where what you want
is to ignore CR if it appears before LF).  In fact, the coordinated
change made back in February to all of grep, sed, and awk, was that all
three programs now default to what used to be possible only through 'sed
-b', because silently stripping CR can corrupt data when you are not
expecting it, while requiring the user to explicitly strip CR when they
know they are working with CRLF line endings is less magic (fewer
downstream patches, and more obvious in looking at a script that the
script knows what it is doing).

If your data lives on a text mount (instead of a binary mount), then you
still get CR stripping for free.  If your data comes from a pipeline
rather than the file system, then you can add a d2u or other
CR-stripping tool in the pipeline.

> This is - to say the least - unpleasant in the light of what Cygwin claims
> to be, namely 'a large collection of GNU and Open Source tools which provide
> functionality similar to a Linux distribution on Windows' (from the top of
> the start website www.cygwin.com).

On Linux, nothing strips CR automatically.  So on Cygwin, we behave the
same - nothing strips CR automatically on binary mounted data.

And the fact that the change was made AND ANNOUNCED back in February,
but you are now only 6 months later complaining about it, is telling.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature