RE: CR-LF handling behavior of SED changed recently - this breaks a lot of MinGW cross build scripts
- Date: Thu, 8 Jun 2017 08:50:23 +0000
- From: "Soegtrop, Michael" <michael.soegtrop@xxxxxxxxx>
- Subject: RE: CR-LF handling behavior of SED changed recently - this breaks a lot of MinGW cross build scripts
> No, the documented behavior is that CR-LF is converted to LF only for text-
> mounted files; but pipelines are default binary-mounted. If you want to strip
> CR from a pipeline, then make it explicit.
> > var=$( prog | sed .)
> Rewrite that to var=$( prog | tr -d '\r' | sed .)
I have two problems with this:
1.) I build many (~ 50) unix libs and tools MinGW cross on cygwin from sources and this breaks many of the configure and other scripts. Feeding back the fixes to the individual lib/tool maintainers will take quite some time and also results in lengthy discussion why they should care about crappy DOS artefacts at all. A compatibility option via environment variable would have been nice.
2.) It is very hard to interpret the documentation in this way. I am citing from https://www.gnu.org/software/sed/manual/sed.html:
This option is available on every platform, but is only effective where the operating system makes a distinction between text files and binary files. When such a distinction is made—as is the case for MS-DOS, Windows, Cygwin—text files are composed of lines separated by a carriage return and a line feed character, and sed does not see the ending CR. When this option is specified, sed will open input files in binary mode, thus not requesting this special processing and considering lines to end at a line feed.
This doesn't say what is treated as a text file and what is treated as a binary file and one can reasonably assume that a text tool like sed opens everything not explicitly declared as binary as text, if a documented option like -b exists.
This cygwin sed behavior is documented in https://cygwin.com/cygwin-ug-net/using-textbinary.html but I wouldn't expect people using sed on cygwin will find this.
In summary I would say that the behavior of sed in cygwin is documented in the cygwin documentation, but it is contradicting the documentation of sed itself, and possibly the intended function of sed as a text processing tool.
I must admit that building Linux stuff for MinGW cross on cygwin works substantially better than doing this on MSys/MSys2. The number of patches I need is small, so the decisions the cygwin team took seem to be the right ones. But this change adds at least one order of magnitude in my "number of patches required" statistics.
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928