Web lists-archives.com

[PATCH 1/2] Documentation: document UTF-16-related behavior




There are a number of broken Windows programs which want to process
files in a UTF-16 variant that is always little endian and always
contains a BOM. Git cannot produce or accept such an encoding for the
working-tree-encoding because no such encoding has been defined with
IANA or implemented in iconv(3).

Document this behavior since it is a frequent source of confusion for
users. Additionally, document that specifying "UTF-16" may produce bytes
of either endianness, but will be sure to provide a BOM to distinguish.

Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>
---
 Documentation/gitattributes.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index b8392fc330..2b2c93afd1 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -330,6 +330,11 @@ That operation will fail and cause an error.
 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').
 
+- It is not possible to specify a variant of UTF-16 with a BOM and a
+  specified endianness, because no such variants have been standardized.
+  Using "UTF-16" will produce a BOM with an unspecified endianness, and
+  using "UTF-16LE" or "UTF-16BE" will prohibit a BOM from being used.
+
 Use the `working-tree-encoding` attribute only if you cannot store a file
 in UTF-8 encoding and if you want Git to be able to process the content
 as text.