Web lists-archives.com

Re: Need help with multibyte UTF-8 characters

On 12/13/2017 2:50 AM, Thomas Wolff wrote:
> Hi Brian,
> Am 13.12.2017 um 06:21 schrieb Brian Inglis:
>> On 2017-12-04 18:23, Thomas Taylor wrote:
>>> I want to use multibyte UTF-8 characters in 64-bit Cygwin under
>>> Windows 7.  The
>>> "vim" editor running in mintty displays the two-byte characters
>>> correctly, but
>>> not the three- (and I assume four-) byte characters, which instead
>>> display as
>>> rectangular filled-in blocks.  The "less" program doesn't even
>>> display two-byte
>>> characters correctly, but instead displays them as <A1> to <FF>,
>>> depending on
>>> the character in question, in reverse color in the terminal window. 
>>> The "cat"
>>> program is even worse, replacing every two-byte character with a
>>> character that
>>> looks like three horizontal bars stacked one above the other.  I've
>>> read the
>>> "Internationalization" page in the Cygwin online manual, but am still
>>> baffled.
>>> My LANG environment variable is set to "en_US.UTF-8".  Can anyone help?
>> Your Windows Regional settings and your mintty/Options/Text/Language and
>> Character Set should be set to match.
>> The profile commands below set Cygwin locale to your Windows Regional
>> settings
>> and charset to UTF-8, or Unix locale to your system locale.
>> Otherwise your system or mintty is going to be doing conversions on
>> each character.
> I am not aware that mintty character display and Windows regional
> settings would interfere in any way you indicated.
> Can you elaborate on this please?
> Thomas
>> # Set user-defined locale
>> locale -fU > /dev/null 2>&1     \
>>          && LC_ALL=$(locale -fU) \
>>          || LC_ALL=$(locale |    \
>>                  sed
>> '/^LANG=\|^LC_CTYPE=\|^LC_ALL=/{s///;h};$!d;x;s/"//g')

I was having an issue with git changing the locale of the files from
ISO-8859-1 to UTF-8 because of this.  I modified my $HOME/.profile and

# Set user-defined locale
export LANG=$(locale -uU)


# Set user-defined locale
export LANG=$(locale -u).ISO-8859-1

which sets all of the locale within Cygwin except for LC_ALL.

$ locale

cyg Simple

Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple