Re: Need help with multibyte UTF-8 characters
- Date: Fri, 15 Dec 2017 16:19:58 +0100
- From: Thomas Wolff <towo@xxxxxxxx>
- Subject: Re: Need help with multibyte UTF-8 characters
Am 15.12.2017 um 01:32 schrieb Brian Inglis:
On 2017-12-12 12:42, Thomas Taylor wrote:
I believe that Cygwin displays certain UTF-8 characters incorrectly. To see the
problem, first save the attached "utf-8_test.sed" text file to your desktop.
Then run "mintty," and set its options by right clicking in its title bar,
selecting "Options" and then "Text." On the Text page set "Locale" to "en_US"
and "Character set" to "UTF-8," and then "Save." Now exit and restart mintty.
Change directory to your desktop and run the editor "vim" on the utf-8_test.sed
file. Once inside vim do a ":set fileencoding=utf-8". You should now see that
vim displays correctly a sample of one-, two-, and three-byte UTF-8 character
encodings in the test file. Vim fails, however, on the three-byte encodings for
the "en" dash, the "em" dash, and the ellipsis, each of which displays
incorrectly as a filled-in rectangle. Now exit vim and do a "less" or "cat" on
the utf-8_test.sed file. You should see most of the sample UTF-8 encoded
characters displayed correctly, except once again for the en dash, em dash, and
ellipsis. So it looks like a problem in the underlying Cygwin run-time
libraries rather than in vim, less, or cat. I haven't tested this on four-byte
UTF-8 character encodings, but assume Cygwin will have similar problems.
Like many others -- no problems visible -- all UTF-8 characters displayed
correctly in gvim/X, vim, less, cat from mintty.
It seems nobody asked you so far which font you use. So please report that.
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple