Web lists-archives.com

utf8 options under Mysql




MySQL has a bewildering variety of unicode collation choices. Most of them are language specific, but what is the difference between "utf8-general-ci", "utf8-unicode-ci", and "utf8-unicode-520-ci." Do they differ in the range of characters they can handle or is it just a matter of the cort order. I understand that utf8-bin is different because it is case sensitive, but the other differences elude me. 

Under what circumstances does it make a difference to use on or the other? I work with a lot of Early Modern print data and the weird symbols of various kinds they use. I've had trouble at times with the "utf8-general-ci" setting, but it may have been more a matter of settings on my front end tool than of the choice of this rather than unicode collation. 

Under character sets, there is just one utf8 setting.  The simplest way to make sense of the choices would be to say that given a character set (utf8) the collation only makes a difference to the sort but makes no difference to what can be displayed. Is that correct.