Information technology and marketing INFORBIRO

Why to use utf8_unicode_ci instead of utf8_general_ci database collation

utf8_unicode_ci is based on the Unicode standard for sorting. utf8_general_ci is very close, but is NOT Unicode compliant, because compromises have been made to make it faster. Despite it is a little bit slower we recommend using utf8_unicode_ci if your site need to handle non-enlish characters.

The "ci" abreviation stands for Case Insensitive, i.e. it is not important whether you use upper or lower letters, e.g. "UPPERletters" is equal to "upperLETTERS".
Note: Some languages don't have support for proper database sorting order, e.g Serbian language - letter 'Š' (pronounced as 'sh') should be the last letter in Azbuka (serbian alphabet) and not to stands after letter 'S'. 

But, what if you already have a working database with some other collation? Remember that when you change database or table collation, existing data will NOT be changed, only newly added. This means that you also need to convert all existing data in a database.

If you like the article please share it with others!

Facebook