The collation utf8mb4_0900_ai_ci is a character set collation for MySQL databases, introduced in MySQL 8.0.1. It is based on the Unicode Collation Algorithm (UCA) 9.0.0, and the character set is utf8mb4, which supports a wide range of Unicode characters.

Advertisement

The “ai” in the collation name stands for “accent insensitive” and the “ci” stands for “case insensitive.” This means that comparisons between characters are done without considering differences in case or accents.

If you are receiving an “Unknown collation” error, it may be because your MySQL server version is older than 8.0.1 and doesn’t support this collation. To fix this issue, you can:

Problem

During the migration of a web application, I got the below error while restoring a database on another server. The collation id may differ based on the MySQL version.

Error message:

Error 1273 (HY000) at line 25 Unknown collation: ‘utf8mb4_0900_ai_ci’

See the error screenshot during database restoration.

Error 1273 (HY000) Unknown collation: 'utf8mb4_0900_ai_ci'

Here you go with a solution.

Solution

After a little investigation, I found that the MySQL server running on the destination is an older version than the source. So we got that the destination server doesn’t contain the required database collation.

Then we do a little tweak in the backup file to resolve this. Edit the database backup file in text editor and replace “utf8mb4_0900_ai_ci” with “utf8mb4_general_ci” and “CHARSET=utf8mb4” with “CHARSET=utf8“.

  • Replace the below string:
    ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
    
  • With:
    ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
    

Here we are changing the CHARSET to utf8, that is the older version and have limitation, Read the implications at the end of this article before making the changes in database.

Save your file and restore the database.

The Linux system users can use the sed command to replace text in files directly.

sed -i 's/utf8mb4_0900_ai_ci/utf8_general_ci/g' backup.sql  
sed -i 's/CHARSET=utf8mb4/CHARSET=utf8/g' backup.sql  

That it. after the above changes, the database was successfully restored!

Hope this is solution helped you to resolve “Unknown collation: ‘utf8mb4_0900_ai_ci’” issue.

Limitations of UTF8 Character Set:

Changing the character set from utf8mb4 to utf8 in MySQL is not inherently bad, but it may have some implications that you should consider before making the change:

  • Limited Unicode support: The utf8 character set in MySQL only supports a limited range of Unicode characters, specifically the Basic Multilingual Plane (BMP), which includes characters from the Unicode code points U+0000 to U+FFFF. In contrast, utf8mb4 supports the full range of Unicode characters, including supplementary characters (code points U+10000 to U+10FFFF), such as emojis and certain rare symbols or scripts. If you need to store these supplementary characters in your database, you should use utf8mb4.
  • Data loss or corruption: If your existing data contains characters outside of the BMP, converting the character set from utf8mb4 to utf8 may result in data loss or corruption. These characters will be replaced with the Unicode replacement character (U+FFFD) during the conversion process.
  • Index length limitations: The utf8 character set uses less storage space (up to 3 bytes per character) compared to utf8mb4 (up to 4 bytes per character). This may help you work around index length limitations, especially with older versions of MySQL. However, you should be aware of the trade-offs in terms of Unicode support.
Share.

28 Comments

  1. This post should be removed, this has a high risk of causing data loss. Moving from utf8 to utf8mb4 doesn’t cause data loss, but moving from utf8mb4 to utf8 removes a byte of data, which is VERY dangerous. Please take this down.

  2. And you will loose everything requiring the fourth byte. Emojis for example. But who cares? Ugly hacks made by people that don’t what they are talking about obviously make the world go round. Because using a current version of mySQL would be too much of burden, I guess…

  3. Thank you this was helpful. Just a minor correction on the intro paragraph utf8mb4_general_ci should be utf8_general_ci

Leave A Reply

Exit mobile version