The collation utf8mb4_0900_ai_ci is a character set collation for MySQL databases, introduced in MySQL 8.0.1. It is based on the Unicode Collation Algorithm (UCA) 9.0.0, and the character set is utf8mb4, which supports a wide range of Unicode characters.
The “ai” in the collation name stands for “accent insensitive” and the “ci” stands for “case insensitive.” This means that comparisons between characters are done without considering differences in case or accents.
If you are receiving an “Unknown collation” error, it may be because your MySQL server version is older than 8.0.1 and doesn’t support this collation. To fix this issue, you can:
Problem
During the migration of a web application, I got the below error while restoring a database on another server. The collation id may differ based on the MySQL version.
Error message:
See the error screenshot during database restoration.
Here you go with a solution.
Solution
After a little investigation, I found that the MySQL server running on the destination is an older version than the source. So we got that the destination server doesn’t contain the required database collation.
Then we do a little tweak in the backup file to resolve this. Edit the database backup file in text editor and replace “utf8mb4_0900_ai_ci” with “utf8mb4_general_ci” and “CHARSET=utf8mb4” with “CHARSET=utf8“.
- Replace the below string:
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
- With:
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
Here we are changing the CHARSET to utf8, that is the older version and have limitation, Read the implications at the end of this article before making the changes in database.
Save your file and restore the database.
The Linux system users can use the sed command to replace text in files directly.
sed -i 's/utf8mb4_0900_ai_ci/utf8_general_ci/g' backup.sql
sed -i 's/CHARSET=utf8mb4/CHARSET=utf8/g' backup.sql
That it. after the above changes, the database was successfully restored!
Hope this is solution helped you to resolve “Unknown collation: ‘utf8mb4_0900_ai_ci’” issue.
Limitations of UTF8 Character Set:
Changing the character set from utf8mb4 to utf8 in MySQL is not inherently bad, but it may have some implications that you should consider before making the change:
- Limited Unicode support: The utf8 character set in MySQL only supports a limited range of Unicode characters, specifically the Basic Multilingual Plane (BMP), which includes characters from the Unicode code points U+0000 to U+FFFF. In contrast, utf8mb4 supports the full range of Unicode characters, including supplementary characters (code points U+10000 to U+10FFFF), such as emojis and certain rare symbols or scripts. If you need to store these supplementary characters in your database, you should use utf8mb4.
- Data loss or corruption: If your existing data contains characters outside of the BMP, converting the character set from utf8mb4 to utf8 may result in data loss or corruption. These characters will be replaced with the Unicode replacement character (U+FFFD) during the conversion process.
- Index length limitations: The utf8 character set uses less storage space (up to 3 bytes per character) compared to utf8mb4 (up to 4 bytes per character). This may help you work around index length limitations, especially with older versions of MySQL. However, you should be aware of the trade-offs in terms of Unicode support.
29 Comments
Thank you, you saved my day!!
Worked for me, thanks!
Thanks a lot..great solution
thanks for info
Excellent! It worked for me…Thanks !
Thanks mate. It solved my issue. I tried solutions from StackOverflow but none of them helped. Only this one helped.
I installed `brew reinstall gnu-sed` and used `gsed` on mac with the same commands. Worked.
its really very helpful thank you
This helped me greatly, thank you very much!
It worked. Thank you.
yes it really help me alot
Thanx! It helped me..
This post should be removed, this has a high risk of causing data loss. Moving from utf8 to utf8mb4 doesn’t cause data loss, but moving from utf8mb4 to utf8 removes a byte of data, which is VERY dangerous. Please take this down.
THANK YOU! I was going to point this out. Please DO NOT use utf8 as charset to replace utf8mb4.
Thanks a lot. It worked for me
thanks, it is work for me
thanks work!
And you will loose everything requiring the fourth byte. Emojis for example. But who cares? Ugly hacks made by people that don’t what they are talking about obviously make the world go round. Because using a current version of mySQL would be too much of burden, I guess…
This is great! It works for me!
Thank you this was helpful. Just a minor correction on the intro paragraph utf8mb4_general_ci should be utf8_general_ci
In my case I also had to add this command:
“`
sed -i ‘s/utf8mb4/utf8/g’ backup.sql
“`
Funciono Perfecto!
Gracias!
Thank you!!
On Mac OSX this is correct:
sed -e ‘s/utf8mb4_0900_ai_ci/utf8_general_ci/g’ oldFile.sql > newFile.sql
Thank you!
Thanks for this, I had to leave the CHARSET=utf8mb4 to get my import to work
me too
Otherwise you would be loosing data, like Emojis.
You’re a life-saver!