Closed
Bug 503523
Opened 15 years ago
Closed 15 years ago
UTF8 character set isn't used when connecting to MySQL
Categories
(addons.mozilla.org Graveyard :: Public Pages, defect)
addons.mozilla.org Graveyard
Public Pages
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: clouserw, Assigned: davedash)
References
Details
Attachments
(1 file, 1 obsolete file)
4.05 KB,
patch
|
clouserw
:
review+
|
Details | Diff | Splinter Review |
Bug 503502 found that we aren't setting the correct character set when we connect to MySQL. By default MySQL is latin1 (that's probably what we're doing) and we should be running "SET NAMES 'utf8'" when our connection fires up. An example: mysql> show variables like 'character_set_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ mysql> select tags from text_search_summary where match(tags) against('海'); +----------+ | tags | +----------+ | á,v,海 | +----------+ 1 row in set (0.00 sec) mysql> SET NAMES 'UTF8'; Query OK, 0 rows affected (0.00 sec) mysql> select tags from text_search_summary where match(tags) against('海'); Empty set (0.00 sec) It fails the second time because our ft_min_word_len is 2 but without the right encoding 海 is determined to be 3 characters long. Before this bug gets fixed we need some serious testing and consideration of changing this both on the current data and future data. Completely untested, but Sergey's last comment on http://bugs.mysql.com/bug.php?id=28581 could help migrate existing data.
Assignee | ||
Comment 2•15 years ago
|
||
mysql> SELECT count(localized_string) FROM translations WHERE char_length(localized_string) <> length(localized_string) -> ; +-------------------------+ | count(localized_string) | +-------------------------+ | 36208 | +-------------------------+ 1 row in set (3.31 sec) SELECT tag_text FROM tags WHERE char_length(tag_text) <> length(tag_text) can get us the affected tags.
Assignee: nobody → dd
Assignee | ||
Comment 3•15 years ago
|
||
This patch does the following: * Forces all cake connections to use utf8 charset * Migration script (utf8.sql, will be renamed on commit to have a commit number)
Attachment #406345 -
Flags: review?(clouserw)
Assignee | ||
Comment 4•15 years ago
|
||
Covering the rest of the tables. Anything missed can be done at a later time. We should make sure the data is backed up, and we log when we run this script, just to cover our bases. Whether this lands in 5.2, depends on QA. QA will have to do fairly thorough coverage of the site to make sure we've covered every text string that is coming from the DB to check for weird entities.
Attachment #406345 -
Attachment is obsolete: true
Attachment #406381 -
Flags: review?(clouserw)
Attachment #406345 -
Flags: review?(clouserw)
Reporter | ||
Updated•15 years ago
|
Attachment #406381 -
Flags: review?(clouserw) → review+
Comment 5•15 years ago
|
||
QA will take this patch for 5.2, and we'll run our Selenium testcases (search.html, search2.html, searchapi.html), plus our Litmus testsuite and ad-hoc.
Assignee | ||
Updated•15 years ago
|
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Summary: UTF character set isn't used when connecting to MySQL → UTF8 character set isn't used when connecting to MySQL
Comment 6•15 years ago
|
||
Verified FIXED; we ran the above, and didn't notice anything amiss.
Status: RESOLVED → VERIFIED
Updated•8 years ago
|
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•