Closed
Bug 1253535
(bmo-emoji)
Opened 8 years ago
Closed 5 years ago
Fix emoji truncation by changing rowformat to dynamic, and convert all columns to utf8mb4
Categories
(bugzilla.mozilla.org :: General, defect, P1)
Tracking
()
RESOLVED
FIXED
People
(Reporter: rfkelly, Assigned: dylan)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
Over in security-sensitive Bug 1253495 Comment 4, I tried to submit a comment containing the unicode "PILE OF POO" character. It and all following characters were truncated from the comment. Ironically, this seems very similar to the bug I was trying to comment on, where MySQL's "utf8" character set will silently truncate strings that contain non-BMP unicode characters: https://mathiasbynens.be/notes/mysql-utf8mb4 This can be a potential security issue if it happens on e.g. username or email fields, like in Bug 1253201. STR: 1. Type a non-BMP character like \U0001f4a9 ("PILE OF POO") or \U0001F6B2 ("BICYCLE") into the bugzilla comment form. 2. Submit the comment. Results: * The character and all text following it are truncated. Expected results: * A nice unicode glyph. I'm going to try to replicate it by pasting the output of `python -c 'print u"hello \U0001F6B2 world"'` at the end of this comment, and seeing whether I get a bicycle... hello
Reporter | ||
Comment 1•8 years ago
|
||
Nope, the bicycle was truncated :-(
Reporter | ||
Comment 2•8 years ago
|
||
FYI, over in Bug 1253495 I was able to forge a valid BrowserID assertion for this email: 'rfkelly@mozilla.com\U0001f4a9\n@mocker.dev.lcip.org' I don't know much about how bugzilla handles the emails returned by persona, but if I were able to login to bugzilla with that assertion and it got truncated in storage, it might let me login as "rfkelly@mozilla.com"...
Reporter | ||
Comment 3•8 years ago
|
||
> I don't know much about how bugzilla handles the emails returned by persona
I'm pleased to confirm that when I tried logging in with such an assertion, bugzilla did its own validation of the returned email and rejected it as invalid.
![]() |
||
Comment 4•8 years ago
|
||
Dupe of bug 405011.
Comment 5•8 years ago
|
||
FYI we are currently merging in all the upstream work from the master branch of Bugzilla (the general product) into BMO, including the above bug fix (which has been requested by many people lately). We've just been cherrypicking fixes and features until now. The merge has raised some interesting problems with our infrastructure (availability of new Perl packages and such), but we're finally getting close. We'll have a test host with the merged code set up within a week or two.
bug 405011 is a kludge at best; it only "fixes" comments (not summaries, or any other text field) and requires weird search shenanigans. the real fix is to use utf8mb4 everywhere. upstream are blocked by the long tail of older mysql versions but bmo has no such limitation.
Summary: Comments containing non-BMP unicode characters are truncated → Convert to utf8mb4 everywhere (Comments containing non-BMP unicode characters are truncated)
Comment 7•8 years ago
|
||
(In reply to Frédéric Buclin from comment #4) > Dupe of bug 405011. That bug is in upstream Bugzilla. This is filed against bugzilla.mozilla.org so it can't be a dupe. It could morph into "backport upstream's fix" but that would be "depends on", not "dupe". As glob points out bug 405011 wouldn't fix the fields which were Ryan's real concern.
Comment 8•8 years ago
|
||
We're not ready to talk about bug 1253495 in public yet. re-hiding.
Group: mozilla-employee-confidential
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → dylan
Assignee | ||
Comment 13•7 years ago
|
||
Assignee | ||
Comment 14•7 years ago
|
||
it appears this works.
Assignee | ||
Comment 15•7 years ago
|
||
so this patch would do all the ALTER TABLE / ALTER DATABASE stuff from utf8 to utf8mb4. It takes a while (a few hours on a dev db). What other considerations would be required for production rollout? For one, I bet the mysqlclient libraries on centos 6 (RHEL 6) are too old (libmysqlclient.so.16). Are we running 5.5.something on the server side? Also this will change the behavior of indexes, but I'm not sure I understand how. In the case when the data is still the ASCII subset of utf8, does it really change anything going from utf8 to utf8mb4?
Flags: needinfo?(scabral)
Comment 16•7 years ago
|
||
Test, test, test. create a table, put stuff in it, convert, see what happens to the existing data, see what happens to new data. Because it's only the character set that is changing, we can do the slaves first, then failover. It means that there will be some overlap time where data may be different, but in the span of a day or 2 I wouldn't think that would matter and wouldn't affect much data. (to do one machine at a time, put "set sql_log_bin=0;" at the beginning of the statements).
Flags: needinfo?(scabral)
Comment 17•7 years ago
|
||
The indexes will need to be rebuilt, because a change in character set (and collation) change how things are ordered. (e.g. the new characters have an ordering to them) It shouldn't change anything lexically that already exists, a will still be before b, but it will put the new allowed characters into the right place lexically.
Assignee | ||
Comment 18•7 years ago
|
||
one further q, is libmysqlclient.so.16 too old to support utf8mb4?
Flags: needinfo?(scabral)
Comment 19•7 years ago
|
||
I have no idea. but if it is, we can upgrade that.....doing tests on dev/stage will illuminate what we need to do. Or, let me know if you want me to google that to see if I can research the answer?
Flags: needinfo?(scabral)
Assignee | ||
Comment 20•7 years ago
|
||
ah, I will research it. You just almost always know everything about mysql so I get lazy. :-D
Assignee | ||
Comment 21•7 years ago
|
||
This is the bug to follow for getting emojis everywhere in BMO. :-)
Flags: needinfo?(cincodenada)
Comment 22•7 years ago
|
||
(In reply to Dylan Hardison [:dylan] from comment #21) Aha, thanks for the redirection! Glad to see it's under way, much appreciated.
Flags: needinfo?(cincodenada)
Comment 23•7 years ago
|
||
It may be a good idea to set the collation to `utf8mb4_unicode_ci`, if the server defaults to `utf8mb4_general_ci` for the `utf8_mb4` character set. Can be checked with `SHOW CHARACTER SET LIKE 'utf8mb4'`.
> https://stackoverflow.com/questions/766809
> https://dev.mysql.com/doc/refman/5.5/en/show-character-set.html
Assignee | ||
Comment 24•7 years ago
|
||
Alright, so if we can use large prefixes we don't have to worry about the index size or the column size being larger than 191... Sheeri: Would there be anything preventing BMO's DBs using this? http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix
Flags: needinfo?(scabral)
Assignee | ||
Comment 25•7 years ago
|
||
Revised questions: 1) what would the time frame be on changing all our tables to use the newer barracuda format? 2) are there downsides or blockers to using barracuda or large prefixes with replication?
Comment 26•7 years ago
|
||
1) We can do stage whenever you are ready. I'd just want to know if you also want to compress the tables while we're at it (makes the table access faster) 2) We'd want to make sure all the slaves are set up properly, including AWS, but there shouldn't be an issue there. It's theoretically possible to do this without downtime in production. Here's a sample plan: - Take a slave out of the load balancer - Convert one table on the slave - Watch replication (for, say, 24h) make sure the table doesn't break replication - Convert the rest of the tables on the slave - Put the slave back in the load balancer - Take another slave out of the load balancer, and use innobackupex to do an online sync - Repeat previous step until all slaves (including backups) are converted - Failover so a converted slave is now the master, and resync the old master. We can test this plan in stage.
Flags: needinfo?(scabral)
Comment 27•7 years ago
|
||
FWIW AWS does allow the innodb_large_prefix option.
Assignee | ||
Comment 28•7 years ago
|
||
You should be made aware of this bug. I'll be working on scheduling this change on staging sometime next week to test out what Sheeri had proposed. hopefully that works for you.
Flags: needinfo?(mpressman)
Comment 29•7 years ago
|
||
thanks :dylan - I reviewed what's going on here. I will be in SF next week for planning meetings, but I should be available for any Q's you may have while getting stage setup.
Flags: needinfo?(mpressman)
Summary: Convert to utf8mb4 everywhere (Comments containing non-BMP unicode characters are truncated) → Convert to utf8mb4 everywhere - text containing non-BMP unicode characters (eg. emoji) is truncated
Assignee | ||
Updated•7 years ago
|
Alias: bmo-emoji
Assignee | ||
Updated•7 years ago
|
Depends on: bmo-emoji-utf8mb4-option
Assignee | ||
Comment 31•7 years ago
|
||
dkl: proposal for this patch: check if the database has all the right options turned on (3 config options + table format + row format) and then use utf8mb4, falling back to utf8 otherwise. if we do that this patch can go out ahead of the work planned, yes?
Flags: needinfo?(dkl)
Comment 32•7 years ago
|
||
(In reply to Dylan Hardison [:dylan] from comment #31) > dkl: proposal for this patch: > > check if the database has all the right options turned on (3 config options > + table format + row format) > and then use utf8mb4, falling back to utf8 otherwise. if we do that this > patch can go out ahead of the work planned, yes? Sounds good to me. Did we ever figure out how to handle indexes that are maxed out with utf8 (1-3 bytes)? Once they are converted to utf8mb4, will we not lose that last bit of data and hurt out our index performance? Is this something we may just have to live with? dkl
Flags: needinfo?(dkl)
Comment 33•7 years ago
|
||
From discussion on #sumo about this, there is concern in two areas: 1. The difficulty of entering emoji on MacOS/iOS platforms 2. Accessibility of emoji in screen readers I want a11y to weigh in on this so asking :marcoz for their feedback. Depending on their thoughts on this, I may want us to restrict emoji from being used in the short-description/title field of bugs so that we don't impede people searching on bugs. Though, a :poop: emoji leaderboard by product/component would had been a glorious thing to make a dashboard of.
Flags: needinfo?(mzehe)
Comment 34•7 years ago
|
||
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #33) > From discussion on #sumo about this, there is concern in two areas: > > 1. The difficulty of entering emoji on MacOS/iOS platforms On MacOS, you press Ctrl+Cmd+Space to bring up either a small or big window where you can search for, and select, Emojis. On the other hand, I have no idea how you enter those on Windows, for example, ;) On iOS, it's a matter of adding another keyboard and switching to it. I use Emojis on iOS very frequently. > 2. Accessibility of emoji in screen readers It varies. On Windows, support is just getting started. For NVDA, there is an add-on, for JAWS (the market-leading commercial screen reader), I don't know the current state, and there are many older versions out there due to high upgrade costs that definitely don't support them. On MacOS and iOS, support for Unicode Emojis is fully implemented in VoiceOver. Apple is updating the supported set every year, and they're also accessible and described in localized form right from the start. On Android, support varies by Android version, but more recent versions of TalkBack do have support. > Depending on their thoughts on this, I may want us to restrict emoji from > being used in the short-description/title field of bugs so that we don't > impede people searching on bugs. I believe, to keep it simple and accessible to the widest variety of our users, that's probably a good idea in any case. Not all international users have the same understanding of emojis, and you never know which of the various smiling or grinning faces the original author might have used. > Though, a :poop: emoji leaderboard by product/component would had been a > glorious thing to make a dashboard of. I definitely agree on that! :-D
Flags: needinfo?(mzehe)
Comment 35•7 years ago
|
||
:MarcoZ, I typoed myself on my first question. I meant difficulty of entering emoji on NON-Mac OS and iOS platforms. But you agree that short-descriptions should be non-emoji only.
Flags: needinfo?(mzehe)
Comment 36•7 years ago
|
||
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #35) > But you agree that short-descriptions should be non-emoji only. Yes.
Flags: needinfo?(mzehe)
Comment 37•7 years ago
|
||
I'd like us not to allow emoji in the short-description/title. :dkl says that we can just filter those out at the application level. Alternatively, we can allow emoji, but not search for them, which would require work in search, and I'd like to minimize the work to get this feature out, so unless there's a strong argument for this, let's filter emoji at the application layer.
Assignee | ||
Comment 38•7 years ago
|
||
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #37) > I'd like us not to allow emoji in the short-description/title. :dkl says > that we can just filter those out at the application level. > > Alternatively, we can allow emoji, but not search for them, which would > require work in search, and I'd like to minimize the work to get this > feature out, so unless there's a strong argument for this, let's filter > emoji at the application layer. This strikes me as bad for a number of reasons. 1) we've never restricted what goes into the summary 2) engineers may well need to report bugs *about* emoji, so having one in the summary is even useful for searching. 3) including these in the summary was one of the original complaints that is driving this work. Additionally, in the worst case (someone creates an unreadable summary, on a bug that is otherwise "good") the summaries are editable. I change summaries to better reflect bugs all the time. A totally useless bug with :poop: :poop: :poop: in the summary should be RESOLVED INVALID and moved to invalid bugs. I think these concerns make sense for keywords (possibly) and already apply to aliases. Bug aliases must match \w+ currently. In perl, that matches any letter or number or _ in any language. Perhaps we should restrict *aliases* to [A-Za-z0-9_]+ :) (Note that currently aliases are effectively restricted to that, but only because of incorrect unicode handling in parts of bugzilla. This protects us from the fact that we treat \d as [0-9] but it actually means "a digit in any language"...)
Comment 39•7 years ago
|
||
(In reply to Dylan Hardison [:dylan] from comment #38) > Additionally, in the worst case (someone creates an unreadable summary, on a > bug that is otherwise "good") > the summaries are editable. I change summaries to better reflect bugs all > the time. > > A totally useless bug with :poop: :poop: :poop: in the summary should be > RESOLVED INVALID and moved to invalid bugs. Good points, I'm removing this as a blocker and let's continue. The requirement will be to monitor what happens and update our 1st level triage guidelines.
Assignee | ||
Comment 40•7 years ago
|
||
(In reply to David Lawrence [:dkl] from comment #32) > Sounds good to me. Did we ever figure out how to handle indexes that are > maxed out with utf8 (1-3 bytes)? Once they are converted to utf8mb4, will we > not lose that last bit of data and hurt out our index performance? Is this > something we may just have to live with? The magical setting is innodb_large_prefix: https://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_large_prefix This increases the size limit and lets us have a full 255 unicode-char, so no downsides. In mysql 5.7 this is even the default.
Assignee | ||
Updated•7 years ago
|
Priority: -- → P1
Comment 43•7 years ago
|
||
Reminder to see if we can deploy a patch ahead of time with minimal pain.
Flags: needinfo?(dylan)
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(dylan)
Summary: Convert to utf8mb4 everywhere - text containing non-BMP unicode characters (eg. emoji) is truncated → Detect if mysql can support utf8mb4, and if using utf8, convert to utf8mb4
Assignee | ||
Comment 46•6 years ago
|
||
I think we need to allocate the time perform the conversion of the mysql tables. Let me review my understanding of what needs to be true in the DB for full utf8 to be supported: 1. we need to use DYNAMIC or COMPRESSION row types: https://dev.mysql.com/doc/refman/5.7/en/innodb-row-format-dynamic.html 2. the option innodb_large_prefix needs to be turned on https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix. This large prefix option is directly related to the concern in comment 32. 3. every table needs to switch from utf8 to utf8mb4 (https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html) If 1 and 2 are true, then my draft patch will just work. Currently this bug is targeted round code that queries the row format and the innodb_large_prefix option but I hadn't had time to actually get it working. Rather than continue to support utf8 *and* utf8mb4, I propose BMO only support utf8mb4 and bug 1328659 really needs to proceed (to at least do 1 and 2, the utf8mb4 conversion can be handled by application-space code). It's 2017, do we *require* support for unicode characters outside the BMP in our data store?
Flags: needinfo?(dkl)
Comment 47•6 years ago
|
||
(In reply to Dylan Hardison [:dylan] (he/him) from comment #46) > I think we need to allocate the time perform the conversion of the mysql > tables. > > Let me review my understanding of what needs to be true in the DB for full > utf8 to be supported: > > 1. we need to use DYNAMIC or COMPRESSION row types: > https://dev.mysql.com/doc/refman/5.7/en/innodb-row-format-dynamic.html > 2. the option innodb_large_prefix needs to be turned on > https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters. > html#sysvar_innodb_large_prefix. This large prefix option is directly > related to the concern in comment 32. > 3. every table needs to switch from utf8 to utf8mb4 > (https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html) > > If 1 and 2 are true, then my draft patch will just work. > Currently this bug is targeted round code that queries the row format and > the innodb_large_prefix option but I hadn't had time to actually get it > working. > > Rather than continue to support utf8 *and* utf8mb4, I propose BMO only > support utf8mb4 and bug 1328659 really needs to proceed (to at least do 1 > and 2, the utf8mb4 conversion can be handled by application-space code). > > > It's 2017, do we *require* support for unicode characters outside the BMP in > our data store? Sorry for the delay on this and also the fact that I needed to refresh my understanding of the issue since I have been detached from this for quite some time. I do not see the need to support characters outside of the BMP at this time as I feel it is better to make the improvements incrementally and going from utf8 to utf8mb4 seems to be the least resistance moving forward. Let me know how I can help in the matter. dkl
Flags: needinfo?(dkl)
Comment 49•6 years ago
|
||
So Millennials want to use emojis! If the MySQL change is difficult, how about storing emoji codes like :+1: or :smile: in the database instead of emoji themselves? I guess GitHub, Slack and others do it, though the implementations vary. * On GitHub, available emojis are limited to https://www.webpagefx.com/tools/emoji-cheat-sheet/ * On Slack, all Unicode emojis plus custom emojis are supported; each character has a code
Assignee | ||
Comment 50•6 years ago
|
||
If someone finds a method of DBD::Mysql that lets you apply a transform to every string the goes in or comes out, I'll consider it. We'd use a Private-Use range unicode character and encode using charnames::viacode(). perl -Mcharnames=:full -E 'use utf8; say charnames::viacode(0x1f44d)' THUMBS UP SIGN
Assignee | ||
Comment 51•6 years ago
|
||
But otherwise this is a clear case of a misuse of resources.
Comment 52•6 years ago
|
||
The discussion of codes like :smile: is interesting. Still, it shouldn't become a distraction from the real issue here, which is the failure to support Unicode. This bug seems to be a duplicate of other bugs including 857438 (https://bugzilla.mozilla.org/show_bug.cgi?id=857438), which is five years old. It should be fixed without further delay. The solution proposed above looks good: change utf8 to utf8mb4, and use innodb_large_prefix and DYNAMIC/COMPRESSED row formats.
Comment 53•6 years ago
|
||
> But otherwise this is a clear case of a misuse of resources.
Maybe I don't understand, but I hope you're not advocating against fixing this. I included an emoji in a rather long comment, ended up losing most of the comment, only noticed a day later, and managed to confuse various colleagues in the process. As it's now fairly normal to include emojis in text conversation having Bugzilla behave in this way seems rather troublesome. (And if you search on Google you'll find other Bugzilla instances also encountered this problem, indicating more lost time.)
Assignee | ||
Updated•6 years ago
|
Summary: Detect if mysql can support utf8mb4, and if using utf8, convert to utf8mb4 → Convert all columns from utf8 to utf8mb4
Assignee | ||
Comment 54•6 years ago
|
||
(In reply to Anne (:annevk) from comment #53) > > But otherwise this is a clear case of a misuse of resources. > > Maybe I don't understand, but I hope you're not advocating against fixing > this. I included an emoji in a rather long comment, ended up losing most of > the comment, only noticed a day later, and managed to confuse various > colleagues in the process. As it's now fairly normal to include emojis in > text conversation having Bugzilla behave in this way seems rather > troublesome. (And if you search on Google you'll find other Bugzilla > instances also encountered this problem, indicating more lost time.) I meant finding an alternate encoding -- an application level encoding! -- is a waste of resources when the database can support real utf8 just fine. https://twitter.com/dylan_hardison/status/936679922730524672 There is now a bug for switching the two settings -- bug 1422427. The rough outline of what needs to happen is that those two settings must be changed, then the innodb row format can be converted to dynamic (or compressed), and after that the charset can be set to utf8mb4. For the later two events, I've made https://github.com/mozilla-bteam/bmo/pull/282. If that were to land now, bmo would cease to function until bug 1422427 happened, so we're not any closer to fixing this at the moment.
Summary: Convert all columns from utf8 to utf8mb4 → Change rowformat to dynamic, and convert all columns to utf8mb4
Assignee | ||
Comment 55•6 years ago
|
||
Attachment #8786833 -
Attachment is obsolete: true
Assignee | ||
Comment 56•6 years ago
|
||
On a lark, I checked if I could even do this for bugzilla-dev: answer is no. mysql> SET GLOBAL innodb_file_format=Barracuda; ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER privilege(s) for this operation
Assignee | ||
Comment 57•6 years ago
|
||
An example of the turn-key support for emoji that we already have, once the db supports utf8mb4. I had no idea bug aliases could be emoji, but they can.
Assignee | ||
Comment 58•6 years ago
|
||
Note I have been pushing this issue forward. Basically we need db space. We will archive approximately 1 million attachments, convert the db, and restore the attachments. Doing this safely required coordination and writing a secure way of archiving attachments.
Comment 59•6 years ago
|
||
That's great news! Good luck with the conversion!
Comment 61•6 years ago
|
||
Good to hear that progress is made towards the real solution! Given that the issue (including its implications, eg truncated security-relevant information) is so grave: Perhaps you want to add a warning to the top of each page, and keep it there it until the real solution has been implemented. Nothing dynamic or interactive, nothing complex or time-intensive to implement, no UI to test in browsers - just some text, eg: "Currently, UTF-8 characters (including emoji) are not supported". It could be like (or inside) the existing box at the top which contains "Looking for saved searches [...]", but eg in red.
Assignee | ||
Comment 62•6 years ago
|
||
UTF-8 is supported. Only UTF-8 outside of the BMP are not supported. If you can write a javascript regex that matches all characters outside of the BMP, we can see about making that change. Meanwhile I have been tireless in trying to push out the real fix for this. :(
Comment 63•6 years ago
|
||
Looking forward to the real fix! Until then you could perhaps add a line to the box which currently has 'Looking for saved searches? click on "Search Bugs" above.' It could read eg "Currently, emoji (and other non-BMP unicode characters) are not supported."
Reporter | ||
Comment 64•6 years ago
|
||
> If you can write a javascript regex that matches all characters outside of the BMP, we can see about making that change. Javascript represents non-BMP characters as surrogate pairs [1], so it should suffice to regex match on /[\uD800-\uDFFF]/. FWIW we use this range (along with a couple of others) to avoid similar unicode-related problems in Firefox Accounts, see e.g: https://github.com/mozilla/fxa-auth-server/blob/master/lib/routes/validators.js#L22 [1] https://mathiasbynens.be/notes/javascript-encoding
Assignee | ||
Comment 65•6 years ago
|
||
So we need to make the bugzilla side of this change toggle-able. glob: How about in https://github.com/mozilla-bteam/bmo/pull/282, rather than removing the utf8 option, I make it take one of two (true) values (say, ['legacy', 'modern'] -- so that Bugzilla->params->{utf8} is always true... but when it's "modern" we check the row format, mysql config options, and use the character encoding 'utf8mb4' instead of the (current) default of 'utf8'.
Flags: needinfo?(glob)
Assignee | ||
Comment 66•6 years ago
|
||
Of course, after bmo's database is updated I would remove the option totally. But I need to allow DBAs to change the character encoding without having to be around, as I am not available during the TCW.
Comment 67•6 years ago
|
||
(In reply to Dylan Hardison [:dylan] (he/him) from comment #65) > glob: How about in https://github.com/mozilla-bteam/bmo/pull/282, rather > than removing the utf8 option, > I make it take one of two (true) values (say, ['legacy', 'modern'] so during the TCW a bmo admin will change that param, then run checksetup. in the event of failure it would be a case of setting the param back to legacy and, if required, restore the db? that sounds reasonable to me.
Flags: needinfo?(glob)
Assignee | ||
Comment 70•6 years ago
|
||
An update for those watching this: We now have complete timing information on doing the DB migration (it's about six hours), so no we enter a negotiation process with the various stake-holders to find a time when bugzilla.mozilla.org can be down to migrate the DB to the new format. I'll update this bug once we have figured out when this will be.
Comment 72•5 years ago
|
||
Thanks … strange, I _did_ seek bugs with 1F44D in comments before reporting the duplicate …
Assignee | ||
Comment 73•5 years ago
|
||
(In reply to Graham Perrin from comment #72) > Thanks … strange, I _did_ seek bugs with 1F44D in comments before reporting > the duplicate … No worries. There are a lot of duplicates of this, and I would much prefer that it was already fixed. :-)
Assignee | ||
Comment 74•5 years ago
|
||
Comment on attachment 8933860 [details] [review] PR Moving the patch that implements utf8=utf8mb4 to a separate bug. We're going to be conducting a test migration of the database on staging to this soon, as preparation for migrating the production database during the tree closure window, in early July.
Attachment #8933860 -
Attachment is obsolete: true
Assignee | ||
Comment 75•5 years ago
|
||
We're clear for TCW on July 14th. July 14th, 2018 will be BMO Emoji Day \o/
Assignee | ||
Comment 76•5 years ago
|
||
Announcement is now shown on BMO. The window will be July 14th, from 13:00 UTC to 21:00 UTC.
Updated•5 years ago
|
Comment 77•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Alias: bmo-emoji → bmo-emoji-💩
Assignee | ||
Updated•5 years ago
|
Alias: bmo-emoji-💩 → 💩
Assignee | ||
Updated•5 years ago
|
Blocks: bmo-emoji-restore-attachments
Assignee | ||
Updated•5 years ago
|
Alias: 💩 → bmo-emoji
Assignee | ||
Comment 78•5 years ago
|
||
Aside from a font issue with one heart in a text input on macOS (E.g. ❤️💛💚💙💜), emojis should work everywhere.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Comment 79•5 years ago
|
||
For people who, like me, read this full bug report with all its comments, wondering all the way what "utf8mb4" was supposed to be, searched the Wikipedia, and found nothing, there is an informative (and IMHO very clear) blog article about it at https://mathiasbynens.be/notes/mysql-utf8mb4 In a nutshell: What MySQL calls "utf8" is not the real UTF-8 but only a subset of it, supporting only the BMP, i.e. the most used 5.88% of all possible codepoints, i.e. those from U+0000 to U+FFFF inclusive. It excludes all the rest, from U+10000 to U+10FFFF. The real UTF-8 is supported by MySQL 5.5.3 and later under the name "utf8mb4".
Assignee | ||
Updated•5 years ago
|
Summary: Change rowformat to dynamic, and convert all columns to utf8mb4 → Fix emoji truncation by changing rowformat to dynamic, and convert all columns to utf8mb4
Keywords: bmo-big
Comment 80•2 years ago
|
||
good idea to set the collation to utf8mb4_unicode_ci
, if the server defaults to utf8mb4_general_ci
for the utf8_mb4
You need to log in
before you can comment on or make changes to this bug.
Description
•