Bug 1253535 (bmo-emoji)

Fix emoji truncation by changing rowformat to dynamic, and convert all columns to utf8mb4

RESOLVED FIXED

Status

()

P1
normal
RESOLVED FIXED
3 years ago
7 months ago

People

(Reporter: rfkelly, Assigned: dylan)

Tracking

(Blocks: 1 bug, {bmo-big})

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

3 years ago
Over in security-sensitive Bug 1253495 Comment 4, I tried to submit a comment containing the unicode "PILE OF POO" character.  It and all following characters were truncated from the comment.

Ironically, this seems very similar to the bug I was trying to comment on, where MySQL's "utf8" character set will silently truncate strings that contain non-BMP unicode characters:

  https://mathiasbynens.be/notes/mysql-utf8mb4

This can be a potential security issue if it happens on e.g. username or email fields, like in Bug 1253201.

STR:

1.  Type a non-BMP character like  \U0001f4a9 ("PILE OF POO") or  \U0001F6B2 ("BICYCLE") into the bugzilla comment form.

2.  Submit the comment.

Results:

* The character and all text following it are truncated.

Expected results:

* A nice unicode glyph.

I'm going to try to replicate it by pasting the output of `python -c 'print u"hello \U0001F6B2 world"'` at the end of this comment, and seeing whether I get a bicycle...

hello 
(Reporter)

Comment 1

3 years ago
Nope, the bicycle was truncated :-(
(Reporter)

Comment 2

3 years ago
FYI, over in Bug 1253495 I was able to forge a valid BrowserID assertion for this email:

   'rfkelly@mozilla.com\U0001f4a9\n@mocker.dev.lcip.org'

I don't know much about how bugzilla handles the emails returned by persona, but if I were able to login to bugzilla with that assertion and it got truncated in storage, it might let me login as "rfkelly@mozilla.com"...
(Reporter)

Comment 3

3 years ago
> I don't know much about how bugzilla handles the emails returned by persona

I'm pleased to confirm that when I tried logging in with such an assertion, bugzilla did its own validation of the returned email and rejected it as invalid.

Comment 4

3 years ago
Dupe of bug 405011.
FYI we are currently merging in all the upstream work from the master branch of Bugzilla (the general product) into BMO, including the above bug fix (which has been requested by many people lately).  We've just been cherrypicking fixes and features until now.  The merge has raised some interesting problems with our infrastructure (availability of new Perl packages and such), but we're finally getting close.  We'll have a test host with the merged code set up within a week or two.
bug 405011 is a kludge at best; it only "fixes" comments (not summaries, or any other text field) and requires weird search shenanigans.

the real fix is to use utf8mb4 everywhere.  upstream are blocked by the long tail of older mysql versions but bmo has no such limitation.
Summary: Comments containing non-BMP unicode characters are truncated → Convert to utf8mb4 everywhere (Comments containing non-BMP unicode characters are truncated)
Group: bugzilla-security
See Also: → bug 1193278
(In reply to Frédéric Buclin from comment #4)
> Dupe of bug 405011.

That bug is in upstream Bugzilla. This is filed against bugzilla.mozilla.org so it can't be a dupe. It could morph into "backport upstream's fix" but that would be "depends on", not "dupe". As glob points out bug 405011 wouldn't fix the fields which were Ryan's real concern.
We're not ready to talk about bug 1253495 in public yet. re-hiding.
Group: mozilla-employee-confidential
Duplicate of this bug: 1265948
bug 1253495 is public, opening.
Group: mozilla-employee-confidential
Duplicate of this bug: 1280003
Duplicate of this bug: 1299422
(Assignee)

Updated

3 years ago
Assignee: nobody → dylan
(Assignee)

Comment 14

3 years ago
Created attachment 8786872 [details]
Screen Shot 2016-08-31 at 14.19.00.png

it appears this works.
(Assignee)

Comment 15

3 years ago
so this patch would do all the ALTER TABLE / ALTER DATABASE stuff from utf8 to utf8mb4. It takes a while (a few hours on a dev db). What other considerations would be required for production rollout?

For one, I bet the mysqlclient libraries on centos 6 (RHEL 6) are too old (libmysqlclient.so.16).
Are we running 5.5.something on the server side?

Also this will change the behavior of indexes, but I'm not sure I understand how. In the case when the data is still the ASCII subset of utf8, does it really change anything going from utf8 to utf8mb4?
Flags: needinfo?(scabral)
Test, test, test. create a table, put stuff in it, convert, see what happens to the existing data, see what happens to new data.

Because it's only the character set that is changing, we can do the slaves first, then failover. It means that there will be some overlap time where data may be different, but in the span of a day or 2 I wouldn't think that would matter and wouldn't affect much data.

(to do one machine at a time, put "set sql_log_bin=0;" at the beginning of the statements).
Flags: needinfo?(scabral)
The indexes will need to be rebuilt, because a change in character set (and collation) change how things are ordered. (e.g. the new characters have an ordering to them)

It shouldn't change anything lexically that already exists, a will still be before b, but it will put the new allowed characters into the right place lexically.
(Assignee)

Comment 18

3 years ago
one further q, is libmysqlclient.so.16 too old to support utf8mb4?
Flags: needinfo?(scabral)
I have no idea. but if it is, we can upgrade that.....doing tests on dev/stage will illuminate what we need to do. Or, let me know if you want me to google that to see if I can research the answer?
Flags: needinfo?(scabral)
(Assignee)

Comment 20

3 years ago
ah, I will research it. You just almost always know everything about mysql so I get lazy. :-D
(Assignee)

Updated

2 years ago
See Also: → bug 405011
(Assignee)

Comment 21

2 years ago
This is the bug to follow for getting emojis everywhere in BMO. :-)
Flags: needinfo?(cincodenada)
(Assignee)

Updated

2 years ago
Blocks: 1212310

Comment 22

2 years ago
(In reply to Dylan Hardison [:dylan] from comment #21)

Aha, thanks for the redirection! Glad to see it's under way, much appreciated.
Flags: needinfo?(cincodenada)
It may be a good idea to set the collation to `utf8mb4_unicode_ci`, if the server defaults to `utf8mb4_general_ci` for the `utf8_mb4` character set. Can be checked with `SHOW CHARACTER SET LIKE 'utf8mb4'`.

> https://stackoverflow.com/questions/766809
> https://dev.mysql.com/doc/refman/5.5/en/show-character-set.html
(Assignee)

Comment 24

2 years ago
Alright, so if we can use large prefixes we don't have to worry about the index size or the column size being larger than 191...

Sheeri: Would there be anything preventing BMO's DBs using this?
http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix
Flags: needinfo?(scabral)
(Assignee)

Comment 25

2 years ago
Revised questions:

1) what would the time frame be on changing all our tables to use the newer barracuda format?
2) are there downsides or blockers to using barracuda or large prefixes with replication?
1) We can do stage whenever you are ready. I'd just want to know if you also want to compress the tables while we're at it (makes the table access faster)
2) We'd want to make sure all the slaves are set up properly, including AWS, but there shouldn't be an issue there.

It's theoretically possible to do this without downtime in production.

Here's a sample plan:
- Take a slave out of the load balancer
- Convert one table on the slave
- Watch replication (for, say, 24h) make sure the table doesn't break replication
- Convert the rest of the tables on the slave
- Put the slave back in the load balancer
- Take another slave out of the load balancer, and use innobackupex to do an online sync
- Repeat previous step until all slaves (including backups) are converted
- Failover so a converted slave is now the master, and resync the old master.

We can test this plan in stage.
Flags: needinfo?(scabral)
FWIW AWS does allow the innodb_large_prefix option.
(Assignee)

Updated

2 years ago
Keywords: bmo-big
(Assignee)

Comment 28

2 years ago
You should be made aware of this bug. I'll be working on scheduling this change on staging sometime next week to test out what Sheeri had proposed. hopefully that works for you.
Flags: needinfo?(mpressman)
thanks :dylan - I reviewed what's going on here. I will be in SF next week for planning meetings, but I should be available for any Q's you may have while getting stage setup.
Flags: needinfo?(mpressman)
Duplicate of this bug: 1319754
Summary: Convert to utf8mb4 everywhere (Comments containing non-BMP unicode characters are truncated) → Convert to utf8mb4 everywhere - text containing non-BMP unicode characters (eg. emoji) is truncated
(Assignee)

Updated

2 years ago
Alias: bmo-emoji
(Assignee)

Updated

2 years ago
Depends on: 1328659
(Assignee)

Comment 31

2 years ago
dkl: proposal for this patch:

check if the database has all the right options turned on (3 config options + table format + row format)
and then use utf8mb4, falling back to utf8 otherwise. if we do that this patch can go out ahead of the work planned, yes?
Flags: needinfo?(dkl)
(In reply to Dylan Hardison [:dylan] from comment #31)
> dkl: proposal for this patch:
> 
> check if the database has all the right options turned on (3 config options
> + table format + row format)
> and then use utf8mb4, falling back to utf8 otherwise. if we do that this
> patch can go out ahead of the work planned, yes?

Sounds good to me. Did we ever figure out how to handle indexes that are maxed out with utf8 (1-3 bytes)? Once they are converted to utf8mb4, will we not lose that last bit of data and hurt out our index performance? Is this something we may just have to live with?

dkl
Flags: needinfo?(dkl)
From discussion on #sumo about this, there is concern in two areas:

1. The difficulty of entering emoji on MacOS/iOS platforms
2. Accessibility of emoji in screen readers

I want a11y to weigh in on this so asking :marcoz for their feedback. Depending on their thoughts on this, I may want us to restrict emoji from being used in the short-description/title field of bugs so that we don't impede people searching on bugs. 

Though, a :poop: emoji leaderboard by product/component would had been a glorious thing to make a dashboard of.
Flags: needinfo?(mzehe)
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #33)
> From discussion on #sumo about this, there is concern in two areas:
> 
> 1. The difficulty of entering emoji on MacOS/iOS platforms

On MacOS, you press Ctrl+Cmd+Space to bring up either a small or big window where you can search for, and select, Emojis. On the other hand, I have no idea how you enter those on Windows, for example, ;)

On iOS, it's a matter of adding another keyboard and switching to it. I use Emojis on iOS very frequently.

> 2. Accessibility of emoji in screen readers

It varies. On Windows, support is just getting started. For NVDA, there is an add-on, for JAWS (the market-leading commercial screen reader), I don't know the current state, and there are many older versions out there due to high upgrade costs that definitely don't support them.

On MacOS and iOS, support for Unicode Emojis is fully implemented in VoiceOver. Apple is updating the supported set every year, and they're also accessible and described in localized form right from the start.

On Android, support varies by Android version, but more recent versions of TalkBack do have support.

> Depending on their thoughts on this, I may want us to restrict emoji from
> being used in the short-description/title field of bugs so that we don't
> impede people searching on bugs. 

I believe, to keep it simple and accessible to the widest variety of our users, that's probably a good idea in any case. Not all international users have the same understanding of emojis, and you never know which of the various smiling or grinning faces the original author might have used.

> Though, a :poop: emoji leaderboard by product/component would had been a
> glorious thing to make a dashboard of.

I definitely agree on that! :-D
Flags: needinfo?(mzehe)
:MarcoZ, I typoed myself on my first question. I meant difficulty of entering emoji on NON-Mac OS and iOS platforms. 

But you agree that short-descriptions should be non-emoji only.
Flags: needinfo?(mzehe)
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #35)
> But you agree that short-descriptions should be non-emoji only.

Yes.
Flags: needinfo?(mzehe)
I'd like us not to allow emoji in the short-description/title. :dkl says that we can just filter those out at the application level. 

Alternatively, we can allow emoji, but not search for them, which would require work in search, and I'd like to minimize the work to get this feature out, so unless there's a strong argument for this, let's filter emoji at the application layer.
(Assignee)

Comment 38

2 years ago
(In reply to Emma Humphries ☕️ [:emceeaich] (UTC-8) +needinfo me from comment #37)
> I'd like us not to allow emoji in the short-description/title. :dkl says
> that we can just filter those out at the application level. 
> 
> Alternatively, we can allow emoji, but not search for them, which would
> require work in search, and I'd like to minimize the work to get this
> feature out, so unless there's a strong argument for this, let's filter
> emoji at the application layer.

This strikes me as bad for a number of reasons. 

1) we've never restricted what goes into the summary
2) engineers may well need to report bugs *about* emoji, so having one in the summary is even useful for searching.
3) including these in the summary was one of the original complaints that is driving this work.

Additionally, in the worst case (someone creates an unreadable summary, on a bug that is otherwise "good")
the summaries are editable. I change summaries to better reflect bugs all the time. 

A totally useless bug with :poop: :poop: :poop: in the summary should be RESOLVED INVALID and moved to invalid bugs.

I think these concerns make sense for keywords (possibly) and already apply to aliases.
Bug aliases must match \w+ currently. In perl, that matches any letter or number or _ in any language.
Perhaps we should restrict *aliases* to [A-Za-z0-9_]+ :)

(Note that currently aliases are effectively restricted to that, but only because of incorrect unicode handling in parts of bugzilla. This protects us from the fact that we treat \d as [0-9] but it actually means "a digit in any language"...)
(In reply to Dylan Hardison [:dylan] from comment #38)

> Additionally, in the worst case (someone creates an unreadable summary, on a
> bug that is otherwise "good")
> the summaries are editable. I change summaries to better reflect bugs all
> the time. 
> 
> A totally useless bug with :poop: :poop: :poop: in the summary should be
> RESOLVED INVALID and moved to invalid bugs.

Good points, I'm removing this as a blocker and let's continue. The requirement will be to monitor what happens and update our 1st level triage guidelines.
(Assignee)

Comment 40

2 years ago
(In reply to David Lawrence [:dkl] from comment #32)
> Sounds good to me. Did we ever figure out how to handle indexes that are
> maxed out with utf8 (1-3 bytes)? Once they are converted to utf8mb4, will we
> not lose that last bit of data and hurt out our index performance? Is this
> something we may just have to live with?

The magical setting is innodb_large_prefix: https://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_large_prefix

This increases the size limit and lets us have a full 255 unicode-char, so no downsides.

In mysql 5.7 this is even the default.
(Assignee)

Updated

2 years ago
Priority: -- → P1
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1336993
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1341122
Reminder to see if we can deploy a patch ahead of time with minimal pain.
Flags: needinfo?(dylan)
(Assignee)

Updated

2 years ago
Flags: needinfo?(dylan)
Summary: Convert to utf8mb4 everywhere - text containing non-BMP unicode characters (eg. emoji) is truncated → Detect if mysql can support utf8mb4, and if using utf8, convert to utf8mb4
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1344012
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1387335
(Assignee)

Comment 46

2 years ago
I think we need to allocate the time perform the conversion of the mysql tables.

Let me review my understanding of what needs to be true in the DB for full utf8 to be supported:

1. we need to use DYNAMIC or COMPRESSION row types: https://dev.mysql.com/doc/refman/5.7/en/innodb-row-format-dynamic.html
2. the option innodb_large_prefix needs to be turned on https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix. This large prefix option is directly related to the concern in comment 32.
3. every table needs to switch from utf8 to utf8mb4 (https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html)

If 1 and 2 are true, then my draft patch will just work. 
Currently this bug is targeted round code that queries the row format and the innodb_large_prefix option but I hadn't had time to actually get it working. 

Rather than continue to support utf8 *and* utf8mb4, I propose BMO only support utf8mb4 and bug 1328659 really needs to proceed (to at least do 1 and 2, the utf8mb4 conversion can be handled by application-space code).


It's 2017, do we *require* support for unicode characters outside the BMP in our data store?
Flags: needinfo?(dkl)
(In reply to Dylan Hardison [:dylan] (he/him) from comment #46)
> I think we need to allocate the time perform the conversion of the mysql
> tables.
> 
> Let me review my understanding of what needs to be true in the DB for full
> utf8 to be supported:
> 
> 1. we need to use DYNAMIC or COMPRESSION row types:
> https://dev.mysql.com/doc/refman/5.7/en/innodb-row-format-dynamic.html
> 2. the option innodb_large_prefix needs to be turned on
> https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.
> html#sysvar_innodb_large_prefix. This large prefix option is directly
> related to the concern in comment 32.
> 3. every table needs to switch from utf8 to utf8mb4
> (https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html)
> 
> If 1 and 2 are true, then my draft patch will just work. 
> Currently this bug is targeted round code that queries the row format and
> the innodb_large_prefix option but I hadn't had time to actually get it
> working. 
> 
> Rather than continue to support utf8 *and* utf8mb4, I propose BMO only
> support utf8mb4 and bug 1328659 really needs to proceed (to at least do 1
> and 2, the utf8mb4 conversion can be handled by application-space code).
> 
> 
> It's 2017, do we *require* support for unicode characters outside the BMP in
> our data store?

Sorry for the delay on this and also the fact that I needed to refresh my
understanding of the issue since I have been detached from this for quite some
time.

I do not see the need to support characters outside of the BMP at this time
as I feel it is better to make the improvements incrementally and going from
utf8 to utf8mb4 seems to be the least resistance moving forward.

Let me know how I can help in the matter.

dkl
Flags: needinfo?(dkl)
Duplicate of this bug: 1379423
See Also: → bug 868867
So Millennials want to use emojis! If the MySQL change is difficult, how about storing emoji codes like :+1: or :smile: in the database instead of emoji themselves? I guess GitHub, Slack and others do it, though the implementations vary.

* On GitHub, available emojis are limited to https://www.webpagefx.com/tools/emoji-cheat-sheet/
* On Slack, all Unicode emojis plus custom emojis are supported; each character has a code
(Assignee)

Comment 50

2 years ago
If someone finds a method of DBD::Mysql that lets you apply a transform to every string the goes in or comes out, I'll consider it. We'd use a Private-Use range unicode character and encode using charnames::viacode().

perl -Mcharnames=:full -E 'use utf8; say charnames::viacode(0x1f44d)'
THUMBS UP SIGN
(Assignee)

Comment 51

2 years ago
But otherwise this is a clear case of a misuse of resources.

Comment 52

a year ago
The discussion of codes like :smile: is interesting. Still, it shouldn't become a distraction from the real issue here, which is the failure to support Unicode.

This bug seems to be a duplicate of other bugs including 857438 (https://bugzilla.mozilla.org/show_bug.cgi?id=857438), which is five years old. It should be fixed without further delay.

The solution proposed above looks good: change utf8 to utf8mb4, and use innodb_large_prefix and DYNAMIC/COMPRESSED row formats.

Comment 53

a year ago
> But otherwise this is a clear case of a misuse of resources.

Maybe I don't understand, but I hope you're not advocating against fixing this. I included an emoji in a rather long comment, ended up losing most of the comment, only noticed a day later, and managed to confuse various colleagues in the process. As it's now fairly normal to include emojis in text conversation having Bugzilla behave in this way seems rather troublesome. (And if you search on Google you'll find other Bugzilla instances also encountered this problem, indicating more lost time.)
(Assignee)

Updated

a year ago
Summary: Detect if mysql can support utf8mb4, and if using utf8, convert to utf8mb4 → Convert all columns from utf8 to utf8mb4
(In reply to Anne (:annevk) from comment #53)
> > But otherwise this is a clear case of a misuse of resources.
> 
> Maybe I don't understand, but I hope you're not advocating against fixing
> this. I included an emoji in a rather long comment, ended up losing most of
> the comment, only noticed a day later, and managed to confuse various
> colleagues in the process. As it's now fairly normal to include emojis in
> text conversation having Bugzilla behave in this way seems rather
> troublesome. (And if you search on Google you'll find other Bugzilla
> instances also encountered this problem, indicating more lost time.)

I meant finding an alternate encoding -- an application level encoding! --
is a waste of resources when the database can support real utf8 just fine.

https://twitter.com/dylan_hardison/status/936679922730524672

There is now a bug for switching the two settings -- bug 1422427.

The rough outline of what needs to happen is that those two settings must be changed,
then the innodb row format can be converted to dynamic (or compressed),
and after that the charset can be set to utf8mb4.

For the later two events, I've made https://github.com/mozilla-bteam/bmo/pull/282.
If that were to land now, bmo would cease to function until bug 1422427 happened, so we're not any closer to fixing this at the moment.
Summary: Convert all columns from utf8 to utf8mb4 → Change rowformat to dynamic, and convert all columns to utf8mb4
Created attachment 8933860 [details] [review]
PR
Attachment #8786833 - Attachment is obsolete: true
On a lark, I checked if I could even do this for bugzilla-dev: answer is no.

mysql> SET GLOBAL innodb_file_format=Barracuda;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER privilege(s) for this operation
Created attachment 8933861 [details]
Screen Shot 2017-12-01 at 22.08.07.png

An example of the turn-key support for emoji that we already have, once the db supports utf8mb4.

I had no idea bug aliases could be emoji, but they can.
Note I have been pushing this issue forward. Basically we need db space. We will archive approximately 1 million attachments, convert the db, and restore the attachments. Doing this safely required coordination and writing a secure way of archiving attachments.

Comment 59

a year ago
That's great news! Good luck with the conversion!
Duplicate of this bug: 1426628

Comment 61

a year ago
Good to hear that progress is made towards the real solution!

Given that the issue (including its implications, eg truncated security-relevant information) is so grave:

Perhaps you want to add a warning to the top of each page, and keep it there it until the real solution has been implemented. Nothing dynamic or interactive, nothing complex or time-intensive to implement, no UI to test in browsers - just some text, eg: 
"Currently, UTF-8 characters (including emoji) are not supported". It could be like (or inside) the existing box at the top which contains "Looking for saved searches [...]", but eg in red.
UTF-8 is supported. Only UTF-8 outside of the BMP are not supported.

If you can write a javascript regex that matches all characters outside of the BMP, we can see about making that change. Meanwhile I have been tireless in trying to push out the real fix for this. :(
(Assignee)

Updated

a year ago
Blocks: 1428085

Comment 63

a year ago
Looking forward to the real fix!

Until then you could perhaps add a line to the box which currently has
'Looking for saved searches? click on "Search Bugs" above.'

It could read eg
"Currently, emoji (and other non-BMP unicode characters) are not supported."
(Reporter)

Comment 64

a year ago
> If you can write a javascript regex that matches all characters outside of the BMP, we can see about making that change.

Javascript represents non-BMP characters as surrogate pairs [1], so it should suffice to regex match on  /[\uD800-\uDFFF]/.  FWIW we use this range (along with a couple of others) to avoid similar unicode-related problems in Firefox Accounts, see e.g:

  https://github.com/mozilla/fxa-auth-server/blob/master/lib/routes/validators.js#L22

[1] https://mathiasbynens.be/notes/javascript-encoding
So we need to make the bugzilla side of this change toggle-able.

glob: How about in https://github.com/mozilla-bteam/bmo/pull/282, rather than removing the utf8 option, 
I make it take one of two (true) values (say, ['legacy', 'modern']
-- so that Bugzilla->params->{utf8} is always true...
but when it's "modern" we check the row format, mysql config options, and use the character encoding 'utf8mb4' instead of the (current) default of 'utf8'.
Flags: needinfo?(glob)
Of course, after bmo's database is updated I would remove the option totally. But I need to allow DBAs to change the character encoding without having to be around, as I am not available during the TCW.
(In reply to Dylan Hardison [:dylan] (he/him) from comment #65)
> glob: How about in https://github.com/mozilla-bteam/bmo/pull/282, rather
> than removing the utf8 option, 
> I make it take one of two (true) values (say, ['legacy', 'modern']

so during the TCW a bmo admin will change that param, then run checksetup.  in the event of failure it would be a case of setting the param back to legacy and, if required, restore the db?

that sounds reasonable to me.
Flags: needinfo?(glob)
(Assignee)

Updated

11 months ago
Duplicate of this bug: 1451259
Duplicate of this bug: 1460216
(Assignee)

Comment 70

9 months ago
An update for those watching this:


We now have complete timing information on doing the DB migration (it's about six hours),
so no we enter a negotiation process with the various stake-holders to find a time when bugzilla.mozilla.org can be down to migrate the DB to the new format.

I'll update this bug once we have figured out when this will be.
(Assignee)

Updated

8 months ago
Duplicate of this bug: 1470500

Comment 72

8 months ago
Thanks … strange, I _did_ seek bugs with 1F44D in comments before reporting the duplicate …
(Assignee)

Comment 73

8 months ago
(In reply to Graham Perrin from comment #72)
> Thanks … strange, I _did_ seek bugs with 1F44D in comments before reporting
> the duplicate …

No worries. There are a lot of duplicates of this, and I would much prefer that it was already fixed. :-)
(Assignee)

Comment 74

8 months ago
Comment on attachment 8933860 [details] [review]
PR

Moving the patch that implements utf8=utf8mb4 to a separate bug.

We're going to be conducting a test migration of the database on staging to this soon, as preparation for migrating the production database during the tree closure window, in early July.
Attachment #8933860 - Attachment is obsolete: true
(Assignee)

Updated

8 months ago
Depends on: 1471612
(Assignee)

Comment 75

8 months ago
We're clear for TCW on July 14th. July 14th, 2018 will be BMO Emoji Day \o/
(Assignee)

Comment 76

8 months ago
Announcement is now shown on BMO. The window will be July 14th, from 13:00 UTC to 21:00 UTC.
No longer depends on: 1421401
See Also: → bug 1421401
(Assignee)

Updated

7 months ago
Alias: bmo-emoji → bmo-emoji-💩
(Assignee)

Updated

7 months ago
Alias: bmo-emoji-💩 → 💩
(Assignee)

Updated

7 months ago
Blocks: 1475801
(Assignee)

Updated

7 months ago
Alias: 💩 → bmo-emoji
(Assignee)

Comment 78

7 months ago
Aside from a font issue with one heart in a text input on macOS (E.g. ❤️💛💚💙💜), emojis should work everywhere.
Status: NEW → RESOLVED
Last Resolved: 7 months ago
Resolution: --- → FIXED
For people who, like me, read this full bug report with all its comments, wondering all the way what "utf8mb4" was supposed to be, searched the Wikipedia, and found nothing, there is an informative (and IMHO very clear) blog article about it at https://mathiasbynens.be/notes/mysql-utf8mb4

In a nutshell: What MySQL calls "utf8" is not the real UTF-8 but only a subset of it, supporting only the BMP, i.e. the most used 5.88% of all possible codepoints, i.e. those from U+0000 to U+FFFF inclusive. It excludes all the rest, from U+10000 to U+10FFFF. The real UTF-8 is supported by MySQL 5.5.3 and later under the name "utf8mb4".
(Assignee)

Updated

7 months ago
Summary: Change rowformat to dynamic, and convert all columns to utf8mb4 → Fix emoji truncation by changing rowformat to dynamic, and convert all columns to utf8mb4
You need to log in before you can comment on or make changes to this bug.