CakePHP's new auto-escaping leads to double-encoded HTML entities

RESOLVED FIXED in 3.4.2

Status

addons.mozilla.org Graveyard
Public Pages
RESOLVED FIXED
11 years ago
3 years ago

People

(Reporter: clouserw, Assigned: fligtar)

Tracking

Details

Attachments

(3 attachments)

(Reporter)

Description

11 years ago
Created attachment 311623 [details]
ntilde

To reproduce:

1) load https://preview.addons.mozilla.org/en-US/firefox/browse/type:3
2) Click "Install Dictionary" next to "Spanish (Spain)"
3) See ñ in the name (see attachment)
Assignee: nobody → fwenzel
Target Milestone: --- → 3.2
Summary: Dictionary names are over escaped in install dialogs → Dictionary names are over-escaped in install dialogs

Comment 1

11 years ago
I just checked and the entities you see are actually in these addons' names, so it's not a code bug in the dictionaries page.

I'll go in and fix them via the devcp on production.

I hope it's not an escape problem in the devcp (or it may have been one that was fixed a while ago).
Summary: Dictionary names are over-escaped in install dialogs → Spanish dictionary add-on names contain HTML entities
(Reporter)

Comment 2

11 years ago
(In reply to comment #1)
> I just checked and the entities you see are actually in these addons' names, so
> it's not a code bug in the dictionaries page.
> 
> I'll go in and fix them via the devcp on production.
> 
> I hope it's not an escape problem in the devcp (or it may have been one that
> was fixed a while ago).
> 

It's not just Spanish, look at the others.  It wasn't on the live site, so I assumed it was just an enscaping problem.  If it's changed in the db, it's not the first case of encoded-entities-in-the-database-that-haven't-been-changed I've seen today...

Comment 3

11 years ago
I just tried to change the Spanish dictionary's name in production (ID 3554) and it seems like the DevCP over-escapes localized strings when stored via /developers/edit/3554.

CCing fligtar who may be of help considering how the escaping works in the DevCP.

Fact is, on my dev copy of 3.2 as well as in production, when I fix the entities in the devCP and save them, double-encoded HTML entities show up in my database, instead of the expected UTF-8 entities.

Comment 4

11 years ago
Okay what I wrote was not entirely true: Saving a "ñ" entity results in a two-letter combination in the database (sounds like a multi-byte incompatibility issue, maybe something PHP5 is more picky about than PHP4?), which in turn gets over-escaped when being displayed in the dev CP's edit page, resulting in completely broken entities when clicking the "Update" button more than once because a layer of escaping is added each time.
Assignee: fwenzel → fligtar
(Assignee)

Comment 5

11 years ago
Created attachment 311825 [details] [diff] [review]
patch, v1

Discovered that this was being caused by the new version of Cake's eagerness to escape all html fields, which we also dealt with in submit buttons last week. This modifies our HtmlHelper override to not escape unless explicitly told to.
Attachment #311825 - Flags: review?(fwenzel)

Comment 6

11 years ago
Comment on attachment 311825 [details] [diff] [review]
patch, v1

Ah, yes that seems to magically fix the problem. Good job.
Attachment #311825 - Flags: review?(fwenzel) → review+
(Assignee)

Comment 7

11 years ago
Checked in, r11610. Woo!
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
(Assignee)

Comment 8

11 years ago
Hmm, actually, re-opening as I guess we need to deal with the add-ons that were affected from a dev cp save.

We can either come up with SQL or wait until this lands in production and go through and edit them using the dev cp. I prefer option 2.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 9

11 years ago
I concur. We'll also be able to see the affected add-ons better than. I hope there are not too many.
Summary: Spanish dictionary add-on names contain HTML entities → CakePHP's new auto-escaping leads to double-encoded HTML entities
(Assignee)

Comment 10

11 years ago
Hmm, select * from translations where localized_string like '%&%'; returned 358 rows. Including stuff like: über
Oh, that looks like it may become a lot of fun. However, maybe we need to make a little script to fix this then: remove multiple &s and then entity-decode the last remaining one.
(Assignee)

Comment 12

11 years ago
Well, we also need to convert the entity to the original character - it shouldn't be encoded in the db at all or we'll still have the problem.
Target Milestone: 3.2 → 3.2.1

Updated

10 years ago
Target Milestone: 3.2.1 → 3.4

Updated

10 years ago
Target Milestone: 3.4 → 3.4.2
(Assignee)

Comment 13

10 years ago
Created attachment 319909 [details] [diff] [review]
script to fix entities, v1

This is a script that will fix the corrupted translations in the database.

You can test it from the command prompt like this:
php -f fix_entities.php dryrun

As of right now, this query [select * from translations where localized_string like '%&%';] will return 352 rows. After the script is run it will return 0.
Attachment #319909 - Flags: review?(clouserw)
(Reporter)

Comment 14

10 years ago
Comment on attachment 319909 [details] [diff] [review]
script to fix entities, v1

There are still a lot of \" and \r\n in the fields, but I guess that's a different bug.  Nice work.
Attachment #319909 - Flags: review?(clouserw) → review+
(Assignee)

Comment 15

10 years ago
Script has been checked in, r12982.

After /bin is updated in production we can run it and call this fixed.
Status: REOPENED → ASSIGNED
Keywords: push-needed
(Assignee)

Comment 16

10 years ago
Script was run with 3.4.2 update.
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago10 years ago
Keywords: push-needed
Resolution: --- → FIXED
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.