Open Bug 944668 Opened 10 years ago Updated 3 years ago

Whine to console when a document is decoded as "replacement" or "x-user-defined"

Categories

(Core :: DOM: HTML Parser, enhancement, P5)

enhancement

Tracking

()

People

(Reporter: hsivonen, Unassigned)

Details

(Keywords: good-first-bug)

We should give feedback in the developer tools about "replacement" and "x-user-defined".

Bulk-downgrade of unassigned, >=5 years untouched DOM/Storage bugs' priority and severity.

If you have reason to believe this is wrong, please write a comment and ni :jstutte.

Severity: normal → S4
Priority: -- → P5
Keywords: good-first-bug

Hi,

I am new in contributing to mozilla, i think i can work on this bug, can you please elaborate this bug so i can understand, what and where changes are required.

Thank You,
Kartik Gautam

Flags: needinfo?(hsivonen)

See https://searchfox.org/mozilla-central/rev/8d722de75886d6bffc116772a1db8854e34ee6a7/parser/html/nsHtml5StreamParser.cpp#1722-1738 for existing messages. Around that code, regardless of mCharsetSource, you could check for mEncoding == REPLACEMENT_ENCODING and mEncoding == X_USER_DEFINED_ENCODING and add messages accordingly.

(Unfortunately, I'll be not reading bugmail after today until January 11th.)

Flags: needinfo?(hsivonen)

The message for replacement should probably be something along the lines of "A legacy encoding that would have been a cross-site scripting hazard was declared as the character encoding. The replacement encoding was used instead. The page should be migrated to UTF-8." and the message for x-user-defined should probably be along the lines of "x-user-defined was declared on the HTTP layer (where it has a different meaning compared to HTML meta). The page should be migrated to UTF-8."

Can anyone tell me, how can i test this thing for "replacement" or "x-user-defined"?

(In reply to Kartik Gautam from comment #5)

Can anyone tell me, how can i test this thing for "replacement" or "x-user-defined"?

mEncoding == REPLACEMENT_ENCODING and mEncoding == X_USER_DEFINED_ENCODING are how you check for them in code. If you meant writing a test case, you can use charset=ISO-2022-KR for the former and, on the HTTP layer only, charset=x-user-defined for the latter. (Note that <meta charset=x-user-defined> has a different meaning.)

You need to log in before you can comment on or make changes to this bug.