Closed Bug 1369020 Opened 8 years ago Closed 8 years ago

Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla56

Tracking Flags:

Tracking

Status

firefox56

---

fixed

People

(Reporter: hsivonen, Assigned: emk)

References

Details

Attachments

(1 file)

Bug 1369020 - Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM. 8 years ago Masatoshi Kimura [:emk] 59 bytes, text/x-review-board-request	hsivonen : review+	Details

Henri Sivonen (:hsivonen)

Reporter

Description

•

8 years ago

Make callers of nsContentUtils::ConvertStringFromEncoding use mozilla::Encoding directly.

Makoto Kato [:m_kato]

Updated

•

8 years ago

Priority: -- → P3

Comment hidden (mozreview-request)

Masatoshi Kimura [:emk]

Assignee

Updated

•

8 years ago

Summary: Remove nsContentUtils::ConvertStringFromEncoding → Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM

Masatoshi Kimura [:emk]

Assignee

Updated

•

8 years ago

Assignee: nobody → VYV03354

Status: NEW → ASSIGNED

Comment hidden (mozreview-request)

Masatoshi Kimura [:emk]

Assignee

Comment 4

•

8 years ago

I replaced some CheckForCOM + ConvertStringFromEncoding combos with Decode rather than ForBOM + DecodeWithBOMRemoval. Decode already implements the BOM handling (a.k.a. the "decode" algorithm of the WHATWG spec). We don't have to re-invent the wheel in the C++ side.

Comment hidden (mozreview-request)

Henri Sivonen (:hsivonen)

Reporter

Comment 6

•

8 years ago

mozreview-review

Comment on attachment 8878689 [details] Bug 1369020 - Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM. https://reviewboard.mozilla.org/r/149996/#review154758 r+ with the nits below addressed. ::: dom/base/nsDocument.cpp:9936 (Diff revision 3) > if (NS_FAILED(rv)) { > const nsACString &docCharset = GetDocumentCharacterSet(); > + const Encoding* encoding = docCharset.IsEmpty() ? > + UTF_8_ENCODING : Encoding::ForName(docCharset); > > - rv = nsContentUtils::ConvertStringFromEncoding(docCharset, > + rv = encoding->DecodeWithBOMRemoval(unescapedRef, ref); The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct. Also, can `docCharset` be empty? We have code elsewhere that assumes it can't. ::: dom/base/nsReferencedElement.cpp:39 (Diff revision 3) > nsAutoCString charset; > aURI->GetOriginCharset(charset); > + const Encoding* encoding = charset.IsEmpty() ? > + UTF_8_ENCODING : Encoding::ForName(charset); > nsAutoString ref; > - nsresult rv = nsContentUtils::ConvertStringFromEncoding(charset, > + nsresult rv = encoding->DecodeWithBOMRemoval(refPart, ref); The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct. ::: dom/html/nsHTMLDocument.cpp:1362 (Diff revision 3) > > nsXPIDLCString cookie; > service->GetCookieString(codebaseURI, channel, getter_Copies(cookie)); > // CopyUTF8toUTF16 doesn't handle error > // because it assumes that the input is valid. > - nsContentUtils::ConvertStringFromEncoding(NS_LITERAL_CSTRING("UTF-8"), > + UTF_8_ENCODING->DecodeWithBOMRemoval(cookie, aCookie); The old code removed the BOM here, but are we really supposed to remove the BOM here? ::: parser/htmlparser/nsParser.cpp:1335 (Diff revision 3) > > if (pws->mNeedCharsetCheck) { > pws->mNeedCharsetCheck = false; > int32_t source; > nsAutoCString preferred; > nsAutoCString maybePrefer; It seems that the `maybePrefer` string is now useless. Please remove it and call `encoding->Name(preferred)` below (twice) instead of `encoding->Name(maybePrefer)`.

Attachment #8878689 - Flags: review?(hsivonen) → review+

Masatoshi Kimura [:emk]

Assignee

Comment 7

•

8 years ago

mozreview-review-reply

Comment on attachment 8878689 [details] Bug 1369020 - Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM. https://reviewboard.mozilla.org/r/149996/#review154758 > The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct. > > Also, can `docCharset` be empty? We have code elsewhere that assumes it can't. Aparently it can't. Removed the empty check. (I did not add an assertion because ForName will immediately panic even in release builds anyway.) > The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct. I also thought that :) Fixed. > The old code removed the BOM here, but are we really supposed to remove the BOM here? No. the spec says "using UTF-8 decode without BOM". https://html.spec.whatwg.org/multipage/dom.html#dom-document-cookie Changed the function to DecodeWithoutBOMHandling.

Comment hidden (mozreview-request)

Pulsebot

Comment 9

•

8 years ago

Pushed by VYV03354@nifty.ne.jp: https://hg.mozilla.org/integration/autoland/rev/8587375461d5 Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM. r=hsivonen

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 10

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/8587375461d5

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox56: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla56

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM

Categories

(Core :: Internationalization, enhancement, P3)

Tracking

()

People

(Reporter: hsivonen, Assigned: emk)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Updated

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Attachment

General

Description

File Name

Content Type