Closed Bug 1369020 Opened 2 years ago Closed 2 years ago

Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM

Categories

(Core :: Internationalization, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: hsivonen, Assigned: emk)

References

Details

Attachments

(1 file)

Make callers of nsContentUtils::ConvertStringFromEncoding use mozilla::Encoding directly.
Priority: -- → P3
Duplicate of this bug: 1369022
Summary: Remove nsContentUtils::ConvertStringFromEncoding → Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM
Assignee: nobody → VYV03354
Status: NEW → ASSIGNED
I replaced some CheckForCOM + ConvertStringFromEncoding combos with Decode rather than ForBOM + DecodeWithBOMRemoval. Decode already implements the BOM handling (a.k.a. the "decode" algorithm of the WHATWG spec). We don't have to re-invent the wheel in the C++ side.
Comment on attachment 8878689 [details]
Bug 1369020 - Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM.

https://reviewboard.mozilla.org/r/149996/#review154758

r+ with the nits below addressed.

::: dom/base/nsDocument.cpp:9936
(Diff revision 3)
>      if (NS_FAILED(rv)) {
>        const nsACString &docCharset = GetDocumentCharacterSet();
> +      const Encoding* encoding = docCharset.IsEmpty() ?
> +        UTF_8_ENCODING : Encoding::ForName(docCharset);
>  
> -      rv = nsContentUtils::ConvertStringFromEncoding(docCharset,
> +      rv = encoding->DecodeWithBOMRemoval(unescapedRef, ref);

The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct.

Also, can `docCharset` be empty? We have code elsewhere that assumes it can't.

::: dom/base/nsReferencedElement.cpp:39
(Diff revision 3)
>    nsAutoCString charset;
>    aURI->GetOriginCharset(charset);
> +  const Encoding* encoding = charset.IsEmpty() ?
> +    UTF_8_ENCODING : Encoding::ForName(charset);
>    nsAutoString ref;
> -  nsresult rv = nsContentUtils::ConvertStringFromEncoding(charset,
> +  nsresult rv = encoding->DecodeWithBOMRemoval(refPart, ref);

The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct.

::: dom/html/nsHTMLDocument.cpp:1362
(Diff revision 3)
>  
>      nsXPIDLCString cookie;
>      service->GetCookieString(codebaseURI, channel, getter_Copies(cookie));
>      // CopyUTF8toUTF16 doesn't handle error
>      // because it assumes that the input is valid.
> -    nsContentUtils::ConvertStringFromEncoding(NS_LITERAL_CSTRING("UTF-8"),
> +    UTF_8_ENCODING->DecodeWithBOMRemoval(cookie, aCookie);

The old code removed the BOM here, but are we really supposed to remove the BOM here?

::: parser/htmlparser/nsParser.cpp:1335
(Diff revision 3)
>  
>    if (pws->mNeedCharsetCheck) {
>      pws->mNeedCharsetCheck = false;
>      int32_t source;
>      nsAutoCString preferred;
>      nsAutoCString maybePrefer;

It seems that the `maybePrefer` string is now useless. Please remove it and call `encoding->Name(preferred)` below (twice) instead of `encoding->Name(maybePrefer)`.
Attachment #8878689 - Flags: review?(hsivonen) → review+
Comment on attachment 8878689 [details]
Bug 1369020 - Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM.

https://reviewboard.mozilla.org/r/149996/#review154758

> The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct.
> 
> Also, can `docCharset` be empty? We have code elsewhere that assumes it can't.

Aparently it can't. Removed the empty check. (I did not add an assertion because ForName will immediately panic even in release builds anyway.)

> The old code removed the BOM here, but it seems to me that `DecodeWithoutBOMHandling` would be more correct.

I also thought that :) Fixed.

> The old code removed the BOM here, but are we really supposed to remove the BOM here?

No. the spec says "using UTF-8 decode without BOM".
https://html.spec.whatwg.org/multipage/dom.html#dom-document-cookie

Changed the function to DecodeWithoutBOMHandling.
Pushed by VYV03354@nifty.ne.jp:
https://hg.mozilla.org/integration/autoland/rev/8587375461d5
Remove nsContentUtils::ConvertStringFromEncoding and nsContentUtils::CheckForBOM. r=hsivonen
https://hg.mozilla.org/mozilla-central/rev/8587375461d5
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.