Closed Bug 1032511 Opened 10 years ago Closed 10 years ago

"ASSERTION: Not a UTF-8 string" with URLSearchParams

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: jruderman, Assigned: baku)

References

Details

(Keywords: assertion, testcase)

Attachments

(3 files, 2 obsolete files)

Attached file testcase
###!!! ASSERTION: Not a UTF-8 string. This code should only be used for converting from known UTF-8 strings.: 'Error', file xpcom/string/nsUTF8Utils.h, line 427

URLSearchParams was added in 887836. I didn't notice this problem right away because of another bug that triggers the same assertions (bug 436006).
Attached file stack
Attached patch utf8.patch (obsolete) — Splinter Review
annevk, can you check the test file and tell me if it makes sense? Thanks.
Attachment #8466987 - Flags: review?(hsivonen)
Attachment #8466987 - Flags: feedback?(annevk)
It should stringify to "\ufffd=" I think. However, our decoders have bug 562590 which is why you only get "=" I guess. However, that will be fixed at some point and then we'll need to update this test. Maybe add a pointer somehow?
Attachment #8466987 - Flags: feedback?(annevk)
Depends on: 562590
Comment on attachment 8466987 [details] [diff] [review]
utf8.patch

Review of attachment 8466987 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/URLSearchParams.cpp
@@ +9,5 @@
>  
>  namespace mozilla {
>  namespace dom {
>  
> +NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE(URLSearchParams, mObservers, mDecoder)

Since decoders are not cycle collection participants, is it actually useful to traverse the decoder here?

@@ +117,5 @@
>    nsACString::const_iterator start, end;
>    aInput.BeginReading(start);
>    aInput.EndReading(end);
>  
> +  nsCString result;

Since "result" and "rv" tend to imply nsresult, I suggest another variable name here. For example "unescaped".

@@ +169,5 @@
> +}
> +
> +void
> +URLSearchParams::ConvertString(const char* aInput, int32_t aInputLength,
> +                               nsAString& aOutput)

Please make this take const nsACString& aInput and extract the buffer and the length inside this method instead of doing it in the caller to keep the buffer and its length together for as long as possible..

@@ +174,5 @@
> +{
> +  if (!mDecoder) {
> +    mDecoder = EncodingUtils::DecoderForEncoding("UTF-8");
> +    if (!mDecoder) {
> +      MOZ_ASSERT(mDecoder, "Failed to create a decoder.");

FWIW, DecoderForEncoding already asserts this, but I suppose overasserting doesn't hurt.

@@ +196,5 @@
> +                         &outputLength);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    aOutput.Truncate();
> +    return;
> +  }

Shouldn't you explicitly let buf know about outputLength after the conversion call?

::: dom/base/test/test_urlSearchParams_utf8.html
@@ +25,5 @@
> +
> +  /** Test for Bug 1032511 **/
> +  var a = new URLSearchParams("%e2");
> +  ok(a, "a exists");
> +  is(a.toString(), '=', "The value should be here.");

Please add another case where a bad percentage-encoded byte occurs in the middle and there's an ASCII byte after.
Attachment #8466987 - Flags: review?(hsivonen) → review-
Attached patch utf8.patch (obsolete) — Splinter Review
I'm not sure about the output of this:

  a = new URLSearchParams("a%e2b");
  is(a.toString(), 'a%EF%BF%BDb=', "The value should be here.");
Attachment #8466987 - Attachment is obsolete: true
Attachment #8467786 - Flags: review?(hsivonen)
That looks correct. It's U+FFFD percent-encoded.
Blocks: 1037715
Comment on attachment 8467786 [details] [diff] [review]
utf8.patch

Review of attachment 8467786 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with these comments addressed.

::: dom/base/URLSearchParams.cpp
@@ +192,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return;
> +  }
> +
> +  char16_t* buf = (char16_t*) moz_malloc((outputLength + 1) * sizeof(char16_t));

Instead of the malloc and sizeof obfuscation and explicit memory management, please use 
const mozilla::fallible_t fallible = mozilla::fallible_t();
mozilla::UniquePtr<char16_t[]> buf(new (fallible) char16_t[outputLength + 1]);

@@ +205,5 @@
> +      aOutput.Truncate();
> +    }
> +  }
> +
> +  moz_free(buf);

And then you don't need this, of course.

::: dom/base/test/test_urlSearchParams_utf8.html
@@ +28,5 @@
> +  ok(a, "a exists");
> +  is(a.toString(), '=', "The value should be here.");
> +
> +  a = new URLSearchParams("a%e2");
> +  is(a.toString(), 'a=', "The value should be here.");

Please add a comment about the decoder bug that makes the decoder fail to emit a REPLACEMENT CHARACTER.
Attachment #8467786 - Flags: review?(hsivonen) → review+
Pushed to try:  https://tbpl.mozilla.org/?tree=Try&rev=46d3e5691896

I used:
  nsAutoArrayPtr<char16_t> buf(new (fallible) char16_t[outputLength + 1]);

instead:
  mozilla::UniquePtr<char16_t[]> buf(new (fallible) char16_t[outputLength + 1]);
Attached patch utf8.patchSplinter Review
Attachment #8467786 - Attachment is obsolete: true
Keywords: checkin-needed
Comment on attachment 8469413 [details] [diff] [review]
utf8.patch

Some string usage nits:

>+  nsACString::const_iterator iter;
>+  aInput.BeginReading(iter);
String iterators are obsolete, use const char * and BeginReading(). See https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Guide/Internal_strings#Iterators

>+  const mozilla::fallible_t fallible = mozilla::fallible_t();
>+  nsAutoArrayPtr<char16_t> buf(new (fallible) char16_t[outputLength + 1]);
Prefer to allocate directly into the output string and use BeginWriting():
if (!aOutput.SetLength(outputLength, fallible_t()))
  return;

>+    buf[outputLength] = 0;
>+    if (!aOutput.Assign(buf, outputLength, mozilla::fallible_t())) {
Assign(buf, outputLength) doesn't read buf[outputLength].
Thanks. I'll fix this in a follow up.
Blocks: 1050702
https://hg.mozilla.org/mozilla-central/rev/d5a749da75bd
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: