Closed
Bug 1026680
Opened 10 years ago
Closed 10 years ago
Latin1 strings: convert escape/unescape, URI encoding/decoding functions
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(5 files)
6.96 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
5.50 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
5.68 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
3.98 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
5.32 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•10 years ago
|
||
I moved the algorithm into a separate function, changed the return type to an enum and eliminated the gotos. After that it was pretty straight-forward.
Attachment #8441584 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8441587 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 3•10 years ago
|
||
A bit simpler than the others. I changed infallibleAppend(c) to append(c) to avoid an assert in infallibleAppend(c). infallibleAppend(jschar) asserts when the buffer is latin1, because inflation is fallible.
Attachment #8441591 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 4•10 years ago
|
||
Attachment #8441592 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 5•10 years ago
|
||
Currently it always returns a TwoByte string, but the characters are really Latin1. Unfortunately there's no easy way to check EnableLatin1Strings here and do the right thing in that case. I'll keep this bug open and get back to this once everything is enabled by default.
Attachment #8441602 -
Flags: review?(n.nethercote)
Comment 6•10 years ago
|
||
Comment on attachment 8441591 [details] [diff] [review] Part 3 - unescape Review of attachment 8441591 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsstr.cpp @@ +252,5 @@ > int k = 0; > bool building = false; > > + /* Step 5. */ > + while (k != length) { k < length, just in case something screwy happens and we overflow, so possibly we end up in code paths we can easily comprehend? Just a general principle to always use < rather than != for this sort of thing. (And I'm told clang, at least, can optimize better in that case, because it can sometimes know wraparound isn't possible.) @@ -310,5 @@ > if (!sb.reserve(length)) \ > return false; \ > sb.infallibleAppend(chars, k); \ > - } \ > - JS_END_MACRO While I don't much like the JS_{BEGIN,END}_MACRO nonsense, I don't think you should get rid of the equivalent thing. Without this, this code doesn't do what it looks like: if (...) ENSURE_BUILDING; else ...; Having explicit do { and while (false) seems much better to me. @@ +279,3 @@ > > /* Step 10-13. */ > if (Unhex4(&chars[k + 2], &c)) { Range::subrange or Range::slice seems valuable for this, to preserve in-bounds assertions. I can r+ a patch for that. :-) @@ -332,5 @@ > } > > step_18: > - if (building) > - sb.infallibleAppend(c); This was wrong even before this patch, right? How did we not fail any assertions? @@ +313,5 @@ > +{ > + CallArgs args = CallArgsFromVp(argc, vp); > + > + /* Step 1. */ > + JSLinearString *str = ArgToRootedString(cx, args, 0); Surprising this doesn't require Rooted. Or does it? @@ +327,5 @@ > + * NB: use signed integers for length/index to allow simple length > + * comparisons without unsigned-underflow hazards. > + */ > + JS_STATIC_ASSERT(JSString::MAX_LENGTH <= INT_MAX); > + int length = str->length(); SafeCast<int>(str->length()), please. @@ +331,5 @@ > + int length = str->length(); > + > + if (str->hasLatin1Chars()) { > + AutoCheckCannotGC nogc; > + if (!Unescape(sb, str->latin1Chars(nogc), length)) Range(str->latin1Chars(nogc), length)?
Updated•10 years ago
|
Attachment #8441584 -
Flags: review?(n.nethercote) → review+
Updated•10 years ago
|
Attachment #8441587 -
Flags: review?(n.nethercote) → review+
Comment 7•10 years ago
|
||
Comment on attachment 8441591 [details] [diff] [review] Part 3 - unescape Review of attachment 8441591 [details] [diff] [review]: ----------------------------------------------------------------- I'll cancel this r?, and let Waldo do the review of the updated patch.
Attachment #8441591 -
Flags: review?(n.nethercote)
Comment 8•10 years ago
|
||
Comment on attachment 8441592 [details] [diff] [review] Part 4 - Cleanup escape Review of attachment 8441592 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsstr.cpp @@ +151,3 @@ > */ > + static_assert(JSString::MAX_LENGTH < UINT32_MAX / 6, > + "newlength must not overflow"); What happens if you have a string of length MAX_LENGTH, and then some of its chars need to be escaped? Will the js_NewString() below fail?
Attachment #8441592 -
Flags: review?(n.nethercote) → review+
Updated•10 years ago
|
Attachment #8441602 -
Flags: review?(n.nethercote) → review+
Comment 9•10 years ago
|
||
Comment on attachment 8441591 [details] [diff] [review] Part 3 - unescape Review of attachment 8441591 [details] [diff] [review]: ----------------------------------------------------------------- Actually, I'm happy to say r=me with Waldo's nits addressed.
Attachment #8441591 -
Flags: review+
Assignee | ||
Comment 10•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #8) > What happens if you have a string of length MAX_LENGTH, and then some of its > chars need to be escaped? Will the js_NewString() below fail? Yes, js_NewString -> JSFlatString::new_ -> validateLength, and there we do: if (MOZ_UNLIKELY(length > JSString::MAX_LENGTH)) { js_ReportAllocationOverflow(maybecx); return false; }
Assignee | ||
Comment 11•10 years ago
|
||
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #6) > k < length, just in case something screwy happens and we overflow, so > possibly we end up in code paths we can easily comprehend? Just a general > principle to always use < rather than != for this sort of thing. (And I'm > told clang, at least, can optimize better in that case, because it can > sometimes know wraparound isn't possible.) OK, k < length is what I'd write in my own code or tell people to write, but with the signed-length and index thing I was being overly cautious. But yeah, k better be <= length or we'll have a buffer overflow in this code. Changed. > @@ -310,5 @@ > Having explicit do { and while (false) seems much better to me. Good point, done. > Range::subrange or Range::slice seems valuable for this, to preserve > in-bounds assertions. I can r+ a patch for that. :-) These methods don't exist so if you don't mind I'll use Range/RangedPtr in this code and file a followup bug to add slice/subrange and use them. There's a *ton* of code I (still) have to change, and I'm really trying to leave the code in a better state than it was before (just look at the other patches in this bug), but I also don't want to get too distracted fixing pre-existing issues :) > This was wrong even before this patch, right? How did we not fail any > assertions? It wasn't wrong before this patch, see also comment 3. Basically it's okay to infallibleAppend(jschar) if you call ensureTwoByteChars first, and we do that (only) for TwoByte strings. Now Latin1 strings can also be used here and we don't call ensureTwoByteChars in this case, so infallibleAppend(jschar) will assert. There are different ways to fix this, but as this is not perf-sensitive it seemed best to just use append(jschar), which will always do the right thing: try to append to a Latin1 buffer and if out of range it will do (fallible) Latin1 -> TwoByte inflation. (The code also does sb.reserve(input-length) and output-length is always <= input-length.) > Surprising this doesn't require Rooted. Or does it? I think it's fine, the string still isn't used after potential GC-calls. But I'll change it to a RootedLinearString; this is not the code where an unnecessary root matters :) > SafeCast<int>(str->length()), please. Done. > Range(str->latin1Chars(nogc), length)? OK, will change it to use str->latin1Range(nogc). Maybe we can have some good first bugs to convert existing code to Range/RangedPtr?
Assignee | ||
Comment 12•10 years ago
|
||
Pushed with comments addressed: https://hg.mozilla.org/integration/mozilla-inbound/rev/68cec11659b9 https://hg.mozilla.org/integration/mozilla-inbound/rev/aea4ea200313 https://hg.mozilla.org/integration/mozilla-inbound/rev/aff142aa038d https://hg.mozilla.org/integration/mozilla-inbound/rev/f08470bb573a https://hg.mozilla.org/integration/mozilla-inbound/rev/45dd001f865a Leaving this bug open to make str_escape return Latin1 strings once Latin1 strings are enabled (as mentioned in comment 5).
Keywords: leave-open
Comment 13•10 years ago
|
||
sorry had to back this out for test failures like https://tbpl.mozilla.org/php/getParsedLog.php?id=41959773&tree=Mozilla-Inbound on windows debug builds
Assignee | ||
Comment 14•10 years ago
|
||
Relanded, Windows debug bustage was unrelated (bug 1027149): https://hg.mozilla.org/integration/mozilla-inbound/rev/3343738708a5 https://hg.mozilla.org/integration/mozilla-inbound/rev/e794548b0169 https://hg.mozilla.org/integration/mozilla-inbound/rev/f65a13922049 https://hg.mozilla.org/integration/mozilla-inbound/rev/fd9bc8054cdd https://hg.mozilla.org/integration/mozilla-inbound/rev/0784505c202a
Comment 15•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3343738708a5 https://hg.mozilla.org/mozilla-central/rev/e794548b0169 https://hg.mozilla.org/mozilla-central/rev/f65a13922049 https://hg.mozilla.org/mozilla-central/rev/fd9bc8054cdd https://hg.mozilla.org/mozilla-central/rev/0784505c202a
Assignee | ||
Comment 16•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #5) > Created attachment 8441602 [details] [diff] [review] > Part 5 - Convert escape > > Currently it always returns a TwoByte string, but the characters are really > Latin1. Unfortunately there's no easy way to check EnableLatin1Strings here > and do the right thing in that case. > > I'll keep this bug open and get back to this once everything is enabled by > default. See bug 1041469 part 5.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in
before you can comment on or make changes to this bug.
Description
•