Closed
Bug 1013917
Opened 11 years ago
Closed 11 years ago
Add Latin1 string flag
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla32
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(2 files, 1 obsolete file)
|
5.20 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
|
42.62 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
This patch adds a new string flag for latin1 strings and makes a number of other JSString changes.
Attachment #8426192 -
Flags: review?(luke)
| Assignee | ||
Comment 1•11 years ago
|
||
Adds two new shell functions: isLatin1(s) and toLatin1(s). The latter is not available with --fuzzing-safe for obvious reasons.
This conversion function is a bit lame, but it allows me to convert most functions working on strings one at a time and write tests for them.
Attachment #8426195 -
Flags: review?(luke)
Comment 2•11 years ago
|
||
Comment on attachment 8426192 [details] [diff] [review]
Part 1 - JSString changes
Review of attachment 8426192 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/vm/String.h
@@ +220,5 @@
> + *
> + * The LATIN1_CHARS_BIT specifies the character encoding. If set, the string's
> + * characters are stored as Latin1 instead of TwoByte. This saves memory because
> + * Latin1 strings use one byte per character instead of two. Note that Latin1
> + * strings are not yet enabled by default, see bug 998392.
Could you integrate this comment a bit more with the preceding comment blocks? In particular, can you:
- mention it in the bulleted list of optimizations in the JSString comment
- mention latin1 vs. twoByte in the operations+fields / invariants+properties columns
- update the encoding bits table (e.g., both the subtype predicate and instance encoding of Rope need an x in the encoding bit
- add Bit 6 to your bulleted bit list below that
@@ +348,5 @@
> /* Type query and debug-checked casts */
>
> MOZ_ALWAYS_INLINE
> bool isRope() const {
> + return (d.u1.flags & TYPE_FLAGS_MASK) == ROPE_FLAGS;
What's your intended scheme for ropes? I was thinking we'd allow building ropes out of mixed latin1 and 2byte and the encoding bit would only indicate whether the rope was *exclusively* composed of latin1 (in which case flattening produced latin1; otherwise flattening would inflate latin1).
@@ +573,5 @@
> public:
> MOZ_ALWAYS_INLINE
> const jschar *nonInlineChars() const {
> JS_ASSERT(!isInline());
> + JS_ASSERT(hasTwoByteChars());
Is the eventual goal that you'll have manually converted all use sites of strings and replaced all uses of the un-prefixed accessors (like 'chars' and 'range') with prefixed accessors (like 'twoByteChars' and 'twoByteRange')? Or are we going to permanently bias towards twoByte by making unprefixed 'chars' mean 'twoByteChars'? I think the former is preferable in avoiding people writing paths that break when latin1 flows through.
Attachment #8426192 -
Flags: review?(luke) → review+
Updated•11 years ago
|
Attachment #8426195 -
Flags: review?(luke) → review+
Comment 3•11 years ago
|
||
I was thinking that it'd be nice to get some fuzzing before the entire conversion is complete. It's hard, though, since you can't let the fuzzer arbitrarily create latin1 strings.
| Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #2)
> - update the encoding bits table (e.g., both the subtype predicate and
> instance encoding of Rope need an x in the encoding bit
> - add Bit 6 to your bulleted bit list below that
My idea there was that the encoding table etc would only describe the "type flags", the lowest 6 bits. The next section describes the other flag, latin1, which is kind of unrelated to the type. Do you think it's clearer to combine them as you suggested?
> What's your intended scheme for ropes? I was thinking we'd allow building
> ropes out of mixed latin1 and 2byte and the encoding bit would only indicate
> whether the rope was *exclusively* composed of latin1 (in which case
> flattening produced latin1; otherwise flattening would inflate latin1).
Yes, that was also my plan. A rope will have the latin1 flag if both |left| and |right| are latin1.
> Is the eventual goal that you'll have manually converted all use sites of
> strings and replaced all uses of the un-prefixed accessors (like 'chars' and
> 'range') with prefixed accessors (like 'twoByteChars' and 'twoByteRange')?
> Or are we going to permanently bias towards twoByte by making unprefixed
> 'chars' mean 'twoByteChars'? I think the former is preferable in avoiding
> people writing paths that break when latin1 flows through.
Definitely the former. I'll add twoByteChars() and friends soon, when there are no more calls to chars/getChars we can remove those.
| Assignee | ||
Comment 5•11 years ago
|
||
needinfo for the question in comment 4 :)
Thanks for the quick reviews btw.
Flags: needinfo?(luke)
Comment 6•11 years ago
|
||
Ah, I see your point. In that case, perhaps we can make this a bit more clear by, at the top of the flags comment:
- add a title "The Flags Word\n"
- explain that, at a high level, the flags word is broken down into type and encoding
- explain encoding, then continue into the existing comment which is all about type.
Flags: needinfo?(luke)
| Assignee | ||
Comment 7•11 years ago
|
||
I updated the comments and also fixed the GDB pretty printers. This applies on top of the patch in bug 1014131, I verified they work fine with this patch.
Attachment #8426933 -
Flags: review?(luke)
| Assignee | ||
Updated•11 years ago
|
Attachment #8426192 -
Attachment is obsolete: true
Comment 8•11 years ago
|
||
Comment on attachment 8426933 [details] [diff] [review]
Part 1 - JSString changes (v2)
Review of attachment 8426933 [details] [diff] [review]:
-----------------------------------------------------------------
Perfect, thanks.
::: js/src/vm/String.h
@@ +158,4 @@
> struct {
> union {
> + const char *nonInlineCharsLatin1; /* JSLinearString, except JS(Fat)InlineString */
> + const jschar *nonInlineCharsTwoByte;
Can you comment the nonInlineCharsTwoByte line as well?
Attachment #8426933 -
Flags: review?(luke) → review+
| Assignee | ||
Comment 9•11 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #9)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/1264ddc72695
> https://hg.mozilla.org/integration/mozilla-inbound/rev/eb24106356d9
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/38be38679248 for breaking rootanalysis: https://tbpl.mozilla.org/php/getParsedLog.php?id=40212897&tree=Mozilla-Inbound
Flags: needinfo?(jdemooij)
Comment 11•11 years ago
|
||
I reproduced the crash locally running ion/bug837312.js with --ion-eager --ion-parallel-compile=off.
Catchpoint 1 (signal SIGSEGV), js::jit::IonCommonFrameLayout::prevType (this=0x0)
at /home/jon/work/rooting/js/src/jit/IonFrames.h:321
321 return FrameType(descriptor_ & FrameTypeMask);
Catchpoint 1 (signal SIGSEGV), js::jit::IonCommonFrameLayout::prevType (this=0x0)
at /home/jon/work/rooting/js/src/jit/IonFrames.h:321
321 return FrameType(descriptor_ & FrameTypeMask);
(gdb) bt 10
#0 js::jit::IonCommonFrameLayout::prevType (this=0x0)
at /home/jon/work/rooting/js/src/jit/IonFrames.h:321
#1 0x00000000006ff2c0 in js::jit::JitFrameIterator::prevType (this=0x7fffffffbd40)
at /home/jon/work/rooting/js/src/jit/IonFrames-inl.h:51
#2 0x000000000087b549 in js::jit::JitFrameIterator::isFakeExitFrame (
this=0x7fffffffbd40) at /home/jon/work/rooting/js/src/jit/IonFrames-inl.h:57
#3 0x000000000082c50c in js::jit::MarkJitExitFrame (trc=0x7fffffffc130, frame=...)
at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1012
#4 0x0000000000822491 in js::jit::MarkJitActivation (trc=0x7fffffffc130,
activations=...) at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1186
#5 0x00000000008223c8 in js::jit::MarkJitActivations (rt=0x1d63480,
trc=0x7fffffffc130) at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1214
#6 0x00000000005da0b1 in js::gc::GCRuntime::markRuntime (this=0x1d637e8,
trc=0x7fffffffc130, useSavedRoots=false)
at /home/jon/work/rooting/js/src/gc/RootMarking.cpp:764
#7 0x00000000005f1296 in js::Nursery::collect (this=0x1d64208, rt=0x1d63480,
reason=JS::gcreason::DEBUG_GC, pretenureTypes=0x0)
at /home/jon/work/rooting/js/src/gc/Nursery.cpp:784
#8 0x0000000000ab928c in js::gc::GCRuntime::minorGC (this=0x1d637e8,
reason=JS::gcreason::DEBUG_GC) at /home/jon/work/rooting/js/src/jsgc.cpp:4983
#9 0x0000000000a99872 in js::MinorGC (rt=0x1d63480, reason=JS::gcreason::DEBUG_GC)
at /home/jon/work/rooting/js/src/jsgc.cpp:4974
The JitFrameIterator has current_ pointing to zero:
(gdb) p *this
$6 = {current_ = 0x0, type_ = js::jit::JitFrame_Exit, returnAddressToFp_ = 0x0,
frameSize_ = 0, cachedSafepointIndex_ = 0x0, activation_ = 0x7fffffffd550,
mode_ = js::SequentialExecution}
| Assignee | ||
Comment 12•11 years ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #11)
> I reproduced the crash locally running ion/bug837312.js with --ion-eager
> --ion-parallel-compile=off.
I couldn't reproduce it locally. Thanks again, this saves me some frustrating Try-debugging! It's an unrelated issue, bug 1009957. I pushed a patch for that and confirmed the rootanalysis shell builds are green now:
https://tbpl.mozilla.org/?tree=Try&rev=71da2c1cbac1
(This is on top of the patches here.)
https://hg.mozilla.org/integration/mozilla-inbound/rev/1895f37082ec
https://hg.mozilla.org/integration/mozilla-inbound/rev/7493b5574762
Flags: needinfo?(jdemooij)
https://hg.mozilla.org/mozilla-central/rev/1895f37082ec
https://hg.mozilla.org/mozilla-central/rev/7493b5574762
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in
before you can comment on or make changes to this bug.
Description
•