Closed Bug 1013917 Opened 11 years ago Closed 11 years ago

Add Latin1 string flag

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(2 files, 1 obsolete file)

Attached patch Part 1 - JSString changes (obsolete) — Splinter Review
This patch adds a new string flag for latin1 strings and makes a number of other JSString changes.
Attachment #8426192 - Flags: review?(luke)
Adds two new shell functions: isLatin1(s) and toLatin1(s). The latter is not available with --fuzzing-safe for obvious reasons. This conversion function is a bit lame, but it allows me to convert most functions working on strings one at a time and write tests for them.
Attachment #8426195 - Flags: review?(luke)
Comment on attachment 8426192 [details] [diff] [review] Part 1 - JSString changes Review of attachment 8426192 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/String.h @@ +220,5 @@ > + * > + * The LATIN1_CHARS_BIT specifies the character encoding. If set, the string's > + * characters are stored as Latin1 instead of TwoByte. This saves memory because > + * Latin1 strings use one byte per character instead of two. Note that Latin1 > + * strings are not yet enabled by default, see bug 998392. Could you integrate this comment a bit more with the preceding comment blocks? In particular, can you: - mention it in the bulleted list of optimizations in the JSString comment - mention latin1 vs. twoByte in the operations+fields / invariants+properties columns - update the encoding bits table (e.g., both the subtype predicate and instance encoding of Rope need an x in the encoding bit - add Bit 6 to your bulleted bit list below that @@ +348,5 @@ > /* Type query and debug-checked casts */ > > MOZ_ALWAYS_INLINE > bool isRope() const { > + return (d.u1.flags & TYPE_FLAGS_MASK) == ROPE_FLAGS; What's your intended scheme for ropes? I was thinking we'd allow building ropes out of mixed latin1 and 2byte and the encoding bit would only indicate whether the rope was *exclusively* composed of latin1 (in which case flattening produced latin1; otherwise flattening would inflate latin1). @@ +573,5 @@ > public: > MOZ_ALWAYS_INLINE > const jschar *nonInlineChars() const { > JS_ASSERT(!isInline()); > + JS_ASSERT(hasTwoByteChars()); Is the eventual goal that you'll have manually converted all use sites of strings and replaced all uses of the un-prefixed accessors (like 'chars' and 'range') with prefixed accessors (like 'twoByteChars' and 'twoByteRange')? Or are we going to permanently bias towards twoByte by making unprefixed 'chars' mean 'twoByteChars'? I think the former is preferable in avoiding people writing paths that break when latin1 flows through.
Attachment #8426192 - Flags: review?(luke) → review+
Attachment #8426195 - Flags: review?(luke) → review+
I was thinking that it'd be nice to get some fuzzing before the entire conversion is complete. It's hard, though, since you can't let the fuzzer arbitrarily create latin1 strings.
(In reply to Luke Wagner [:luke] from comment #2) > - update the encoding bits table (e.g., both the subtype predicate and > instance encoding of Rope need an x in the encoding bit > - add Bit 6 to your bulleted bit list below that My idea there was that the encoding table etc would only describe the "type flags", the lowest 6 bits. The next section describes the other flag, latin1, which is kind of unrelated to the type. Do you think it's clearer to combine them as you suggested? > What's your intended scheme for ropes? I was thinking we'd allow building > ropes out of mixed latin1 and 2byte and the encoding bit would only indicate > whether the rope was *exclusively* composed of latin1 (in which case > flattening produced latin1; otherwise flattening would inflate latin1). Yes, that was also my plan. A rope will have the latin1 flag if both |left| and |right| are latin1. > Is the eventual goal that you'll have manually converted all use sites of > strings and replaced all uses of the un-prefixed accessors (like 'chars' and > 'range') with prefixed accessors (like 'twoByteChars' and 'twoByteRange')? > Or are we going to permanently bias towards twoByte by making unprefixed > 'chars' mean 'twoByteChars'? I think the former is preferable in avoiding > people writing paths that break when latin1 flows through. Definitely the former. I'll add twoByteChars() and friends soon, when there are no more calls to chars/getChars we can remove those.
needinfo for the question in comment 4 :) Thanks for the quick reviews btw.
Flags: needinfo?(luke)
Ah, I see your point. In that case, perhaps we can make this a bit more clear by, at the top of the flags comment: - add a title "The Flags Word\n" - explain that, at a high level, the flags word is broken down into type and encoding - explain encoding, then continue into the existing comment which is all about type.
Flags: needinfo?(luke)
I updated the comments and also fixed the GDB pretty printers. This applies on top of the patch in bug 1014131, I verified they work fine with this patch.
Attachment #8426933 - Flags: review?(luke)
Attachment #8426192 - Attachment is obsolete: true
Comment on attachment 8426933 [details] [diff] [review] Part 1 - JSString changes (v2) Review of attachment 8426933 [details] [diff] [review]: ----------------------------------------------------------------- Perfect, thanks. ::: js/src/vm/String.h @@ +158,4 @@ > struct { > union { > + const char *nonInlineCharsLatin1; /* JSLinearString, except JS(Fat)InlineString */ > + const jschar *nonInlineCharsTwoByte; Can you comment the nonInlineCharsTwoByte line as well?
Attachment #8426933 - Flags: review?(luke) → review+
I reproduced the crash locally running ion/bug837312.js with --ion-eager --ion-parallel-compile=off. Catchpoint 1 (signal SIGSEGV), js::jit::IonCommonFrameLayout::prevType (this=0x0) at /home/jon/work/rooting/js/src/jit/IonFrames.h:321 321 return FrameType(descriptor_ & FrameTypeMask); Catchpoint 1 (signal SIGSEGV), js::jit::IonCommonFrameLayout::prevType (this=0x0) at /home/jon/work/rooting/js/src/jit/IonFrames.h:321 321 return FrameType(descriptor_ & FrameTypeMask); (gdb) bt 10 #0 js::jit::IonCommonFrameLayout::prevType (this=0x0) at /home/jon/work/rooting/js/src/jit/IonFrames.h:321 #1 0x00000000006ff2c0 in js::jit::JitFrameIterator::prevType (this=0x7fffffffbd40) at /home/jon/work/rooting/js/src/jit/IonFrames-inl.h:51 #2 0x000000000087b549 in js::jit::JitFrameIterator::isFakeExitFrame ( this=0x7fffffffbd40) at /home/jon/work/rooting/js/src/jit/IonFrames-inl.h:57 #3 0x000000000082c50c in js::jit::MarkJitExitFrame (trc=0x7fffffffc130, frame=...) at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1012 #4 0x0000000000822491 in js::jit::MarkJitActivation (trc=0x7fffffffc130, activations=...) at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1186 #5 0x00000000008223c8 in js::jit::MarkJitActivations (rt=0x1d63480, trc=0x7fffffffc130) at /home/jon/work/rooting/js/src/jit/IonFrames.cpp:1214 #6 0x00000000005da0b1 in js::gc::GCRuntime::markRuntime (this=0x1d637e8, trc=0x7fffffffc130, useSavedRoots=false) at /home/jon/work/rooting/js/src/gc/RootMarking.cpp:764 #7 0x00000000005f1296 in js::Nursery::collect (this=0x1d64208, rt=0x1d63480, reason=JS::gcreason::DEBUG_GC, pretenureTypes=0x0) at /home/jon/work/rooting/js/src/gc/Nursery.cpp:784 #8 0x0000000000ab928c in js::gc::GCRuntime::minorGC (this=0x1d637e8, reason=JS::gcreason::DEBUG_GC) at /home/jon/work/rooting/js/src/jsgc.cpp:4983 #9 0x0000000000a99872 in js::MinorGC (rt=0x1d63480, reason=JS::gcreason::DEBUG_GC) at /home/jon/work/rooting/js/src/jsgc.cpp:4974 The JitFrameIterator has current_ pointing to zero: (gdb) p *this $6 = {current_ = 0x0, type_ = js::jit::JitFrame_Exit, returnAddressToFp_ = 0x0, frameSize_ = 0, cachedSafepointIndex_ = 0x0, activation_ = 0x7fffffffd550, mode_ = js::SequentialExecution}
(In reply to Jon Coppeard (:jonco) from comment #11) > I reproduced the crash locally running ion/bug837312.js with --ion-eager > --ion-parallel-compile=off. I couldn't reproduce it locally. Thanks again, this saves me some frustrating Try-debugging! It's an unrelated issue, bug 1009957. I pushed a patch for that and confirmed the rootanalysis shell builds are green now: https://tbpl.mozilla.org/?tree=Try&rev=71da2c1cbac1 (This is on top of the patches here.) https://hg.mozilla.org/integration/mozilla-inbound/rev/1895f37082ec https://hg.mozilla.org/integration/mozilla-inbound/rev/7493b5574762
Flags: needinfo?(jdemooij)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: