Closed
Bug 1008590
Opened 10 years ago
Closed 10 years ago
Remove chars pointer for inline strings, store JSString length and flags separately
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla32
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(1 file, 2 obsolete files)
71.43 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
With the patch in bug 1008421, JSString's 4th word is only used for inline chars. My initial plan was to store the string length there, and stuff the length for inline strings in the flags word (it must be very small if we use inline chars). However, this is a bit annoying because we'd have two different length fields and it complicates a lot of code, not just JSString::length(). There's a much nicer alternative: stop storing the chars pointer for inline strings, as it can be derived from the string pointer. This will add a branch to JSLinearString::chars(), but this seems negligible. Then we can store the length and flags in two separate 32-bit fields and eliminate the right-shift to get the string length. It also means we can shrink JSString on 64-bit from 4 to 3 words. The attached patch does this and passes jit-tests on x86 and x64. I also refactored the flags a bit: because we have 32 available flags now, we can make some queries (isFlat, isInline, isFatInline etc) more efficient. Benchmarks don't seem affected (on x64 Octane-pdfjs is about 500 points faster, likely because our strings are smaller now on 64-bit).
Assignee | ||
Updated•10 years ago
|
Blocks: latin1strings
Assignee | ||
Comment 1•10 years ago
|
||
This is also nice for bug 903519 (allocating strings and chars in the nursery), because it eliminates the getAllocKind() calls in JSString::isFatInline and JSString::isExternal.
Blocks: 903519
Comment 2•10 years ago
|
||
Comment on attachment 8420545 [details] [diff] [review] WIP Review of attachment 8420545 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/String.h @@ +197,5 @@ > + 000101 Inline > + 010101 FatInline > + 100001 External > + 001001 Atom > + 101001 PermAtom I'd love to have every possible combination documented, even if some of them are invalid.
Assignee | ||
Comment 3•10 years ago
|
||
Green on Try.
Assignee | ||
Updated•10 years ago
|
Attachment #8420545 -
Attachment is obsolete: true
Assignee | ||
Comment 4•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #2) > I'd love to have every possible combination documented, even if some of them > are invalid. I tweaked the comment a bit, is this fine or should I add more info?
Flags: needinfo?(n.nethercote)
Comment 5•10 years ago
|
||
Comment on attachment 8420839 [details] [diff] [review] Patch Review of attachment 8420839 [details] [diff] [review]: ----------------------------------------------------------------- I'm hoping to be able to easily answer questions like "what would flags 111111 mean... is that even valid?"
Assignee | ||
Comment 6•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #5) > I'm hoping to be able to easily answer questions like "what would flags > 111111 mean... is that even valid?" The comment explains that's a string with the IsFlat, HasBase, IsInline, IsAtom flags (the lowest 4 bits). It also explains HasBase is only used for Dependent and Undepended and these have fixed flags: never used as atoms and don't have inline chars, so this type is not valid :) FWIW, if we remove the HasBase flag, we get 111101: a FatInlinePermanentAtom. I'm not sure we use Inline/FatInline atm for permanent atoms, but the type itself is valid.
Comment 7•10 years ago
|
||
> > I'm hoping to be able to easily answer questions like "what would flags
> > 111111 mean... is that even valid?"
>
> The comment explains that's a string with the IsFlat, HasBase, IsInline,
> IsAtom flags (the lowest 4 bits). It also explains HasBase is only used for
> Dependent and Undepended and these have fixed flags: never used as atoms and
> don't have inline chars, so this type is not valid :)
Well, yes. It'd be nice if I didn't have to deduce all that for myself. My motivation is that in the past I tried to enumerate all the possibilities exactly like this and I was unable to work out if certain combinations were valid. The new scheme is simpler, so maybe such a comment is not necessary; it's just a suggestion.
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 8•10 years ago
|
||
Rebased to tip, with some minor tweaks.
Attachment #8420839 -
Attachment is obsolete: true
Attachment #8420839 -
Flags: review?(luke)
Attachment #8423000 -
Flags: review?(luke)
Comment 9•10 years ago
|
||
Comment on attachment 8423000 [details] [diff] [review] Patch Review of attachment 8423000 [details] [diff] [review]: ----------------------------------------------------------------- Overall, looks great. A couple nits and possibly one bug below. ::: js/src/vm/String.h @@ +175,5 @@ > * have no such entry. The "subtype predicate" entry for a type specifies > * the predicate used to query whether a JSString instance is subtype > * (reflexively) of that type. > * > + * Rope 000000 000000 Pre-existing, but we seem to have lost the column headings for "instance encoding" and "subtype encoding"; can you add them back in? @@ +201,5 @@ > + * > + * Some types also use the other 2 bits: > + * > + * Bit 4: Extensible, FatInline > + * Bit 5: External, PermanentAtom This was a bit confusing. I think what you're saying is that bits 4 and 5 don't imply a particular string, they are part of a denser encoding scheme. In that case, listing out the cases just restates the above table. @@ +205,5 @@ > + * Bit 5: External, PermanentAtom > + * > + * The Rope, Dependent, Undepended, Extensible and External types have fixed flags: > + * none of the other flags will be set for these types (they are not used as > + * atoms and don't have inline chars). I think this para is mostly implied by the table which specifies the instance encoding. @@ +223,2 @@ > > + static const uint32_t PERMANENT_BIT = JS_BIT(5); Assuming External strings aren't permanent, I don't think you should have a PERMANENT_BIT, just a PERMANENT_ATOM_FLAGS (in the style of EXTERNAL_FLAGS). Also, there is an isPermanent() below that would return true for an External string. @@ +383,5 @@ > } > > + MOZ_ALWAYS_INLINE > + bool isFatInline() const { > + return (d.u1.flags & FAT_INLINE_FLAGS) == FAT_INLINE_FLAGS; I wonder if it would be nicer to distinguish *_FLAGS (which imply that you test with d.u1.flags == *_FLAGS) and cases where you you need to do this mask + equality. In generality, the mask could be different then the compared value, so you could have a (*_MASK, *_MATCH) pair for each one. What do you think? @@ +421,5 @@ > > /* Only called by the GC for dependent or undepended strings. */ > > inline bool hasBase() const { > + JS_STATIC_ASSERT((DEPENDENT_FLAGS | FLAT_BIT) == UNDEPENDED_FLAGS); Seems like you don't need the static assert, given that this is all by design now. @@ +671,5 @@ > template <js::AllowGC allowGC> > static inline JSInlineString *new_(js::ThreadSafeContext *cx); > > inline jschar *init(size_t length); > + inline jschar *initFat(size_t length); Could you move this to JSFatInlineString? I was thinking it would be nice to use the static type to prevent initializing an inline string with initFat or vice versa.
Attachment #8423000 -
Flags: review?(luke) → review+
Assignee | ||
Comment 10•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/85e85da78909 (Green on Try a few days ago.) (In reply to Luke Wagner [:luke] (on partial PTO) from comment #9) > Overall, looks great. A couple nits and possibly one bug below. Thanks for the careful review! > ::: js/src/vm/String.h > Pre-existing, but we seem to have lost the column headings for "instance > encoding" and "subtype encoding"; can you add them back in? Done. > This was a bit confusing. I think what you're saying is that bits 4 and 5 > don't imply a particular string, they are part of a denser encoding scheme. > In that case, listing out the cases just restates the above table. I can see how it could be confusing. Removed that part. > I think this para is mostly implied by the table which specifies the > instance encoding. Same. > Assuming External strings aren't permanent, I don't think you should have a > PERMANENT_BIT, just a PERMANENT_ATOM_FLAGS (in the style of EXTERNAL_FLAGS). > Also, there is an isPermanent() below that would return true for an External > string. isPermanent is a method of JSAtom and External strings are not atoms, so there's no bug. I agree PERMANENT_BIT is confusing, removed it. I also changed JSAtom::isPermanent() to just call JSString::isPermanentAtom(). > I wonder if it would be nicer to distinguish *_FLAGS (which imply that you > test with d.u1.flags == *_FLAGS) and cases where you you need to do this > mask + equality. In generality, the mask could be different then the > compared value, so you could have a (*_MASK, *_MATCH) pair for each one. > What do you think? Nice idea. I think the separate MASK/MATCH constants could make it more confusing though. I changed it to: static const uint32_t FAT_INLINE_MASK = INLINE_CHARS_BIT | JS_BIT(4); static const uint32_t PERMANENT_ATOM_MASK = ATOM_BIT | JS_BIT(5); /* Initial flags for inline and fat inline strings. */ static const uint32_t INIT_INLINE_FLAGS = FLAT_BIT | INLINE_CHARS_BIT; static const uint32_t INIT_FAT_INLINE_FLAGS = FLAT_BIT | FAT_INLINE_MASK; The *_MASK ones are used for type tests. > Could you move this to JSFatInlineString? I was thinking it would be nice > to use the static type to prevent initializing an inline string with initFat > or vice versa. I wasn't sure about overloading init() with the same signature, but it makes sense here; done.
Comment 11•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/85e85da78909
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in
before you can comment on or make changes to this bug.
Description
•