Closed Bug 1009466 Opened 11 years ago Closed 10 years ago

Add a wrapper class for JSString char access

Tracking

()

Status:

RESOLVED INVALID

People

(Reporter: jandem, Assigned: jandem)

References

Details

Jan de Mooij [:jandem]

Assignee

Description

•

11 years ago

For bug 903519 (allocating strings in the nursery) we need to ensure there are no raw jschar* pointers on the stack (for chars owned by JSStrings at least). After discussing this with Terrence we decided to add some RAII class that holds a jschar*, and this pointer will be updated when a GC moves the string. Bug 998392 will be touching a lot of the same code and this new char wrapper could make it easier to support multiple encodings. I want to prototype something and see how it works out.

Jan de Mooij [:jandem]

Assignee

Comment 1

•

11 years ago

After looking at some code in jsstr.cpp, here are a few thoughts: (1) In many/most cases, no GC can happen while we are using the chars. (2) There are a few functions, like UnrolledMatch, that do pointer arithmetic etc and need high performance. To support these use cases, we could add "raw" types like this: class TwoByteCharPtr { jschar *chars_; } class Latin1CharPtr { jschar8 *chars_; } // Or UTF8 etc. class CharPtr { union of char* types + enum value jschar operator[](size_t i) const; } And have these implement operator+ etc for pointer arithmetic. Then the analysis could treat these as GC hazards if they are used across GC calls. Terrence is this possible? I think this would take care of most string->chars() calls, with pretty much no overhead. In debug builds these types would also give us bounds checks. If we *can* GC while holding the chars (or don't care about the overhead), we use another variant that does the GC magic: RootedTwoByteCharPtr ptr = string->twoByteChars(); RootedCharPtr ptr = string->chars(); TwoByteCharPtr can be assigned to RootedTwoByteCharPtr; RootedTwoByteCharPtr can be passed to helper functions that take TwoByteCharPtr etc. Thoughts?

Flags: needinfo?(terrence)

Terrence Cole [:terrence]

Comment 2

•

11 years ago

(In reply to Jan de Mooij [:jandem] from comment #1) > After looking at some code in jsstr.cpp, here are a few thoughts: > > (1) In many/most cases, no GC can happen while we are using the chars. > (2) There are a few functions, like UnrolledMatch, that do pointer > arithmetic etc and need high performance. > > To support these use cases, we could add "raw" types like this: > > class TwoByteCharPtr { jschar *chars_; } > class Latin1CharPtr { jschar8 *chars_; } // Or UTF8 etc. > class CharPtr { > union of char* types + enum value > jschar operator[](size_t i) const; > } The type=encoding classes already exist in js/public/CharacterEncoding.h. There are variants for Latin1, UTF8, and TwoByteChars in both null-terminated and pointer-range varieties. They are based on mozilla::Range/RangePtr, so we get efficient access and range assertions automatically. > And have these implement operator+ etc for pointer arithmetic. Then the > analysis could treat these as GC hazards if they are used across GC calls. > Terrence is this possible? Absolutely. That's an extremely elegant solution, too! With that, we would get protection against unwanted GC's and at the same time let C++ types ensure we have the right string encoding at all times. You just need to add the new names to |isRootedPointerTypeName| in js/src/devtools/rootAnalysis/annotations.js; should be trivial. > I think this would take care of most string->chars() calls, with pretty much > no overhead. In debug builds these types would also give us bounds checks. > > If we *can* GC while holding the chars (or don't care about the overhead), > we use another variant that does the GC magic: > > RootedTwoByteCharPtr ptr = string->twoByteChars(); > RootedCharPtr ptr = string->chars(); > > TwoByteCharPtr can be assigned to RootedTwoByteCharPtr; RootedTwoByteCharPtr > can be passed to helper functions that take TwoByteCharPtr etc. In theory, we should be able to implement GCMethods<TwoByteChars> and get these classes for free. At the moment, no extra marking would be needed for these, since the chars can't move. When we do move them, the owning string would simply traverse the Rooted<TwoByteChars> list and forward the chars manually. > Thoughts? As always, we're thinking in the same direction. Your idea of making the underlying chars types into GC pointers for the analysis is brilliant. It should make this conversion straightforward.

Flags: needinfo?(terrence)

Jan de Mooij [:jandem]

Assignee

Comment 3

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #2) > The type=encoding classes already exist in js/public/CharacterEncoding.h. > There are variants for Latin1, UTF8, and TwoByteChars in both > null-terminated and pointer-range varieties. They are based on > mozilla::Range/RangePtr, so we get efficient access and range assertions > automatically. That's true, but those are probably also used for chars that are not owned by strings and we don't want the static analysis to complain about that. Maybe I could implement the other classes in terms of these though... Thanks for the feedback!

Terrence Cole [:terrence]

Comment 4

•

11 years ago

Good point! The analysis should be seeing the declared names, so even a typedef would work there, although it would be nice to have the compiler catch accidental casts.

Jan de Mooij [:jandem]

Assignee

Comment 5

•

11 years ago

I'm now beginning to doubt we really need a "rooted char" class at all, if we have an "unrooted char" type that the static analysis can check. In many cases you could just store an index or something, for instance ReplaceData::dollar/dollarEnd. I'll see how far we get with that.

Jan de Mooij [:jandem]

Assignee

Updated

•

11 years ago

Depends on: 1011693

Terrence Cole [:terrence]

Comment 6

•

10 years ago

I'm guessing this really isn't necessary, given the state of the latin1 patches.

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Resolution: --- → INVALID

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Add a wrapper class for JSString char access

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: jandem, Assigned: jandem)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6