Closed Bug 1052579 Opened 10 years ago Closed 6 years ago

Store JS string buffers in the data heap

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: benjamin, Assigned: cmartin)

References

(Blocks 1 open bug)

Details

Attachments

(9 files, 6 obsolete files)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
See bug 1052575 for overview: this bug tracks storing JS engine string buffers in the segregated data heap.
Depends on: 1364359
Priority: -- → P3
Assignee: nobody → cmartin

active work

Priority: P3 → P1
Depends on: 1527767
Attached patch string_arena_assert.patch (obsolete) — Splinter Review

This is the patch I've been using for finding places where string memory is allocated from outside the new string heap.

Any time the JSString class receives a new value for its char buffers, I added a check that the buffer is located in the new StringBufferArena. If not, it will print its PID and infinite-loop so I can attach the VS Debugger and inspect the call stack.

It also checks reallocs, since there were a couple instances where code was trying to "cross arenas" during a realloc. Same behavior - Prints the PID and freezes.

This patch is not intended to be checked in as-is, although something like it may appear in the final product as a MOZ_ASSERT() to ensure that nobody accidently adds new code that allocates string buffers in the default heap.

I've managed to get all XPCShell tests passing on my machine with the 'new arena + asserts' combo. I'm working on getting all the Mochitests to pass as well. I'm a little uncomfortable with relying on only testing to find all the right codepaths though, so I'm also trying a different approach that may help more.

I'm using the compiler itself to help me find the sites where allocation happens by renaming everything in the datapath for the JSString heap storage and letting the compile errors tell me where to look. Then recursively I go to each new spot and rename all the datapath items there.

If all goes well, then when the code eventually compiles again, it will mean that I've found every allocation site that eventually ends up as the heap storage for a JSString.

Example

(Notice the 'x_' prefix on everything I've found)

-JS_PUBLIC_API JSString* JS_NewUCString(JSContext* cx, char16_t* chars,
+JS_PUBLIC_API JSString* x_JS_NewUCString(JSContext* cx, char16_t* chars,
                                        size_t length) {
   AssertHeapIsIdle();
   CHECK_THREAD(cx);
-  return NewString(cx, chars, length);
-}
-
-JS_PUBLIC_API JSString* JS_NewUCStringDontDeflate(JSContext* cx,
+  return x_NewString(cx, chars, length);
+}
+
+JS_PUBLIC_API JSString* x_JS_NewUCStringDontDeflate(JSContext* cx,
                                                   char16_t* chars,
                                                   size_t length) {
   AssertHeapIsIdle();
   CHECK_THREAD(cx);
-  return NewStringDontDeflate(cx, chars, length);
+  return x_NewStringDontDeflate(cx, chars, length);
 }

(In reply to Chris Martin [:cmartin] from comment #3)

I'm using the compiler itself to help me find the sites where allocation happens by renaming everything in the datapath for the JSString heap storage and letting the compile errors tell me where to look. Then recursively I go to each new spot and rename all the datapath items there.

If all goes well, then when the code eventually compiles again, it will mean that I've found every allocation site that eventually ends up as the heap storage for a JSString.

It's a good tactic. :-) I used it in bug 991981 five years ago. It can be a bit of a sprawling tactic, but that's how the cookie crumbles. :-| 'least you aren't doing it for a security bug.

I'm almost at the point where I've hit most of the major allocation sites and solved whatever problems existed, but I am a
little hung up on the StringBuffer class, which is a generic JS utility class that is used for building up a heap buffer from a
bunch of other string-like objects. I'm having trouble deciding how to best handle this class, as it is used for both
JSStrings and non-JSStrings.

Currently, it covers multiple use cases...

Usecases

Usecase 1 - finishString() (api) (usage)

In some uses, it is used to build up a heap buffer that will eventually be handed off to a JSFlatString. In this case,
it will now need to be allocated on the new String Heap

Usecase 2 - stealChars() (api) (usage)

In some uses, it is stolen by directly taking over the underlying char16_t*. There seems to be a bit of a 'leaky'
abstraction with this usecase, since the ptr is assumed to be freeable by calling js_free()

Usecase 3 - "Plain old buffer" (api) (usage)

In some uses, it gives away a mutable reference to its private members and is used directly as a sort-of "vector of
chars with some convenient append() functions". Some code even reallocates the underlying buffers without telling
the containing class about it
, which seems like the code does not view the StringBuffer class as an abstraction
at all.

Suggestions

I see a couple of paths I could take to change it. I'm curious to know which seem like a good idea:

  1. Add an 'arena_id' argument to the ctor, and add MOZ_ASSERT(m_arena_id == StringArena); to finishString() to
    ensure it is only used with buffers allocated in the new arena.

  2. Same as #1, but also cleanup the 2 leaky abstractions from above: Have Usecase 2 return a smart pointer that
    frees the buffer using the correct policy, and add APIs to the class to support Usecase 3 without directly
    exposing the private member.

  3. Change StringBuffer to a struct and make its members public. This essentially turns it into a "char vector
    with associated append() functions", which is how Usecase 3 above already treats it. Then implement
    Usecases 1 and 2 as seperate classes that use StringBuffer as their underlying implementation. This has several
    advantages over the first two options:

    • It separates the 3 usecases into their own classes with clear responsibilities
    • There is no need to add extra storage for an arena_id member because it will be known at compile time
    • If a developer makes a mistake and uses the wrong arena, they will get a clear compile-time type error
      instead of an assertion at runtime

Suggestion #1 represents the minimum change I could make for this bug. That gets the job done, but since I'm
already elbows-deep in this code I wouldn't mind leaving it better than I found it :)

Attached patch alloc_policy.patch (obsolete) — Splinter Review

Gian-Carlo mentioned that he might be able to also help with testing on Linux to find allocation sites that I may have missed.

I am uploading the patches I've been using for testing. They are NOT meant to be the final patches that will be checked in. In many cases, I just solved a problem in the fastest way possible to help me get past the assertion to the next allocation site.

This is patch 1/3

Attached patch stringbuffer_arena.patch (obsolete) — Splinter Review

This is patch 2/3

Attached patch allocation_sites.patch (obsolete) — Splinter Review

This is patch 3/3

I forgot to mention it above, but these patches are against changeset 198cd4a81bf2 in mozilla-central. I haven't tried yet to see how cleanly they would apply against tip because I've switched away to the different strategy I've talked about in a previous comment.

Thanks for offering to help, Gian-Carlo :)

Several areas of JS code abstract their memory allocation logic behind an
AllocPolicy. Since these allocations may now need to be in different arenas,
each AllocPolicy will need to be assigned a Mozjemalloc arena that it will
use for all its allocations.

Refactored a couple of minor issues with JS StringBuffer, partly to prepare
it to handle the new JSString buffer arena.

  1. There were public functions that were returning mutable references to
    private members, making them not-really "private" at all. They are now
    hidden and new API calls were added to provide needed functionality.
  2. ExtractWellSized() now uses the buffer's policy to resize/free it instead
    of directly calling JS alloc API
  3. Changed FinishStringFlat to no longer take a reference to a StringBuffer
    and a reference to a StringBuffer member

Depends on D25705

Any code that calls StringBuffer::finishString() is creating a
JSFlatString from a StringBuffer's backing buffer. That backing buffer should
always be allocated on the new String Arena heap.

This new class will ensure that this behavior is statically enforced by the
compiler.

Depends on D25706

Several of these JS char-copying functions allocate memory, and currently
they implicitly do it in the MallocArena. They now take an explicit argument
about which arena they perform their allocation in.

Depends on D25707

Currently, JSAPI malloc calls can only allocate in MallocArena. Now there
are calls for when the user intends to allocate a buffer that will be
"stolen" by one of the NewString calls.

Depends on D25708

Now that all JSString buffers have been moved to the new arena, add assertions
to prevent future developers from accidently allocating new buffers in the
wrong arena.

Depends on D25710

Attachment #9045651 - Attachment is obsolete: true
Attachment #9050865 - Attachment is obsolete: true
Attachment #9050866 - Attachment is obsolete: true
Attachment #9050868 - Attachment is obsolete: true
Attachment #9053456 - Attachment is obsolete: true

Please check-in D25705 for now :)

Keywords: checkin-needed

Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cf6d9523f2cb
Allow JS AllocPolicy to be assigned to an arena r=sfink

Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Please check-in D25706 (and don't close the ticket if possible)

Keywords: checkin-needed

I'm confused - D25705 already landed in comment 18, but the revision was never closed in Phabricator. Which in turn is confusing Lando into thinking that D25706 can't land without also landing dependent revision D25705. Chris, can you please close D25705?

Flags: needinfo?(cmartin)

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/18a310712e50
Refactor and cleanup for JS StringBuffer class r=sfink

Keywords: checkin-needed

I closed it. It had to be re-opened for me to rebase my commit stack against mozilla-central for whatever reason. Otherwise moz-phab was complaining that it couldn't edit the revision. Weird stuff...

Flags: needinfo?(cmartin)

Please check-in D25707

Keywords: checkin-needed

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9ad896485f89
Move StringBuffer::finishString() and update all usage sites r=sfink

Keywords: checkin-needed

Please check-in D25708

Keywords: checkin-needed

Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1b3804497a8e
Add explicit arena to several char-copying JS functions r=sfink

Keywords: checkin-needed

To ensure that any new JSString has its char buffer allocated in the new arena,
it is useful to be able to query a pointer and assert that it is in the
correct arena (at-least in Debug Build).

This adds the required functionality to mozjemalloc, and JSString can use it
for its new assertion in a later change.

Attachment #9055027 - Attachment description: Bug 1052579 - Add assertions to JSString to catch allocation regressions → Bug 1052579 - Add ability to query ArenaID to mozjemalloc_ptr_info
Attachment #9057882 - Attachment description: Bug 1052579 - Add ability to query ArenaID to mozjemalloc_ptr_info → Bug 1052579 - Add assertions to JSString to catch allocation regressions

Please check-in D25709

Keywords: checkin-needed

Pushed by ccoroiu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ec340bb2dd46
Add new JSAPI calls for allocating string buffers r=sfink

Keywords: checkin-needed

Please check-in D25711. Thanks!

Keywords: checkin-needed

We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again. (255, 'applying /tmp/tmpT5CDEf\npatching file js/src/vm/Compartment.cpp\nHunk #2 FAILED at 120\n1 out of 2 hunks FAILED -- saving rejects to file js/src/vm/Compartment.cpp.rej\npatching file js/src/util/StringBuffer.h\nHunk #1 succeeded at 41 with fuzz 1 (offset 3 lines).\nHunk #2 FAILED at 83\nHunk #3 FAILED at 263\n2 out of 3 hunks FAILED -- saving rejects to file js/src/util/StringBuffer.h.rej\npatching file js/src/ctypes/CTypes.cpp\nHunk #1 FAILED at 7783\n1 out of 1 hunks FAILED -- saving rejects to file js/src/ctypes/CTypes.cpp.rej\npatching file js/src/builtin/TestingFunctions.cpp\nHunk #2 FAILED at 2459\nHunk #3 FAILED at 3510\n2 out of 3 hunks FAILED -- saving rejects to file js/src/builtin/TestingFunctions.cpp.rej\nabort: patch failed to apply', '')

Flags: needinfo?(cmartin)

From that Autoland error, it looks like merging D25710 was attempted, but it's D25711 that I need the check-in for, please.

Sorry if there was confusion about the order they appear in the list. I later decided to change the order on a couple of these patches.

Flags: needinfo?(cmartin) → needinfo?(csabou)
Keywords: checkin-needed

Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/497561b76737
Add ability to query ArenaID to mozjemalloc_ptr_info. r=glandium

Keywords: checkin-needed

Lando's confused because D25711 depends on D25710 in Phabricator. You'll need to fix up the dependencies similar to last time (and probably manually close D25711).

Flags: needinfo?(csabou) → needinfo?(cmartin)

Sorry about that everyone - Fixed the build breakage and also changed the Phab review to get rid of the dependency. It should merge properly now. Please check-in D25711.

Flags: needinfo?(cmartin)
Keywords: checkin-needed

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/401f2be1ced1
Add ability to query ArenaID to mozjemalloc_ptr_info r=glandium

Keywords: checkin-needed

(In reply to Chris Martin [:cmartin] from comment #10)

Created attachment 9055021 [details]
Bug 1052579 - Allow JS AllocPolicy to be assigned to an arena

Several areas of JS code abstract their memory allocation logic behind an
AllocPolicy. Since these allocations may now need to be in different arenas,
each AllocPolicy will need to be assigned a Mozjemalloc arena that it will
use for all its allocations.

Hey, I just noticed this adds a reference to each JS AllocPolicy. There can be many (live) Vectors in the JS engine so this might have non-trivial memory overhead. Was this an intentional change? It would be cool if we could restrict it to just those Vectors that need non-default arenas :)

Flags: needinfo?(cmartin)

This does seem to manifest as a (slight) memshrink regression.

https://treeherder.mozilla.org/perf.html#/graphs?series=autoland,1959145,1,4&zoom=1555916882771.978,1555983682515.1099,3990000,4030000&selected=autoland,1959145,461838,788278614,4

That link is for Base Content JS, but more complex tests may see this more.

(In reply to Jan de Mooij [:jandem] from comment #43)

There can be many (live) Vectors in the JS engine so this might have non-trivial memory overhead. Was this an intentional change? It would be cool if we could restrict it to just those Vectors that need non-default arenas :)

Darn - I didn't realize that it would be a significant overhead, and so I more-or-less just chose the simplest change.

I actually had originally wrote this change with the AllocPolicy types templated with a "const arena_id_t&" parameter, but when I tried that a bunch of other code sites broke, so I gave up and went path-of-least-resistance :)

But alas, if it is too much overhead then I will likely have to take another shot at doing it with the templates.

Alrighty - I'll probably try and tackle that again after I finish with this last big patch that changes over all the allocation sites. Thanks for catching that memory regression! :)

Flags: needinfo?(cmartin)

Please check-in D25710.

Keywords: checkin-needed

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e51a022e039f
Change all found JSString allocation sites to new arena r=sfink

Keywords: checkin-needed
Attachment #9057882 - Attachment is obsolete: true
Regressions: 1547052

In D25705, I added a new arenaId member to the js::BaseAllocPolicy. This
increased the size of everything that uses a JS AllocPolicy, which is a
lot.

This change follows suit from earlier work, which is to make everything
allocation-related have an "arena" version and a "default" version that
uses the arena version with the implied default arena.

StringBuffer is then changed to use this new functionity to define its
own alloc policy that uses the new StringBufferArena.

Please check-in D29685. Thanks!

Keywords: checkin-needed

Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/596926f7e812
Remove arenaId member that was introduced in D25705 r=sfink

Keywords: checkin-needed

In D25711, I added an arenaId member to jemalloc_ptr_info_t when MOZ_DEBUG
is defined. This adds a GTest to ensure that the new member returns the correct
value.

Attachment #9063002 - Attachment description: Bug 1052579 - Add GTest for arenaId debugging added to jemalloc_ptr_info() → Bug 1052579 - Modify GTest for jemalloc_ptr_info() to check arenaId
Pushed by cmartin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5268e93ef348 Modify GTest for jemalloc_ptr_info() to check arenaId r=glandium
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
No longer depends on: 1364359
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: