Closed Bug 1053362 Opened 10 years ago Closed 5 years ago

Add a separate generational heap explicitly for string data

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: terrence, Unassigned, NeedInfo)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

The Basic Idea:
* Bump alloc all string data into a separate (non-GC, non-malloc) heap.
* On GC (possibly GC_SHRINKING): when sweeping, copy any remaining live strings into malloc memory. I think we'll probably want to keep a utilization number while sweeping and do the evacuation on the next GC when it gets low enough.

The rationale here is three-fold:

1) Performance: In bug 903519, Jan found that allocating strings directly in the nursery is slower than not, because -- unlike with objects -- there is no way for us to tell if any particular string is going to be long-lived. This would let us bump-allocate all string data regardless.

2) Security: Having string data initially in a separate heap would make certain exploits harder.

3) Fragmentation: By only allocating long-lived strings in malloc memory (and doing a large number of long-lived allocations at once) we reduce the malloc heap's fragmentation.
There's not much in the jit right now that creates new strings, and certainly none that create malloc data. I guess we should first inline a bunch of string stuff, possibly using the new malloc stub so that we have an idea of how much data we're talking about.

I am going through old security bugs, and this one came up. AIUI, this work encompasses Bug 1052579 (which would move JS strings into another heap) and then adds the additional step of moving long-lived strings somewhere else.

Is that understanding correct? Is moving long-lived string somewhere else still an intriguing idea? Jonco suggested that unless you disagree, we could just close this bug as WONTFIX.

Flags: needinfo?(sphink)

Note also bug 1052579, which covers using a different jemalloc arena for string data. But that feels mostly independent of this.

This is related to allocating strings in the nursery, bug 1442481, though I find the idea here to be an interesting approach that I hadn't considered. It might be worth a try to see if it helps get the hoped-for speedup of nursery strings that doesn't seem to be materializing -- it would avoid the "string tenuring tax" at the cost of some temporary fragmentation.

If both bug 1442481 and bug 1052579 land (and I update bug 1442481 to use the string data arena), then this bug would provide no further security benefit. But we could leave this as blocking bug 1052575 for now? Your choice.

Flags: needinfo?(sphink)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

Hey Steve, I wanted to ask about bug 1442481 and your comment here - it doesn't look like that patch was updated to use the StringBuffer arena from bug 1052579. Is that correct? Could we do that as a followup?

Flags: needinfo?(sphink)
Flags: needinfo?(cmartin)

Hey Tom,

If I'm understanding the patches from bug 1442481 correctly, it looks like they are changing where they allocate the JSString object (IE the pointer/length pair for the character buffer), but the actual buffer that contains the characters is still properly being allocated in the new mozjemalloc StringBuffer arena.

Here is an example of a pretty typical JSString allocation. If you follow this call, it eventually leads directly to moz_arena_malloc with very little interaction with the GC, aside from updating its byte counter and reporting any OOM to it. At the end of the function, JSLinearString::new_() is called, which leads to js::AllocateString. This eventually leads us to the allocation of the JSLinearString object here. This looks like the spot that Steve's change applies to - Which is just the string metadata.

Bug 1052579 and Bug 1052582 were only meant for moving JSString and ArrayBufferObject "byte buffers" into a new mozjemalloc arena.

Do you think that we should open a new bug to also investigate moving the JSString and ArrayBufferObject metadata into their own heaps as well? I could imagine that these two types are particularly useful targets for Type Confusion/UAF Attacks since they contain pointers to user-controlled buffers.

Flags: needinfo?(cmartin) → needinfo?(tom)

Bug 1474659 ought to have moved the arraybuffer metadata. I agree we would like to isolate the string metadata also.

Flags: needinfo?(tom)
You need to log in before you can comment on or make changes to this bug.