Open Bug 1810953 Opened 1 year ago Updated 10 months ago

Replace arena lookup Mutex with a RWLock

Categories

(Core :: Memory Allocator, enhancement)

enhancement

Tracking

()

ASSIGNED

People

(Reporter: pbone, Assigned: pbone)

References

(Blocks 6 open bugs)

Details

(Whiteboard: [sp3])

Attachments

(1 obsolete file)

jemalloc provides an arena allocation API. Arenas are specified using arena IDs and these are looked up in a tree for every operation. Also the tree is locked for every lookup forcing all operations on alternative threads to go through a single lock.

I'd like to remove the arena IDs and use pointers in this API, removing this tree, its lookups and the lock.

Well, the reason they are IDs and not pointers is so that they can't be abused by attacker code.

I'm trying to understand under what circumstances having the pointer there is going to really help an attacker. I couldn't really come up with one myself, but that could just mean I'm not creative enough.

This is for DOMArena. If I'm thinking about it correctly, an attacker would need to be able to read the pointer member of DOMArena to find where the data is; but that's only helpful if they can't get the address to something allocated in that arena, and they have a fairly constrained memory read primitive that lets them read that member, but not much else (to find it via alternate means).

If they could overwrite the pointer member, I think they could direct execution via the vtable but that doesn't seem like a significantly more powerful capability than they already have, unless the write primitive is very limited.

Yannis, do you have some thoughts on this? Feel free to reach out to tjr or pbone if you need more context.

Flags: needinfo?(yjuglaret)
Blocks: 1811985
Blocks: 1811987

Hello,

I believe attackers typically target objects:

  1. which live close to the object for which they found a memory corruption vulnerability;
  2. for which they know (or find) an easy path towards a better primitive;
  3. ideally, for which they have easy control over the lifetime, which makes things easier.

So the important questions to me are:

  • Where do DOMArena objects end up when they get allocated? What other objects live there? How likely is it that the code that manages some of those objects has memory corruption vulnerabilities?
  • How easy is it to get a better primitive from corrupting a DOMArena object? Expanding on Tom's comment, as far as I could see neither DOMArena nor arena_t should have a vtable because they have no virtual methods; and the fields of arena_t don't seem to use types with virtual methods either, so no vtable through dereference either. The absence of vtable implies that DOMArena objects are unlikely to be on top of the list of easy targets an attacker would try, as vtables are a known easy way to get to an arbitrary code execution primitive -- depending on what mitigations are in place.
  • What control does an attacker have over the lifetime of DOMArena objects and/or other objects that live there?
  • Are there easier (maybe even well-known) targets than DOMArena objects, that also get allocated in there anyway?

Usually, developers go the other way around: if they notice that some kind of object has become a well-known target in public writeups or in-the-wild exploits, then they would isolate them in their own dedicated part of the heap. We could follow this kind of approach as well if we fear -- or discover later -- that DOMArena objects are now a juicy target for attackers. Personally, the resulting DOMArena from the change doesn't strike me as an obviously reachable and interesting target for an attacker, but I may be proven wrong as my knowledge around it is rather superficial.

No longer blocks: 1811987
Flags: needinfo?(yjuglaret)

Thank you Yannis and Tom for your input. I'll write the patch and give you another chance to take a look.

Note that DOMArena is not the only user of mozjemalloc arenas. Moreover, if I recall correctly, there have been exploits relying on digging in the mozjemalloc metadata in the past. Giving attackers an easier access to the metadata does not sound great.

Taking a step back, and a closer look at the implementation for ArenaCollection, I think it should be possible to keep the tree and have the best of both worlds: ids rather than pointers, without the lock contention.

We currently use a mutually exclusive lock, which is taken by three methods:

  • GetById: iterates over the tree structure and reads a field in the nodes;
  • CreateArena: adds a node to the tree;
  • DisposeArena: removes a node from the tree.

GetById is by far the most common call; the numbers of calls to CreateArena and DisposeArena should be negligible in comparison. So the vast majority of the lock contention should be the consequence of simultaneous calls to GetById. Yet, simultaneous calls to GetById do not need mutual exclusion, because this method does not alter the tree. Hence, I believe that we could get rid of the lock contention by moving from a mutually exclusive lock to a "multiple readers, single writer" synchronization primitive like std::shared_mutex.

I also believe that the additional performance gain we could make by removing the tree entirely, thus not resolving ids to pointers, is likely to be small compared to the part resulting from solving the lock contention. So I would suggest that we first change the synchronization primitive then reevaluate the performance gain we would get from removing the tree, if that makes sense to you?

(In reply to Yannis Juglaret from comment #7)

I also believe that the additional performance gain we could make by removing the tree entirely, thus not resolving ids to pointers, is likely to be small compared to the part resulting from solving the lock contention.

In theory yes, but let's validate it. Switching to an RWLock should be a simple patch, and I expect it might also be pretty easy to hack up a prototype of the direct-pointer approach. Assuming that's the case, let's profile both of them on windows and mac.

Okay, I like that idea and I have a workspace with the hacked up version removing the tree. I want to profile it after Bug 1809610 since that's the major source of contention in jemalloc now (lower hanging/juicer fruit).

We could also provide an API that works from a pointer rather than an ID and provide an "id to pointer" function so that some code that needs to do a lot of work can do "id to pointer" once, do their work, and then forget the pointer.

Also WRT to security. Any object in an arena can be easily turned into a pointer to that arena (by calculating its chunk address and finding the arena field within the chunk header). I don't know if this is still "easy" when it's in an exploit, or what the goal might be. It just "feels" like these arena addresses aren't very secret but we're still protecting them as if they are.

Although I suppose if you know the arena ID and can call the search functions you can find the address in that tree anyway.

Assignee: nobody → pbone
Status: NEW → ASSIGNED

(In reply to Paul Bone [:pbone] from comment #9)

Also WRT to security. Any object in an arena can be easily turned into a pointer to that arena (by calculating its chunk address and finding the arena field within the chunk header). I don't know if this is still "easy" when it's in an exploit, or what the goal might be. It just "feels" like these arena addresses aren't very secret but we're still protecting them as if they are.

One nice thing about arena ids is that it allows to reason about the security of mozjemalloc as an API that lives in isolation from its callers. You don't need to care where the callers of mozjemalloc put the ids you return, and whether those ids might get easily corrupted. Because even if they do get corrupted, then at worse mozjemalloc might be working with an ID that will result in a correct allocation but in another arena than the initial one. This may already be interesting for an attacker -- as it could essentially allow to bypass the isolation between two arenas -- which is why it can be nice to have hard-to-guess ids.

By having mozjemalloc return arbitrary arena_t pointers and take them as input, it suddenly becomes a much bigger deal if callers of the API put these pointers in locations where they may be easily corrupted. mozjemalloc may now receive pointers to raw data that is controlled by an attacker and carefully crafted to look like a legit arena_t, but isn't. An attacker could for example use this to control where in memory the next objects allocated by the caller of the API would end up. This could in turn make it easier to corrupt one of those freshly allocated objects, e.g. if the location chosen by the attacker is a part of memory over which they somehow have good control.

In our case though, we do not have to reason about mozjemalloc as an API that lives in isolation from its callers. We know where those pointers to arena would end up, and if that looks bad then we can either put them somewhere else, or protect them somehow (e.g., XOR them with a global randomly generated cookie). Also, I would consider these attacks rather theoretical, in the sense that I think a desperate attacker may consider them if for example they found a vulnerability that clearly only allows them to corrupt nothing else than this id or pointer. For the vast majority of heap corruptions, I think it would be very likely that an attacker can find an easier target that leads directly to a good exploitation primitive, typically by corrupting some pointer to code or some length.

(In reply to Paul Bone [:pbone] from comment #10)

Although I suppose if you know the arena ID and can call the search functions you can find the address in that tree anyway.

If an attacker can already call arbitrary functions then they wouldn't be at a stage where they would bother with arenas. Arenas are part of the potential targets an attacker may consider when trying to grow their capabilities from small constrained primitives to generic powerful ones. That being said, it is true that an attacker currently only needs an arbitrary read primitive and the address of a pointer to code to be able to iterate over the tree of arenas and figure out pointers to arenas and ids. But they wouldn't be able to corrupt the pointers to arenas unless they also have an arbitrary write primitive -- and if they do then they wouldn't care about corrupting it.

Thanks Yannis, That's really useful info.

Replacing the Mutex with an RWLock gives us almost all the benefit with none of the risk - even if the risk is low we could revisit that decision later.

Blocks: 1809610
Summary: Remove arena lookup lock → Replace arena lookup Mutex with a RWLock
Whiteboard: [sp3]

Depends on D173828

Performance results, these are made by first applying Bug 1824655 (the base) then testing this patch.

https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=8369603144df912ca38ad34c6d6d79f03df8114b&newProject=try&newRevision=b25b8693529b34c7525fde8dcc6d6b0c73a9819e&framework=13&page=1&filter=speedometer

Not a big difference but maybe there's still potential here, I wonder if a lock-free structure for the arena tree would be better...

Thanks. Basically no change.

Flags: needinfo?(smaug)
Attachment #9331117 - Attachment is obsolete: true

We decided to abandon patch D176904 since it didn't result in any performance improvment on any platform and may have reduced performance on Windows.

I'm leaving the bug open because I'd like to try a lock-free data structure and also compare with removing this lookup.

Blocks: 1843932
See Also: → 1844359
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: