Closed Bug 1585921 Opened 10 months ago Closed 10 months ago

Make it harder to misuse GC containers

Categories

(Core :: JavaScript: GC, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: jonco, Assigned: jonco)

Details

Attachments

(3 files)

Every so often a change adds a container of GC things that doesn't have the correct barriers which then results in follow on crashes and possible security vulnerabilities. This is not surprising as these things are hard to use and the correct approach is not always obvious.

We should try and improve this situation. Some possibilities:

  • add static asserts when things are mis-used (e.g. when creating a GC container of unbarriered pointers) but there are probably exceptions
  • have some kind of linter that warns when new instances of questionable usage are added
  • improve the documentation

Maybe a bit orthogonal to containers.

Depending on the type of barriers we want to check for. In a new/existing zeal mode we could scan the tenured heap during sweeping and assert for any pointer into the un-allocationed portion of the nursery. Or In that zeal mode insure that the nursery is empty during sweeping and assert for any pointer into a chunk previously used by the nursery.

Something similar, but probably weaker, could be done for incremental GC barriers. Check that all pointers point to a currently allocated cell.

(In reply to Paul Bone [:pbone] from comment #1)
There's a zeal mode that does something like this: CheckHeapAfterGC.

I'll confess that I don't understand our setup well enough to answer questions about it right now. If I had to hazard a guess:

When using GC pointers in GC containers, you need them to be movable (so that you don't orphan a store buffer edge target), and you may need pre-write barriers, post-write barriers, and/or read barriers.

You only need post-write barriers for types that are nursery-allocatable, but I think they're harmless to use if you don't need them?

You only need pre-write barriers if an old value might be overwritten, so if you're using eg a GCVector as a stack, then all of your writes are essentially initializations.

Then there's the issue of inside or outside Spidermonkey. Outside Spidermonkey, you can pretty much only use JS::Heap<T>, which will give you pre-, post-, and read barriers.

Inside Spidermonkey, you can use:

  • HeapPtr<T>, which gives pre- and post-barriers.
  • GCPtr<T>, which does not allow moving, so I lied you can't use it in containers (since they may need to reallocate their vector or key storage).
  • PreBarriered<T>, which gives a pre-barrier only so should not be used with nursery-allocatable pointers.
  • raw pointers, which obviously doesn't provide any barriers at all.

This doesn't cover the TenuredHeap or weak variants.

Now, the weirdness: there are hashers declared for GCPtr, PreBarriered, and WeakHeapPtr -- and as of a few days ago, HeapPtr. (tcampbell/iain needed it.) But I don't understand why GCPtrHasher exists. Isn't it invalid to use a GCPtr as a hashtable key type anyway, because it can't be moved? Its addition is tricky to track back, because we renamed an old HeapPtr to GCPtr at one point.

For this bug, we could at least check whether PreBarriered<T> is used with a T that is nursery-allocatable. That should error out.

We could do the same for raw pointers. Unless we decided raw pointers are the "I know what I'm doing" escape hatch for uses where barriering is done manually. Though maybe it would be better to ban raw pointers, and create a RawPtr<T> or ManuallyBarriered<T> escape hatch instead?

When are raw pointers valid to use? Something that is allocated directly in the tenured heap, doesn't need a read barrier or is manually read barriered, and either doesn't need to handle incremental GC or handles it manually, I guess? (So eg something that is only used with Rooted, never explicitly traced?)

I'm asking to clarify what the rules are, before speculating too much on how to enforce those rules.

Or should we go the other way, and say you should always templatize on bare pointers, but make that inside Spidermonkey actually store a HeapPtr<T>, and outside a Heap<T>? Then we could allow using PreBarriered / RawPtr / ManuallyBarriered as explicit escape hatches to avoid unwanted barriering? It would be a nicer story. Right now, we provide these very explicitly named container types like GCHashMap and GCVector, but it's unclear what "GC" stuff they handle for you. It's natural to assume that as long as you use those, you don't need to think about GC -- or at the very least, you don't need to think about it any more than you do with Heap<T> (as in, you need to trace them.)

Now, the weirdness: there are hashers declared for GCPtr, PreBarriered, and WeakHeapPtr -- and as of a few days ago, HeapPtr. (tcampbell/iain needed it.) But I don't understand why GCPtrHasher exists. Isn't it invalid to use a GCPtr as a hashtable key type anyway, because it can't be moved? Its addition is tricky to track back, because we renamed an old HeapPtr to GCPtr at one point.

For the record, the shell builds if I remove the GCPtrHasher stuff. Should I just rip it out?

(In reply to Steve Fink [:sfink] [:s:] from comment #3)

When using GC pointers in GC containers, you need them to be movable (so that you don't orphan a store buffer edge target), and you may need pre-write barriers, post-write barriers, and/or read barriers.

Moveability is required if you need post barriers and your container can move its contents. A side point is whether you consider things like tuples to be containers (these don't move their contents).

You only need post-write barriers for types that are nursery-allocatable, but I think they're harmless to use if you don't need them?

Yes, we check the type and no barrier is compiled for non nursery allocated things.

You only need pre-write barriers if an old value might be overwritten, so if you're using eg a GCVector as a stack, then all of your writes are essentially initializations.

Destruction counts as a write too, so you still need a pre-write barrier in this case.

Then there's the issue of inside or outside Spidermonkey. Outside Spidermonkey, you can pretty much only use JS::Heap<T>, which will give you pre-, post-, and read barriers.

Heap<T> doesn't have a pre write barrier since bug 1581574, as we can rely on the read barrier to make incremental marking work.

Inside Spidermonkey, you can use:

  • HeapPtr<T>, which gives pre- and post-barriers.
  • GCPtr<T>, which does not allow moving, so I lied you can't use it in containers (since they may need to reallocate their vector or key storage).
  • PreBarriered<T>, which gives a pre-barrier only so should not be used with nursery-allocatable pointers.
  • raw pointers, which obviously doesn't provide any barriers at all.

Now, the weirdness: there are hashers declared for GCPtr, PreBarriered, and WeakHeapPtr -- and as of a few days ago, HeapPtr. (tcampbell/iain needed it.) But I don't understand why GCPtrHasher exists. Isn't it invalid to use a GCPtr as a hashtable key type anyway, because it can't be moved? Its addition is tricky to track back, because we renamed an old HeapPtr to GCPtr at one point.

Yes, that is strange. We should remove GCPtrHasher.

For this bug, we could at least check whether PreBarriered<T> is used with a T that is nursery-allocatable. That should error out.

I'll try that. It may be that this is used somewhere where the barriers are implemented manually though.

We could do the same for raw pointers. Unless we decided raw pointers are the "I know what I'm doing" escape hatch for uses where barriering is done manually. Though maybe it would be better to ban raw pointers, and create a RawPtr<T> or ManuallyBarriered<T> escape hatch instead?

This makes things more complicated for roots though. I like that you can have Rooted<GCVector<JSObject*>> and I don't really want to force another template in there. (For heap allocated vectors, you don't have the Rooted so it's GCVector<HeapPtr<JSObject*>>).

When are raw pointers valid to use? Something that is allocated directly in the tenured heap, doesn't need a read barrier or is manually read barriered, and either doesn't need to handle incremental GC or handles it manually, I guess? (So eg something that is only used with Rooted, never explicitly traced?)

The valid uses for raw pointers in the heap are limited so I think there is a good case for banning raw pointers there.

Or should we go the other way, and say you should always templatize on bare pointers, but make that inside Spidermonkey actually store a HeapPtr<T>, and outside a Heap<T>?

This sounds nice but I think there are too many exceptions for us to decide effectively...

Then we could allow using PreBarriered / RawPtr / ManuallyBarriered as explicit escape hatches to avoid unwanted barriering?

Oh, you mean add a default barrier to unbarriered pointers? Actually that might work. It doesn't cover the case of non-container use of GC pointers in the heap though, which would still require a barrier.

It would be a nicer story. Right now, we provide these very explicitly named container types like GCHashMap and GCVector, but it's unclear what "GC" stuff they handle for you. It's natural to assume that as long as you use those, you don't need to think about GC -- or at the very least, you don't need to think about it any more than you do with Heap<T> (as in, you need to trace them.)

Yeah, this isn't great either. We should also ban GC pointers from standard containers, or at least the ones we import into js.

GCPolicy<T> calls the instance method for these types so these static methods aren't required.

The root marking functions have assertions that will catch this being used outside of heap marking.

Depends on D48533

Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/71e0aaf161bf
Remove unnecessary static trace methods which are never called r=sfink
https://hg.mozilla.org/integration/autoland/rev/bfbd9f2b907c
Use root marking functions to trace unbarriered pointers in GCPolicy traits since this is only safe when we're marking roots r=sfink
https://hg.mozilla.org/integration/autoland/rev/754da41ac0f7
Remove GCPtrHasher which is invalid, and replace with HeapPtrHasher r=sfink
Assignee: nobody → jcoppeard
You need to log in before you can comment on or make changes to this bug.