Closed Bug 1399944 Opened 2 years ago Closed 2 years ago

Add assertions to help track down crashes while tracing the heap

Categories

(Core :: JavaScript: GC, enhancement)

55 Branch
enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: jonco, Assigned: jonco)

References

Details

Attachments

(1 file, 1 obsolete file)

We have quite a few bugs where we crash while tracing the heap (for example bug 1358073 and bug 1288952) that look like we traced some bad pointer.

The crash addresses for the above two bugs show a similar pattern: ~20% are 0xffff0 and the rest are mostly random things ending in 0xffff0.

This indicates that we're trying to get the location from the chunk trailer for a pointer that is bad in some way.

There are several possible causes (and probably more than one apply):
 - faulty RAM
 - heap corruption
 - bugs in the GC
 - use after free

Unfortunately if we hit one of these while tracing we have no context to work out what happened.  This bug is about seeing if we can add some assertions to catch bad pointers earlier so we have the necessary information.
Attached patch bug1399944-check-cell-pointers-1 (obsolete) — Splinter Review
Here's a patch check for bad pointers passed through our rooting API (in Rooted, Heap and TenuredHeap).  It checks the following:
 - pointer not in first 1MiB
 - pointer is correctly aligned
 - chunk trailer is accessible and as valid location value
 - tenured cells have a non-null zone (this is zeroed on free)

I considered adding this in the barrier wrappers but put then here because I think this will provide more coverage and less performance overhead, but I'm not sure about that.  We could put them in the barrier wrappers instead, or in both.

Also, I'm wondering if it would be possible to make these into MOZ_DIAGNOSTIC_ASSERTS but that probably would have too much impact.
Attachment #8908239 - Flags: feedback?(sphink)
Comment on attachment 8908239 [details] [diff] [review]
bug1399944-check-cell-pointers-1

Review of attachment 8908239 [details] [diff] [review]:
-----------------------------------------------------------------

\o/ Definitely a feedback+ from me!

You're right, it would be nice to have these on heap barriers, but it sounds pretty expensive. And in some cases, you're already examining the data somewhat.

We could try this first.

::: js/public/HeapAPI.h
@@ +405,5 @@
> +    auto location = detail::GetCellLocation(cell);
> +    if (location != ChunkLocation::Nursery && location != ChunkLocation::TenuredHeap)
> +        return false;
> +    if (location == ChunkLocation::TenuredHeap && !detail::GetGCThingZone(addr))
> +        return false;

It doesn't add much, but if location is Nursery, then it should have a non-null storeBuffer as well.

Also for nursery pointers, it should be within one of the nursery chunks, but that'd probably require a TLS lookup and walking a chunk list? Doesn't seem worth it.

For TenuredHeap, the overkill version would be to check !InFreeList(arena, cell). But that *really* gets expensive.
Attachment #8908239 - Flags: feedback?(sphink) → feedback+
Good idea.
Attachment #8908239 - Attachment is obsolete: true
Attachment #8909357 - Flags: review?(sphink)
Comment on attachment 8909357 [details] [diff] [review]
bug1399944-check-cell-pointers v2

Review of attachment 8909357 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/GCPolicyAPI.h
@@ +94,5 @@
>      static bool needsSweep(T* tp) {
>          return tp->needsSweep();
>      }
> +
> +#ifdef DEBUG

How would you feel about making these unconditional instead of DEBUG-only? It doesn't matter now, but it seems like it might be nice to be able add a MOZ_RELEASE_ASSERT(p->isValid()) for specific situations, without having to add all this in. It shouldn't add much to the code size, I wouldn't think?

::: js/public/GCVariant.h
@@ +120,5 @@
>      }
> +
> +#ifdef DEBUG
> +    static bool isValid(const mozilla::Variant<Ts...>& v) {
> +        return true; // TODO

File a bug for this.

I think it's something like

struct Matcher {
  template<typename T>
  bool match(T& val) {
    return val.isValid();
  }
}

and then

static bool isValid(const mozilla::Variant<Ts...>& v) {
  return v.match(Matcher());
}
 
...but I'm never too sure about these Variant things.
Attachment #8909357 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink] [:s:] from comment #4)
> How would you feel about making these unconditional instead of DEBUG-only?

Sounds good.  I was thinking about suggesting that.

> > +#ifdef DEBUG
> > +    static bool isValid(const mozilla::Variant<Ts...>& v) {
> > +        return true; // TODO

Oops.  I didn't mean to leave that in, I'll fix it.
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/74faaba5ecd2
Check for valid GC cell pointers in various places r=sfink
https://hg.mozilla.org/mozilla-central/rev/74faaba5ecd2
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Depends on: 1402649
You need to log in before you can comment on or make changes to this bug.