spike in crashes at @0x0 | js::Proxy::get in aurora 48

RESOLVED WORKSFORME

Status

()

defect
--
critical
RESOLVED WORKSFORME
3 years ago
3 years ago

People

(Reporter: dbaron, Unassigned)

Tracking

({crash, topcrash})

Trunk
x86
Windows NT
Points:
---

Firefox Tracking Flags

(firefox48+ fixed)

Details

(crash signature)

This bug was filed from the Socorro interface and is 
report bp-1298766b-7ddc-41c4-adb4-b1f442160429.
=============================================================

When updates to Firefox 48 were enabled on Aurora yesterday, we've had a massive spike in crashes with the signature [@0x0 | js::Proxy::get].  They're spread across a decent number of users.

There have been a small number of crashes with this signature in the past (including 2 on aurora on 2016-03-02 builds, but otherwise all on beta, release, and esr in the past year), but nothing like this volume, which appears to be consistent across both yesterday's and today's Aurora builds.

I'm really not sure where to start looking for this given that it's never shown up on nightly.

[Tracking Requested - why for this release]:
This is the #1 topcrash on Aurora 48, by a pretty large margin.

So far all of the crashes have been on Windows, with this (reasonable) distribution:
Rank 	Platform pretty version 	Count 	%
1 	Windows 7 	106 	68.83 %
2 	Windows 10 	31 	20.13 %
3 	Windows 8.1 	16 	10.39 %
4 	Windows 8 	1 	0.65 %
Flags: needinfo?(jorendorff)
Flags: needinfo?(hv1989)
Is it possible there was something recently that we missed backporting?
Flags: needinfo?(arai.unmht)
backporting some fix-for-null-dereference patches is still ongoing.
but I'm not sure if this signature is related to them.
Flags: needinfo?(arai.unmht)
(In reply to David Baron [:dbaron] ⌚️UTC-7 (review requests must explain patch) from comment #2)
> Is it possible there was something recently that we missed backporting?

Like Arai said there are a few requests for aurora backports open which still have to land:
- bug 1265307
- bug 1263525
- bug 1268138
- bug 1268056
- bug 1268740

We are hoping to get approval as soon as possible.
Flags: needinfo?(hv1989)
Bug 1265307 got approval yesterday.
Flags: needinfo?(jorendorff)
(In reply to Hannes Verschore [:h4writer] from comment #4)
> (In reply to David Baron [:dbaron] ⌚️UTC-7 (review requests must explain
> patch) from comment #2)
> > Is it possible there was something recently that we missed backporting?
> 
> Like Arai said there are a few requests for aurora backports open which
> still have to land:
> - bug 1265307
> - bug 1263525
> - bug 1268138
> - bug 1268056
> - bug 1268740
> 
> We are hoping to get approval as soon as possible.

I looked over all the bugs listed above and all have landed on 48 except bug 1265307. I requested Sheriff (Wes) to land the last one today as well.
(In reply to Ritu Kothari (:ritu) from comment #6)
> I looked over all the bugs listed above and all have landed on 48 except bug
> 1265307. I requested Sheriff (Wes) to land the last one today as well.

Does bug 1265307 explain a crash that's not (and was not previously) present on mozilla-central but is present on aurora?  It doesn't appear to me like it would, since it's backing out code that is on mozilla-central.

All the other bugs landed prior to the May 6 aurora nightly build, and this crash is still present in that build.
Bug 1265307 has landed Friday. When can we expect a new aurora release with this included and see if that helped?
Not sure how to move further with this.

The only two patches that landed after the merge and first failing build are:
https://hg.mozilla.org/releases/mozilla-aurora/rev/cbeae0c4410b
https://hg.mozilla.org/releases/mozilla-aurora/rev/88c0b4446910

1) Both patches are identical to what landed on mozilla-central.
2) We backed out the regexp changes, which wasn't stable yet. Which didn't fix the number of crashes
3) The crashes are Windows only and low volume.
4) We haven't seen this crash on nightly
5) With the changes it is atypical to only hit one platform

- Could it be these crashes are also present on nightly, but due to lower usage we don't trigger it? How many nightly windows users do we have?
- Could this be a PGO bustage? That would explain we only hit it on Windows?
- Are there other patches that could have caused this?
NIGHTLY_BUILD not being set is another difference between Trunk and Aurora that could lead to behavior differences.
Possibly blame from bug 1186060 will know better in a few days after bug 1270664 takes hold.
No crashes on the 13th bug 1271493 landed then.
You need to log in before you can comment on or make changes to this bug.