Closed Bug 1182197 Opened 9 years ago Closed 9 years ago

crash in mozilla::dom::Promise::MaybeSettle or mozilla::dom::Promise::Settle

Categories

(Core :: DOM: Core & HTML, defect)

39 Branch
All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox39 --- unaffected
firefox40 + wontfix
firefox41 + fixed
firefox42 + fixed
firefox43 + fixed
firefox44 + fixed
firefox45 --- fixed

People

(Reporter: kairo, Assigned: bzbarsky)

References

Details

(Keywords: crash, topcrash-win)

Crash Data

Attachments

(7 files)

[Tracking Requested - why for this release]:

This bug was filed from the Socorro interface and is 
report bp-7cf1e3e2-1448-46e0-99ce-e2b282150709.
=============================================================

Top Frames:
0 	xul.dll 	mozilla::dom::Promise::MaybeSettle(JS::Handle<JS::Value>, mozilla::dom::Promise::PromiseState) 	dom/promise/Promise.cpp
1 	xul.dll 	mozilla::dom::Promise::ResolveInternal(JSContext*, JS::Handle<JS::Value>) 	dom/promise/Promise.cpp
2 	xul.dll 	mozilla::dom::Promise::JSCallback(JSContext*, unsigned int, JS::Value*) 	dom/promise/Promise.cpp
3 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
4 	xul.dll 	js::DirectProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/DirectProxyHandler.cpp
5 	xul.dll 	js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/CrossCompartmentWrapper.cpp
6 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
7 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp
[...]

It looks like all of those have an address of 0x4, so this seems to be a near-null deref.

Those crashes are new to 40, and are 0.9% of overall 40.0b2 data.
Dup of bug 1178271?
Flags: needinfo?(kairo)
Crash Signature: [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] → [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] [@ mozilla::dom::Promise::MaybeSettle]
Flags: needinfo?(kairo)
Flags: needinfo?(kairo)
(In reply to Boris Zbarsky [:bz] from comment #1)
> Dup of bug 1178271?

If MaybeSettle and Settle are the same, then it might very well be - but the signature here has more volume in the wild.
Flags: needinfo?(kairo)
Here's the full implementation of MaybeSettle:

1383 Promise::MaybeSettle(JS::Handle<JS::Value> aValue,
1384                      PromiseState aState)
1385 {
1386   // Promise.all() or Promise.race() implementations will repeatedly call
1387   // Resolve/RejectInternal rather than using the Maybe... forms. Stop SetState
1388   // from asserting.
1389   if (mState != Pending) {
1390     return;
1391   }
1392 
1393   Settle(aValue, aState);
1394 }

so your signature just depends on what inlining happened.

Since bug 1178271 is the one with more information, marking dependent.
Depends on: 1178271
This was mentioned in the FF40 Beta stability issues in today's channel meeting.
Boris, KaiRo mentioned that we are hitting this crash on Beta channel at 1% which is still pretty high. Can you please help find an owner for this bug? I don't see much activity on bug 1178271 either and therefore requesting your help. Thanks!
Flags: needinfo?(bzbarsky)
It's really not clear what's going on here, unfortunately.  I don't know who can own this because it's not clear where things are going wrong.

It's possibly our memory is just corrupt.  It's possible there are bugs in the GC or CC...

Do we have any useful data on when this started?  On whether it's correlated with anything?
Flags: needinfo?(bzbarsky)
On aurora, this was first seen in 20150515091916. The crash volume was pretty low at the time, so the issue may have been introduced a build or two before then. Here's a possible regression range: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=d14cf14709f4&tochange=b03c5f1a77a1 It's mostly Pocket changes.

On nightly this was first seen in 20150430030201 but the daily volume was even lower, so it's really hard to say.

If there's no obvious fix in sight, can you think of any RELEASE_ASSERTs or other diagnostics that might help point to the problem?
One thing that might be handy is the JS stack when we're resolving this cced promise.  That would give us some idea of what code is involved that's keeping it around after it's been CCed, at least....

This stack would presumably contain URIs in it, though ideally only chrome:// ones, so I'm not sure what our policies are around collecting such a thing.
This bug is now wontfix for 40.

I see crash reports for every beta in the 40 cycle except beta8 and beta9. (beta9 was only released today.) Did we perhaps fix this in beta8?

https://crash-stats.mozilla.com/report/list?product=Firefox&signature=mozilla%3A%3Adom%3A%3APromise%3A%3AMaybeSettle%28JS%3A%3AHandle%3CT%3E%2C+mozilla%3A%3Adom%3A%3APromise%3A%3APromiseState%29
I experienced this crash (cfce7fcf-f66e-470f-9515-dc0dd2150802).

I was seeing black drawing over the UI & Page. Scrolling and such would resolve it, but eventually it got too frustrating and I went to shut down the browser at which point it crashed.
David, Boris, I am trying to follow up on 41 tracked bugs. This crash is showing up most predominantly on 40.0.2 followed up by 41.0b2. Do we have additional crash dumps that could be investigated to help root cause it further? I know you mentioned a regression range in comment 7 and I am wondering whether we may want to try out a speculative fix?
Flags: needinfo?(dmajor)
Flags: needinfo?(bzbarsky)
I defer to bz; I can't really speak to the next steps here without knowing the code.
Flags: needinfo?(dmajor)
We have no idea for a speculative fix so far.  It _looks_ like someone is holding on to a Promise after unlink and then exposing it to JS, but I have no idea what would be doing that, exactly...

We could try to modify the Promise code to gather the information I suggest in comment 8, as long as we're OK with shipping that information out in crash dumps...
Flags: needinfo?(bzbarsky)
I personally would be very happy if we put something in there that makes the crashes easier to investigate. It sounds like this could happen more often with Promises getting used more and more.
(In reply to Boris Zbarsky [:bz] from comment #8)
> One thing that might be handy is the JS stack when we're resolving this cced
> promise.  That would give us some idea of what code is involved that's
> keeping it around after it's been CCed, at least....
> 
> This stack would presumably contain URIs in it, though ideally only
> chrome:// ones, so I'm not sure what our policies are around collecting such
> a thing.

Flagging Benjamin to confirm, but AFAIK this is fine since all fields are restricted-access by default.
Flags: needinfo?(benjamin)
If somebody can guarantee that the stacks involved only contain chrome frames and never content frames, then this is absolutely fine (and we could probably expose this publicly).

If it's possible that this contains content frames, we need to talk more, since then it would probably needs to be controlled by the submit-URLs submission checkbox.
Flags: needinfo?(benjamin)
We can't guarantee that they only contain chrome frames, sadly.  At least not with a simple patch.
Can we redact non-chrome frames?
Yes, with a not-simple patch.  We'd have to figure out how to define "chrome frame" for our purposes and then find them in there...
So, for purposes of deciding between "redact non-chrome frames" and "make this annotation respect the submit-URL checkbox":

How often do you expect to see non-chrome frames? How much would your debugging be hindered by not seeing them? Would you be willing to go without them in exchange for a higher rate of user submissions? How much work is the not-simple patch?
Are there any common URLs for the pages these are crashing on? Maybe we can reproduce something locally.
(In reply to Andrew McCreight [:mccr8] from comment #21)
> Are there any common URLs for the pages these are crashing on? Maybe we can
> reproduce something locally.

It's basically a typical distribution of web usage in general (facebook, gmail, youtube, twitter).
> How often do you expect to see non-chrome frames?

I don't know.  We _think_ this is correlated to Promise usage in Pocket, but maybe something else can trigger it too...  We just don't have much info.

> How much would your debugging be hindered by not seeing them?

Probably not much.

> Would you be willing to go without them in exchange for a higher rate of user
> submissions?

I think that as long as we get _one_ user submission here we'll likely have as much data as we can get from multiple such submissions..

> How much work is the not-simple patch?

My estimate is that to do it right is 1-2 days of work.  The simple patch is probably closer to 1-2 hours.
David, can we just get this into crash reports and see how things look?  I'm not sure what's involved in getting the XPCOM stackString into a crash report, exactly...
Flags: needinfo?(dmajor)
Crash Signature: [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] [@ mozilla::dom::Promise::MaybeSettle] → [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] [@ mozilla::dom::Promise::MaybeSettle] [@ mozilla::dom::Promise::Settle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] [@ mozilla::dom::Promise::Settle]
Flags: needinfo?(continuation)
No longer depends on: 1178271
#ifdef MOZ_CRASHREPORTER
#include "nsExceptionHandler.h"
#endif

#ifdef MOZ_CRASHREPORTER
CrashReporter::AnnotateCrashReport(NS_LITERAL_CSTRING("xpcom_stack"), ...)
#endif
Flags: needinfo?(dmajor)
Comment on attachment 8658358 [details] [diff] [review]
investigation patch.  Log the stack at promise fulfillment on a CCed promise into the crash reporter data

Review of attachment 8658358 [details] [diff] [review]:
-----------------------------------------------------------------

The double conversion of the string is kind of unfortunate but since we're about to crash anyway, it seems fine.

I don't really know how those various JS objects on the stack work (I'll leave that to nsm). Is there any chance of the callees doing a null pointer dereference or some such, given that we're already in a weird state?
Attachment #8658358 - Flags: review?(nsm.nikhil)
Attachment #8658358 - Flags: review?(dmajor)
Attachment #8658358 - Flags: review+
That was meant to be an r+ but apparently I mid-aired?
> The double conversion of the string is kind of unfortunate

Yeah...  One of those conversions is just a straight copy in the common case, though.  And yes, this is not exactly hot code!

> Is there any chance of the callees doing a null pointer dereference or some such, given
> that we're already in a weird state?

Chance, maybe.  I mean, the whole JS heap could be bonkers.  But none of those callees should use the promise or any of its members other than the stack member, at least...
Would it be worthwhile to change the null check of mGlobal in the ctor to be a RELEASE_ASSERT?

Also I noticed the regression range sort of lines up with bug 1058695, but I'm not sure how that would lead to this, so that's probably a red herring.

When we do these various PromiseCallback Calls, is mGlobal being unmark grayed? I forget how automatically that happens nowadays.
I assume we'll get some data from this on just nightly, right?  So no need to uplift anywhere?

> Would it be worthwhile to change the null check of mGlobal in the ctor to be a
> RELEASE_ASSERT?

You mean the MOZ_ASSERT?  The PRomise ctor is protected; the only caller should be Promise::Create.  Promise::Create then immediately calls CreateWrapper, which tries to Init an AutoJSAPI with the global.  If it were null, the Init() call would return false, CreateWrapper would throw on the ErrorResult, and Create() would return null and let the Promise just go away.  So I'm not sure it's all that worthwhile.

> Also I noticed the regression range sort of lines up with bug 1058695

Well...  That's interesting.  That bug is what added the dereference of mGlobal to MaybeSettle, no?  Before that, MaybeSettle didn't touch mGlobal.  So the bug with mGlobal being null was preexisting, I bet, but the crash was in fact introduced by bug 1058695.

> When we do these various PromiseCallback Calls, is mGlobal being unmark grayed?

mGlobal is an nsIGlobalObject pointer, not a JS thing.
Blocks: 1058695
Whiteboard: [leave open]
I just noticed: should we get the stack prior to doing the reject, for symmetry?
We could, but we're not seeing crashes with the reject on the stack, right?  Since I assume the testing patch will just get backed out once we have the data we want, I didn't worry too much about it.
Comment on attachment 8658358 [details] [diff] [review]
investigation patch.  Log the stack at promise fulfillment on a CCed promise into the crash reporter data

Review of attachment 8658358 [details] [diff] [review]:
-----------------------------------------------------------------

I already r+ed
Attachment #8658358 - Flags: review?(nsm.nikhil)
(In reply to Boris Zbarsky [:bz] from comment #33)
> > When we do these various PromiseCallback Calls, is mGlobal being unmark grayed?
> 
> mGlobal is an nsIGlobalObject pointer, not a JS thing.

Sorry, I meant the mGlobal fields of the PromiseCallback subclasses (WrapperPromiseCallback, ResolvePromiseCallback, RejectPromiseCallback) which have type JS::Heap<JSObject*>. I was wondering because I see WrapperPromiseCallback::Call() in many (but not all) of these crash stacks, but maybe that's just always the way we often invoke these promises.

Some promises are triggered from the observer service. Like this one:
https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-bdf702150908

Note that this is spinning the event loop while we're notifying observers. It would be nice to know what is triggering those.

Some are being triggered from the storage CompletionNotifier:
https://crash-stats.mozilla.com/report/index/45c8ec4d-5a2c-4c8f-b8cc-0047c2150907

It doesn't seem too likely that there's something particular about the promise that is triggering this, but who knows. It does seem worth noting that this all appears to be chrome JS.
Flags: needinfo?(continuation)
Keywords: leave-open
Whiteboard: [leave open]
dmajor, could you look into the crash dumps for a few of these stacks? It would be nice to know what is triggering the observer service in these stacks:
https://crash-stats.mozilla.com/report/index/dce65f8b-18f7-4cba-abd5-1f0bc2150908
https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-bdf702150908
https://crash-stats.mozilla.com/report/index/a48c08d2-9f0b-42ea-b584-2a2042150908

Also, it would be nice to get some more frames for these crashes that have CompletionNotifier near the end to see what is spinning the event loop:
https://crash-stats.mozilla.com/report/index/41a4f21f-9b63-4fbf-b81a-037c92150908
https://crash-stats.mozilla.com/report/index/81040734-b10d-448f-aa4c-f480b2150908
https://crash-stats.mozilla.com/report/index/36b1d90b-8f73-4765-bf4d-048b12150908

Thanks.
Flags: needinfo?(dmajor)
> Sorry, I meant the mGlobal fields of the PromiseCallback subclasses

Ah.  Those might want to unmarkgray mGlobal in Call, indeed...

> but maybe that's just always the way we often invoke these promises.

Right.  A promise chain will end up with those PromiseCallback things in the callstack.
dmajor looked at one of these observer crashes and the thing that was triggering the observer service was like this:
33 0022f908 515dca1e xul!nsObserverList::NotifyObservers+0x3e
34 0022f928 51715991 xul!nsObserverService::NotifyObservers+0xc6
35 0022f958 5146c4d2 xul!nsXREDirProvider::DoShutdown+0x72
36 0022f99c 5146837a xul!ScopedXPCOMStartup::~ScopedXPCOMStartup+0x51

So we're in shutdown, but not that late in shutdown.
(In reply to Andrew McCreight [:mccr8] from comment #39)
> dmajor, could you look into the crash dumps for a few of these stacks? It
> would be nice to know what is triggering the observer service in these
> stacks:
> https://crash-stats.mozilla.com/report/index/dce65f8b-18f7-4cba-abd5-
> 1f0bc2150908
> https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-
> bdf702150908
> https://crash-stats.mozilla.com/report/index/a48c08d2-9f0b-42ea-b584-
> 2a2042150908

Those all have the same stack as the one you linked me to on IRC (in other words, comment 41).

> 
> Also, it would be nice to get some more frames for these crashes that have
> CompletionNotifier near the end to see what is spinning the event loop:
> https://crash-stats.mozilla.com/report/index/41a4f21f-9b63-4fbf-b81a-
> 037c92150908
> https://crash-stats.mozilla.com/report/index/81040734-b10d-448f-aa4c-
> f480b2150908
> https://crash-stats.mozilla.com/report/index/36b1d90b-8f73-4765-bf4d-
> 048b12150908
> 
> Thanks.

Below CompletionNotifier is:
10 0019f7bc 55429daf xul!mozilla::storage::`anonymous namespace'::CompletionNotifier::Run+0x17
11 0019f8cc 553cdc36 xul!nsThread::ProcessNextEvent+0x6e8
12 0019f8fc 553cf0ba xul!mozilla::ipc::MessagePump::Run+0x5f
13 0019f934 553ceac3 xul!MessageLoop::RunHandler+0x20
14 0019f954 554291ad xul!MessageLoop::Run+0x19
15 0019f960 55428cf4 xul!nsBaseAppShell::Run+0x32
16 0019f96c 55498db9 xul!nsAppShell::Run+0x1b
17 0019f97c 552e8a43 xul!nsAppStartup::Run+0x20
18 0019fb60 552e8c6c xul!XREMain::XRE_mainRun+0x4b9
Flags: needinfo?(dmajor)
(In reply to David Major [:dmajor] from comment #42)
> Those all have the same stack as the one you linked me to on IRC (in other
> words, comment 41).
Thanks, I'll try looking through things that listen for that shutdown stuff and see what might be triggering JS.

> Below CompletionNotifier is:

Thanks. It is good to know this is not nested event loop spinning.
Depends on: 1203289
Summary: crash in mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState) → crash in mozilla::dom::Promise::MaybeSettle or mozilla::dom::Promise::Settle
Hoping that the investigative patch in Nightly will help us find a possible cause and fix for 42. It is too late for 41.
So I was thinking about this a bit more... given that this is fallout from bug 1058695, we could in fact consider adding a null-check for mGlobal and treat null as IsDying().  That might be ok for 41 if there's still time for that; it's certainly better than crashing...
Flags: needinfo?(rkothari)
Boris, given that this crash is in the top 10 for 41 at 1.17%, we might be able to consider taking a fix in 41RC, depending on the complexity and risk associated. Would you be able to nominate a patch for uplift today? We gtb 41 RC Monday am PST.
Flags: needinfo?(rkothari) → needinfo?(bzbarsky)
I _think_ this is pretty safe, but it could use a second set of eyes.

Also, I'm out on Monday so if someone is going to uplift it then it won't be me...
Flags: needinfo?(bzbarsky)
Attachment #8660010 - Flags: review?(nsm.nikhil)
Resetting the status because there is a good chance we might taking a likely fix for this in 41RC.
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this

Review of attachment 8660010 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine to me, I only read the bug report, but it definitely doesn't do any more harm. Is the patch supposed to do anything else eventually considering the attachment description?
Attachment #8660010 - Flags: review?(nsm.nikhil) → review+
No, that's it.  I guess I'll try landing that on inbound and if it doesn't blow up the world people should consider uplifting it on Monday...
Nikhil, would you be able to nominate this patch for uplift to Beta? We will gtb 41 RC on Monday 9/14 by noon. Thanks!
Flags: needinfo?(nsm.nikhil)
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this

Approval Request Comment
[Feature/regressing bug #]: bug 1058695
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: at least some of this code runs regularly
[Risks and why]: this should mostly restore existing behavior, but we don't entirely understand what is happening here
[String/UUID change made/needed]: none
Flags: needinfo?(nsm.nikhil)
Attachment #8660010 - Flags: approval-mozilla-beta?
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this

[Triage Comment] This is another fix that we are taking in 41 RC. This fix is safe and low risk.
Attachment #8660010 - Flags: approval-mozilla-beta? → approval-mozilla-release+
While investigating something else, I happened to notice that a large number of these crashes are within the "OOM danger zone" for AvailableVirtualMemory.

Could it be related? Maybe high memory means more CC which leads to the unlink problem?
Could be...

Do we by any chance have any reports with the stack instrumentation we added at this point?
No, and it's going to be difficult, because:
- This crash is pretty rare on nightly to begin with, and
- With the null check workaround, we'll no longer get a crash report for this issue. We'd have to look for cced_promise_stack annotations on machines that go on to crash somewhere else later.
Hmm.  I didn't realize this was rare on nightly.  Should have pushed the investigation patch to all channels before landing the workaround...

Also, the null check hasn't landed on nightly and aurora, right?  So there we would still get crashes (though as you say they might be rare).
The null-check landed on m-c in comment 52.

Perhaps you could undo the null check before merge day and let the logging go to aurora. We get some single-digit crashes per day there.
> The null-check landed on m-c in comment 52.

So it did.  I clearly can't remember what I did.

Undoing the null check on m-c makes sense.  I'll do that next time the tree is open.
Got some data from aurora 43. Most of the stacks are:

cced_promise_stack 	PromiseSet.prototype.add/<@resource://gre/modules/AsyncShutdown.jsm:145:9 Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:934:23 this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:813:7 Promise*this.PromiseWalker.scheduleWalkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:744:11 this.PromiseWalker.schedulePromise@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:776:7 this.PromiseWalker.completePromise@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:711:7 Spinner.prototype.observe@resource://gre/modules/AsyncShutdown.jsm:523:9 

But there are a couple others:

cced_promise_stack 	ConnectionData.prototype<._executeStatement/pending<.handleCompletion@resource://gre/modules/Sqlite.jsm:784:13 

cced_promise_stack 	i@http://platform.twitter.com/widgets.js:9:3662 o@http://platform.twitter.com/widgets.js:9:3692 promise callback*o/</<@http://platform.twitter.com/widgets.js:9:3736 o/<@http://platform.twitter.com/widgets.js:9:3709 o@http://platform.twitter.com/widgets.js:9:3599 o@http://platform.twitter.com/widgets.js:9:2844 a@http://platform.twitter.com/widgets.js:9:3032 @http://platform.twitter.com/widgets.js:1:1785 a<@http://platform.twitter.com/widgets.js:8:9002 @http://platform.twitter.com/widgets.js:1:1713 e@http://platform.twitter.com/widgets.js:1:406 @http://platform.twitter.com/widgets.js:1:1379 @http://platform.twitter.com/widgets.js:1:1 @http://platform.twitter.com/widgets.js:1:1 

By the way, it looks like we didn't land any changes on 42? If these stacks don't point to an obvious fix, could you land the bandaid on beta for now?
Flags: needinfo?(bzbarsky)
Bug 1203289 landed on 42. But I guess that did not fix it.
Hmm.  The Promise-backend stuff is under a nested event loop spin from an observer notification.... but clearly twitter is not likely to be managing that (modulo sync XHR perhaps).

I'll think a bit about this over the weekend, then probably land the null-check on Monday. :(
Boris, any news on this?
It seems that the attachment 8660010 [details] [diff] [review] which landed only in 41 mitigated the crash. Most of the crashes are in beta (but still affect release).
So Andrew and I walked through some of this code just now, and we think we see a possible hole in the GC/CC stuff in Promise code.  I filed bug 1213391 on it.

I will post a patch there and aim to get that onto Aurora and see whether that nixes the crashes, but in the meantime I have landed the bandaid on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/e96fd3320ab4
Flags: needinfo?(bzbarsky)
Depends on: 1213391
The fix for bug 1213391 is now on m-c and aurora; we should see what crashstats says about this crash there....
Flags: needinfo?(dmajor)
bz, dmajor is on PTO, if you are looking for stats stuff, e.g. how crash volume might change, it's better to ni? myself, but this is so rare on Dev Edition and Nightly that those channels won't tell us much, I think only Beta will give us info if this has really been fixed.
Flags: needinfo?(bzbarsky)
OK.  I guess we wait for 1213391 to ride the trains to beta, then....
Flags: needinfo?(bzbarsky)
OS: Windows NT → Windows
Hardware: x86 → All
Version: Trunk → 39 Branch
As we are going to build 42 rc today, this is too late for this bug. Wontfix.
42 is "fixed" by the change mentioned in comment 69.

The question is whether this crash signature is visible in 43.
[Tracking Requested - why for this release]:

(In reply to Boris Zbarsky [:bz] from comment #77)
> The question is whether this crash signature is visible in 43.

It is. This is 1.8% of 43.0b1 crashes.
OK, so bug 1213391 didn't fix this.  :(

Andrew, should we just permanently add this null-check and give up? :(
Flags: needinfo?(continuation)
(In reply to Boris Zbarsky [:bz] from comment #79)
> Andrew, should we just permanently add this null-check and give up? :(

Yeah, it is going to be hard to figure out what is actually going on here. Maybe have an assertion in there with the null check.
Flags: needinfo?(continuation)
Tracking for 43 and 44 as this is still fairly high volume.
OK, let's just do this and then I guess backport to aurora/beta
Attachment #8686282 - Flags: review?(continuation)
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Comment on attachment 8686282 [details] [diff] [review]
workaround.  Add a null-check for mGlobal in Promise::Settle

Review of attachment 8686282 [details] [diff] [review]:
-----------------------------------------------------------------

Can we maybe assert if mGlobal is null? Though maybe any such failures will be intermittent and thus not of use to anybody. It might be nice to throw in a comment about how we don't really expect mGlobal to be null, but it happens in the wild.
Attachment #8686282 - Flags: review?(continuation) → review+
Comment on attachment 8686311 [details] [diff] [review]
workaround.  Add a null-check for mGlobal in Promise::Settle

Approval Request Comment
[Feature/regressing bug #]: Unknown.
[User impact if declined]: The crashes continue until morale improves.
[Describe test coverage new/current, TreeHerder]: We have yet to reproduce this.
[Risks and why]: Low risk.  We've been shipping this for several releases now in
   practice.
[String/UUID change made/needed]: None.
Attachment #8686311 - Flags: approval-mozilla-beta?
Attachment #8686311 - Flags: approval-mozilla-aurora?
Comment on attachment 8686311 [details] [diff] [review]
workaround.  Add a null-check for mGlobal in Promise::Settle

Possible crash fix, high volume crash, let's uplift it.
Attachment #8686311 - Flags: approval-mozilla-beta?
Attachment #8686311 - Flags: approval-mozilla-beta+
Attachment #8686311 - Flags: approval-mozilla-aurora?
Attachment #8686311 - Flags: approval-mozilla-aurora+
Fixed, see comment 87.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Flags: needinfo?(dmajor)
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.