1058695 - Promises can keep a page alive and even prevent Firefox from shutting down

Reporter

Description

•

10 years ago

STR: 1) Open a new tab 2) Navigate to: data:text/html,<script>(function f() Promise.resolve().then(f))()</script> 3) Close the tab The window is now kept alive in the background, you could a dump() stmt or something similar to f() to see it. Trying to quit Firefox won't work - it hangs at shutdown.

Andrew McCreight [:mccr8]

Comment 1

•

10 years ago

Is this something you could look into, Nikhil?

Flags: needinfo?(nsm.nikhil)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 2

•

10 years ago

Sounds like we're missing a CheckInnerWindowCorrectness somewhere.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 3

•

10 years ago

Would calling CIWC() before dispatching a runnable to resolve/reject the promise be the right spot? That is what are the ideal situations to call CIWC() in? Most other callers do so before dispatching an event.

Flags: needinfo?(nsm.nikhil)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 4

•

10 years ago

You should CIWC when the runnable runs, and if CIWC fails not call into JS.

Boris Zbarsky [:bzbarsky]

Comment 5

•

10 years ago

There needs to be a spec for this, no? In general, if I have hold of a Promise from a now-navigated-away-from window and I call then() on it, what should happen? What happens in other UAs?

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Updated

•

10 years ago

Assignee: nobody → nsm.nikhil

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

10 years ago

Whiteboard: [MemShrink]

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 6

•

10 years ago

Promise is not a DETH so I can't directly use this function. Just attempting to call IsCurrentInnerWindow() on the Promise's window always results in true and the bug still remains.

Flags: needinfo?(khuey)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 7

•

10 years ago

DETH handles this via DisconnectFromOwner nulling out mOwnerWindow, I bet :/

Flags: needinfo?(khuey)

Boris Zbarsky [:bzbarsky]

Comment 8

•

10 years ago

> Just attempting to call IsCurrentInnerWindow() In the closed case as opposed to the navigated-away-from case, the inner does not stop being current. Again, can we please figure out the behavior we actually want before we start trying to figure out how to implement it? What do other UAs do for windows inside now-removed iframes, windows that have been navigated away from, and windows inside now-closed tabs or popups?

Tim Taubert [:ttaubert] (inactive)

Reporter

Comment 9

•

10 years ago

Depending on how intrusive the final patch will be I think we might want to uplift this as far as possible. All releases are affected since we introduced Promises.

status-firefox31: --- → affected

status-firefox32: --- → affected

status-firefox33: --- → affected

status-firefox34: --- → affected

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 10

•

10 years ago

If it's a fairly investigative bug, I won't be able to get to this till next week. Unassigning.

Assignee: nsm.nikhil → nobody

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

10 years ago

Blocks: 1057512

Joel Maher ( :jmaher ) (UTC -8)

Comment 11

•

10 years ago

:nsm, is this something you can take up this week. Right now this appears to be our last blocker for bug 1057512.

Flags: needinfo?(nsm.nikhil)

Johnny Stenback (:jst)

Comment 12

•

10 years ago

Talked to Nikhil about this. There's no clear immediate path forward here, but we will dig in more here and pick a path and do that, and Nikhil is the guy to do that work. Unfortunately he's out for a Service Worker hackathon next week and won't be able to dig in here until after that.

Joel Maher ( :jmaher ) (UTC -8)

Comment 13

•

10 years ago

:jst/:nsm, thanks for looking back into this. I am excited about seeing this fixed. As this might be a couple more weeks, would there be any concerns with disabling the 2 pdf tests (on debug only) and then we can reduce the urgency on this bug. Of course I understand if we want to keep this coverage and can commit to working on this in the week after the hackathon.

Flags: needinfo?(jst)

Johnny Stenback (:jst)

Comment 14

•

10 years ago

I think from where I sit that should be fine, but we should probably check with the pdf folks before doing that.

Flags: needinfo?(jst)

Yury Delendik (:yury)

Comment 15

•

10 years ago

(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #14) > I think from where I sit that should be fine, but we should probably check > with the pdf folks before doing that. The tests were disabled and re-enabled several times in the past. There are only those two tests that check PDF.js integration.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Updated

•

10 years ago

Flags: needinfo?(nsm.nikhil)

Joel Maher ( :jmaher ) (UTC -8)

Comment 16

•

10 years ago

Yury, I know where you are coming from- having minimal coverage and disabling half of it isn't a good thing. Maybe there is a way to hack something at the end of each test case to work around this promise issue if we need to have these on for debug tests? Yury, right now we see this leaking on debug tests only- do we gain extra coverage (other than leak checking) with these specific tests on debug vs opt?

Flags: needinfo?(ydelendik)

Yury Delendik (:yury)

Comment 17

•

10 years ago

(In reply to Joel Maher (:jmaher) from comment #16) > Maybe there is a way to hack something at the end of each test case to work > around this promise issue if we need to have these on for debug tests? Not sure it will be a simple hack since we have to track each Promise on main and web worker thread and ensure they are resolved or rejected. We can certainly try (I filed https://github.com/mozilla/pdf.js/issues/5278). But there is a risk that if we miss at least one Promise (and we use them across entire code base), the problem will not go away. > Yury, right now we see this leaking on debug tests only- do we gain extra > coverage (other than leak checking) with these specific tests on debug vs > opt? The tests mostly for integration with Firefox. (We perform PDF parsing/rendering testing on other computers via github/botio) It is hard to tell which debug assertion can be triggered in Firefox by running PDF.js, or even due to the different timing. If I remember correctly, the bug 1008450 was specific only to Mac OSX debug builds.

Flags: needinfo?(ydelendik)

Joel Maher ( :jmaher ) (UTC -8)

Comment 18

•

10 years ago

sorry to keep bugging you Yury. My goal is to get this resolved by either fixing it or disabling until it can get fixed. I know it is hard to quantify the pros/cons of a specific platform or build type. Right now I am not seeing specific data to show that we will see a big loss by not running these 2 pdf tests in debug mode for a few weeks until this is fixed? Please speak up if we shouldn't disable this and if we are not going to disable it I want to get an ETA of when this bug will be fixed.

Flags: needinfo?(ydelendik)

Yury Delendik (:yury)

Comment 19

•

10 years ago

(In reply to Joel Maher (:jmaher) from comment #18) > sorry to keep bugging you Yury. no problem > My goal is to get this resolved by either fixing it or disabling until it > can get fixed. I know it is hard to quantify the pros/cons of a specific > platform or build type. Right now I am not seeing specific data to show > that we will see a big loss by not running these 2 pdf tests in debug mode > for a few weeks until this is fixed? That's okay to temporary disable those test -- that was done several times in the past. (I hope that the comment 13 does not talk about permanently disabling them)

Flags: needinfo?(ydelendik)

Tim Taubert [:ttaubert] (inactive)

Reporter

Comment 20

•

10 years ago

Hey Nikhil, any chance you can take a look at this again?

Flags: needinfo?(nsm.nikhil)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 21

•

10 years ago

on email, :bz said --- We can (1) make attempts to create a Promise with a non-current-inner global fail, for example. Or we can (2) make such promises not call their callbacks. Or we can (3) declare that such non-current-inners don't spin their event loop, which is much like (2) I suspect. This might happen anyway depending on what people do with their event loops. Relying on GC/CC here is not a good idea, because of the non-determinism.... Have we tried asking other UAs, in particular Chrome and Domenic which of those options, or whatever other ones we're considering, they like? If you're just not able to get an answer from them in time, I think doing #2 for now is fine. #1 has the problem that we might have code that doesn't actually handle Promise creation failure. --- Unfortunately the thread he started[1] seems to have diverged from the original intention :/ [1]: https://mail.mozilla.org/pipermail/es-discuss/2014-September/039608.html Boris, the problem with (1) or (2) is that CheckInnerWindowCorrectness() is right now on DETH and Promises aren't DETHs. Can I just copy the logic?

Flags: needinfo?(nsm.nikhil) → needinfo?(bzbarsky)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 22

•

10 years ago

Is currentinnerwindow check even enough here, if the Promise is created in some iframe and the parent document goes to bfcache. We probably want something like CheckInnerWindowCorrectness() && !IsFrozen(); nsPIDOMWindow has IsCurrentInnerWindow() and IsFrozen(); In case Promise's callback would be called in bfcache doc, we probably should evict the doc from bfcache.

Boris Zbarsky [:bzbarsky]

Comment 23

•

10 years ago

> if the Promise is created in some iframe and the parent document goes to bfcache. Ouch. That would also affect the active-document API that johns is adding on nsIDocument, and nsPIDOMWindow::HasActiveDocument, right? We should fix those... But for purpose of this bug, CheckInnerWindowCorrectness won't help as far as I can tell. I mentioned that in comment 8 already.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 24

•

10 years ago

Right, and there is no DisconnectFromOwner here. So maybe we should just stash the Promise in a doubly linked list on the window (we want this anyway; actually) and then let it know when the window is going away like we do for DETH?

Johnny Stenback (:jst)

Updated

•

10 years ago

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 25

•

10 years ago

OK, so how about this proposal: 1) We add a boolean member on nsIGlobalObject. Call it mCleanedUp or something similar (if using that name, we should remove the corresponding member on nsGlobalWindow). Have that member start 2) Have nsGlobalWindow set this member in nsGlobalWindow::CleanUp. 3) If the member is set on its mGlobal, a promise simply never notifies its handlers (and ignores any attempts to add new handlers). That might not handle the bfcache case, which is why we might want to name the member something other than mCleanedUp and set it in the bfcache case as well. Bobby, objections to adding data members to nsIGlobalObject?

Flags: needinfo?(bzbarsky) → needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 26

•

10 years ago

(In reply to Boris Zbarsky [:bz] from comment #25) > Bobby, objections to adding data members to nsIGlobalObject? My objection is more with polluting nsIGlobalObject with stuff that doesn't have a clear meaning for all of its consumers. I want to be very careful that we don't end up with another nsIScriptGlobalObject. Is there one not mentioned in comment 25? If this is window-specific, then it seems like Promises should just maintain a QI-ed ref to nsPIDOMWindow, and if it's non-null, check that bit directly.

Flags: needinfo?(bobbyholley)

Boris Zbarsky [:bzbarsky]

Comment 27

•

10 years ago

I think that such a thing would have a clear meaning for all globals: you should not be running script anymore, for whatever reason. Not terribly window-specific...

Bobby Holley (:bholley)

Comment 28

•

10 years ago

(In reply to Boris Zbarsky [:bz] from comment #27) > I think that such a thing would have a clear meaning for all globals: you > should not be running script anymore, for whatever reason. Not terribly > window-specific... But do we actually have a clear sense of when that happens for other globals? That's my question here. Certainly not for sandboxes. For ContentFrameMessageManger, maybe? And what about XPCWrappedNative BackstagePass instances?

Boris Zbarsky [:bzbarsky]

Comment 29

•

10 years ago

> But do we actually have a clear sense of when that happens for other globals? For workers, yes. It never happens for sandboxes and backstagepass. In any case, we could do the Window thing for now, but I'm trying to think about how we want to implement this once Promise is part of SpiderMonkey.

Bobby Holley (:bholley)

Comment 30

•

10 years ago

Hm. In the current world this might be able to be approximated by the cases where GetGlobalJSObject returns null. But long term I'd really like GetGlobalJSObject() to become GlobalJSObject(), which would mean that anyone holding an nsIGlobalObject would have to cycle collect that edge, and holding onto an nsIGlobalObject would more or less guarantee that the global would remain alive. So, for Window, Workers, and ContentFrameMessageManager, I guess we need some way to indicate that the global is dying but not yet dead. So I'd be ok with StartDying()/IsDying()/mIsDying(false) on nsIGlobalObject.

Boris Zbarsky [:bzbarsky]

Comment 31

•

10 years ago

Nikhil, are you up for dealing with this?

Flags: needinfo?(nsm.nikhil)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 32

•

10 years ago

If you mean comment 25, then yes I could do that.

Flags: needinfo?(nsm.nikhil)

Boris Zbarsky [:bzbarsky]

Comment 33

•

10 years ago

Comment 25, yes.

Andrew McCreight [:mccr8]

Comment 34

•

10 years ago

nsm, please post your work in progress when you have something ready. I'm seeing a leak when running dom/network/tests/ with e10s where promises are keeping alive the window for test_udpsocket.html and I'd like to see if this will fix that.

Andrew McCreight [:mccr8]

Comment 35

•

10 years ago

(In reply to Andrew McCreight [:mccr8] from comment #34) > nsm, please post your work in progress when you have something ready. I'm > seeing a leak when running dom/network/tests/ with e10s where promises are > keeping alive the window for test_udpsocket.html and I'd like to see if this > will fix that. Never mind, I'm pretty sure it is about some networking IPDL stuff instead, which would make more sense.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Updated

•

10 years ago

Assignee: nobody → nsm.nikhil

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 36

•

10 years ago

Boris, In the microtask based implementation, if a runnable run from Promise::PerformMicroTaskCheckpoint leads to a new task being appended to the microtask queue, PerformMicroTaskCheckpoint will never break out of it's loop. This prevents nsGlobalWindow::CleanUp from running. In fact, it freezes the entire browser.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Updated

•

10 years ago

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 37

•

10 years ago

Per spec, that's more or less what should happen. What we _should_ be doing is performing some equivalent of interrupt checks in Promise::PerformMicroTaskCheckpoint, because this is equivalent to an infinite loop in JS. I'm not actually sure what the best way of doing that is, since this queue interleaves promises from different windows, so we can't just pick a particular jscontext and do interrupt checks on it... that's assuming the watchdog were even running when we're not really executing JS. Bobby, thoughts?

Flags: needinfo?(bzbarsky)

Bobby Holley (:bholley)

Comment 38

•

10 years ago

It should be enough just to use a JSAutoRequest for the scope of Promise::PerformMicroTaskCheckpoint. That will put the JS engine in an active state for the duration of the function, and activate the watchdog. nsm, can you try that and see if you get the slow script dialog?

Boris Zbarsky [:bzbarsky]

Comment 39

•

10 years ago

JSAutoRequest on which JSContext? Or does that not matter?

Bobby Holley (:bholley)

Comment 40

•

10 years ago

(In reply to Boris Zbarsky [:bz] from comment #39) > JSAutoRequest on which JSContext? Or does that not matter? Doesn't matter. The state lives on the JSRuntime.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 41

•

10 years ago

Attached patch Add member to nsIGlobalObject to detect it is going away. Make promises use it (obsolete) — Details — Splinter Review

This is what I have right now. The JSAutoRequest does cause the slow script dialog to popup and also keeps the rest of the UI responsive, but attempting to quit while the promise loop is running, does nothing, until the browser eventually MOZ_CRASHes |Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at /home/nikhil/mozilla-central/toolkit/components/terminator/nsTerminator.cpp:101| I added the request interrupt call after every runnable. I don't know if this is efficient, but at least the browser can shut down. I haven't tried the other scenarios yet like bfcache and just closing tabs instead of the browser.

Attachment #8519277 - Flags: feedback?(bobbyholley)

Bobby Holley (:bholley)

Comment 42

•

10 years ago

Comment on attachment 8519277 [details] [diff] [review] Add member to nsIGlobalObject to detect it is going away. Make promises use it Review of attachment 8519277 [details] [diff] [review]: ----------------------------------------------------------------- When the slow script dialog kills a script, the JS engine throws an uncatchable exception (i.e. returning false without setting a pending exception). So it seems like what we really want to do here is to propagate that failure to the nsresult returning value of ::Run() on the runnables, and then bail out of the while loop if one of the runnables fails.

Attachment #8519277 - Flags: feedback?(bobbyholley) → feedback-

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

10 years ago

No longer blocks: 1057512

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 43

•

10 years ago

I made the changes in comment 42 and the loop stops, and the browser tries to shutdown. But it almost always gets stuck here: #0 0x00007ffff7bcb8bf in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0 #1 0x00007ffff67d51ba in PR_Wait (mon=0x7fffe2415f80, timeout=4294967295) at /home/nikhil/mozilla-central/nsprpub/pr/src/pthreads/ptsynch.c:691 #2 0x00007fffed1580db in mozilla::ReentrantMonitor::Wait (this=0x7ffff6c62fc8, aInterval=4294967295) at /home/nikhil/mozilla-central/xpcom/glue/BlockingResourceBase.cpp:468 #3 0x00007fffed0e435e in mozilla::ReentrantMonitorAutoEnter::Wait (this=0x7fffffffc6a8, aInterval=4294967295) at ../../dist/include/mozilla/ReentrantMonitor.h:190 #4 0x00007fffed119eb9 in nsEventQueue::GetEvent (this=0x7ffff6c62fc8, aMayWait=true, aResult=0x7fffffffc810) at /home/nikhil/mozilla-central/xpcom/threads/nsEventQueue.cpp:67 #5 0x00007fffed12791f in nsThread::nsChainedEventQueue::GetEvent (this=0x7ffff6c62fb8, aMayWait=true, aEvent=0x7fffffffc810) at /home/nikhil/mozilla-central/xpcom/threads/nsThread.h:116 #6 0x00007fffed11e1ea in nsThread::ProcessNextEvent (this=0x7ffff6c62f40, aMayWait=true, aResult=0x7fffffffc90e) at /home/nikhil/mozilla-central/xpcom/threads/nsThread.cpp:821 #7 0x00007fffed16a1e7 in NS_ProcessNextEvent (aThread=0x7ffff6c62f40, aMayWait=true) at /home/nikhil/mozilla-central/xpcom/glue/nsThreadUtils.cpp:265 #8 0x00007fffee442c7f in mozilla::layers::CompositorParent::ShutDown () at /home/nikhil/mozilla-central/gfx/layers/ipc/CompositorParent.cpp:341 #9 0x00007fffee4c3217 in gfxPlatform::ShutdownLayersIPC () at /home/nikhil/mozilla-central/gfx/thebes/gfxPlatform.cpp:568 #10 0x00007fffed179586 in mozilla::ShutdownXPCOM (aServMgr=0x7ffff6cbe338) at /home/nikhil/mozilla-central/xpcom/build/XPCOMInit.cpp:841 #11 0x00007fffed179355 in NS_ShutdownXPCOM (aServMgr=0x7ffff6cbe338) at /home/nikhil/mozilla-central/xpcom/build/XPCOMInit.cpp:780 #12 0x00007ffff1081d0e in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x7fffe244fb60) at /home/nikhil/mozilla-central/toolkit/xre/nsAppRunner.cpp:1334 #13 0x00007ffff108bcf5 in XREMain::XRE_main (this=0x7fffffffcc40, argc=4, argv=0x7fffffffe1f8, aAppData=0x7fffffffcec0) at /home/nikhil/mozilla-central/toolkit/xre/nsAppRunner.cpp:4253 #14 0x00007ffff108c44f in XRE_main (argc=4, argv=0x7fffffffe1f8, aAppData=0x7fffffffcec0, aFlags=0) at /home/nikhil/mozilla-central/toolkit/xre/nsAppRunner.cpp:4446 #15 0x0000000000404f5f in do_main (argc=4, argv=0x7fffffffe1f8, xreDirectory=0x7ffff6c39840) at /home/nikhil/mozilla-central/browser/app/nsBrowserApp.cpp:287 #16 0x0000000000404670 in main (argc=4, argv=0x7fffffffe1f8) at /home/nikhil/mozilla-central/browser/app/nsBrowserApp.cpp:656 waiting for the compositor to shutdown, which itself is waiting for its thread to shut down. A few runs do end successfully, usually if I attempt quitting immediately after executing the test case. I can't think of a predictable way to probe what causes this to happen.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 44

•

10 years ago

At first I thought comment 43 might be related to Bug 1095443, but the same thing happens even with the fix for Bug 1095443. Boris, any thoughts?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 45

•

10 years ago

Nothing offhand, sorry. :(

Flags: needinfo?(bzbarsky)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 46

•

10 years ago

Gavin, maybe you know someone who can dig into this a bit. I'm attaching the patch I have.

Flags: needinfo?(gavin.sharp)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 47

•

10 years ago

Attached patch Add member to nsIGlobalObject to detect it is going away. Make promises use it — Details — Splinter Review

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Updated

•

10 years ago

Attachment #8519277 - Attachment is obsolete: true

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 48

•

10 years ago

No idea, but maybe Paolo or Yoric can help.

Flags: needinfo?(paolo.mozmail)

Flags: needinfo?(gavin.sharp)

Flags: needinfo?(dteller)

David Teller [:Yoric] - still alive but not very active

Comment 49

•

10 years ago

I'm not sure what's requested of us. Gavin, Nikhil, do you want me (or Paolo) to take a look at the implementation of DOM Promise and find out how this patch conflicts with the compositor? That's something we can certainly try to do, but given that I have zero knowledge of the compositor, it might take me some time to get into it.

Flags: needinfo?(nsm.nikhil)

Flags: needinfo?(gavin.sharp)

Flags: needinfo?(dteller)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 50

•

10 years ago

(In reply to David Rajchenbach-Teller [:Yoric] (away until November 17th - use "needinfo") from comment #49) > I'm not sure what's requested of us. Gavin, Nikhil, do you want me (or > Paolo) to take a look at the implementation of DOM Promise and find out how > this patch conflicts with the compositor? That's something we can certainly > try to do, but given that I have zero knowledge of the compositor, it might > take me some time to get into it. So somehow the Promise conversion to microtask is interfering with the compositor. I'm not sure how, which is why I was hoping someone who knew about the compositor could :) My hunch is that the compositor is waiting on some event but promises (when using this bug's test case at least) are preventing that event from executing. If you could investigate that, awesome!

Flags: needinfo?(nsm.nikhil)

Bobby Holley (:bholley)

Comment 51

•

10 years ago

I'm totally confused. Nikhil, why are you asking frontend people to investigate layout/gfx issues? Yoric's willingness to help in comment 49 is noble, but he's almost certainly not the right person to ask about this. If you want help with gfx/layers/ipc/CompositorParent.cpp, you should probably try BenWa.

Flags: needinfo?(paolo.mozmail)

Flags: needinfo?(nsm.nikhil)

Flags: needinfo?(gavin.sharp)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 52

•

10 years ago

Sorry, Yoric has been involved with the Promises effort for a while, so I thought he might be able to help from that end of things. I think the best people on this would be Paolo and BenWa (or another gfx person) together. ni?ing them, perhaps they can work out some communication. BenWa, comment 43 has context. Thanks!

Flags: needinfo?(paolo.mozmail)

Flags: needinfo?(nsm.nikhil)

Flags: needinfo?(bgirard)

David Teller [:Yoric] - still alive but not very active

Comment 53

•

10 years ago

Indeed, both Paolo and myself have been patching Promise.cpp recently, so we might be good interlocutors. We just have no clue about the compositor.

Benoit Girard (:BenWa)

Comment 54

•

10 years ago

Can you post the state of all the threads? Knowing what we're doing in the compositor would help. The compositor should be able to complete without calling into the main thread.

Flags: needinfo?(bgirard)

:Paolo Amadini

Comment 55

•

10 years ago

Since bug 1095443 was ruled out, I don't have any other suggestion for something directly related to the Promise change. We post less events to the main thread now, so that change might have an effect if something in the compositor shutdown is accidentally time-dependent.

Flags: needinfo?(paolo.mozmail)

David Teller [:Yoric] - still alive but not very active

Comment 56

•

10 years ago

Not that compositor shutdown is somewhat fragile. We have many reports of compositor shutdown hangs (it's a subset of topcrash #2, I don't have a clear idea of how large this subset is).

Alex Vincent [:WeirdAl]

Comment 57

•

10 years ago

Note bug 1095267 comment 4. Over there I discovered that converting from Promise.jsm to native Promises started a similar freeze in SQLite testing, and that a call to Components.utils.forceCC() ended the freeze. There might be a need to cycle-collect here as well, or at least a way to identify and break the promise-related cycle so things can proceed smoothly.

Boris Zbarsky [:bzbarsky]

Updated

•

10 years ago

Blocks: 1124322

Johnny Stenback (:jst)

Comment 58

•

10 years ago

David, Nikhil, what's the next steps here?

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 59

•

10 years ago

I tried the attached patch again and didn't run into any compositor hangs. This is on Linux, intel graphics. If a Promise is stuck in a loop and the window is closed, a slow script dialog pops up. Killing the script will result in the browser shutting down after a few seconds (once the existing event queue is dealt with). Could someone try this on other platforms?

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 60

•

10 years ago

Comment on attachment 8526288 [details] [diff] [review] Add member to nsIGlobalObject to detect it is going away. Make promises use it Review of attachment 8526288 [details] [diff] [review]: ----------------------------------------------------------------- I'm thinking we should land this and see if the problem persists. Could you review? I'll take out the __PRETTY_FUNCTION__ lines before landing.

Attachment #8526288 - Flags: review?(bobbyholley)

Bobby Holley (:bholley)

Comment 61

•

10 years ago

Comment on attachment 8526288 [details] [diff] [review] Add member to nsIGlobalObject to detect it is going away. Make promises use it Review of attachment 8526288 [details] [diff] [review]: ----------------------------------------------------------------- Cancelling review on account of the interrupt callback thing. ::: dom/base/nsGlobalWindow.cpp @@ +1422,5 @@ > > void > nsGlobalWindow::CleanUp() > { > + NS_WARNING(__PRETTY_FUNCTION__); Per above, this goes away. ::: dom/base/nsIGlobalObject.h @@ +27,5 @@ > public: > NS_DECLARE_STATIC_IID_ACCESSOR(NS_IGLOBALOBJECT_IID) > > + bool > + IsDying() const This needs copious comments explaining what's going on. @@ +33,5 @@ > + return mIsDying; > + } > + > + void > + StartDying() This should almost certainly be marked protected. ::: dom/promise/Promise.cpp @@ +369,5 @@ > return false; > } > > + ThreadsafeAutoSafeJSContext cx; > + JSAutoRequest req(cx); What is this doing here? Looks to me like it should be removed. @@ +374,2 @@ > do { > + fprintf(stderr, "NON EMPTY\n"); Same here. @@ +381,5 @@ > + nsresult rv = runnable->Run(); > + if (NS_WARN_IF(NS_FAILED(rv))) { > + return; > + } > + JS_RequestInterruptCallback(runtime->Runtime()); Requesting an interrupt callback is very expensive IIRC. Why are we doing it here, especially given that we don't do it when executing setTimeout etc? ::: js/xpconnect/src/XPCJSRuntime.cpp @@ +1366,5 @@ > // static > bool > XPCJSRuntime::InterruptCallback(JSContext *cx) > { > + NS_WARNING(__PRETTY_FUNCTION__); this too. @@ +1417,5 @@ > } > } > + if (win && win->IsDying()) { > + MOZ_ASSERT(false); > + } Make this MOZ_ASSERT_IF(win, !win->IsDying());

Attachment #8526288 - Flags: review?(bobbyholley)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 62

•

10 years ago

Sorry about getting back to this after ages. (In reply to Bobby Holley (:bholley) from comment #61) > > @@ +381,5 @@ > > + nsresult rv = runnable->Run(); > > + if (NS_WARN_IF(NS_FAILED(rv))) { > > + return; > > + } > > + JS_RequestInterruptCallback(runtime->Runtime()); > > Requesting an interrupt callback is very expensive IIRC. Why are we doing it > here, especially given that we don't do it when executing setTimeout etc? If I don't have this here, the slow script dialog does not popup, presumably since the runnables just keep running. I'm not sure how to proceed without this.

Flags: needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 63

•

10 years ago

Brian, is this call to JS_RequestInterruptCallback something we should be concerned about? It's going to be occurring on the runtime's owner thread, and happen at various places where we return from JS to C++.

Flags: needinfo?(bhackett1024)

Bobby Holley (:bholley)

Comment 64

•

10 years ago

Hm, let's try luke.

Flags: needinfo?(luke)

Luke Wagner [:luke]

Comment 65

•

10 years ago

JS_RequestInterruptCallback() can be rather expensive since involves patching all Ion backedges. (It's not the super-expensive signal path, though, which is only used when JS_RequestInterruptCallback() is called *off* the owner thread.) I'm actually quite confused why JS_RequestInterruptCallback() is being called as it is in the patch: JS_RequestInterruptCallback() simply requests that a callback to be called later but does not call it itself. With the slow-script dialog, we already have the watchdog thread calling JS_RequestInterruptCallback(); what we need on the main thread is regular calls to js::CheckForInterrupt(), which normally the JS engine does for you. But, if you have something outside the JS engine spending a long time in a loop, it would need to call js::CheckForInterrupt() too. (It appears we don't actually have a public JSAPI for this that I can find (weird), so you can add it.) Given all that, does changing that call to JS_RequestInterruptCallback() to JS::CheckForInterrupt() fix your problem?

Flags: needinfo?(luke)

Bobby Holley (:bholley)

Comment 66

•

10 years ago

My initial assumption was that we couldn't rely on the watchdog thread because that's main-thread-only. But thinking about it more, it's not actually clear to me whether requesting an interrupt callback on workers would help very much. On workers, we request an interrupt callback when we send a control runnable, and the interrupt callback processes control runnables. So we should actually be set in that case.

Flags: needinfo?(bobbyholley)

Flags: needinfo?(bhackett1024)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 67

•

10 years ago

So I looked at adding a JS::CheckForInterrupt(), but js::CheckForInterrupt() requires a JSContext and I don't have one in PerformMicroTaskCheckpoint(). Bobby, will the AutoSafeJSContext be acceptable?

Flags: needinfo?(bobbyholley)

Boris Zbarsky [:bzbarsky]

Comment 68

•

10 years ago

Or add an overload of CheckForInterrupt that takes JSRuntime?

Bobby Holley (:bholley)

Comment 69

•

10 years ago

(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #67) > So I looked at adding a JS::CheckForInterrupt(), but js::CheckForInterrupt() > requires a JSContext and I don't have one in PerformMicroTaskCheckpoint(). > Bobby, will the AutoSafeJSContext be acceptable? ThreadsafeAutoSafeJSContext should work, yes - this stuff is called on workers too, right? (In reply to Not doing reviews right now from comment #68) > Or add an overload of CheckForInterrupt that takes JSRuntime? Almost, but to invoke the interrupt callback we need a cx.

Flags: needinfo?(bobbyholley)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 70

•

10 years ago

Fixed patch coming up. Couple of things: 1) I've made the interrupt check main thread only. a) Because on the worker it crashes with this stack https://nsm.pastebin.mozilla.org/8831084 b) On workers, during shutdown the runnables are forcibly canceled, so we don't get stuck. 2) I am calling the function CheckForJSInterrupt instead of CheckForInterrupt because I get linker errors due to clashing with js::CheckForInterrupt() even though they are in separate namespaces and I have no easy fix.

Bobby Holley (:bholley)

Comment 71

•

10 years ago

(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #70) > 2) I am calling the function CheckForJSInterrupt instead of > CheckForInterrupt because I get linker errors due to clashing with > js::CheckForInterrupt() even though they are in separate namespaces and I > have no easy fix. Hm, that's probably worth getting to the bottom of rather than letting it dictate the naming, no?

Luke Wagner [:luke]

Comment 72

•

10 years ago

(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #70) > a) Because on the worker it crashes with this stack > https://nsm.pastebin.mozilla.org/8831084 That can be fixed by just adding a null check to AttachFinishedCompilations where we're already branching for a null IonCompartment.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 73

•

10 years ago

Attached patch interdiff — Details — Splinter Review

Attachment #8596250 - Flags: review?(bobbyholley)

Bobby Holley (:bholley)

Comment 74

•

10 years ago

Comment on attachment 8596250 [details] [diff] [review] interdiff Review of attachment 8596250 [details] [diff] [review]: ----------------------------------------------------------------- r=me with that. ::: dom/base/nsIGlobalObject.h @@ +38,5 @@ > + * from being dispatched. > + * > + * We pair this with checks during processing the promise microtask queue > + * that pops up the slow script dialog when the Promise queue is preventing > + * a window from going away. Wonderful comment, thank you. ::: js/src/jsapi.cpp @@ +3355,5 @@ > > +JS_PUBLIC_API(bool) > +CheckForJSInterrupt(JSContext* cx) > +{ > + return js::CheckForInterrupt(cx); please just expose this in jsfriendapi.h and get rid of the duplicate method. ::: js/src/vm/ForOfIterator.cpp @@ +102,5 @@ > ForOfIterator::nextFromOptimizedArray(MutableHandleValue vp, bool* done) > { > MOZ_ASSERT(index != NOT_ARRAY); > > + if (!js::CheckForInterrupt(cx_)) This change should go.

Attachment #8596250 - Flags: review?(bobbyholley) → review+

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 75

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/527cd70f241f

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 76

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7c9a39a083d1

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 77

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/cb59db723d15

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 78

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/66e5d77f947d

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 79

•

10 years ago

Bobby, extern declaring CheckForInterrupt in jsfriendapi breaks mac builds [1], moving CheckForInterrupt from jscntxt to jsfriendapi fails due to it being inline but not having access to JSRuntime. Should I bring back JS::CheckForJSInterrupt()? [1]: https://treeherder.mozilla.org/logviewer.html#?job_id=9235188&repo=mozilla-inbound

Flags: needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 80

•

10 years ago

(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #79) > Bobby, extern declaring CheckForInterrupt in jsfriendapi breaks mac builds > [1], moving CheckForInterrupt from jscntxt to jsfriendapi fails due to it > being inline but not having access to JSRuntime. Should I bring back > JS::CheckForJSInterrupt()? > > [1]: > https://treeherder.mozilla.org/logviewer.html#?job_id=9235188&repo=mozilla- > inbound Oh, I didn't realize it was inline. In that case, the appropriate thing to do is to add JS::CheckForInterrupt to jsapi.h (and defined in jsapi.cpp), which just calls the inlined version. I know you mentioned above that you were having trouble making this build, but I'd like to get the the bottom of that rather than using non-standard naming for unknown reasons.

Flags: needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 81

•

10 years ago

It occurred to me that having JS::CheckForInterrupt and js::CheckForInterrupt might be a problem if there's code that does: using namespace js; using namespace JS; Though if that happens, it should manifest as a compile error, not a link error. Anyway, if that's the case, I guess we could do ::JS_CheckForInterrupt for the external API - it uses the deprecated-yet-still-common-and-understandable naming convention, which is fine for now.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 82

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ae9d5eda86c

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 83

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/c839f283bf28

Ryan VanderMeulen [:RyanVM]

Comment 84

•

10 years ago

Backed out for bustage. https://hg.mozilla.org/integration/mozilla-inbound/rev/d02374f93b37 https://treeherder.mozilla.org/logviewer.html#?job_id=9333851&repo=mozilla-inbound

status-firefox31: affected → ---

status-firefox32: affected → ---

status-firefox33: affected → ---

status-firefox34: affected → ---

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Assignee

Comment 85

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/531631f34021

Carsten Book [:Tomcat]

Comment 86

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/531631f34021

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox40: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla40

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Depends on: 1182197

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Depends on: 1245674

Ben Hsu [:HoPang]

Updated

•

8 years ago

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

Karl Tomlinson (:karlt)

Comment 87

•

5 years ago

https://github.com/whatwg/html/issues/2621 is related, but seems to be focusing on whether Document object is fully active, which I assume is not the case if the document is in bfcache.

Add member to nsIGlobalObject to detect it is going away. Make promises use it 10 years ago Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?) 3.74 KB, patch	bholley : feedback-	Details \| Diff \| Splinter Review
Add member to nsIGlobalObject to detect it is going away. Make promises use it 10 years ago Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?) 15.88 KB, patch		Details \| Diff \| Splinter Review
interdiff 10 years ago Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?) 4.31 KB, patch	bholley : review+	Details \| Diff \| Splinter Review