Closed
Bug 1003730
Opened 10 years ago
Closed 5 years ago
Close the tab will not kill the worker if there is a console.log() keep printing log to DevTools
Categories
(Core :: DOM: Workers, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla66
People
(Reporter: timdream, Assigned: baku)
References
Details
Attachments
(1 file, 3 obsolete files)
2.91 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
STR: 1. Open test case, a worker will start running 2. Open devtool and verify numbers being printed on the console 3. Verify the CPU usage of Firefox is high with |top| 4. Open a new tab 5. Close the tab with test case 6. Wait for a few second and observe |top| Expected: 1. Firefox stop eating CPU since Worker should be removed. Actual: 1. Firefox keep eating CPU and your laptop start burning your hands.
Reporter | ||
Updated•10 years ago
|
Summary: Close the tab will not kill the worker if there is a console.log() in it → Close the tab will not kill the worker if there is a console.log() keep printing log to DevTools
Reporter | ||
Comment 1•10 years ago
|
||
ochameau ask me to try to replace |console.log()| with |dump()| and see if the issue is still there. It's not. So this looks more likely a DevTools issue according to him.
Component: DOM: Workers → Developer Tools: Console
Product: Core → Firefox
No, this belongs in the DOM. Andrea can you investigate here?
Component: Developer Tools: Console → DOM: Workers
Product: Firefox → Core
Actually adding Andreas ... see comment 2
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 4•10 years ago
|
||
working on it.
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 5•10 years ago
|
||
Attachment #8422377 -
Flags: review?(khuey)
Why is this necessary? Why isn't JS interrupted when the worker wants to shut down?:
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #6) > Why is this necessary? Why isn't JS interrupted when the worker wants to > shut down?: The reason why I wrote this is because a worker is not kill if there is a runnable running. When console.log() is called in workers we run this: nsRefPtr<ConsoleCallDataRunnable> runnable = new ConsoleCallDataRunnable(callData); runnable->Dispatch(); This keeps the worker and the JS alive. This patch fixes the problem for the Console, but maybe bent or you can come out with some better global idea.
Flags: needinfo?(amarchesini)
Comment on attachment 8422377 [details] [diff] [review] feature.patch This is just wallpaper over a much more serious issue.
Attachment #8422377 -
Flags: review?(khuey) → review-
Assignee: amarchesini → khuey
So I never got around to fixing this ... One of the tricky problems worker shutdown has to deal with is what to do with JS that is not yieldin. We assume that we're doing one of four things: 1. We're running some chrome JS ctypes call, such as an epoll. We don't attempt to handle this case, and will simply block shutdown until it completes. 2. We're running an infinite loop solely within JS. 3. We're running an infinite loop in JS that is making "short-lived" calls into the DOM, where "short-lived" means something that does not spin any event queues. 4. We're currently running a sync queue. When we want to shutdown a worker, we dispatch a NotifyRunnable, which is a control runnable. That pokes the operation callback, which interrupts running JS to run the control queue. We then return false from the operation callback which aborts the running script. This handles case 2, and it handles case 3 as soon as we return from the DOM call into JS (the operation callback will trigger immediately and we'll run the NotifyRunnable). It does not handle case 4. We used to deal with this by propagating the false return value from RunCurrentSyncLoop to the caller (of the JSNative entry point into the DOM) which would trigger an uncatchable exception. Some of that was lost when we moved to WebIDL. I believe the Console used to make sync calls to the main thread, and doesn't anymore, so this might just be fixed. But some other sites need some work. e.g. http://searchfox.org/mozilla-central/rev/d9a04f71630ce4203ff0a5e26722723045d035b5/dom/workers/WorkerRunnable.cpp#593 looks correct, but http://searchfox.org/mozilla-central/rev/d9a04f71630ce4203ff0a5e26722723045d035b5/dom/xhr/XMLHttpRequestWorker.cpp#1827 does not, and neither does http://searchfox.org/mozilla-central/rev/d9a04f71630ce4203ff0a5e26722723045d035b5/dom/workers/ScriptLoader.cpp#2112. I'll needinfo bz since he touched some of this last, and can probably pawn it off on baku or bkelly if needed.
Flags: needinfo?(bzbarsky)
Assignee: khuey → nobody
Reporter | ||
Comment 11•7 years ago
|
||
I've been trying to see if the bug reproduces in the latest Nightly and the problem is getting worse with e10s. I am unable to close the tab once it starts running. Neither I can Cmd+Q the Nighty app afterwards. Both the chrome and content process gets too busy, making switching tab hard.
Severity: normal → blocker
tracking-e10s:
--- → ?
Reporter | ||
Comment 12•7 years ago
|
||
But yeah, in non-e10s the bug is fixed, considering the repro step in comment 0.
Reporter | ||
Comment 13•7 years ago
|
||
(In reply to Tim Guan-tin Chien [:timdream] (please needinfo) from comment #11) > Neither I can Cmd+Q the Nighty app afterwards. Cmd+Q works after a very long wait, console output the following: $ mach run https://bugzilla.mozilla.org/attachment.cgi?id=8415079 0:00.25 /Users/timdream/Repositories/gecko/gecko-dev/objdir-artifact/dist/Nightly.app/Contents/MacOS/firefox https://bugzilla.mozilla.org/attachment.cgi?id=8415079 -no-remote -foreground -profile /Users/timdream/Repositories/gecko/gecko-dev/objdir-artifact/tmp/scratch_user WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"SessionStore: flushing all windows","state":{"total":1,"current":0},"filename":"resource:///modules/sessionstore/SessionStore.jsm","lineNumber":1465,"stack":["resource:///modules/sessionstore/SessionStore.jsm:ssi_onQuitApplicationGranted:1465","resource:///modules/sessionstore/SessionStore.jsm:ssi_observe:648","chrome://global/content/bindings/remote-browser.xml:permitUnload:356","chrome://browser/content/browser.js:CanCloseWindow:6173","chrome://browser/content/browser.js:WindowIsClosing:6193","chrome://browser/content/browser.xul:onclose:1"]}] Barrier: quit-application-granted
Assignee | ||
Comment 14•7 years ago
|
||
We still don't support this particular case because we end up creating a loop of scheduling of GarbageCollectRunnable. We don't have onclose event anymore, so when we shutdown workers we can immediately stop scheduling GC.
Assignee: nobody → amarchesini
Attachment #8415079 -
Attachment is obsolete: true
Attachment #8422377 -
Attachment is obsolete: true
Flags: needinfo?(bzbarsky)
Attachment #8787452 -
Flags: review?(bkelly)
Comment 15•7 years ago
|
||
Comment on attachment 8787452 [details] [diff] [review] worker.patch Review of attachment 8787452 [details] [diff] [review]: ----------------------------------------------------------------- Can't we still be running js in the Closing state? We don't abort in that case. I don't think this test case exercises that path, though. Also, shouldn't we get this for free once we collapse Terminating/Canceling/Killing into a single state?
Attachment #8787452 -
Flags: review?(bkelly) → review-
Comment 16•7 years ago
|
||
Also, I think we are mostly handling this case today now as well. For example: 1) Open fresh dev-edition browser 2) open example.com tab 3) Open comment 0 attachment in tab 4) Observe CPU usage increases 5) Close comment 0 attachment tab (this will take a while) 6) Open about:memory tab 7) Minimize memory 8) Observe CPU usage drops The memory minimization causes us to enter the Terminating state which aborts the script. This seems to cleanup correctly. This is a bigger problem in e10s because we don't get GC's scheduled as often since a lot of chrome script lives in the parent process. The jankiness seems induced by console.log() flooding the main thread with runnables. I think perhaps we should instead write a bug on console to not do this. For example, it could allow at most one outstanding main thread Runnable per worker thread. Any queued up console reports get sent in the next main thread runnable. Andrea, what do you think?
Flags: needinfo?(amarchesini)
Reporter | ||
Updated•7 years ago
|
Attachment #8415079 -
Attachment is obsolete: false
Comment 17•7 years ago
|
||
Andrea, I think I might have seen the GC loop thing as well. I might not have seen previously since it will probably be timing dependent based on when the GC timers fire.
Comment 18•7 years ago
|
||
So I'm trying to understand comment 14. What is the loop here?
Comment 19•7 years ago
|
||
I don't remember all the details, but basically we were in state Closing or Terminating, but could never progress to Killing. GC control runnables kept getting dispatched before we could progress to Killing. Because we only cancel GC timers in Killing this could prevent shutdown forever.
Comment 20•7 years ago
|
||
FWIW I pushed this to try https://treeherder.mozilla.org/#/jobs?repo=try&revision=79e816ee4131fc8932ae976dfba1a8016a365442 with the patch for bug 1216175 and some other changes.
Reporter | ||
Comment 21•6 years ago
|
||
I can no longer reproduce comment 0 on Nightly (other than the fact console blocks the chrome process). If nothing else is actionable here we should close this as WFM.
Comment 22•6 years ago
|
||
:Baku, can you verify Tim's last comment and then close this bug accordingly?
Updated•6 years ago
|
Priority: -- → P2
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•5 years ago
|
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 23•5 years ago
|
||
Yes, this bug is fixed. I confirm.
Assignee | ||
Comment 24•5 years ago
|
||
Blake, I don't know how familiar you are with the worker shutdown but this is what happens: 1. When a worker is terminated, we call: Notify(Canceling) on the parent thread. 2. This triggers a NotifyInternal(Canceling) call on the worker thread via a ControlRunnable. 3. Dispatching a ControlRunnable triggers a JS interruption. 4. If we are in a sync loop (xhr? url CTOR?), it can happen that we process the control runnable queue there (maybe we could refactoring this part, but it's not related to this bug). 5. The worker is set to 'canceling' state. At this point no JS execution should happen. 6. When the interruption callback is executed, there are no control runnables in the queue. This means that we don't return 'false' and JS is still executed. You can reproduce this issue with something like: |while (true) { fetch(something); }| My patch checks the status in the intercept callback. Plus it avoids processing normal runnables in sync loop when the worker is terminated. Follow ups will be: 1. Ignore the return value for Control runnables. We can simply check WorkerPrivate::mStatus 2. Check if we can avoid processing all the control runnables in multiple places and just process them in the intercept callback.
Attachment #8415079 -
Attachment is obsolete: true
Attachment #8787452 -
Attachment is obsolete: true
Attachment #9031104 -
Flags: review?(mrbkap)
Assignee | ||
Updated•5 years ago
|
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 25•5 years ago
|
||
Comment on attachment 9031104 [details] [diff] [review] worker_shutdown.patch Review of attachment 9031104 [details] [diff] [review]: ----------------------------------------------------------------- I think I understand this enough to stamp it.
Attachment #9031104 -
Flags: review?(mrbkap) → review+
Comment 26•5 years ago
|
||
Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/2d4a6ccb4811 Always terminate workers when in canceling state, r=mrbkap
Comment 27•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2d4a6ccb4811
Status: REOPENED → RESOLVED
Closed: 5 years ago → 5 years ago
status-firefox66:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in
before you can comment on or make changes to this bug.
Description
•