Closed
Bug 965527
Opened 9 years ago
Closed 9 years ago
Intermittent B2G desktop Shutdown "###!!! ABORT: file resource://gre/modules/AsyncShutdown.jsm"
Categories
(Toolkit :: OS.File, defect)
Tracking
()
RESOLVED
FIXED
mozilla31
Tracking | Status | |
---|---|---|
firefox29 | --- | unaffected |
firefox30 | --- | fixed |
firefox31 | --- | fixed |
firefox-esr24 | --- | unaffected |
b2g-v1.4 | --- | fixed |
b2g-v2.0 | --- | fixed |
People
(Reporter: RyanVM, Assigned: Yoric)
References
Details
(Keywords: intermittent-failure)
Attachments
(5 files, 1 obsolete file)
1.31 KB,
patch
|
Yoric
:
review+
|
Details | Diff | Splinter Review |
1.45 KB,
patch
|
Yoric
:
review+
|
Details | Diff | Splinter Review |
9.39 KB,
patch
|
Yoric
:
review+
|
Details | Diff | Splinter Review |
2.30 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
7.91 KB,
patch
|
Yoric
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=33754620&tree=B2g-Inbound b2g_ubuntu64_vm b2g-inbound opt test reftest on 2014-01-29 11:03:47 PST for push fbd02ed19b1a slave: tst-linux64-ec2-352 11:09:08 INFO - REFTEST TEST-START | Shutdown 11:09:08 INFO - WARNING: A phase completion condition is taking too long to complete. Condition: OS.File: flush I/O queued before profile-before-change Phase: profile-before-change 11:09:08 INFO - ERROR: At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources. Conditions: OS.File: flush I/O queued before profile-before-change Phase: profile-before-change 11:09:08 INFO - [Parent 2336] ###!!! ABORT: file resource://gre/modules/AsyncShutdown.jsm, line 390 11:09:08 INFO - [Parent 2336] ###!!! ABORT: file resource://gre/modules/AsyncShutdown.jsm, line 390 11:09:08 WARNING - PROCESS-CRASH | Shutdown | application crashed [@ mozalloc_abort(char const*)]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 5•9 years ago
|
||
I am in the process of instrumenting AsyncShutdown and OS.File to get better reporting on such things (bug 957123). In the meantime, there is nothing I can do.
Flags: needinfo?(dteller)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 35•9 years ago
|
||
philor, comment 34 seems to be a different bug. Could you file it and assign it to rvitillo?
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Whiteboard: [leave open]
Reporter | ||
Comment 37•9 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/7cc797a9b5dc
Keywords: checkin-needed
Whiteboard: [leave open] → [leave open][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/7cc797a9b5dc
Whiteboard: [leave open][fixed-in-fx-team] → [leave open]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 40•9 years ago
|
||
For some reason, we end up in a case in which the logging throws an error. Let's make it even more careful.
Attachment #8377050 -
Flags: review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Reporter | ||
Comment 41•9 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/35dcbed473e7
Keywords: checkin-needed
Reporter | ||
Comment 42•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/35dcbed473e7
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 44•9 years ago
|
||
This one is a little clearer. 1. OS.File (and only OS.File) has trouble shutting down; 2. the reset timer is on; 3. we don't know whether there is any message pending, because it's not a DEBUG build This could be a bug in OS.File (most likely) or this could be some OS.File client spamming OS.File with requests. Working on further expanding the logging.
Assignee | ||
Comment 45•9 years ago
|
||
Yet more logging and as a bonus, a test on that logging.
Attachment #8378956 -
Flags: review+
Assignee | ||
Comment 46•9 years ago
|
||
Try: https://tbpl.mozilla.org/?tree=Try&rev=db20878addf8
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 51•9 years ago
|
||
Same try, after pull: https://tbpl.mozilla.org/?tree=Try&rev=4882894cfb96 So here we go for further logging.
Keywords: checkin-needed
Reporter | ||
Comment 52•9 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/03545057aff2
Keywords: checkin-needed
Whiteboard: [leave open] → [leave open][fixed-in-fx-team]
Reporter | ||
Comment 53•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/03545057aff2
Whiteboard: [leave open][fixed-in-fx-team] → [leave open]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 56•9 years ago
|
||
So, we now know that some call to OS.File.writeAtomic is placed a few seconds before the deadline and doesn't receive a reply within ~1 minute. On closer inspection of the code, we might have a race condition if the first call to OS.File is placed during a brief window during shutdown, which would explain the issue. Hardening the process, let's see if it works.
Assignee: nobody → dteller
Attachment #8382941 -
Flags: review?(nfroyd)
![]() |
||
Comment 57•9 years ago
|
||
Comment on attachment 8382941 [details] [diff] [review] Hardening OS.File shutdown flag Review of attachment 8382941 [details] [diff] [review]: ----------------------------------------------------------------- Sure, makes sense. ::: toolkit/components/osfile/modules/osfile_async_front.jsm @@ +205,5 @@ > + LOG("OS.File is not available anymore. Request has been rejected."); > + throw new Error("OS.File has been shut down."); > + } > + return code(); > + }.bind(this)); Fat arrow function instead?
Attachment #8382941 -
Flags: review?(nfroyd) → review+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Reporter | ||
Comment 62•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a5b7ed8dae4
Keywords: checkin-needed
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 64•9 years ago
|
||
(In reply to TBPL Robot from comment #63) This was on a push from after comment 62.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Backed this out in https://hg.mozilla.org/integration/mozilla-inbound/rev/dfd483bbeea6 at Yoric's request.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 82•9 years ago
|
||
I'm blocked by the lack of information on the file being written. Reworking the debugging information to ensure that we keep this data, without keeping data that is too expensive.
Attachment #8394126 -
Flags: review?(nfroyd)
![]() |
||
Comment 83•9 years ago
|
||
Comment on attachment 8394126 [details] [diff] [review] 4. Yet more detailed logging Review of attachment 8394126 [details] [diff] [review]: ----------------------------------------------------------------- Is there a reason you're turning this on for non-DEBUG builds too? I worry that we're going to be collecting all this information in non-debug (i.e. release) builds and that's going to slow things down. r=me if we can turn this back into mostly DEBUG-only logging before the next uplift. ::: toolkit/components/osfile/modules/osfile_async_front.jsm @@ +139,5 @@ > + * Extract a shortened version of an object, fit for logging. > + * > + * This function returns a copy of the original object in which all > + * long strings, Arrays, TypedArrays, ArrayBuffers are removed and > + * replaced with plceholders. Use this function to sanitize objects Nit: "placeholders" @@ +160,5 @@ > + if (Array.isArray(obj)) { > + if (obj.length > 32) { > + return {"Long array": obj.length}; > + } > + return [summarizeObject(k) for (k of obj)]; Perhaps it would be better to do something like (possibly not real JS): return [summarizeObject(k) for (k of obj.slice(0, Math.min(obj.length, 32)))] + [(obj.length - 32) + " trailing elements omitted"]; WDYT? @@ +203,5 @@ > + Debugging: { > + /** > + * The latest message sent and still waiting for a reply. In DEBUG > + * builds, the entire message is stored, which may be memory-consuming. > + * In non-DEBUG builds, only the method name is stored. This comment is no longer accurate. (Might have been more straightforward if you had made the Debugging-as-object changes, then the summary+message # changes.) @@ +211,5 @@ > + /** > + * The latest reply received, or null if we are waiting for a reply. > + * In DEBUG builds, the entire response is stored, which may be > + * memory-consuming. In non-DEBUG builds, only exceptions and > + * method names are stored. Likewise. @@ +326,3 @@ > reply = error; > isError = true; > + if (error instanceof PromiseWorker.WorkerError) { I guess this refactoring makes this clearer...
Attachment #8394126 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 84•9 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #83) > Comment on attachment 8394126 [details] [diff] [review] > 4. Yet more detailed logging > > Review of attachment 8394126 [details] [diff] [review]: > ----------------------------------------------------------------- > > Is there a reason you're turning this on for non-DEBUG builds too? I worry > that we're going to be collecting all this information in non-debug (i.e. > release) builds and that's going to slow things down. For some reason, the error doesn't seem to happen on DEBUG builds, so I need info on non-DEBUG builds. > r=me if we can turn this back into mostly DEBUG-only logging before the next > uplift. > @@ +160,5 @@ > > + if (Array.isArray(obj)) { > > + if (obj.length > 32) { > > + return {"Long array": obj.length}; > > + } > > + return [summarizeObject(k) for (k of obj)]; > > Perhaps it would be better to do something like (possibly not real JS): > > return [summarizeObject(k) for (k of obj.slice(0, Math.min(obj.length, > 32)))] + [(obj.length - 32) + " trailing elements omitted"]; > > WDYT? Actually, I don't think we are ever going to get an array with size > 5 in the current setting, so I wanted to go for something simple. What do you think about keeping such a refinement for later, if needed? > @@ +326,3 @@ > > reply = error; > > isError = true; > > + if (error instanceof PromiseWorker.WorkerError) { > > I guess this refactoring makes this clearer... Logging was making this piece of code too complicated for its own good. I took the opportunity to clean it up a little.
![]() |
||
Comment 85•9 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #84) > > Is there a reason you're turning this on for non-DEBUG builds too? I worry > > that we're going to be collecting all this information in non-debug (i.e. > > release) builds and that's going to slow things down. > > For some reason, the error doesn't seem to happen on DEBUG builds, so I need > info on non-DEBUG builds. Ah, I see I was misled by the mis-starred failures above. OK, then, yes, we do need this in opt builds. > > @@ +160,5 @@ > > > + if (Array.isArray(obj)) { > > > + if (obj.length > 32) { > > > + return {"Long array": obj.length}; > > > + } > > > + return [summarizeObject(k) for (k of obj)]; > > > > Perhaps it would be better to do something like (possibly not real JS): > > > > return [summarizeObject(k) for (k of obj.slice(0, Math.min(obj.length, > > 32)))] + [(obj.length - 32) + " trailing elements omitted"]; > > > > WDYT? > > Actually, I don't think we are ever going to get an array with size > 5 in > the current setting, so I wanted to go for something simple. What do you > think about keeping such a refinement for later, if needed? That's fine. If you wanted to bump down the max size summarized for this, that'd be OK too.
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Reporter | ||
Comment 86•9 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/18f579c4308e
Keywords: checkin-needed
Whiteboard: [leave open] → [leave open][fixed-in-fx-team]
Assignee | ||
Comment 87•9 years ago
|
||
Try (after quickfix ex->error): https://tbpl.mozilla.org/?tree=Try&rev=3fe41b0017df
Assignee | ||
Comment 88•9 years ago
|
||
Talos xperf Try: https://tbpl.mozilla.org/?tree=Try&rev=bc4d65c55845
Assignee | ||
Comment 89•9 years ago
|
||
Oops, typo.
Attachment #8394126 -
Attachment is obsolete: true
Attachment #8394801 -
Flags: review+
Reporter | ||
Comment 90•9 years ago
|
||
Something in this checkin-needed push caused Win7 talos xperf regressions. It wasn't clear to me which was at fault, so I backed out the entire push. Please verify that your patch wasn't at fault and re-request checkin when ready :) https://hg.mozilla.org/integration/fx-team/rev/6d0a341040a9 https://tbpl.mozilla.org/php/getParsedLog.php?id=36500476&tree=Fx-Team
Whiteboard: [leave open][fixed-in-fx-team] → [leave open]
Reporter | ||
Comment 91•9 years ago
|
||
Try runs look good, thanks :) https://hg.mozilla.org/integration/fx-team/rev/8f768ced9020
Whiteboard: [leave open] → [leave open][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/8f768ced9020
Whiteboard: [leave open][fixed-in-fx-team] → [leave open]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 95•9 years ago
|
||
So, the latest DUP informs us that: - it's not the first message; - a reset is pending at the time of the crash; - one call to writeAtomic remains unanswered. I believe that it's the same issue as the OS.File part of bug 961317. There's a patch over there pending review.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 101•9 years ago
|
||
Tomcat, could you file a separate bug for this one and Cc bsmedberg and gps? The crash is part of experiments.jsm
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 104•9 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on April 9th-16th) from comment #101) > Tomcat, could you file a separate bug for this one and Cc bsmedberg and gps? > The crash is part of experiments.jsm yeah sorry, filed bug 996601 for it
Flags: needinfo?(cbook)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 106•9 years ago
|
||
Wrong bug, honzab, that should be bug 995027. As a side-note, I believe that the current bug was fixed by the patch of bug 961317. Closing. Please reopen if needed.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(honzab.moz)
Resolution: --- → FIXED
![]() |
||
Comment 107•9 years ago
|
||
I believe you. Not sure that bug popped on me tho. It's hard, too similar ones one by one :)
Flags: needinfo?(honzab.moz)
Reporter | ||
Comment 108•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #107) > I believe you. Not sure that bug popped on me tho. It's hard, too similar > ones one by one :) It shows up in the TBPL suggestion list and you can see the Experiments.jsm stuff in the parsed log immediately prior to the failure.
status-b2g-v1.4:
--- → fixed
status-b2g-v2.0:
--- → fixed
status-firefox29:
--- → unaffected
status-firefox30:
--- → fixed
status-firefox31:
--- → fixed
status-firefox-esr24:
--- → unaffected
Whiteboard: [leave open]
Target Milestone: --- → mozilla31
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 111•9 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?" - I'll be away on May 8th-12th) from comment #110) > Gijs: Nope, unrelated bug. Thanks! Sorry, meant to star bug 924622, not sure what went wrong there. :-\
Flags: needinfo?(gijskruitbosch+bugs)
Comment 112•9 years ago
|
||
Tweaking summary to avoid future mis-stars.
Summary: Intermittent B2G desktop Shutdown | application crashed [@ mozalloc_abort(char const*)] after "###!!! ABORT: file resource://gre/modules/AsyncShutdown.jsm" → Intermittent B2G desktop Shutdown "###!!! ABORT: file resource://gre/modules/AsyncShutdown.jsm"
You need to log in
before you can comment on or make changes to this bug.
Description
•