Closed
Bug 1259305
Opened 9 years ago
Closed 8 years ago
Intermittent browser_download.js | application crashed [@ mozilla::net::ChannelEventQueue::Resume] after Assertion failure: mSuspendCount > 0, at /home/worker/workspace/build/src/netwerk/ipc/ChannelEventQueue.cpp:64
Categories
(Core :: Networking, defect, P3)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla52
People
(Reporter: philor, Assigned: dragana)
References
Details
(Keywords: assertion, intermittent-failure, Whiteboard: [necko-active])
Attachments
(1 file, 3 obsolete files)
4.18 KB,
patch
|
dragana
:
review+
|
Details | Diff | Splinter Review |
Assignee | ||
Comment 2•9 years ago
|
||
I am going to take a look at this, I was making changes in this code recently.
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Comment 3•9 years ago
|
||
[Tracking Requested - why for this release]:
Updated•9 years ago
|
blocking-b2g: 2.2r? → ---
status-firefox48:
affected → ---
tracking-b2g:
backlog → ---
Flags: a11y-review?
Assignee | ||
Comment 4•9 years ago
|
||
Attachment #8734682 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 5•9 years ago
|
||
![]() |
||
Updated•9 years ago
|
Whiteboard: [necko-active]
Assignee | ||
Updated•9 years ago
|
Attachment #8734682 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 6•9 years ago
|
||
I am asking for a review, but I am open for suggestions. Suspending divert gets complicated :)
a different solution that I was thinking of is to inform nsHttpChannel that it has a HttpChannelParent from the beginning (not only during diversion) and to suspend parent as weel all the time :)
Attachment #8734682 -
Attachment is obsolete: true
Attachment #8736765 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 7•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(jduell.mcbugs)
Comment 8•8 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
![]() |
||
Comment 9•8 years ago
|
||
Comment on attachment 8736765 [details] [diff] [review]
bug_1259305_v1.patch
Stealing review from jason on his request.
Attachment #8736765 -
Flags: review?(jduell.mcbugs) → review?(honzab.moz)
![]() |
||
Comment 10•8 years ago
|
||
Dragana, I need some background on this patch. Can you please explain:
- cause of this bug
- what the patch exactly does and WHY
I'm missing good comments on rational in both bugzilla and the patch.
Thanks.
Flags: needinfo?(dd.mozilla)
![]() |
||
Comment 11•8 years ago
|
||
Comment on attachment 8736765 [details] [diff] [review]
bug_1259305_v1.patch
The conclusion here is that Dragana is going to find out why this patch has been introduced and if it's still needed. Until then I will drop the r? flag.
Attachment #8736765 -
Flags: review?(honzab.moz)
Assignee | ||
Comment 12•8 years ago
|
||
Before bug 1201170 it was not possible to suspend channel during diversion.
bug 1201170 made possible this, but it has a bug that this patch should fix.
So to suspend OnDatAv. during divert we need to suspend mEventQ at parent. nsHttpChannelParent::suspend is not call during diver but only the nsHttpChannel. Bug 1201170 added this MessageDiversionStart/MessageDiversionStop and mParentChannel->SuspendMessageDiversion();
The bug from the bug 1201170 is that suspend count in nsHttpChannel and HttpChannelParent::mEventQ was not the same. so when divert start we need to suspend mEventQ as many time as nsHttpChannel is suspended minus 1 (this one is "suspend nsHttpChannel during diversion" that is called by HttpChannelParent it elf)
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 13•8 years ago
|
||
Attachment #8736765 -
Attachment is obsolete: true
Attachment #8801157 -
Flags: review?(honzab.moz)
Assignee | ||
Comment 14•8 years ago
|
||
![]() |
||
Comment 15•8 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #14)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=2d6ef4e2b94acdbefd27576c32efc485ecae334b
Is the test still failing or not with this patch? The try looks not more orange than usual, but please star the runs so that we are sure this 1) fixes the problem 2) doesn't break anything else.
![]() |
||
Comment 16•8 years ago
|
||
Comment on attachment 8801157 [details] [diff] [review]
bug_1259305_v1.patch
Review of attachment 8801157 [details] [diff] [review]:
-----------------------------------------------------------------
In general, I really strongly dislike flags of this type (msuspendedfor this or that..) however, the patch makes somewhat sense (spent an hour studying the code, I'm probably not the right reviewer for this..)
I think only the comments need tuning here.
disclaimer: didn't test locally, nor I'm super expert to the diversion and synthetization code.
::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +577,5 @@
> {
> LOG(("HttpChannelParent::RecvSuspend [this=%p]\n", this));
>
> if (mChannel) {
> + mChannel->Suspend();
can you add a comment why we call Suspend() and not SuspendInternal() here?
@@ +588,5 @@
> {
> LOG(("HttpChannelParent::RecvResume [this=%p]\n", this));
>
> if (mChannel) {
> + mChannel->Resume();
similar
@@ +1361,5 @@
> MOZ_ASSERT(!mDivertingFromChild, "Already suspended for diversion!");
> return NS_ERROR_UNEXPECTED;
> }
>
> + // MessageDiversionStarted call will suspend mEventQ as meny time as the
as many times
@@ +1362,5 @@
> return NS_ERROR_UNEXPECTED;
> }
>
> + // MessageDiversionStarted call will suspend mEventQ as meny time as the
> + // channel has been suspended, so that channel and this queue are sync.
are in sync ?
@@ +1364,5 @@
>
> + // MessageDiversionStarted call will suspend mEventQ as meny time as the
> + // channel has been suspended, so that channel and this queue are sync.
> + // to suspend nsHttpChannel but not to suspend mEventQ wi will call
> + // mChannel->SuspendInternal() otherwise wi will call mChannel->Suspend()
s/wi/we/ ? also, this comment doesn't make much sense to me nor I see the code you are referring to, tho it seems to be the most important part.
@@ +1374,5 @@
> // been called and thus the channel may not be pending. If we've already
> // automatically suspended after synthesizing the response, then we don't
> // need to suspend again here.
> if (!mSuspendAfterSynthesizeResponse) {
> + // Suspend only nsHttpChannel but do not suspend mEventQ.
say why please
@@ +1384,5 @@
> // the response.
> mSuspendedForDiversion = true;
> +
> + // To be consistent when mSuspendAfterSynthesizeResponse suspends the channel
> + // we do not need to suspend mEventQ so wi will resume it.
Is it so that this code cannot execute BEFORE HttpChannelParent::ResponseSynthesized()? Could we assert that with mWillSynthesizeResponse == false if possible? (with some explanatory comment?)
Your comment should be a bit improved. As I understand, when mSuspendAfterSynthesizeResponse is true, the channel is suspended now (better said, was suspended after the synthetization has been done, based on the premise above).
The mSuspendedForDiversion flag is here because mChannel->SuspendInternal() may fail when after onstartrequest notification (mPending on nshttpchannel is false), so we must not call resume() on the httpchannel later in that case, would fail or cause some kind of inconsistancy? (Honestly, I think that adding |NS_ENSURE_TRUE(mIsPending, NS_ERROR_NOT_AVAILABLE);| also to resumeinternal() would do... but that's a differnt bug/task.)
So, you do mEventQ->Resume() here because the event queue has been suspended one more time by call to mChannel->MessageDiversionStarted(this) (which suspends the message queue the same number of times the channel is suspended). But since we do not actually suspend the channel here, we want to lower the number of mEventQ suspension times by one. OK.. makes sense :D
Oh yeah...
Attachment #8801157 -
Flags: review?(honzab.moz) → review+
Assignee | ||
Comment 17•8 years ago
|
||
comments updated.
Attachment #8801157 -
Attachment is obsolete: true
Attachment #8807097 -
Flags: review+
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Comment 18•8 years ago
|
||
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3219bfc7b776
If nsHttpChannel has been suspended before divert to parent starts, suspend message diversion. r=mayhemer
Keywords: checkin-needed
Reporter | ||
Comment 19•8 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment 20•8 years ago
|
||
Is it worth taking this on Aurora for 51? If so, please nominate :)
status-firefox49:
--- → wontfix
status-firefox50:
--- → wontfix
status-firefox51:
--- → affected
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 21•8 years ago
|
||
It was not causing any issues til now, because we do not suspend channels that early at the moment. So no need to uplift.
Flags: needinfo?(dd.mozilla)
Updated•8 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 23•7 years ago
|
||
The latest error message doesn't matched with the original signature.
@philor can you help create a new one and move the OrangeFactor linkage to that?
INFO - TEST-START | dom/workers/test/serviceworkers/browser_download.js
INFO - GECKO(3190) | JavaScript error: jar:file:///builds/worker/workspace/build/application/firefox/omni.ja!/components/nsHelperAppDlg.js, line 538: TypeError: dialog is null
INFO - TEST-INFO | started process screentopng
INFO - TEST-INFO | screentopng: exit 0
INFO - Buffered messages logged at 17:40:55
INFO - Global property added while loading chrome://browser/content/browser-thumbnails.js: getTopSiteURLs
INFO - Console message: [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "resource://gre/modules/FileUtils.jsm" line: 174}]
INFO - Buffered messages logged at 17:40:57
INFO - Console message: [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "resource://gre/modules/FileUtils.jsm" line: 174}]
INFO - Buffered messages logged at 17:40:58
INFO - Console message: [JavaScript Warning: "Unknown pseudo-class or pseudo-element ‘-moz-tree-line’. Ruleset ignored due to bad selector." {file: "chrome://global/content/xul.css" line: 629}]
INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/dom/workers/test/serviceworkers/download/window.html" line: 0}]
INFO - Buffered messages logged at 17:41:01
INFO - Console message: [JavaScript Error: "TypeError: dialog is null" {file: "jar:file:///builds/worker/workspace/build/application/firefox/omni.ja!/components/nsHelperAppDlg.js" line: 538}]
INFO - @jar:file:///builds/worker/workspace/build/application/firefox/omni.ja!/components/nsHelperAppDlg.js:538:1
INFO -
INFO - Buffered messages finished
INFO - TEST-UNEXPECTED-FAIL | dom/workers/test/serviceworkers/browser_download.js | Test timed out -
Flags: needinfo?(philringnalda)
Reporter | ||
Comment 24•7 years ago
|
||
One misstarred timeout on a try push, which might have been what he relanded or might have been a partial change, or might have been what was backed out? No thanks, I'll wait for it to happen somewhere other than try before filing that.
Flags: needinfo?(philringnalda)
You need to log in
before you can comment on or make changes to this bug.
Description
•