Closed
Bug 1158189
Opened 10 years ago
Closed 6 years ago
shutdownhang in mozilla::net::nsHttpConnectionMgr::Shutdown()
Categories
(Core :: Networking: HTTP, defect, P1)
Tracking
()
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
firefox37 | --- | wontfix |
firefox38 | + | wontfix |
firefox39 | + | wontfix |
firefox40 | + | wontfix |
firefox41 | - | wontfix |
firefox42 | - | wontfix |
firefox43 | --- | wontfix |
firefox44 | --- | wontfix |
firefox46 | + | wontfix |
firefox47 | --- | wontfix |
firefox48 | --- | wontfix |
firefox49 | --- | wontfix |
firefox50 | --- | affected |
firefox51 | --- | affected |
firefox52 | --- | wontfix |
firefox55 | --- | affected |
People
(Reporter: mayhemer, Assigned: dragana)
References
Details
(Keywords: crash, topcrash-win, Whiteboard: [necko-active])
Crash Data
Attachments
(3 files, 1 obsolete file)
[Tracking Requested - why for this release]:
[Tracking Requested - why for this release]:
[Tracking Requested - why for this release]:
+++ This bug was initially created as a clone of Bug #1124880 +++
This bug was filed from the Socorro interface and is
report bp-35103a28-d12f-4999-9066-4f6ed2150122.
=============================================================
This is probably what a good part of bug 1103833 is mutating to after the fix of bug 1104317 - at least from the looks of early 36.0b2 data where this constitutes >1% of all crashes.
Find stats and more of those reports at https://crash-stats.mozilla.com/report/list?signature=shutdownhang+%7C+WaitForSingleObjectEx+%7C+WaitForSingleObject+%7C+PR_Wait+%7C+nsThread%3A%3AProcessNextEvent%28bool%2C+bool%2A%29+%7C+NS_ProcessNextEvent%28nsIThread%2A%2C+bool%29+%7C+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown%28%29
Comment 1•10 years ago
|
||
Tracking topcrash. Note that there are only 2 Betas left in the 38 cycle. If this is going to be fixed in 38, we need a fix really soon.
Comment 2•10 years ago
|
||
I think this is not purely a UDP thing now, so making summary more general.
Summary: shutdownhang in mozilla::net::nsHttpConnectionMgr::Shutdown() on pending UDP socket close → shutdownhang in mozilla::net::nsHttpConnectionMgr::Shutdown()
Comment 3•10 years ago
|
||
David, I know you love complex bugs. Do you have an idea for this one? Thanks
Flags: needinfo?(dmajor)
It is not clear to me what this bug is about. Comment 0 is just a copy of bug 1124880 comment 0. There is no text indicating why that bug was cloned. Are these crashes different somehow?
Flags: needinfo?(dmajor)
Comment 5•10 years ago
|
||
(In reply to David Major [:dmajor] from comment #4)
> It is not clear to me what this bug is about. Comment 0 is just a copy of
> bug 1124880 comment 0. There is no text indicating why that bug was cloned.
> Are these crashes different somehow?
In https://bugzilla.mozilla.org/show_bug.cgi?id=1124880#c72 it looks like Honza wanted to close out that bug to make it clear that his first batch of fixes landed and had some effect.
I think it is probably more useful to look at the dependent bugs from here, and address them individually, since you all have already done some work to isolate different causes for the shutdownhang signatures we're still seeing.
Comment 6•10 years ago
|
||
Too late for 38
Reporter | ||
Comment 7•10 years ago
|
||
(In reply to Liz Henry (:lizzard) from comment #5)
> (In reply to David Major [:dmajor] from comment #4)
> > It is not clear to me what this bug is about. Comment 0 is just a copy of
> > bug 1124880 comment 0. There is no text indicating why that bug was cloned.
> > Are these crashes different somehow?
>
> In https://bugzilla.mozilla.org/show_bug.cgi?id=1124880#c72 it looks like
> Honza wanted to close out that bug to make it clear that his first batch of
> fixes landed and had some effect.
>
> I think it is probably more useful to look at the dependent bugs from here,
> and address them individually, since you all have already done some work to
> isolate different causes for the shutdownhang signatures we're still seeing.
Exactly. Thanks.
I have this problem long time ago, likely relate to https. When I can't navigate to https site(like github), I still can navigate to http site, then I close the firefox, and there will be a background firefox.exe. Wait a long time, the crash will happen.
Comment 9•9 years ago
|
||
We could still take a patch for this for 39 if it lands soon. Otherwise wontfix until we are able to disentangle more shutdownhangs from the pack.
status-firefox41:
--- → ?
tracking-firefox41:
--- → +
Comment 10•9 years ago
|
||
Wontfix for 39. Looks like we have made some progress on fixing a bunch of the dependent bugs!
Comment 12•9 years ago
|
||
This is now wontfix for 40. I have flipped tracking back for 41 as there are a still a substantial number of crash reports. We really need to figure out how to unblock this bug.
status-firefox42:
--- → affected
tracking-firefox42:
--- → +
Comment 13•9 years ago
|
||
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #12)
. We really need to figure out
> how to unblock this bug.
this bug has 2 dependencies that are making (some) progress - and I hope they help the incident rate of this signature.
Similar to Sylvester and Liz's comments, this bug does not seem actionable by itself so untracking it. I have pinged folks to get the patch in bug 1152046 reviewed soon so it can be uplifted to Beta41.
Updated•9 years ago
|
Crash Signature: , bool) | mozilla::net::nsHttpConnectionMgr::Shutdown()]
[@ shutdownhang | ntdll.dll@0x3c6bc] → , bool) | mozilla::net::nsHttpConnectionMgr::Shutdown()]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | ntdll.dl…
Updated•9 years ago
|
status-firefox43:
--- → ?
status-firefox44:
--- → ?
Updated•9 years ago
|
Comment 16•9 years ago
|
||
I uninstalled hide my ip but not disable proxy in hide my ip. I must disable proxy in internet explorer.
Crash report
https://crash-stats.mozilla.com/report/index/ea00c1f4-f6b0-4ea7-8039-7fe5e2151008
Updated•9 years ago
|
Comment 17•9 years ago
|
||
1. Run Firefox and Hide MyIP
2. Change IP in this program
3. Not disable proxy and uninstall Hide MyIP. FF is run and not connect properly with internet, because Hide My IP is uninstalled.
4. Disable proxy global=disable proxy in Internet Explorer(FF is run)
5. Click Shutdown in Australis Hamburger Menu
6. FF crash with crash report:
https://crash-stats.mozilla.com/report/index/ea00c1f4-f6b0-4ea7-8039-7fe5e2151008
Comment 18•9 years ago
|
||
daniel - you're having fun with the proxy code right now and comment 17 looks like a str worth chasing..
Updated•9 years ago
|
Crash Signature: ntdll.dll@0x3c6bc] → ntdll.dll@0x3c6bc]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_Wait | mozilla::ReentrantMonitor::Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | WaitForSing…
Updated•9 years ago
|
Crash Signature: WaitForSingleObjectEx | PR_Wait | mozilla::ReentrantMonitor::Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown] → WaitForSingleObjectEx | PR_Wait | mozilla::ReentrantMonitor::Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread…
Updated•9 years ago
|
Crash Signature: nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown] → nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
(In reply to Patrick McManus [:mcmanus] from comment #18)
> daniel - you're having fun with the proxy code right now and comment 17
> looks like a str worth chasing..
Patrick, not sure which Daniel this was directed to. Could you please ni? This crash signature is at #7 spot on FF44 and is a long standing issue so any help would be awesome.
Flags: needinfo?(mcmanus)
Updated•9 years ago
|
Flags: needinfo?(mcmanus) → needinfo?(daniel)
Comment 20•9 years ago
|
||
So "Hide MyIP" creates a (socks?) proxy? And when you've remove it, Firefox can't connect to it anymore and when you then shut down Firefox (after that unsuccessful proxy use) it crashes?
The step #4 in comment #17 is interesting though if it is required. I'll investigate this.
Flags: needinfo?(daniel)
Updated•9 years ago
|
Assignee: nobody → daniel
Whiteboard: [necko-active]
Doug, Jason, Daniel, Fx44 stability has been an ongoing issue. This crash is ranked #3 on crash-stats for ~2 weeks and I don't see any activity here. Could you please investigate this at a high priority? I will work with you to get a patch uplifted to Beta44 (if safe) asap. Appreciate your help in getting a high quality release out there!
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(dougt)
Flags: needinfo?(daniel)
Comment 22•9 years ago
|
||
I've spent hours on trying the STR from comment #17, and I've ran various kinds of proxies and tried with them present and shutdown and Firefox in various states and conditions related to that, but I simply cannot reproduce this on my windows 7 machine.
I now suspect this bug is rather one of the more notorious and hard-to-catch shutdownhang problems of recent times, not necessarily strictly related to anything proxy.
Flags: needinfo?(daniel)
Comment 23•9 years ago
|
||
It stinks, but at the moment we have no idea how to fix this bug, assuming it's another instance of our socket-hangs-at-close-during-shutdown-on-windows problems. We're trying to reach out to any contacts we have at Microsoft to see if they have any suggestions.
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(dougt)
Comment 24•9 years ago
|
||
Here's a quick break down of the signatures involved. Unfortunately there's not much here to help. Looks like a recent change altered the signature in 44, but the problem didn't improve with those changes.
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20|%20WaitForSingleObjectEx%20|%20WaitForSingleObject%20|%20PR_WaitCondVar%20|%20nsThread%3A%3AProcessNextEvent%20|%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown#tab-sigsummary
- affects 44 and higher, second worst offender (3527), bug 1152046.
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20|%20WaitForSingleObjectEx%20|%20WaitForSingleObject%20|%20PR_Wait%20|%20nsThread%3A%3AProcessNextEvent%20|%20NS_ProcessNextEvent%20|%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown
- affects 43 and lower, worst offender by far (5932), bug 1152046.
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20|%20WaitForSingleObjectEx%20|%20WaitForSingleObject%20|%20PR_WaitCondVar%20|%20nsThread%3A%3AProcessNextEvent%20|%20NS_ProcessNextEvent%20|%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown#tab-sigsummary
- small numbers (88)
[@ shutdownhang | WaitForSingleObjectEx | PR_Wait | mozilla::ReentrantMonitor::Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20|%20WaitForSingleObjectEx%20|%20PR_Wait%20|%20mozilla%3A%3AReentrantMonitor%3A%3AWait%20|%20nsThread%3A%3AProcessNextEvent%20|%20NS_ProcessNextEvent%20|%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown
- small numbers (7)
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_Wait | mozilla::ReentrantMonitor::Wait | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20|%20WaitForSingleObjectEx%20|%20WaitForSingleObject%20|%20PR_Wait%20|%20mozilla%3A%3AReentrantMonitor%3A%3AWait%20|%20nsThread%3A%3AProcessNextEvent%20|%20NS_ProcessNextEvent%20|%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown
- affects 42 and lower, low numbers (465), mostly bug 1152046.
Comment 25•9 years ago
|
||
Here's a skim of recent hang stacks for the worst signature. the script looks at threads 4 through 8 and prints out each thread that has nss3.dll in it. Usually one of these threads will be the socket thread.
There's an even mixture of reports here, about half with 3rd party dlls (lsps?) and the other half with nothing but browser frames. I don't think it's fair to blame everything on the 3rd party code due to the mix here.
Comment 26•9 years ago
|
||
One thing I noticed, the two main signatures generally don't occur with e10s enabled. I wonder if the e10s versions of these hangs show up under different signature, or maybe e10s addresses the problem in some way.
https://crash-stats.mozilla.com/search/?product=Firefox&signature=%3Dshutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_Wait+|+nsThread%3A%3AProcessNextEvent+|+NS_ProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&dom_ipc_enabled=!__null__&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
https://crash-stats.mozilla.com/search/?product=Firefox&signature=%3Dshutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&dom_ipc_enabled=!__null__&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Assignee | ||
Comment 27•9 years ago
|
||
I think we are looking for a bug in a wrong place.
Looking at report:
https://crash-stats.mozilla.com/report/index/930f640b-1743-4550-bd1a-e9c9d2160112#allthreads
ClosingService is waiting on CondVar. That means that Shutdown is not called till then!
Call to ClosingService::Shutdown will set variable mShutdown and notify CondVar which should actually shutdown the thread.
So Shutdown in IOService is not called:
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/nsIOService.cpp#1079
and probably also nsSocketTransportService::Shutdown
I will look into this. (that it is working better with e10s, can be that computers are faster it confirms this theory because the main thread is less busy)
Assignee | ||
Comment 28•9 years ago
|
||
A quick test that I made:
We tried to check whether we are in shutdown, so that we do not make new connection, like:
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/nsSocketTransport2.cpp#1224
But at the point this shutdown crashes are happening IOService does not know that we are in shutdown.
Assignee | ||
Comment 29•9 years ago
|
||
Another thing I notice:
Before NS_XPCOM_SHUTDOWN_OBSERVER notification is sent (this is when IOService is set to shutdown)
kProfileChangeNetTeardownTopic notification is sent and this notification shuts donw SocketTransportService and nsHttpConnectionMng. So when IOService is set to be in shutdown SocketTransportService and nsHttpConnectionMng are already shut down. So gIOService->IsOffline() is not really useful.
Comment 30•9 years ago
|
||
removed all the js helper stacks that are sitting in a CondVar and tend to land around thread 7 and 8.
Attachment #8706705 -
Attachment is obsolete: true
Comment 31•9 years ago
|
||
I'm really at loss here and since I can't find any proxy correlation, I feel I better unassign myself from this.
Assignee: daniel → nobody
Updated•9 years ago
|
Assignee: nobody → dd.mozilla
Assignee | ||
Comment 32•9 years ago
|
||
Can you still reproduce this?
May I ask you to make http log and attach it here or send it to me via e-mail?
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Please change:
set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,io:5,sNotifyAddr:5
Thanks!
Flags: needinfo?(mkdante381)
Assignee | ||
Comment 34•9 years ago
|
||
So I had a look at some comments people left, and I notice that some of them mention facebook. So I opened 4-5 of them and they all have Http2Stream::WriteSegments or mozilla::net::Http2Session::CleanupStream on sockettransport thread stack.
so I searched crashes with facebook in a comment:
https://crash-stats.mozilla.com/search/?user_comments=facebook&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
and
https://crash-stats.mozilla.com/signature/?user_comments=facebook&signature=shutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=2
and I looked at couple of then, all have the same socketTransport thread stack.
I think we have a problem there:
https://crash-stats.mozilla.com/report/index/e7cf3937-9aee-47c9-a280-8b8132160204#allthreads
https://crash-stats.mozilla.com/report/index/4742a98c-2315-41a4-9721-686092160205#allthreads
...
Maybe sockettransport thread is hanging before user decide to shutdown firefox....
Comment 35•9 years ago
|
||
this is interesting.. thanks! I'm going to offer a partial theory and patch in a few minutes.. but more crash stats first:
https://crash-stats.mozilla.com/report/index/206ddb38-8b93-4658-9ae7-6d5dd2160208#allthreads
https://crash-stats.mozilla.com/report/index/29dd158a-cf5a-45ea-8263-c18f92160208#allthreads
https://crash-stats.mozilla.com/report/index/e7152983-4ace-4d5d-b915-d58a82160208#allthreads
and this one is interesting because its from gecko 47 and contains 1241906, so we know that won't fix it
https://crash-stats.mozilla.com/report/index/b6075322-659b-4d92-b28b-a30202160208#allthreads
https://crash-stats.mozilla.com/report/list?range_unit=days&range_value=28&signature=shutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown#tab-reports
Comment 36•9 years ago
|
||
all of these are on the stack of onsocketreadable() in http2 code. Now facebook has gone to h2 in the last couple of weeks (mid jan afaict) so it's not surprising if there was a latent bug in there that they've pushed the crash-stats related to it way up. That's a recurring 3 am dream for me.
Lots of things are possible, but I think the most plausible is that the onsocketreadable loop http://hg.mozilla.org/mozilla-central/annotate/76733110704b/netwerk/protocol/http/nsHttpConnection.cpp#l1793 isn't terminating during shutdown. Sure, its possible that some call inside the loop is spinning or blocked, but all of the stacks in the last few comments are actually different beyond onsocketreadable() which suggests to me that's the level of the loop.
That code relies on seeing a state variable set by the IO.. perhaps that's not going well because of shutdown.. perhaps an interaction with NSS and shutdown. I'm not really sure.
But it seems to me that we shouldn't be looping if the shutdown var is set, right? We do need to loop in the general case - there are some things in the state machine that need to be driven to completion and won't necessarily get new I/O events to make that happen.. but in the case of shutdown we can clearly bail out.
we seem to usually get 1 to 3 on nightly, so it would seem pretty easy to confirm it helps within a few days of landing
https://crash-stats.mozilla.com/signature/?date=%3E%3D2016-01-11T21%3A42%3A28.392440&version=47.0a1&signature=shutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1
Updated•9 years ago
|
Whiteboard: [necko-active] → [necko-active][spdy]
Assignee | ||
Comment 37•9 years ago
|
||
side note, can be a lead:
moving nsHttpConnectionMgr::Shutdown, bug 1238910, made this crash happening more often that is one of the reason I want to revert that change.
Assignee | ||
Comment 38•9 years ago
|
||
I know we need to loop in onsocketreadable, but can we put some control that it stops after sometime, i do not care can be long time like a second, just to be sure that we are not getting a long long loops there.
Assignee | ||
Comment 39•9 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #37)
> side note, can be a lead:
>
> moving nsHttpConnectionMgr::Shutdown, bug 1238910, made this crash happening
> more often that is one of the reason I want to revert that change.
actually from Patrick's comment #36 this is probably because of facebook and maybe not bug 1238910, let's see
Comment 40•9 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #39)
> (In reply to Dragana Damjanovic [:dragana] from comment #37)
> > side note, can be a lead:
> >
> > moving nsHttpConnectionMgr::Shutdown, bug 1238910, made this crash happening
> > more often that is one of the reason I want to revert that change.
>
> actually from Patrick's comment #36 this is probably because of facebook and
> maybe not bug 1238910, let's see
1238910 was only ever on firefox 46, so we ought to be able to figure it out by seeing which channels were effected (i.e. they all use facebook, but the patch was only on one of them..)
Comment 41•9 years ago
|
||
we'll give this a try over in 1246778. (ironically, try-server is down atm)
Assignee | ||
Comment 42•9 years ago
|
||
Interesting comment:
https://crash-stats.mozilla.com/report/index/dcd2214c-0b33-4dbe-821c-3e7402160207#allthreads
Maybe we are hanging in the pipe.
Assignee | ||
Comment 43•9 years ago
|
||
Assignee | ||
Comment 44•9 years ago
|
||
some comments mention that it is more often happening since last update. Patrick do you have an idea what has changed in 44? or maybe it is just facebook that switch to h2 :)
Assignee | ||
Comment 45•9 years ago
|
||
from some comments it looks like firefox behaved as if it does not have network connectivity, probably sockettransport thread blocked, and the user want to shut ff down and this crash happens....
I can reproduce this on Windows 7 SP1, Nvidia Quadro 600. Simple STR - start Firefox, about:support, shutdown. This is on 44 release, so no E10S. Or, at least, I believe it's the same problem - I don't get a crash reporter, but firefox.exe lingers for 15-20 seconds after the window disappears.
Let me know if there is anything I can help with. I haven't collected the log, I'll do that tomorrow.
Assignee | ||
Comment 47•9 years ago
|
||
Can you attach a http log:
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Thanks!
Flags: needinfo?(milan)
In the meantime, since :dougt mentioned a possible security related problem - on this same machine, on startup, I sometimes fail to start and get "failed to get DNS service" in nsIOService.cpp:208, followed by an out of memory error from nsScriptSecurityManager.cpp:1351 and a MOZ_CRASH.
Comment 49•9 years ago
|
||
I'm interested in a log of a shutdown hang.. but a couple things make me think you're talking about things unrelated to this bug
1] comment 48 sounds a lot more like the DLL problems of 1243098 than our classic necko problems.. and we know your machine is subject to those problems.. or as you say maybe its a security problem. unfortunately I don't think those are the issue we're hunting in this bug.
2] the 15-20 second comment in comment 46.. the circumstance of interest here is definitely a deadlock.. so if it resolved itself (even after a long time) its probably something else.
but like I said, the log would still be interesting :)
It's possible this is a different problem. My stack has WaitForMultipleObjects on the top, and seems to have std::locale::~locale() in it, with the first mention of our code in crtExitProcess() call.
OK, log coming up.
Flags: needinfo?(milan)
Assignee | ||
Comment 52•9 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #51)
> Created attachment 8717973 [details]
> About a 15 second delay on shutdown with this log
Thanks!
This is not necko problem:
nsHttpHandler receive notification at:
2016-02-10 17:45:13.815000 UTC - 0[811140]: nsHttpHandler::Observe [topic="profile-change-net-teardown"]
and necko is shutdown and nsHttpConnectionMgr destroyed at:
2016-02-10 17:45:14.158000 UTC - 0[811140]: Destroying nsHttpConnectionMgr @dba25b0
there is no stalls on the socketTransport thread
Comment 53•9 years ago
|
||
this is a log i've received from a user from a machine with the same shutdown hang signature...
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 54•9 years ago
|
||
(In reply to philipp from comment #53)
> Created attachment 8719302 [details]
> Vista log.zip
>
> this is a log i've received from a user from a machine with the same
> shutdown hang signature...
Thanks a lot!
This one is really interesting log. I had a short look and it is not the same hang as one we have analyzed for bug 1244749.
I will analyze this one tomorrow.
Just to confirm: In bug 1244749 the hangs was always happening after a laptop was waking up after a longer sleep. The user that this log belongs to does not have the same experience, it is not happening after sleep? At least from the log I do not see it.
Flags: needinfo?(dd.mozilla)
Comment 55•9 years ago
|
||
could you specify under what circumstances you did that logging on your vista device in regards to comment #54 - was your device in sleep mode before?
Flags: needinfo?(thane)
Comment hidden (obsolete) |
Assignee | ||
Comment 57•9 years ago
|
||
I have open a new bug for the log from comment #53. It is easier to track.
The new bug is bug 1248358.
Assignee | ||
Comment 58•9 years ago
|
||
Thane, It would be really helpful if you could make a log with some more logging turned on, or run a build where we would put some more logging.Thanks!
Comment 59•9 years ago
|
||
Dragana - sure, no problem. I'm just a technician, not a programmer, so you'll have to let me know what you need done, and I'll do it.
Flags: needinfo?(thane)
Assignee | ||
Comment 60•9 years ago
|
||
Thanks a lot!
I will make a new build with more logging (You will not need to install anything, you can just run a binary).
I will make it tomorrow. And i will let you know what to do.
Comment 61•9 years ago
|
||
Sounds good. Thanks!
Comment 62•9 years ago
|
||
Not sure if this means anything, but making sure proxy is set to "no proxy" seem to improve things. However, there is no proxy set on the system, so I would think "use system proxy" would be the same as "no proxy".
Comment 63•9 years ago
|
||
Well, not entirely. On the Win7 computer, the problem appears to be resolved with "No proxy" set. On the Vista machine, I can now get to some pages (news.google.ca opens, but no links from that page work).
Here's the most recent crash report from the Vista computer.
https://crash-stats.mozilla.com/report/index/bp-383af48c-cfa6-4ad3-aa3f-8e0032160215
Comment 64•9 years ago
|
||
I have another Windows 7 computer exhibiting the same problems. I'll be looking at it today and will post more information.
Comment 65•9 years ago
|
||
Update: All the computers I am using are running Windows Firewall only.
Comment 66•9 years ago
|
||
I just disabled Windows Firewall on one of the affected machines. Still getting the same slow page load and crashes.
Assignee | ||
Comment 67•9 years ago
|
||
A look at some stack from the last week (all with signature: shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown)
https://crash-stats.mozilla.com/report/list?product=Firefox&signature=shutdownhang+%7C+WaitForSingleObjectEx+%7C+WaitForSingleObject+%7C+PR_WaitCondVar+%7C+nsThread%3A%3AProcessNextEvent+%7C+NS_ProcessNextEvent+%7C+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown#tab-reports
I will start with easy ones:
1) cause by some strange lsps:
https://crash-stats.mozilla.com/report/index/d5187fb5-6d43-4c9b-b64d-0eaae2160316#allthreads
https://crash-stats.mozilla.com/report/index/3a255270-f5b9-4511-bfdb-c077d2160315#allthreads
https://crash-stats.mozilla.com/report/index/e5f5f174-d8e8-43d1-bf4f-fa8892160315#allthreads
https://crash-stats.mozilla.com/report/index/c658a69b-1ea0-495b-bd30-2c9af2160314#allthreads
https://crash-stats.mozilla.com/report/index/d6efbfc0-126d-4440-bc28-80dd12160314#allthreads
https://crash-stats.mozilla.com/report/index/7fccb71a-8bfb-428c-a2c5-6d7042160313#allthreads
https://crash-stats.mozilla.com/report/index/4802791b-3030-4832-85f8-312e82160311#allthreads
2) AutoDialer - dialer is going to be disabled:
https://crash-stats.mozilla.com/report/index/0de537c7-a008-4eb5-9f2c-3ad6d2160309#allthreads
https://crash-stats.mozilla.com/report/index/b3fdf620-9735-409a-a8c1-1fb2b2160312#allthreads
https://crash-stats.mozilla.com/report/index/1d12f5cc-2cc3-43f9-b000-284092160314#allthreads
https://crash-stats.mozilla.com/report/index/f0d673aa-4909-48b9-88f9-246952160316#allthreads
2) PR_Connect hangs - all ar winXP: 5.1.2600 Service Pack 2 or 3 - Maybe we should increase shutdown time for xp and see if this helps, maybe they are just slow :(
https://crash-stats.mozilla.com/report/index/a2bf38d3-bd90-4339-a74e-1f0e72160311#allthreads
https://crash-stats.mozilla.com/report/index/83b1cc7a-d837-4bee-8dc4-f33752160311#allthreads
https://crash-stats.mozilla.com/report/index/9ce3c419-d529-4538-824d-1e8ef2160312#allthreads
https://crash-stats.mozilla.com/report/index/ad7deb85-6599-4460-9134-63f612160315#allthreads
https://crash-stats.mozilla.com/report/index/d6120cfb-61c1-4463-90d6-ea82e2160316#allthreads
3) this one I am not sure can be h2 bug:
https://crash-stats.mozilla.com/report/index/531451c6-387d-4214-ad08-3c4062160309#allthreads
4) security involved:
PSMSend security/manager/ssl/nsNSSIOLayer.cpp
nsSSLIOLayerWrite security/manager/ssl/nsNSSIOLayer.cpp
PR_Write nsprpub/pr/src/io/priometh.c
nsSocketOutputStream::Write(char const*, unsigned int, unsigned int*) netwerk/base/nsSocketTransport2.cpp
mozilla::net::nsHttpConnection::EnsureNPNComplete()
I will open another bug for this.
https://crash-stats.mozilla.com/report/index/8854d193-7d3d-40a9-9c23-af0e42160313#allthreads
https://crash-stats.mozilla.com/report/index/b6f1b596-c33e-494d-8280-874eb2160311#allthreads
https://crash-stats.mozilla.com/report/index/f2f987bf-8baa-46b8-914b-301862160311#allthreads
5)Stacks is missing on this one, I will ignore it for now:
https://crash-stats.mozilla.com/report/index/4a022504-1928-4557-8541-d559c2160313#allthreads
6) PR_Poll hangs :(
Some of them have very short uptime, even 75s probably just a coincident. 3 are under 4min
3 have almost the same uptime and last crash time
https://crash-stats.mozilla.com/report/index/93bf6db7-6eb2-40b2-b00b-03c8b2160311#allthreads
https://crash-stats.mozilla.com/report/index/6043d6f5-a5a3-4ace-ae2b-61c2a2160311#allthreads
https://crash-stats.mozilla.com/report/index/1e867aa5-ecb2-40d0-b8a2-bf2502160312#allthreads
https://crash-stats.mozilla.com/report/index/5d4bb802-8879-4ea8-ba56-d735e2160312#allthreads
https://crash-stats.mozilla.com/report/index/868902e1-4474-4e16-95ad-72a932160314#allthreads
https://crash-stats.mozilla.com/report/index/4112ce8c-1820-4258-b09f-f57b62160315#allthreads
https://crash-stats.mozilla.com/report/index/f8ad7344-ccd9-4fee-a329-21f2e2160315#allthreads
https://crash-stats.mozilla.com/report/index/1d459064-233c-4c1b-abcd-f7d412160315#allthreads
https://crash-stats.mozilla.com/report/index/67adecf9-90b5-4e2e-9c2b-caa772160315#allthreads
https://crash-stats.mozilla.com/report/index/01c86943-aa8f-48f2-84f2-636652160316#allthreads
https://crash-stats.mozilla.com/report/index/5ada73fe-788d-481c-8a74-ff9422160316#allthreads
Assignee | ||
Comment 68•9 years ago
|
||
For crashes under 6) in comment 67:
Hope bugs 698882 and 1244749 will help.
Updated•9 years ago
|
Comment 69•9 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #67)
> A look at some stack from the last week (all with signature
thanks for this update!
> 3) this one I am not sure can be h2 bug:
> https://crash-stats.mozilla.com/report/index/531451c6-387d-4214-ad08-
> 3c4062160309#allthreads
that's with 44.0b2, which doesn't have the 1247205 fix, so we can ignore it until we have a report that shows up with that fix
>
> 4) security involved:
a subset of those could also be 1247205 (and are 44.0b2)
Whiteboard: [necko-active][spdy] → [necko-active]
Assignee | ||
Comment 70•9 years ago
|
||
ff 46 have a different signature:
Crash Reports for shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown
So I had a look at some of this stacks:
47 belonged to 2) in comment #75 . 2 of them were win7 SP1 not win xp
1 belonged to 4) in comment #75
3 were dialer
12 belonged to 1) in comment #75
So we have a problem with PR_Connect on winXP :)
From some statistic on ff46 with this signature 75% of crashes are on xp
We need a better way to analyze stacks!!!!
I will file a bug to increase shutdown timer on xp, just to try it out, but I am afraid that we have a problem with PR_Connect on xp and this will not help.
Assignee | ||
Comment 71•9 years ago
|
||
I compared the first week of beta 43 and first week of beta 46 (44 and 45 have some other changes that could influence results)
11/3/2015-11/10/2015 - 1698 crashes beta43
3/8/2016 -3/15/2016 - 718 crashes beta46
xp 710 506
xp prof 0 1
vista 22 9
win7 595 156
win8 62 5
win8.1 201 32
win10 107 9
android 1 0
Crashes decreased on win10 and win8 and partially on win7 and less on winXP
(All 9 crashes on win 10 are due to viruses and 1 because of autodail)
on winXP there is a problem with PR_Connect.
Suggestion: We should improved telemetry for PR_Connect - Bug 1257809
increase shutdown-crash-timeout: Bug 1257216 It would be good to make this change in beta for a week and look at results (It is only a pref change)
Comment 72•9 years ago
|
||
This is the #3 topcrash in beta 2, with 2.25% of total crashes:
https://crash-stats.mozilla.com/signature/?product=Firefox&signature=shutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&version=46.0b2&date=%3C2016-03-22T01%3A35%3A18&date=%3E%3D2016-03-15T01%3A35%3A18
I'll track the bugs suggested in comment 71 since they may help with this crash spike.
status-firefox46:
--- → affected
tracking-firefox46:
--- → +
Comment 73•9 years ago
|
||
Still the #3 crash in beta 5.
Comment 74•9 years ago
|
||
beta 5: 4534 crashes, 2.64% from 73634 crashes (we had several extra days of beta 5 data)
beta 8: 1174 crashes, 2.45% from 20514 crashes
We have a patch that may help for beta 9, https://bugzilla.mozilla.org/show_bug.cgi?id=1259089
Updated•9 years ago
|
Blocks: e10s-crashes
Comment 75•9 years ago
|
||
fyi, Dragana has requested aurora uplift for 698882 over in that bug. Something has significantly improved this bug's occurance on nightly recently and our best guess is that it is the fix for 698882 (which was somewhat unanticipated). Unfortunately, that's not a trivial patch.
Comment 76•9 years ago
|
||
see https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c110 for dragana's analysis of the crash rate on nightly.. basically it was roughly one per day through the 0324 build and it stopped abruptly after that.
Assuming that's not just a massive stroke of luck, after talking it over our best guess is 698882 plays a role. There are a couple other possibilities (1260218, 905460) but neither seems as likely.. and they were both committed several days after 698882.
Comment 78•9 years ago
|
||
this signature seems to skyrocket in early 47.0b2 data & makes up nearly 25% of crashes there. could you take a look?
https://crash-stats.mozilla.com/search/?product=Firefox&process_type=browser&version=47.0b2&signature=%3Dshutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|+PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+NS_ProcessNextEvent+|+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&_facets=signature&_facets=user_comments&_facets=platform_pretty_version&_facets=install_time&_columns=date&_columns=signature&_columns=version#crash-reports
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 79•9 years ago
|
||
(In reply to philipp from comment #78)
> this signature seems to skyrocket in early 47.0b2 data & makes up nearly 25%
> of crashes there. could you take a look?
>
> https://crash-stats.mozilla.com/search/
> ?product=Firefox&process_type=browser&version=47.
> 0b2&signature=%3Dshutdownhang+|+WaitForSingleObjectEx+|+WaitForSingleObject+|
> +PR_WaitCondVar+|+nsThread%3A%3AProcessNextEvent+|+NS_ProcessNextEvent+|+mozi
> lla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&_facets=signature&_facets
> =user_comments&_facets=platform_pretty_version&_facets=install_time&_columns=
> date&_columns=signature&_columns=version#crash-reports
Thanks, I am looking... (If we do not figure it out soon we will back it out 698882 :) )
Flags: needinfo?(dd.mozilla) → needinfo?(mcmanus)
Assignee | ||
Comment 80•9 years ago
|
||
Patrick, more info: 698882 landed on 47.0b2
Assignee | ||
Comment 81•9 years ago
|
||
I have double check only thing that have landed on nssSocketTransportService2 and nsIOService between nigthly and beta are: increase of number of active sockets, one timer for a better telemetry, some nsILoadInfo changes for create URI... Nothing of that should have influence on PollableEvent :(
So I expect that this is going to hit ff48 too when it gets to beta.
Some thoughts:
Maybe PR_Write to the socket pair fails sometime and in the old version we use to try it for every event dispatch. In the current version we try it only for the first even (until Poll wakes up)
We could try to set mSignaled only if PR_Write succeeds?
See: https://hg.mozilla.org/releases/mozilla-beta/annotate/tip/netwerk/base/PollableEvent.cpp#l227
I find his a good idea so I will open a separate bug.
We could try to land this on beta and see if it helps.
Also maybe add a watchdog to retry it if it fails.
Comment 82•9 years ago
|
||
definitely put 1270029 on 47.. but I'm not 100% its the root of the bug.
is this on 48 at all?
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 83•9 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #82)
> definitely put 1270029 on 47.. but I'm not 100% its the root of the bug.
I still does not know what is the root of the problem.
>
> is this on 48 at all?
698882 landed on nightly48.
There is really no hangs in nsHttpConnectionMgr since 24. March
Have i miss something, all 698882 blockers landed on 47 as well.
Assignee | ||
Comment 84•9 years ago
|
||
This is more a start up hang. Uptime are less then 210s.
Assignee | ||
Comment 85•9 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #84)
> This is more a start up hang. Uptime are less then 210s.
my mistake ,some are short but not all of them. (crash-stats page can be made better)
Assignee | ||
Comment 86•9 years ago
|
||
one more note, it seams that is crashing for some users constantly (last crash field is very low)
Comment 87•9 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #80)
> Patrick, more info: 698882 landed on 47.0b2
Hrm, bad news - this is still pretty bad in 47.0b2, it's currently at the top of the list.
https://crash-stats.mozilla.com/search/?product=Firefox&version=47.0b2&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Assignee | ||
Comment 88•9 years ago
|
||
I will ask for backout of 698882. I do no know what went wrong with uplift.
Hello Dragana, Wes, Patrick: I noticed the back out in https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c144, but IIUC, there were two commits in Fx47 for bug 698882, these are https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c142 (which was backed out in comment 144) and also https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c129. Do we also need to backout what landed in comment 129 in that bug?
Flags: needinfo?(wkocher)
Flags: needinfo?(mcmanus)
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 90•9 years ago
|
||
(In reply to Ritu Kothari (:ritu) from comment #89)
> Hello Dragana, Wes, Patrick: I noticed the back out in
> https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c144, but IIUC, there
> were two commits in Fx47 for bug 698882, these are
> https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c142 (which was backed
> out in comment 144) and also
> https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c129. Do we also need to
> backout what landed in comment 129 in that bug?
it was backed out in comment 130 (https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c130)
So everything is out.
Flags: needinfo?(wkocher)
Flags: needinfo?(mcmanus)
Flags: needinfo?(dd.mozilla)
(In reply to Dragana Damjanovic [:dragana] from comment #90)
> (In reply to Ritu Kothari (:ritu) from comment #89)
> > Hello Dragana, Wes, Patrick: I noticed the back out in
> > https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c144, but IIUC, there
> > were two commits in Fx47 for bug 698882, these are
> > https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c142 (which was backed
> > out in comment 144) and also
> > https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c129. Do we also need to
> > backout what landed in comment 129 in that bug?
>
> it was backed out in comment 130
> (https://bugzilla.mozilla.org/show_bug.cgi?id=698882#c130)
>
> So everything is out.
Oh my bad. But thanks for the confirmation.
Comment 92•9 years ago
|
||
This hang is definitely not happening in 48 or 49, I'm thinking we could resolve this bug out as having been fixed in 48 by bug 698882?
https://crash-stats.mozilla.com/signature/?product=Firefox&version=47.0b2&signature=shutdownhang+%7C+WaitForSingleObjectEx+%7C+WaitForSingleObject+%7C+PR_WaitCondVar+%7C+nsThread%3A%3AProcessNextEvent+%7C+NS_ProcessNextEvent+%7C+mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown
Any reason this would have become *massively* more common in 47.0b2 than it was in 47.0b1? (In 47.0b2 so far, it's 16% of the total crashes in the e10s test cohort, compared to 1.5% in 47.0b1.)
Assignee | ||
Comment 94•9 years ago
|
||
(In reply to David Baron [:dbaron] ⌚️UTC-7 (review requests must explain patch) from comment #93)
> Any reason this would have become *massively* more common in 47.0b2 than it
> was in 47.0b1? (In 47.0b2 so far, it's 16% of the total crashes in the e10s
> test cohort, compared to 1.5% in 47.0b1.)
it is caused by the uplift of the bug 698882. The bug has been backed out.
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → wontfix
status-firefox48:
--- → fixed
status-firefox49:
--- → fixed
Resolution: --- → FIXED
Comment 95•9 years ago
|
||
This appears resolved mostly by bug 698882 on gecko 48.
Comment 96•9 years ago
|
||
https://crash-stats.mozilla.com/report/list?signature=shutdownhang%20%7C%20WaitForSingleObjectEx%20%7C%20WaitForSingleObject%20%7C%20PR_WaitCondVar%20%7C%20nsThread%3A%3AProcessNextEvent%20%7C%20NS_ProcessNextEvent%20%7C%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown#tab-reports
It's still happening on Firefox 48.
Comment 97•9 years ago
|
||
It's #37 in the top crashers list for Firefox 48.0a2: https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=48.0a2
Assignee | ||
Comment 98•9 years ago
|
||
I need to laugh about this:
there was no crashes from March 24th 2016 until May 10th 2016 (I mean build date not crash date). We closed the bug on May 10th and it started again on May 13th (build date May 11th) :)
2 crashes I have had a look at had some strange dlls, but not one didn't :(
I will continue investigating...
Thanks
Assignee | ||
Comment 99•9 years ago
|
||
Some of them have Sophos_detoured.dll it is a anti-virus for windows, but not all of them.
Reporter | ||
Comment 100•9 years ago
|
||
Could crashes like [1][2] be caused by some issues with busy wait?
[1] https://crash-stats.mozilla.com/report/index/b6941d82-63a9-45e3-aa0b-eb2ef2160517#allthreads
[2] https://crash-stats.mozilla.com/report/index/d0735fc8-ba13-48db-92d2-5bebe2160517#allthreads
Comment 101•9 years ago
|
||
what report is the earliest (by build date) that you saw? The earliest I noticed was 20160514004011.. we could at least look at the 48 changelog.
Updated•9 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 102•9 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #101)
> what report is the earliest (by build date) that you saw? The earliest I
> noticed was 20160514004011.. we could at least look at the 48 changelog.
Sorry, my mistake it is 20160514004011. I am looking at changelog but still nothing obvious.
There are 31 crashes in 3.5 days and it is a weekend this looks like it is going to rise :)
Assignee | ||
Comment 103•9 years ago
|
||
My current theory: Bug 1186060 landed on 25th March all our crashes disappeared
It was backed out from aurora on 13th May (bug 1270664) and the crashes are back again.
I will try to test the theory by backing out 1270664 on aurora and if it fixes this crash again see to leave it like that if possible.
Comment 104•9 years ago
|
||
Dragana, it is possible the compiler change modified the signature and so the crash was still present but with a different signature.
Assignee | ||
Comment 105•9 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #104)
> Dragana, it is possible the compiler change modified the signature and so
> the crash was still present but with a different signature.
No, except it has changed into something completely anonymous.
We are trying to solve this crash for very long time and the signature change as we added changes :) , so if I search for this crash I always search only for "nsHttpConnectionMgr"
In this shutdown hang we are really hanging in select() or connect() which is windows os function.
Updated•9 years ago
|
Crash Signature: nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ] → nsThread::ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
…
Comment 106•9 years ago
|
||
I think the analysis that this is correlated with the MSVC 2013 toolchain is correct. So I'm going to mark it fixed target 49 (MSVC 2015) with 48 affected and <= 47 wontfix.
Target Milestone: --- → mozilla49
Updated•9 years ago
|
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Comment 107•9 years ago
|
||
marking this wontfix-48 based on https://bugzilla.mozilla.org/show_bug.cgi?id=1270664#c33
Comment 108•8 years ago
|
||
I saw one crash on Nightly 0520 build: https://crash-stats.mozilla.com/report/index/5685b5ec-23b2-4d2e-8816-6e3922160523
Flags: needinfo?(mcmanus)
Comment 109•8 years ago
|
||
(In reply to Ting-Yu Chou [:ting] from comment #108)
> I saw one crash on Nightly 0520 build:
> https://crash-stats.mozilla.com/report/index/5685b5ec-23b2-4d2e-8816-
> 6e3922160523
3 more from the same build:
https://crash-stats.mozilla.com/report/index/c91e62ee-1b96-492f-b64c-656172160523
https://crash-stats.mozilla.com/report/index/206fbc57-50e5-417b-8718-0232f2160523
https://crash-stats.mozilla.com/report/index/269d789a-8f32-4ead-8ed8-f68ec2160523
Comment 110•8 years ago
|
||
:ting - that's a different issue than the one fixed in this bug. It looks like you filed it under bug 1275167 - thanks
Flags: needinfo?(mcmanus)
Comment 111•8 years ago
|
||
:ting - oh I might be wrong on that. Its stuck in poll(). Let's look at in the other bug.
Comment 112•8 years ago
|
||
regarding the crashes in 108 and 109 - they all have nsAppShell::ProcessNextNativeEvent on the stack of thread 0 (when nshttpconnectionmgr is spinning the event loop). This is very different than the classic signatures related to this bug (1158189) so it should be treated as a different bug.
ni dragana to make sure she agrees.
Flags: needinfo?(dd.mozilla)
Assignee | ||
Comment 113•8 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #112)
> regarding the crashes in 108 and 109 - they all have
> nsAppShell::ProcessNextNativeEvent on the stack of thread 0 (when
> nshttpconnectionmgr is spinning the event loop). This is very different than
> the classic signatures related to this bug (1158189) so it should be treated
> as a different bug.
>
> ni dragana to make sure she agrees.
I am wondering about ntdll.dll@0xa5164 on socketThread.
We should not be in Poll any more: gIOService->SetHttpHandlerAlreadyShutingDown(); is called and a event is dispatch to SocketThread so we should be out of the Poll and not going back there again.
So we are hanging in Poll, in ntdll.dll@0xa5164
There is similar bug with Cache - bug 1275162.
Maybe we should get raw dump and look at stack.
Flags: needinfo?(dd.mozilla)
Comment 114•8 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #113)
> gIOService->SetHttpHandlerAlreadyShutingDown(); is called and a event is
> dispatch to SocketThread
I think that is not working due to 1275167
Updated•8 years ago
|
Crash Signature: ]
[@ shutdownhang | WaitForSingleObjectEx | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ] → ]
[@ shutdownhang | WaitForSingleObjectEx | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ]
Comment 116•8 years ago
|
||
This signature is showing as a top crash in early 49 beta: https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=49.0b&days=7.
Should I file a new bug for these crashes?
Assignee | ||
Comment 117•8 years ago
|
||
(In reply to Marcia Knous [:marcia - use ni] from comment #116)
> This signature is showing as a top crash in early 49 beta:
> https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=49.
> 0b&days=7.
>
> Should I file a new bug for these crashes?
This started with 49 turning into beta?
I will file another bug. Thanks!
Assignee | ||
Comment 118•8 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #117)
> (In reply to Marcia Knous [:marcia - use ni] from comment #116)
> > This signature is showing as a top crash in early 49 beta:
> > https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=49.
> > 0b&days=7.
> >
> > Should I file a new bug for these crashes?
>
> This started with 49 turning into beta?
>
> I will file another bug. Thanks!
This can be dup of bug 1289145
Assignee | ||
Comment 119•8 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #118)
> (In reply to Dragana Damjanovic [:dragana] from comment #117)
> > (In reply to Marcia Knous [:marcia - use ni] from comment #116)
> > > This signature is showing as a top crash in early 49 beta:
> > > https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=49.
> > > 0b&days=7.
> > >
> > > Should I file a new bug for these crashes?
> >
> > This started with 49 turning into beta?
> >
> > I will file another bug. Thanks!
>
> This can be dup of bug 1289145
probably not, I will file a new bug.
Comment 120•8 years ago
|
||
A signature is now back, MSVC2015 changed inlining of a function.
Old bugs can be found using proto_signature search.
bug 698882 regressed beta, bug 1292181 hopefully stopped the regression.
Beta 48 level looks to me (without any deep analysis) to be about same as 47. ~70% crashes from Windows XP.
Status: RESOLVED → REOPENED
Crash Signature: ]
[@ shutdownhang | WaitForSingleObjectEx | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ] → ]
[@ shutdownhang | WaitForSingleObjectEx | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsEventQueue::GetEvent ]
[@ shutdo…
Resolution: FIXED → ---
Target Milestone: mozilla49 → ---
Updated•8 years ago
|
Crash Signature: mozilla::net::nsHttpConnectionMgr::Shutdown ]
[@ shutdownhang | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
[@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::ProcessNextEv… → mozilla::net::nsHttpConnectionMgr::Shutdown ]
Comment 121•8 years ago
|
||
comment 120 typo should be;
Beta 49 level looks to me (without any deep analysis) to be about same as 48. ~70% crashes from Windows XP.
Updated•8 years ago
|
Crash Signature: mozilla::net::nsHttpConnectionMgr::Shutdown ] → mozilla::net::nsHttpConnectionMgr::Shutdown ]
[@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::ProcessNextEvent | _MD_CURRENT_THREAD | PR_GetThreadPrivate | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown ]
status-firefox50:
--- → affected
status-firefox51:
--- → affected
status-firefox52:
--- → affected
Updated•8 years ago
|
Crash Signature: ] → ]
[@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::ProcessNextEvent | PR_ExitMonitor | PR_GetThreadPrivate | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
Updated•8 years ago
|
Crash Signature: ]
[@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::ProcessNextEvent | PR_ExitMonitor | PR_GetThreadPrivate | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown] → ]
[@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::ProcessNextEvent | PR_ExitMonitor | PR_GetThreadPrivate | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | mozilla::CondVar::Wait | nsEv…
Comment 122•8 years ago
|
||
Too late for firefox 52, mass-wontfix.
Updated•8 years ago
|
Crash Signature: nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEvent | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown] → nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEvent | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shutdown]
[@ shutdownhang | _PR_MD_WAIT_CV | _PR_WaitCondVar | mozilla::CondVar…
Updated•7 years ago
|
Crash Signature: [@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_Wait | nsThread::ProcessNextEvent(bool, bool*) | NS_ProcessNextEvent(nsIThread*, bool) | mozilla::net::nsHttpConnectionMgr::Shutdown()]
[@ shutdownhang | WaitForSingleObjectEx | WaitForSi… → [@ shutdownhang | NtWaitForAlertByThreadId | RtlSleepConditionVariableSRW | SleepConditionVariableSRW | mozilla::detail::ConditionVariableImpl::wait | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::Ge…
Comment 124•7 years ago
|
||
I have encountered several times this crash after removing a webextension and restarting the browser using Firefox 55.0a1 under Windows 10 64-bit:
- bp-9e447097-2b84-49ad-9385-266950170608
- bp-7a41dc78-be4f-4037-a835-599f20170608
- bp-9c96d7e3-0427-4dcc-a078-951c30170607
- bp-f2a24256-57f0-4219-a95a-103a40170607
- bp-d6eac1d3-1e2b-4357-865b-718440170529
status-firefox55:
--- → affected
Assignee | ||
Comment 125•7 years ago
|
||
(In reply to Vasilica Mihasca, QA [:vasilica_mihasca] from comment #124)
> I have encountered several times this crash after removing a webextension
> and restarting the browser using Firefox 55.0a1 under Windows 10 64-bit:
> - bp-9e447097-2b84-49ad-9385-266950170608
> - bp-7a41dc78-be4f-4037-a835-599f20170608
> - bp-9c96d7e3-0427-4dcc-a078-951c30170607
> - bp-f2a24256-57f0-4219-a95a-103a40170607
> - bp-d6eac1d3-1e2b-4357-865b-718440170529
Can you please open a new bug for this? It seems you can reproduce it and it seems there is a proble with webextensions.
Flags: needinfo?(vasilica.mihasca)
Comment 126•7 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #125)
> Can you please open a new bug for this? It seems you can reproduce it and it
> seems there is a proble with webextensions.
Filed Bug 1371248.
Flags: needinfo?(vasilica.mihasca)
Comment 128•7 years ago
|
||
Currently the main signatures are shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariableSRW and shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | mozilla::TimeStamp::Now. They will be broken down with bug 1375511.
Comment 129•7 years ago
|
||
Bulk priority update: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Updated•7 years ago
|
Crash Signature: mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEvent | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shu... ] → mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEvent | nsThread::ProcessNextEvent | NS_ProcessNextEvent | mozilla::net::nsHttpConnectionMgr::Shu... ]
[@ shutdownhang | mozilla::net::nsHttpConnect…
Assignee | ||
Comment 130•6 years ago
|
||
The signature for this crashes has changed and there are separate bugs for the new signature.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•