Closed
Bug 1099209
Opened 10 years ago
Closed 9 years ago
Investigate intermittent Windows 7 and 8 [@ nsStandardURL::~nsStandardURL()] crashes hit in automation
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
People
(Reporter: RyanVM, Assigned: valentin)
References
Details
(Keywords: crash)
Attachments
(1 file, 1 obsolete file)
1.96 KB,
patch
|
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-beta+
bajaj
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
We seem to have a number of crashes in ~nsStandardURL that are hit in automation (see deps). The common theme seems to be Windows 7 as the underlying platform. Is there anyone on the networking team that can maybe investigate if there's an underlying cause that needs fixing here?
Flags: needinfo?(jduell.mcbugs)
Reporter | ||
Updated•10 years ago
|
Summary: Investigate intermittent Windows 7 [@ nsStandardURL::~nsStandardURL()] → Investigate intermittent Windows 7 [@ nsStandardURL::~nsStandardURL()] crashes hit in automation
Comment 1•10 years ago
|
||
Valentin: you've been looking at a lot of URI stuff. Can you take this? We can get you a Windows machine if you need one.
Flags: needinfo?(jduell.mcbugs) → needinfo?(valentin.gosu)
Assignee | ||
Comment 2•10 years ago
|
||
I've actually crossed paths with this issue while working on bug 1009648. It only occurs on windows, in debug builds, and the crash point is the DumpLeakedURLs mechanism. I'm not really sure if it's because of a compiler peculiarity on Windows, or if nsStandardURL::ShutdownGlobalObjects gets called at the same time with ::~nsStandardURL on different threads. My guess is that it's a combination of the two. I'll see what we can do...
Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu)
Reporter | ||
Comment 4•10 years ago
|
||
We're seeing this crashes on Windows XP lately too. For example: https://treeherder.mozilla.org/logviewer.html#?job_id=4961339&repo=mozilla-inbound
Assignee | ||
Comment 5•10 years ago
|
||
Given that the race is fairly difficult to reproduce I think I might have to submit a patch to m-c and see if we get any more crashes.
Reporter | ||
Updated•9 years ago
|
Comment 6•9 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #5) > Given that the race is fairly difficult to reproduce I think I might have to > submit a patch to m-c and see if we get any more crashes. Jason, is there somebody who could look at this, or failing that, disable this code on Windows? It is happening extremely frequently. Bug 1145481 happened like 14 times today alone. This is causing some severe tree stability issues.
Flags: needinfo?(jduell.mcbugs)
Updated•9 years ago
|
Summary: Investigate intermittent Windows 7 [@ nsStandardURL::~nsStandardURL()] crashes hit in automation → Investigate intermittent Windows 7 and 8 [@ nsStandardURL::~nsStandardURL()] crashes hit in automation
Comment 7•9 years ago
|
||
Nathan, could you try doing a TSan run with DEBUG_DUMP_URLS_AT_SHUTDOWN defined in netwerk/base/nsStandardURL.cpp? You'd want to run browser/devtools/webconsole/test/browser_webconsole_view_source.js as that seems to be hitting some shutdown race condition all over the place in bug 1145481. So you could try running that directory or however we run that. (It is Windows-only, but presumably the race isn't platform specific.) Thanks.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 8•9 years ago
|
||
I'm thinking that there's a URL being released off the main thread during shutdown. The TSan results should provide some insight.
Comment 9•9 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #8) > I'm thinking that there's a URL being released off the main thread during > shutdown. > The TSan results should provide some insight. Presumably we'd catch that with owning thread assertions, wouldn't we? nsStandardURL isn't threadsafe. Unless it's UAF or something...
Comment 10•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #9) > Presumably we'd catch that with owning thread assertions, wouldn't we? > nsStandardURL isn't threadsafe. Unless it's UAF or something... Oh, I was confused. I guess it is some URI thing that is threadsafe, not URL. I can try running this test in an ASan build and see if turns up anything.
Comment 11•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #10) > I can try running this test in an ASan build and see if turns up anything. I ran an ASan build on this directory with NS_BUILD_REFCNT_LOGGING defined a few times, but it didn't turn up anything.
Assignee | ||
Comment 12•9 years ago
|
||
Honza, can I get your feedback on this issue? My take on it is that the static DumpLeakedURLs gets destroyed while at the same time URLs are released on other threads. I suspect the reason this is only a problem on Windows is because the compiler calls the destructors for static objects in a slightly different way. There are a few things we could do here: 1. Consider disabling DEBUG_DUMP_URLS_AT_SHUTDOWN for Windows until we figure out a better way of tracking leaked URLs. 2. Introduce some locking. 3. Assert that URLs are only destroyed on the main thread. What do you think?
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(honzab.moz)
Comment 13•9 years ago
|
||
nsStandardURL is not using threadsafe refcounting, so I don't think it should be getting destroyed on other threads.
Assignee | ||
Comment 14•9 years ago
|
||
I added a MOZ_ASSERT(NS_IsMainThread()) to the destructor, and it seems there are a bunch of them :)
Comment 15•9 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #14) > I added a MOZ_ASSERT(NS_IsMainThread()) to the destructor, and it seems > there are a bunch of them :) Hmm, I guess it is okay from a refcounting perspective if it is always on the same thread? If they nsStandardURL are really being destroyed on multiple threads, then a quick fix would be to check if NS_IsMainThread() before touching the global variable. I think for debugging purposes we only care about held by main thread objects. Or just a lock, I guess.
Comment 16•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #7) > Nathan, could you try doing a TSan run with DEBUG_DUMP_URLS_AT_SHUTDOWN > defined in netwerk/base/nsStandardURL.cpp? You'd want to run > browser/devtools/webconsole/test/browser_webconsole_view_source.js as that > seems to be hitting some shutdown race condition all over the place in bug > 1145481. So you could try running that directory or however we run that. > (It is Windows-only, but presumably the race isn't platform specific.) > Thanks. I ran this through TSan a couple of times, but it didn't seem to want to run right, complaining about unimportable javascript modules. This sort of thing isn't unheard of when running with TSan, and it sounds like y'all might have a better handle on what's going wrong anyway.
Flags: needinfo?(nfroyd)
Comment 17•9 years ago
|
||
The first patch in bug 1129784 seems to trigger this fairly consistently (see https://bugzilla.mozilla.org/show_bug.cgi?id=1129784#c9 for a try push). This is blocking me from landing that patch, so is there an ETA for a fix here?
Blocks: 1129784
Flags: needinfo?(valentin.gosu)
Assignee | ||
Comment 18•9 years ago
|
||
Attachment #8584798 -
Flags: review?(honzab.moz)
Assignee | ||
Comment 19•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=12706798e4d7
Flags: needinfo?(valentin.gosu)
Comment 20•9 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #19) > https://treeherder.mozilla.org/#/jobs?repo=try&revision=12706798e4d7 You canceled the needinfo without any further information. Will this patch resolve the problem when it lands?
Assignee | ||
Comment 21•9 years ago
|
||
(In reply to Eddy Bruel [:ejpbruel] from comment #20) > (In reply to Valentin Gosu [:valentin] from comment #19) > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=12706798e4d7 > > You canceled the needinfo without any further information. Will this patch > resolve the problem when it lands? Yes. Sorry I didn't make that clear.
Comment 22•9 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #21) > (In reply to Eddy Bruel [:ejpbruel] from comment #20) > > (In reply to Valentin Gosu [:valentin] from comment #19) > > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=12706798e4d7 > > > > You canceled the needinfo without any further information. Will this patch > > resolve the problem when it lands? > > Yes. Sorry I didn't make that clear. Thanks. Honza, do you think you would be able to review this quickly? This would allow me to land bug 1129784.
Comment 23•9 years ago
|
||
Comment on attachment 8584798 [details] [diff] [review] Only track leaked URLs on the main thread Review of attachment 8584798 [details] [diff] [review]: ----------------------------------------------------------------- r=honzab with comments addressed ::: netwerk/base/nsStandardURL.cpp @@ +270,5 @@ > // default parser in case nsIStandardURL::Init is never called > mParser = net_GetStdURLParser(); > > #ifdef DEBUG_DUMP_URLS_AT_SHUTDOWN > + if (!NS_IsMainThread()) { I'd rather see this as: if (NS_IsMainThread()) { do the leak detect stuff... } Since once someone may add a code bellow it and may be nicely screwed... applies to all the nsStandardURL blocks. @@ +308,5 @@ > DumpLeakedURLs::~DumpLeakedURLs() > { > + if (!NS_IsMainThread()) { > + return; > + } I believe this will always happen on the main thread, but who knows on which thread win is calling this. Maybe add an assertion?
Attachment #8584798 -
Flags: review?(honzab.moz) → review+
Updated•9 years ago
|
Flags: needinfo?(honzab.moz)
Comment 24•9 years ago
|
||
Thanks Honza!
Assignee | ||
Comment 25•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Attachment #8584798 -
Attachment is obsolete: true
Assignee | ||
Comment 26•9 years ago
|
||
Try push is green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fa66f4b786b2
Keywords: checkin-needed
Reporter | ||
Comment 27•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c4c48aad5eb4
Keywords: checkin-needed
Reporter | ||
Comment 28•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/c4c48aad5eb4
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Reporter | ||
Comment 29•9 years ago
|
||
Please nominate this for Aurora/Beta/b2g37 approval when you get a chance.
Flags: needinfo?(valentin.gosu)
Comment 30•9 years ago
|
||
Comment on attachment 8586027 [details] [diff] [review] Only track leaked URLs on the main thread Approval Request Comment [Feature/regressing bug #]: old [User impact if declined]: this causes intermittent failures on tree herder [Describe test coverage new/current, TreeHerder]: this code runs very frequently [Risks and why]: very low. This is debug-only code [String/UUID change made/needed]: none
Flags: needinfo?(valentin.gosu)
Attachment #8586027 -
Flags: approval-mozilla-aurora?
Comment 31•9 years ago
|
||
Comment on attachment 8586027 [details] [diff] [review] Only track leaked URLs on the main thread See previous comment.
Attachment #8586027 -
Flags: approval-mozilla-beta?
Attachment #8586027 -
Flags: approval-mozilla-b2g37?
Comment 32•9 years ago
|
||
Comment on attachment 8586027 [details] [diff] [review] Only track leaked URLs on the main thread Should be in 38 beta 3.
Attachment #8586027 -
Flags: approval-mozilla-beta?
Attachment #8586027 -
Flags: approval-mozilla-beta+
Attachment #8586027 -
Flags: approval-mozilla-aurora?
Attachment #8586027 -
Flags: approval-mozilla-aurora+
Reporter | ||
Updated•9 years ago
|
status-b2g-v2.2:
--- → affected
status-b2g-master:
--- → fixed
status-firefox37:
--- → wontfix
status-firefox38:
--- → affected
status-firefox39:
--- → affected
status-firefox-esr31:
--- → wontfix
Reporter | ||
Comment 33•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-beta/rev/58dca3f7560a
Reporter | ||
Comment 34•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/1d66ed2ccb3c
Updated•9 years ago
|
Attachment #8586027 -
Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Reporter | ||
Comment 35•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/ecf084a49ddd
You need to log in
before you can comment on or make changes to this bug.
Description
•