Closed
Bug 1035075
Opened 10 years ago
Closed 10 years ago
crash in mozilla::net::ProxyAutoConfig::MyIPAddress(JS::CallArgs const&)
Categories
(Core :: Networking: HTTP, defect)
Tracking
()
People
(Reporter: andrei, Assigned: dragana)
References
()
Details
(Keywords: crash)
Crash Data
Attachments
(1 file, 2 obsolete files)
3.42 KB,
patch
|
mcmanus
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-esr31+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-f761b777-af4a-45f5-afbc-77a7f2140707. ============================================================= We experienced this crash via Mozmill. Looks to be some sort of proxy issue. (As additional data, we do have custom proxies set up on the machine where the crash occurred). Not sure what more info I can get here. The test log doesn't have anything useful: > 03:29:03 TEST-END | testToolbar/testHomeButton.js | finished in 328ms > 03:29:06 *** WIFI GEO: shutdown called > 03:29:06 mozcrash INFO | Saved minidump as /Users/mozauto/Library/Application Support/Firefox/Crash Reports/pending/9FA0C864-F6F9-4F73-ADE6-35D856F445BE.dmp > 03:29:06 mozcrash INFO | Saved app info as /Users/mozauto/Library/Application Support/Firefox/Crash Reports/pending/9FA0C864-F6F9-4F73-ADE6-35D856F445BE.extra > 03:29:06 PROCESS-CRASH | /Users/mozauto/jenkins/workspace/mozilla-aurora_functional/data/mozmill-tests/firefox/tests/functional/testToolbar/testHomeButton.js | application crashed [Unknown top frame] > 03:29:06 Crash dump filename: /Users/mozauto/jenkins/workspace/mozilla-aurora_functional/data/profile/minidumps/9FA0C864-F6F9-4F73-ADE6-35D856F445BE.dmp > 03:29:06 No symbols path given, can't process dump. > 03:29:06 MINIDUMP_STACKWALK not set, can't process dump. > 03:29:06 RESULTS | Passed: 105
Reporter | ||
Comment 1•10 years ago
|
||
Crash-stats shows the function MyIPAddress crashing in multiple locations lately: https://crash-stats.mozilla.com/query/?query_type=simple&query=MyIPAddress
Updated•10 years ago
|
status-firefox31:
--- → affected
status-firefox33:
--- → affected
OS: Mac OS X → All
Version: unspecified → 32 Branch
Comment 3•10 years ago
|
||
Done.
Assignee: nobody → sworkman
Status: NEW → ASSIGNED
Flags: needinfo?(sworkman)
Comment 4•10 years ago
|
||
based on comment 0 and some comments in another bug I am going to dup with this, I think its a shutdown issue.
Comment 6•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=2992433d1fd1
Comment 7•10 years ago
|
||
Assignee: sworkman → mcmanus
Attachment #8458891 -
Flags: review?(sworkman)
Comment 8•10 years ago
|
||
Comment on attachment 8458891 [details] [diff] [review] 1035075.1 Review of attachment 8458891 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for taking this Pat. Sorry I didn't get to it sooner. Changes look good r=me.
Attachment #8458891 -
Flags: review?(sworkman) → review+
Comment 9•10 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/ece3f69eb1b1
Backed this out under suspicion of causing various gc crashes: https://hg.mozilla.org/integration/mozilla-inbound/rev/e2650b0b07d6 https://tbpl.mozilla.org/php/getParsedLog.php?id=44144136&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=44144216&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=44143224&tree=Mozilla-Inbound
Crashes were still happening with the backout, so I relanded this. https://hg.mozilla.org/integration/mozilla-inbound/rev/f698c8166457
Backed out again: https://hg.mozilla.org/integration/mozilla-inbound/rev/cabc7efd2c1f because it was responsible for the B2G mochitest-7 orange: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=b2g_emulator_vm.*mochitest-7 e.g., in https://tbpl.mozilla.org/php/getParsedLog.php?id=44144765&tree=Mozilla-Inbound
Flags: needinfo?(mcmanus)
Two other notes: * this orange is opt mochitest-7, not debug. (They're split into different numbers of chunks.) * the orange is not 100% of the time, but probably somewhere around 80% of the time
Comment 14•10 years ago
|
||
wild guess about the orange: my patch makes the observer code synchronously wait for the pac thread to shutdown. Before it shuts down it actually cancels all the resolutions in the pac queue (it has always done this) - and now the main thread waits for those cancels to happen. its possible that in the case of the orange some extra requests are being canceled due to the synchronous wait. If this was the case the test was probably always technically racy. if that guess is right, two options might be: * figure out what aspect of the test is dependent on pac resolution after the shutdown observers run and fix it * remove the cancel semantics at gecko shutdown time - but probably keep them at reload time
Flags: needinfo?(mcmanus)
Comment 15•10 years ago
|
||
dragana offered to try and push this one over the finish line - Thanks!
Assignee: mcmanus → dd.mozilla
Assignee | ||
Comment 16•10 years ago
|
||
I was trying to figure out how ProxyAutoConfig works. isn't it the problem that mPAC gets destroyed but sRunning is not set to nullptr? maybe i not sure if i got it right, would it be enough just to set sRunning to nullptr in destructor?
Flags: needinfo?(mcmanus)
Comment 17•10 years ago
|
||
(In reply to Dragana Damjanovic from comment #16) > I was trying to figure out how ProxyAutoConfig works. > isn't it the problem that mPAC gets destroyed but sRunning is not set to > nullptr? maybe i not sure if i got it right, would it be enough just to set > sRunning to nullptr in destructor? hey Dragana, proxyautoconfig::sRunning is only non-null while the js engine is actually executing - its only there so that some C++ helper functions that are bound to pac specific js functions can find the object data. I agree with your inference that the stack we see is UAF of sRunning. however, I think its because mPAC is being destroyed from another thread while the pac thread is actually executing the JS. resetting sRunning when that happens isn't going to solve the problem - the whole js context is also destroyed. If the js engine calls a c++ helper funcion and sRunning is null, something is very wrong. I think this is happening because nspacman::shutdown() is not synchronous with joining the pac thread. the dtor that does try and join the pac thread (thread->shutdown()) only happens synchronously on the main thread, if the dtor is reached from another thread (I suspect it is the socket thread in this case - making 3 threads in question) then that thread join is proxied back to the main thread and perhaps not exectued before mPAC is destroyed.. that would be consistent with the stack trace. My patch tries to fix that by having the shutdown() (which happens on the main thread anyhow) also join the pac thread.. that means it will wait until the pac thread has run all of its events and isn't using a js context. does that help? of course that still leaves the orange it created to deal with.
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 18•10 years ago
|
||
The problem was with thread shutdown() function. This function blocks current execution but continue serving the thread queue. So mPACMan->Shutdown() is called and blocked and the next event in the main thread queue is called which again calls mPACMan->Shutdown(). So now I dispatch a runnable to wait for the PAC thread to shutdown.
Attachment #8458891 -
Attachment is obsolete: true
Attachment #8472297 -
Flags: review?(mcmanus)
Comment 19•10 years ago
|
||
Comment on attachment 8472297 [details] [diff] [review] fix v2 Review of attachment 8472297 [details] [diff] [review]: ----------------------------------------------------------------- > The problem was with thread shutdown() function. This function blocks > current execution but continue serving the thread queue. So > mPACMan->Shutdown() is called and blocked and the next event in the main > thread queue is called which again calls mPACMan->Shutdown(). That's a good diagnosis. I had forgotten that shutdown() isn't just join() - it runs events on the current thread. Generally re-entrant stuff is a good way to have bugs - so let's go with both your patch and a simple little addition that checks ::shutdown for double calling. r=mcmanus if that tests out ok for you ::: netwerk/base/src/nsPACMan.cpp @@ +124,5 @@ > + } > + > + NS_IMETHODIMP Run() > + { > + NS_ABORT_IF_FALSE(NS_IsMainThread(), "wrong thread"); if (mPACMan->mPACThread) {} @@ +312,5 @@ > > void > nsPACMan::Shutdown() > { > NS_ABORT_IF_FALSE(NS_IsMainThread(), "pacman must be shutdown on main thread"); if (mShutdown) { return; } // that might avoid some re-entrancy problems.
Attachment #8472297 -
Flags: review?(mcmanus) → review+
Assignee | ||
Comment 20•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=6f0ee674695d
Assignee | ||
Comment 21•10 years ago
|
||
Attachment #8472297 -
Attachment is obsolete: true
Attachment #8472868 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Attachment #8472868 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Attachment #8472868 -
Flags: review?(mcmanus)
Comment 22•10 years ago
|
||
Comment on attachment 8472868 [details] [diff] [review] fix v3 Review of attachment 8472868 [details] [diff] [review]: ----------------------------------------------------------------- When I said "r=mcmanus if that tests out ok for you" that meant you didn't need to r? it again as long as you made and tested the changes I indicated (which you did! Thanks!). If you didn't like those suggestions, then the r? cycle would have to go around the merry go round again.
Attachment #8472868 -
Flags: review?(mcmanus) → review+
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 24•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b5435b03ae8 In the future, please include a commit message with your patch when you're requesting checkin. Thanks! https://developer.mozilla.org/en-US/docs/Developer_Guide/Committing_Rules_and_Responsibilities#Checkin_comment
Keywords: checkin-needed
Comment 25•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8b5435b03ae8
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Updated•10 years ago
|
Flags: qe-verify-
Reporter | ||
Comment 26•10 years ago
|
||
We've just experienced this crash on 32 (release) again: https://crash-stats.mozilla.com/report/index/384aa689-8cec-4493-a725-1d3282140826 Should the fix be uplifted at least to 33? (Given that 32 is already out the door)
Flags: needinfo?(ryanvm)
Flags: needinfo?(dd.mozilla)
Comment 27•10 years ago
|
||
Questions like that should be addressed to the bug's assignee, not the guy who landed the patch for them.
Flags: needinfo?(ryanvm)
Comment 28•10 years ago
|
||
[Tracking Requested - why for this release]: I'm not sure about the impact of this crash and any side-effects it could cause, but I think backporting it to 33 and 31ESR would be good. The crash ADI is not that high for this crash but it looks like that people on 31.0 see this most often: Firefox 31.0 93.67 % 429 Firefox 32.0b8 2.40 % 11 Firefox 32.0b7 1.31 % 6 Firefox 33.0a2 0.44 % 2 Dragana may be able to give better details about possible side-effects.
status-firefox-esr31:
--- → affected
tracking-firefox32:
--- → ?
tracking-firefox33:
--- → ?
tracking-firefox-esr31:
--- → ?
Comment 29•10 years ago
|
||
Based on comment 28, I'm marking this as won't fix in 32 (already at RC1). We can fix this in 33 and 31.2.0 (ships alongside 33).
status-firefox34:
--- → affected
tracking-firefox34:
--- → +
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(dd.mozilla)
Comment 30•10 years ago
|
||
Dragana - Now that this has been on m-c for a week, can you please request approval to uplift to 33 (aurora) and ESR31?
Flags: needinfo?(dd.mozilla)
Comment 31•10 years ago
|
||
Comment on attachment 8472868 [details] [diff] [review] fix v3 Hi--I'm filling this out for :mcmanus, who's out for a week. Sorry about imperfect information as a result. Approval Request Comment [Feature/regressing bug #]: not sure exactly :( but we're seeing crashes back to 31 [User impact if declined]: crashes [Describe test coverage new/current, TBPL]: comment 10 seems to indicate we're hitting this codepath during testing. [Risks and why]: low risk: just adds logic to wait during shutdown, and patch has been baking on m-c for a week [String/UUID change made/needed]: no If this is not a sec:{high,crit} bug, please state case for ESR consideration: crashing on ESR (see comment 28)
Attachment #8472868 -
Flags: approval-mozilla-esr31?
Attachment #8472868 -
Flags: approval-mozilla-aurora?
Flags: needinfo?(dd.mozilla)
Updated•10 years ago
|
Attachment #8472868 -
Flags: approval-mozilla-esr31?
Attachment #8472868 -
Flags: approval-mozilla-esr31+
Attachment #8472868 -
Flags: approval-mozilla-aurora?
Attachment #8472868 -
Flags: approval-mozilla-aurora+
Comment 32•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/419f744ba5c9 https://hg.mozilla.org/releases/mozilla-esr31/rev/c96e9d362a23
You need to log in
before you can comment on or make changes to this bug.
Description
•