Crash in [@ @0x0 | nsHostResolver::FlushCache]
Categories
(Core :: Networking: DNS, defect, P3)
Tracking
()
People
(Reporter: calixte, Assigned: valentin)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression, Whiteboard: [necko-triaged])
Crash Data
Attachments
(2 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr68+
|
Details | Review |
This bug is for crash report bp-deea109f-ecbc-43cd-9ccf-3d3f40190331.
Top 10 frames of crashing thread:
0 @0x0
1 dalvik-jit-code-cache_2359_2359 (deleted) dalvik-jit-code-cache_2359_2359 @0x7b7e628
2 libxul.so nsHostResolver::FlushCache netwerk/dns/nsHostResolver.cpp:740
3 libxul.so nsDNSService::Observe netwerk/dns/nsDNSService2.cpp:1134
4 libxul.so nsObserverList::NotifyObservers xpcom/ds/nsObserverList.cpp:66
5 libxul.so nsObserverService::NotifyObservers xpcom/ds/nsObserverService.cpp:286
6 libxul.so mozilla::GeckoNetworkManager::OnStatusChanged widget/android/GeckoNetworkManager.h:37
7 libxul.so mozilla::detail::RunnableFunction<mozilla::jni::detail::ProxyNativeCall<GeckoAppShellSupport, mozilla::java::GeckoAppShell, true, false, mozilla::jni::StringParam const&> >::Run widget/android/jni/Natives.h:395
8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1161
9 libxul.so NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:474
There are 63 crashes (from 61 installations) in 66.0.2 with buildid 20190326175229. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1450893.
[1] https://hg.mozilla.org/releases/mozilla-release/rev?node=1805599da5fd
Assignee | ||
Comment 1•5 years ago
|
||
I see a few of them affecting Fennec 65.0.1:
https://crash-stats.mozilla.com/report/index/5dbf9fa0-3d45-4ac7-a89e-db83a0190406
And it crashes at the same spot.
The patch shouldn't have changed the behaviour at all if TRR is off.
Dragana, did I miss something in the patch?
Comment 2•5 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #1)
I see a few of them affecting Fennec 65.0.1:
https://crash-stats.mozilla.com/report/index/5dbf9fa0-3d45-4ac7-a89e-db83a0190406
And it crashes at the same spot.The patch shouldn't have changed the behaviour at all if TRR is off.
Dragana, did I miss something in the patch?
This is interesting. We have some random crashes in nsHostResolver they are mostly on Fennec. I have spent some time looking at them but I have not come to any conclusion and the code is same on all platforms, why it is affecting only Fennec.
What is interesting here is that the small change you made cause this crash to appear.
We do not call ClearCache with parameter true that could actually change the previous behavior.
Assignee | ||
Comment 3•5 years ago
|
||
That's what I found confusing as well. I'm inclined to say this is caused by a compiler bug, or maybe random bitflips, but I have no evidence to support that.
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
Assigning to myself to keep an eye on it. Given bug 1549398 it is possible we have an Android only DNS bug.
Comment 5•5 years ago
|
||
Only 98 crashes on release - relatively low volume compared to the bug referenced in Comment 4.
Updated•4 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
Assignee | ||
Comment 7•4 years ago
|
||
frame #5 of report https://crash-stats.mozilla.org/report/index/4dca6cb1-8d45-4bf5-8836-216810200217
This crash was rather obvious in retrospect, but I missed it because I was
looking at the wrong thing. We're not actually crashing in FlushCache,
instead mHostResolver is null in nsDNSService::Observe
What made it obvious is frame #5 of report https://crash-stats.mozilla.org/report/index/4dca6cb1-8d45-4bf5-8836-216810200217
Included here because crash reports expire:
1 libxul.so nsHostResolver::FlushCache(bool) netwerk/dns/nsHostResolver.cpp:740
2 libxul.so nsDNSService::Observe(nsISupports*, char const*, char16_t const*) netwerk/dns/nsDNSService2.cpp:1132
3 libxul.so nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:66
4 libxul.so nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverService.cpp:295
5 libxul.so DecreasePrivateDocShellCount() docshell/base/nsDocShell.cpp:306
6 libxul.so nsDocShell::Destroy() docshell/base/nsDocShell.cpp:5076
See the code points to this line:
https://hg.mozilla.org/releases/mozilla-esr68/annotate/ef373efc995d9350a676c4c231b344d173423e8a/docshell/base/nsDocShell.cpp#l306
As we can see, it emits the "last-pb-context-exited" notification,
and nsDNSService tries to call FlushCache.
However, it appears this notification may be called after we get the shutdown
notification and we null out the pointer. It's unclear why this crash was not
noticed before bug 1450893 landed.
Depends on D63107
Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/9fbafe41107e clang-format nsDNSService2.cpp r=dragana https://hg.mozilla.org/integration/autoland/rev/1e00af5facbe Add a null check before calling mHostResolver->FlushCache() r=dragana
Comment 9•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/9fbafe41107e
https://hg.mozilla.org/mozilla-central/rev/1e00af5facbe
Assignee | ||
Comment 10•4 years ago
|
||
Comment on attachment 9127144 [details]
Bug 1543331 - Add a null check before calling mHostResolver->FlushCache() r=dragana
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Fixes crash on Fennec caused by null pointer dereference.
- User impact if declined: Crashes on Fennec. Fenix doesn't seem to get these - unclear if it's a difference in architecture or if it just doesn't have enough users yet.
- Fix Landed on Version: 75
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Just adds a null check. Very low risk.
- String or UUID changes made by this patch:
Updated•4 years ago
|
Comment 11•4 years ago
|
||
Comment on attachment 9127144 [details]
Bug 1543331 - Add a null check before calling mHostResolver->FlushCache() r=dragana
Fixes a low-volume Fennec crash. Approved for 68.6esr.
Comment 12•4 years ago
|
||
bugherder uplift |
Updated•2 years ago
|
Description
•