Closed
Bug 1358043
Opened 7 years ago
Closed 6 years ago
Crash in nsCacheService::Init
Categories
(Core :: Networking: Cache, defect, P2)
Tracking
()
People
(Reporter: skywalker333, Assigned: mayhemer)
References
Details
(Keywords: crash, reproducible, Whiteboard: [necko-next])
Crash Data
Attachments
(1 file)
996 bytes,
patch
|
michal
:
review+
jcristau
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-58815f30-8ed8-42ed-a856-3f3cf0170419. =============================================================
Reporter | ||
Updated•7 years ago
|
status-firefox53:
--- → affected
status-firefox54:
--- → affected
status-firefox55:
--- → ?
tracking-firefox55:
--- → ?
Hardware: ARM → All
Comment 1•7 years ago
|
||
This affects Firefox as well, and I see one Android crash in 55 so setting it as affected. If you go back one month in crash stats there are about 21 crashes. Do we any steps to reproduce?
Comment 2•7 years ago
|
||
Crash volume is still pretty low for this signature, ni on reporter to see if there are any STR.
Flags: needinfo?(skywalker333)
Comment 3•7 years ago
|
||
Kind of low volume, not sure it will be useful for relman to track this.
Comment 4•7 years ago
|
||
Skywalker has been filing bugs from the crash-stats server. I don't think they have been encountering the crashes.
Reporter | ||
Comment 5•7 years ago
|
||
I don't have any particular steps to reproduce (STR). The crash occurred after having recently installed Firefox Aurora (54.0a2 2017-04-18). Install Age 587 seconds since version was first installed (9 minutes and 47 seconds) I was browsing bugzilla.mozilla.org at the time of the crashes. The first crash occurred when I was looking at Bug 1164027. I experienced a crash with signature [ ElfLoader::~ElfLoader ] bp-b61973c4-6efa-43ed-9d36-25f700170419. Uptime 551 seconds (9 minutes and 11 seconds) Install Age 551 seconds since version was first installed (9 minutes and 11 seconds) Install Time 2017-04-19 02:20:57 Product FennecAndroid Release Channel aurora Version 54.0a2 Build ID 20170418074655 OS Android OS Version 0.0.0 Linux 3.4.0-1974790 #1 SMP PREEMPT Fri Oct 25 08:41:54 KST 2013 armv7l Android Version 18 (REL) Build Architecture arm Build Architecture Info ARMv7 Qualcomm Krait features: swp,half,thumb,fastmult,vfpv2,edsp,neon,vfpv3,tls,vfpv4,idiva,idivt | 4 Android Manufacturer samsung Android Model SM-N900W8 Related Bugs Bug 1164027 NEW --- intermittent PROCESS-CRASH | autophone-s1s2 | application crashed [@ ElfLoader::~ElfLoader] Then I restarted firefox and experienced a second crash, this time with signature [ nsCacheService::Init ] bp-58815f30-8ed8-42ed-a856-3f3cf0170419. Uptime 7 seconds Last Crash 36 seconds before submission Install Age 587 seconds since version was first installed (9 minutes and 47 seconds) Startup Crash False MOZ_CRASH Reason MOZ_CRASH(Can't create cache IO thread) Crash Reason SIGSEGV Crash Address 0x0 App Notes FP(D00-L1010-W00000000-T010) EGL? EGL+ GL Context? GL Context+ AdapterDescription: 'Model: SM-N900W8, Product: hltevl, Manufacturer: samsung, Hardware: qcom, OpenGL: Qualcomm -- Adreno (TM) 330 -- OpenGL ES 3.0 V@45.0 AU@04.03.00.125.097 RVADDULA_AU_LINUX_ANDROID_JB_3.1.2.04.03.00.125.097+PATCH[ES]_msm8974_JB_3.1.2_CL3905453_release_ENGG (CL@3905453)' GL Layers? GL Layers+ samsung SM-N900W8 samsung/hltevl/hltecan:4.3/JSS15J/N900W8VLUBMJ4:user/release-keys Processor Notes processor_ip-172-31-11-82_1318; MozillaProcessorAlgorithm2015; skunk_classifier: reject - not a plugin hang bp-58815f30-8ed8-42ed-a856-3f3cf0170419 4/18/17 10:30 PM bp-b61973c4-6efa-43ed-9d36-25f700170419 4/18/17 10:30 PM
Flags: needinfo?(skywalker333)
Reporter | ||
Comment 6•7 years ago
|
||
Looking at https://crash-stats.mozilla.com/signature/?signature=nsCacheService%3A%3AInit&date=%3E%3D2016-11-09T09%3A21%3A17.000Z&date=%3C2017-05-09T09%3A21%3A17.000Z#graphs The number of crashes per day increased from 1 (Mar-Apr) up to 10-15 starting over halfway through April (Apr20-23?). For FennecAndroid only.
Reporter | ||
Comment 7•7 years ago
|
||
Signature report for nsCacheService::Init
Showing results from a month ago
Operating System
Android 173 92.0%
Windows 7 10 5.3%
Windows 10 3 1.6%
Windows 8.1 1 0.5%
Windows XP 1 0.5%
Product
FennecAndroid 53.0.1 33 45.8% 35
FennecAndroid 53.0 20 27.8% 16
FennecAndroid 53.0.2 7 9.7% 9
FennecAndroid 54.0a2 2 2.8% 2
FennecAndroid 54.0b2 2 2.8% 1
FennecAndroid 54.0b4 1 1.4% 1
Uptime Range
< 1 min 72 38.3%
> 1 hour 57 30.3%
15-60 min 21 11.2%
1-5 min 20 10.6%
5-15 min 18 9.6%
Architecture
arm 161 85.6%
x86 26 13.8%
amd64 1 0.5%
Flash Version
[blank] 188 100.0%
Comment 8•7 years ago
|
||
A report came in on webcompat.com regarding a crash in fennec while on the hacks blog. Bug report: https://webcompat.com/issues/9010 Site URL: https://hacks.mozilla.org/2017/06/new-css-grid-layout-panel-in-firefox-nightly/ I can consistently reproduce the crash, even after a restart, though others on the webcompat team can't at all. This is in Firefox 55 and 57, only 1 tab open and no other running applications. My device is a Nexus 6, here's a report: https://crash-stats.mozilla.com/report/index/2cd376d0-ce87-4b0a-844f-ed9160170817 Is there anything I can do to help here? Since my device is reproducing, just by scrolling / interacting with the page.
Flags: needinfo?(mozillamarcia.knous)
Comment 9•7 years ago
|
||
I was able to reproduce this on a Nexus 6 device as well, running release. It looks as if this happens using Firefox as well, but much less frequently than Fennec. Because the crash reason is listed as MOZ_CRASH(Can't create cache IO thread), I moved it into what I think is a better component.
Component: General → Networking: Cache
Flags: needinfo?(mozillamarcia.knous)
Product: Firefox for Android → Core
Updated•7 years ago
|
Keywords: reproducible
Comment 10•7 years ago
|
||
This crash is because NS_NewNamedThread fails, which... ugh. Jason, who has a few cycles to look at this and either (1) reproduce, or (2) create a try build with some debugging for those who can reproduce?
Flags: needinfo?(jduell.mcbugs)
Whiteboard: [necko-next]
Comment 11•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P2
Updated•6 years ago
|
Flags: needinfo?(jduell.mcbugs)
Comment 12•6 years ago
|
||
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #9) > I was able to reproduce this on a Nexus 6 device as well, running release. > > It looks as if this happens using Firefox as well, but much less frequently > than Fennec. Because the crash reason is listed as MOZ_CRASH(Can't create > cache IO thread), I moved it into what I think is a better component. Hi Marcia, Do you remember how to reproduce this crash? If yes, could you provide detailed steps? Thanks.
Flags: needinfo?(mozillamarcia.knous)
Comment 13•6 years ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #12) > (In reply to Marcia Knous [:marcia - needinfo? me] from comment #9) > > I was able to reproduce this on a Nexus 6 device as well, running release. > > > > It looks as if this happens using Firefox as well, but much less frequently > > than Fennec. Because the crash reason is listed as MOZ_CRASH(Can't create > > cache IO thread), I moved it into what I think is a better component. > > Hi Marcia, > > Do you remember how to reproduce this crash? > If yes, could you provide detailed steps? > > Thanks. Hello Kershaw - I don't recall how I was able to reproduce since it was so long ago - sorry.
Flags: needinfo?(mozillamarcia.knous)
Updated•6 years ago
|
Assignee: nobody → odvarko
Updated•6 years ago
|
Assignee: odvarko → honzab.moz
Assignee | ||
Comment 14•6 years ago
|
||
Note that when we fail to create an io thread in cache2, we switch to a memory only mode. we fail at [1] and then, because of missing gInstance, we gracefully fail all IO. Surprisingly, *all*[2] the code in cache1 is already prepared for missing io thread, cache2 links to cache1 have graceful handling as well [3]. the fix here is to just turn the crash to a warning or something to just ignore and live with. [1] https://searchfox.org/mozilla-central/rev/c0b26c40769a1e5607a1ae8be37fe64df64fc55e/netwerk/cache2/CacheFileIOManager.cpp#1216 [2] https://searchfox.org/mozilla-central/search?q=symbol:F_%3CT_nsCacheService%3E_mCacheIOThread&redirect=false [3] https://searchfox.org/mozilla-central/rev/c0b26c40769a1e5607a1ae8be37fe64df64fc55e/netwerk/cache2/OldWrappers.cpp#714-734
Status: NEW → ASSIGNED
Assignee | ||
Comment 15•6 years ago
|
||
Attachment #9025397 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 16•6 years ago
|
||
Michal, see comment 14 for rational. There is no need to push to try this, there is no realistic scenario this could actually trigger on our test infra.
Updated•6 years ago
|
Attachment #9025397 -
Flags: review?(michal.novotny) → review+
Assignee | ||
Comment 17•6 years ago
|
||
Just in case: https://treeherder.mozilla.org/#/jobs?repo=try&revision=27731e242a1d5980c8a0565b722b91aeb6c40cb1
Assignee | ||
Comment 18•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #17) > Just in case: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=27731e242a1d5980c8a0565b722b91aeb6c40cb1 To explain, this is a simulated push with the old cache io thread missing (being null). I wanted to check for possible other crashes in case I missed any non-null checks. Assertion failure: ((bool)(__builtin_expect(!!(!NS_FAILED_impl(rv)), 1))) (Unexpected state), at /builds/worker/workspace/build/src/netwerk/protocol/http/nsHttpChannel.cpp:855 is fine (we wait only for "normal" cache entry, no hangs expected)
Comment 20•6 years ago
|
||
Pushed by aciure@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d21e9cf5a196 Produce only warning when appcache/old cache backend I/O thread can't be created for lack of resources, r=michal
Keywords: checkin-needed
Comment 21•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d21e9cf5a196
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox65:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Reporter | ||
Updated•6 years ago
|
status-firefox63:
--- → affected
Comment 22•6 years ago
|
||
Seems simple enough, please nominate this for Beta/ESR60 approval.
Assignee | ||
Comment 23•6 years ago
|
||
I'm not sure we want to pass this to ESR. There still could be some corner case we haven't discovered yet that may cause a crash (or instability) somewhere in the cache or its consuming code when the thread is missing. I'd rather push this only up to beta. Note that this mainly effects only Android because of lack of OS resources and not desktop.
Flags: needinfo?(honzab.moz)
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 24•6 years ago
|
||
Comment on attachment 9025397 [details] [diff] [review] v1 [Beta/Release Uplift Approval Request] Feature/Bug causing the regression: none User impact if declined: Early startup crash when the machine is out of memory/handles (on low end HW, specifically mobile) Is this code covered by automated tests?: No Has the fix been verified in Nightly?: Yes Needs manual test from QE?: No If yes, steps to reproduce: This is hard to repro. You would need a HW with just low enough number of free thread handles to reproduce and then try to go on... List of other uplifts needed: None Risk to taking this patch: Medium Why is the change risky/not risky? (and alternatives if risky): I would rather be a bit cautious here since we may still be missing some code path or missing check that will cause a crash or some unexpected state when the thread is missing. Also, when we are so much out of resources, we will likely crash somewhere else soon anyway... maybe this was an accidental 'safe check' we just removed... String changes made/needed: none
Attachment #9025397 -
Flags: approval-mozilla-beta?
Comment 25•6 years ago
|
||
This is very low volume, we can let it ride the trains.
Updated•6 years ago
|
Attachment #9025397 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in
before you can comment on or make changes to this bug.
Description
•