Closed
Bug 976171
Opened 10 years ago
Closed 5 years ago
crash in mozilla::net::CacheIOThread::LoopOneLevel(unsigned int)
Categories
(Core :: Networking: Cache, defect, P3)
Tracking
()
RESOLVED
WORKSFORME
mozilla32
People
(Reporter: mayhemer, Unassigned)
References
(Depends on 1 open bug)
Details
(Keywords: crash, Whiteboard: [necko-backlog])
Crash Data
Attachments
(1 file, 1 obsolete file)
1.60 KB,
patch
|
michal
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-626494ac-f0c2-4a18-b28c-b1c862140223. ============================================================= According the IO thread simplicity, this also seem more likely as a heap break from outside or something being actually wrong with the event (not the thread).
Reporter | ||
Comment 1•10 years ago
|
||
One valid report [1] in 4 weeks. Maybe just a null check will do here. [1] https://crash-stats.mozilla.com/report/index/74ada869-620f-4cc1-84dc-b99562140327
Reporter | ||
Comment 2•10 years ago
|
||
The bug may already be fixed, but we should ensure there are no null runnables added and later attempted to be executed.
Assignee: nobody → honzab.moz
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #8413837 -
Flags: review?(michal.novotny)
Comment 3•10 years ago
|
||
Comment on attachment 8413837 [details] [diff] [review] v1 Review of attachment 8413837 [details] [diff] [review]: ----------------------------------------------------------------- It is a bad usage of dispatching methods if somebody passes nullptr. Instead of returning an error in DispatchInternal() add an assertion to methods that call it, i.e. to CacheIOThread::DispatchAfterPendingOpens() and CacheIOThread::Dispatch().
Attachment #8413837 -
Flags: review?(michal.novotny) → review-
Reporter | ||
Comment 4•10 years ago
|
||
- MOZ_ASSERTS added to the top level methods to catch this when we are in debug - non-null check left to actually fix/prevent unnecessary crashes in production that won't tell us anything anyway
Attachment #8413837 -
Attachment is obsolete: true
Attachment #8415835 -
Flags: review?(michal.novotny)
Updated•10 years ago
|
Attachment #8415835 -
Flags: review?(michal.novotny) → review+
Reporter | ||
Comment 5•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c69333201bc7
https://hg.mozilla.org/mozilla-central/rev/c69333201bc7
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Comment 7•10 years ago
|
||
Looking at [1] I see that there are still some crashes: - Firefox 32 Beta - 8 crashes ranging from 20140722030201 to 20140811180644 - Firefox 34 Nightly - 2 crashes: 20140722030201 and 20140810030204 Honza, is this acceptable, or does it need more work?
Flags: needinfo?(honzab.moz)
Comment 8•10 years ago
|
||
(In reply to Florin Mezei, QA (:FlorinMezei) from comment #7) > Looking at [1] I see that there are still some crashes: > - Firefox 32 Beta - 8 crashes ranging from 20140722030201 to 20140811180644 > - Firefox 34 Nightly - 2 crashes: 20140722030201 and 20140810030204 > > Honza, is this acceptable, or does it need more work? [1] https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=28&signature=mozilla%3A%3Anet%3A%3ACacheIOThread%3A%3ALoopOneLevel%28unsigned+int%29#tab-reports
Reporter | ||
Comment 9•10 years ago
|
||
I suspect more work is needed here. One thing that comes to my mind is that some event has its reference counter broken. But it also could be a result of a heap break from a completely different code, but this is hard to track.
Flags: needinfo?(honzab.moz)
Comment 10•10 years ago
|
||
Thanks Honza! I'm reopening this so it gets the needed attention.
Reporter | ||
Updated•10 years ago
|
Status: REOPENED → NEW
Updated•9 years ago
|
Crash Signature: [@ mozilla::net::CacheIOThread::LoopOneLevel(unsigned int)] → [@ mozilla::net::CacheIOThread::LoopOneLevel(unsigned int)]
[@ mozilla::net::CacheIOThread::LoopOneLevel]
Updated•8 years ago
|
Whiteboard: [necko-backlog]
Reporter | ||
Comment 11•8 years ago
|
||
Status: only one crash on aurora (47.0a2), 64 on release (45.0.1). This is a low rate, but still I would like to figure out if this is CacheIOThread issues or issues with the runnable. It's not duplicate of bug 1257611, https://crash-stats.mozilla.com/report/index/354f7ec7-806f-4db3-a584-10ed52160405, having the fix (https://hg.mozilla.org/releases/mozilla-aurora/log/a4481ccef67e/netwerk/cache2/CacheFileIOManager.cpp)
Reporter | ||
Comment 12•8 years ago
|
||
Probes from bug 1277275 show that mainly the write queue can be pretty long. In two weeks on Nightly (50) there are overall 1630k samples to HTTP_CACHE_IO_QUEUE_WRITE probe where 334k (20%) hits more than 30 events backlog and 400k (25%) backlog of more than 300! Just before WRITE we process MANAGEMENT that has some 850k samples with 89k >300. But on MANAGEMENT we don't do any IO and according the numbers this is just accumulation of operations happening around openings and readings (the sum of all OPEN*/READ* ops is almost equal to number of MANAGEMENT operations). There is no reference of how many sessions never go over a backlog of 30 events. Anyway, this all shows we can keep a lot of memory allocated (suspected cause of THIS bug) to hold these queues alive. Solutions may be more threads, cache/net race, smaller write op granularity, priorities for the write queue as well with a time limit for a write operation to rather be bypassed.
Comment 13•8 years ago
|
||
Crash volume for signature 'mozilla::net::CacheIOThread::LoopOneLevel': - nightly (version 50): 0 crash from 2016-06-06. - aurora (version 49): 0 crash from 2016-06-07. - beta (version 48): 56 crashes from 2016-06-06. - release (version 47): 282 crashes from 2016-05-31. - esr (version 45): 37 crashes from 2016-04-07. Crash volume on the last weeks: Week N-1 Week N-2 Week N-3 Week N-4 Week N-5 Week N-6 Week N-7 - nightly 0 0 0 0 0 0 0 - aurora 0 0 0 0 0 0 0 - beta 13 10 7 7 11 5 2 - release 37 43 32 46 54 56 5 - esr 3 1 1 4 3 5 8 Affected platforms: Windows, Mac OS X
status-firefox47:
--- → affected
status-firefox48:
--- → affected
status-firefox-esr45:
--- → affected
Comment 14•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Comment 15•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Reporter | ||
Comment 17•5 years ago
|
||
I checked few of the crash reports and they are only from very old branches. WFM!
Status: NEW → RESOLVED
Closed: 10 years ago → 5 years ago
Resolution: --- → WORKSFORME
Comment 18•5 years ago
|
||
Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.
Keywords: stalled
You need to log in
before you can comment on or make changes to this bug.
Description
•