Closed
Bug 1247432
Opened 9 years ago
Closed 8 years ago
Don't do any unnecessary I/O in cache2 after shutdown
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
mozilla47
People
(Reporter: mayhemer, Assigned: mayhemer)
References
Details
(Whiteboard: [necko-active])
Attachments
(1 file)
1.68 KB,
patch
|
michal
:
review+
lizzard
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
- don't read/open - ignore doomed handles (already moved to doomed/ dir) - write and close only for 2 seconds after shutdown Still, I'm not sure at all this will help bug 1215970. I believe there is some heavy I/O blockade happening from Skype (or whoever is the culprit) causing any op taking more than the time for shutdown. So I think there is no help with this patch. But we can try to at least make it a non-top crash :)
Updated•9 years ago
|
Whiteboard: [necko-active]
Assignee | ||
Comment 1•9 years ago
|
||
The shutdown-commenced point should be considered at profile-change-net-teardown and not xpcom-shutdown. When reproducing bug 1215970 I can see the following: 11:21:42.066000 UTC - 0[2f11140]: nsHttpHandler::Observe [topic="profile-change-net-teardown"] 11:22:22.875000 UTC - 0[2f11140]: nsHttpHandler::Observe [topic="xpcom-shutdown"] So, some 40 seconds later. During these 40 seconds, cache should be muted for reading/opening. Writes should be terminated up to xpcom-shutdown + 2 seconds.
Assignee | ||
Comment 2•9 years ago
|
||
Looking into the code this seems as trivial as shutting the cache io manager sooner from "profile-change-net-teardown". since that point entries cannot be open on the high level, internal read/open operations will be bypassed (with failure), writing has the 2 second limit that will begin sooner as well.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•9 years ago
|
||
- shutdown IO manager at profile-net-change-teardown - leaving other topics as well since this one is IMO not 100% in all cases to arrive - redundancy is OK, this always happens on the main thread https://treeherder.mozilla.org/#/jobs?repo=try&revision=469b207dea7c
Attachment #8718915 -
Flags: review?(michal.novotny)
Updated•9 years ago
|
Attachment #8718915 -
Flags: review?(michal.novotny) → review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 4•9 years ago
|
||
Better try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=07dc7aa24bad
Comment 6•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/9d5fa6d7558a
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Comment 7•9 years ago
|
||
backout bugherder landing |
I had to back this out for apparently causing a spike in ASAN failures in test_browserElement_oop_getWebManifest.html https://treeherder.mozilla.org/logviewer.html#?job_id=21801214&repo=mozilla-inbound https://hg.mozilla.org/integration/mozilla-inbound/rev/eda42451649d
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 8•9 years ago
|
||
(In reply to Wes Kocher (:KWierso) from comment #7) > I had to back this out for apparently causing a spike in ASAN failures in > test_browserElement_oop_getWebManifest.html > > https://treeherder.mozilla.org/logviewer.html#?job_id=21801214&repo=mozilla- > inbound > https://hg.mozilla.org/integration/mozilla-inbound/rev/eda42451649d Thanks Wes! I'm currently investigating it persuaded it's from a different patch (not yet landed).
Status: REOPENED → NEW
Assignee | ||
Comment 9•9 years ago
|
||
11:03:13 INFO - ==5232==ERROR: AddressSanitizer: heap-use-after-free on address 0x6270000ff900 at pc 0x440bf3 bp 0x7fde3c742750 sp 0x7fde3c742730 11:03:13 INFO - READ of size 13552 at 0x6270000ff900 thread T15 (Cache2 I/O) 11:03:14 INFO - #0 0x440bf2 in write /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:325 11:03:14 INFO - #1 0x7fde59dd52ef in pt_Write /builds/slave/try-l64-asan-00000000000000000/build/src/nsprpub/pr/src/pthreads/ptio.c:1315 11:03:15 INFO - #2 0x7fde3f5782cf in (anonymous namespace)::interposedWrite(PRFileDesc*, void const*, int) /builds/slave/try-l64-asan-00000000000000000/build/src/xpcom/build/NSPRInterposer.cpp:72 11:03:15 INFO - #3 0x7fde3f95aacf in mozilla::net::CacheFileIOManager::WriteInternal(mozilla::net::CacheFileHandle*, long, char const*, int, bool, bool) /builds/slave/try-l64-asan-00000000000000000/build/src/netwerk/cache2/CacheFileIOManager.cpp:2024 11:03:15 INFO - #4 0x7fde3f9aa9c6 in mozilla::net::WriteEvent::Run() /builds/slave/try-l64-asan-00000000000000000/build/src/netwerk/cache2/CacheFileIOManager.cpp:725 11:03:15 INFO - #5 0x7fde3f980b89 in mozilla::net::CacheIOThread::LoopOneLevel(unsigned int) /builds/slave/try-l64-asan-00000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:294 11:03:15 INFO - #6 0x7fde3f98038c in mozilla::net::CacheIOThread::ThreadFunc() /builds/slave/try-l64-asan-00000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:224 11:03:15 INFO - #7 0x7fde3f97fdd3 in mozilla::net::CacheIOThread::ThreadFunc(void*) /builds/slave/try-l64-asan-00000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:173 11:03:15 INFO - #8 0x7fde59ddd3cf in _pt_root /builds/slave/try-l64-asan-00000000000000000/build/src/nsprpub/pr/src/pthreads/ptthread.c:216 11:03:15 INFO - #9 0x7fde5e368e99 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7e99) 11:03:15 INFO - #10 0x7fde5d4782ec (/lib/x86_64-linux-gnu/libc.so.6+0xf42ec) 11:03:15 INFO - ASAN:SIGSEGV Michal, do you have any idea from top of your head what this could be? If not, I will look further. Trying to obtain logs (not sure this will be reproducible with sync logging) at https://treeherder.mozilla.org/#/jobs?repo=try&revision=ea3283e4df75
Flags: needinfo?(michal.novotny)
Comment 10•9 years ago
|
||
I guess we post write event via CacheFileIOManager::Write(), but we release the buffer that holds the data sooner than WriteEvent::Run() is executed. But I have no idea how this can happen since WriteEvent holds the reference to the caller (mCallback) so the caller shouldn't be destructed. If no callback is passed to the WriteEvent, then the event takes buffer ownership and releases it in the destructor.
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 11•9 years ago
|
||
Michal, can you please look at bug 1248958 soon? Thanks.
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Wes Kocher (:KWierso) from comment #7) > I had to back this out for apparently causing a spike in ASAN failures in > test_browserElement_oop_getWebManifest.html > > https://treeherder.mozilla.org/logviewer.html#?job_id=21801214&repo=mozilla- > inbound > https://hg.mozilla.org/integration/mozilla-inbound/rev/eda42451649d Trying with bug 1248958 again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ba867dfa9bd
Assignee | ||
Comment 13•9 years ago
|
||
Bug 1248958 has landed on m-i.
Status: NEW → ASSIGNED
Keywords: checkin-needed
Comment 14•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/2bfa3b1f1c74
Keywords: checkin-needed
Comment 15•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2bfa3b1f1c74
Status: ASSIGNED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Comment 16•9 years ago
|
||
Shoudn't we uplift this patch (together with bug 1248958)? Would be nice to have it in the same release as bug 1249811.
Flags: needinfo?(honzab.moz)
Comment 17•9 years ago
|
||
> in the same release as bug 1249811.
I meant 913822.
Assignee | ||
Comment 18•9 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #16) > Shoudn't we uplift this patch (together with bug 1248958)? Would be nice to > have it in the same release as bug 1249811. Agree. Wanted to do it actually.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 19•9 years ago
|
||
Comment on attachment 8718915 [details] [diff] [review] v1 Approval Request Comment [Feature/regressing bug #]: The new HTTP cache [User impact if declined]: This patch greatly helps with shutdown hangs and timeout crashes [Describe test coverage new/current, TreeHerder]: landed on m-c while ago, tested on every browser/platform shutdown [Risks and why]: after several test runs I can say zero when landed together with bug 1248958 [String/UUID change made/needed]: none IMPORTANT: Land bug 1248958 *before* this bug on m-a !!!
Attachment #8718915 -
Flags: approval-mozilla-aurora?
Comment 20•9 years ago
|
||
Tracking for 46 (and maybe 45?) as this should help our shutdown crashes including the one that looks like it's from skype click to play.
status-firefox45:
--- → ?
status-firefox46:
--- → affected
tracking-firefox46:
--- → +
tracking-firefox47:
--- → +
Comment 21•9 years ago
|
||
Comment on attachment 8718915 [details] [diff] [review] v1 This may help with shutdown crashes. Please uplift after bug 1248958
Attachment #8718915 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 22•9 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/502ad41d8332
Comment 23•8 years ago
|
||
This should be backed out because it causes lot of shutdown crashes in the cache code (bugs 1251130, 1248389, 1254720, 1252746). We can re-land it once we find a proper fix for CacheFileIOManager shutdown.
Status: RESOLVED → REOPENED
Flags: needinfo?(honzab.moz)
Resolution: FIXED → ---
Assignee | ||
Comment 24•8 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #23) > This should be backed out because it causes lot of shutdown crashes in the > cache code (bugs 1251130, 1248389, 1254720, 1252746). We can re-land it once > we find a proper fix for CacheFileIOManager shutdown. I can see only one bug we have here - bug 1251130 that needs a proper fix. Do you believe there will be more even if we find a proper fix for it? Otherwise, we can back *this* one out, but we'll again start crashing on shutdown timeout. Hence, similar bad impact on users. Let's try to find a proper fix for bug 1251130 first. If that is going to be complicated or hard to uplift, let's back this out (from branches or from nightly as well - what would make really sad).
Flags: needinfo?(honzab.moz) → needinfo?(michal.novotny)
Assignee | ||
Comment 26•8 years ago
|
||
We are soon going to have a patch for bug 1251130.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 8 years ago
Flags: needinfo?(michal.novotny)
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•