Closed
Bug 1342366
Opened 7 years ago
Closed 7 years ago
Crash in nsWyciwygChannel::GetCharsetAndSource
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla55
People
(Reporter: philipp, Assigned: michal)
Details
(Keywords: crash, Whiteboard: [necko-active])
Crash Data
Attachments
(2 files, 4 obsolete files)
22.64 KB,
patch
|
michal
:
review+
gchang
:
approval-mozilla-aurora+
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
22.47 KB,
patch
|
ritu
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-1b533675-4b62-4c72-9f25-5ef0c2170224. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 xul.dll nsWyciwygChannel::GetCharsetAndSource(int*, nsACString_internal&) netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:641 1 xul.dll mozilla::net::WyciwygChannelParent::OnStartRequest(nsIRequest*, nsISupports*) netwerk/protocol/wyciwyg/WyciwygChannelParent.cpp:276 2 xul.dll nsWyciwygChannel::NotifyListener() netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp:855 3 xul.dll mozilla::detail::RunnableMethodImpl<void ( nsDocument::*)(void), 0, 0>::Run() obj-firefox/dist/include/nsThreadUtils.h:768 4 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1067 5 xul.dll mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:96 6 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:225 7 xul.dll nsBaseAppShell::Run() widget/nsBaseAppShell.cpp:156 8 xul.dll nsAppShell::Run() widget/windows/nsAppShell.cpp:262 9 xul.dll nsAppStartup::Run() toolkit/components/startup/nsAppStartup.cpp:283 10 xul.dll XREMain::XRE_mainRun() toolkit/xre/nsAppRunner.cpp:4401 11 xul.dll XREMain::XRE_main(int, char** const, nsXREAppData const*) toolkit/xre/nsAppRunner.cpp:4534 12 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:4625 this cross-platform crash signature has been going on for a while. many user comments in crash reports mention being on imdb.com and clicking on the back button (navigating back a page) there.
Comment 1•7 years ago
|
||
Murder in a locked room...
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → michal.novotny
Whiteboard: [necko-active]
Assignee | ||
Comment 2•7 years ago
|
||
It's crashing on mCacheEntry because it is nulled out on background thread in CloseCacheEntryInternal(). The patch changes: - CloseCacheEntry() is called in NotifyListener() after OnStart/OnStart is called and after mIsPending is set to false - GetCharsetAndSource() returns early if mIsPending is false, so no race is possible even if the listener calls it later than from OnStart/OnStop. - OpenCacheEntry() changed so that OnCacheEntryAvailable() is called only when opening entry for reading - OnCacheEntryAvailable() was cleaned a bit, there was IMO a bug that channel was not canceled when ReadFromCache() failed.
Attachment #8842825 -
Flags: review?(honzab.moz)
Assignee | ||
Comment 3•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0f76d56a77cd3e42638d543c63afb37e5f50aacd
Comment 4•7 years ago
|
||
Comment on attachment 8842825 [details] [diff] [review] fix Review of attachment 8842825 [details] [diff] [review]: ----------------------------------------------------------------- I think with cache2 we can just back bug 599127 out (or the relevant parts of it)? That would fix this crash and remove some unnecessary complexity, IMO. ::: netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp @@ +440,5 @@ > // open a cache entry for this channel... > // mIsPending set to true since OnCacheEntryAvailable may be called > // synchronously and fails when mIsPending found false. > mIsPending = true; > + nsresult rv = OpenCacheEntry(mURI, false); I liked the flags more, here you have to know what the bool means. But whatever. @@ +629,5 @@ > aCharset = mCharset; > return NS_OK; > } > > + if (!mIsPending) { so, is mIsPending accessed only on a single thread? some thread assertions would be good to add. @@ +637,1 @@ > if (!mCacheEntry) { this is accessed concurrently. the usual problem - a refptr accessed on more than one thread. this should be locked, imo. OTOH, there is no need for the CloseCacheEntryInternal method anymore. Could we just back bug 599127 out? @@ +870,5 @@ > mListener = nullptr; > mListenerContext = nullptr; > } > > mIsPending = false; hmm.. interesting we live with this so long. The correct pending state should change as following: ->OnStart() mIsPending = false ->OnStop()
Attachment #8842825 -
Flags: review?(honzab.moz) → feedback-
Comment 5•7 years ago
|
||
I frequently get this crash when using the back button to navigate through IMDB pages.
Comment 6•7 years ago
|
||
Most recent crash reports: https://crash-stats.mozilla.com/report/index/d8095c9d-502e-4974-b286-0d9bf2170307 https://crash-stats.mozilla.com/report/index/9c3d7937-0a51-4660-acc5-83a272170307
Assignee | ||
Comment 8•7 years ago
|
||
Backing out bug 599127 makes sense. I'll create a patch soon.
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 9•7 years ago
|
||
Attachment #8842825 -
Attachment is obsolete: true
Attachment #8845681 -
Flags: review?(honzab.moz)
Comment 10•7 years ago
|
||
Comment on attachment 8845681 [details] [diff] [review] patch v2 Review of attachment 8845681 [details] [diff] [review]: ----------------------------------------------------------------- lgtm, but I think we can simplify even a bit more, + few other notes (for a possible followup) ::: netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp @@ -130,5 @@ > - // with bug 913828. > - MOZ_ASSERT(NS_IsMainThread()); > - nsCOMPtr<nsICacheService> service = > - do_GetService(NS_CACHESERVICE_CONTRACTID, &rv); > - } I'd rather leave this bit. I am not sure if comm-central has migrated to cache2 already or not. @@ +431,5 @@ > rv = mCacheEntry->SetMetaDataElement("inhibit-persistent-caching", "1"); > if (NS_FAILED(rv)) return rv; > } > > + if (mNeedToSetSecurityInfo) { what if you just always set it? will it have any performance/IO op effect? @@ +480,5 @@ > NS_IMETHODIMP > nsWyciwygChannel::SetSecurityInfo(nsISupports *aSecurityInfo) > { > + if (mMode == READING) { > + return NS_ERROR_UNEXPECTED; maybe even assert(false)? @@ -628,5 @@ > - if (mCharsetAndSourceSet) { > - *aSource = mCharsetSource; > - aCharset = mCharset; > - return NS_OK; > - } looks like that this was working even when we didn't have a cache entry. shouldn't we preserve it? can SetCharsetAndSource be called also when we are in the READ mode? If yes, then it's a bit weird. If no, OTOH, all these mNeedToWrite* etc flags are here IMO just because we were opening the cache entry for writing asynchronously. Now we open for write with OpenTruncate, so that after AsyncOpen we always have an entry. So, I think we can remove all these 'caches' to write to the cache entry later, or not? @@ -855,5 @@ > > void > nsWyciwygChannel::NotifyListener() > -{ > - if (mListener) { any reason not to check we have a listener? it scares me a bit... also, from experience - turned to an actual new rule when calling callbacks, the code should even be like this: nsCOMPtr<> listener = mListener; if (listener) listener->OnStart() mPending = false if (listener) listener->OnStop() mListener = nullptr; the same should be in OnStart/StopRequest but up to you to do this change in a followup to separate potential regressions.
Attachment #8845681 -
Flags: review?(honzab.moz) → feedback+
Assignee | ||
Comment 11•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #10) > ::: netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp > @@ -130,5 @@ > > - // with bug 913828. > > - MOZ_ASSERT(NS_IsMainThread()); > > - nsCOMPtr<nsICacheService> service = > > - do_GetService(NS_CACHESERVICE_CONTRACTID, &rv); > > - } > > I'd rather leave this bit. I am not sure if comm-central has migrated to > cache2 already or not. And how is this related? WyciwygChannel will no longer use cache off the main thread, so why is it needed? > @@ +431,5 @@ > > rv = mCacheEntry->SetMetaDataElement("inhibit-persistent-caching", "1"); > > if (NS_FAILED(rv)) return rv; > > } > > > > + if (mNeedToSetSecurityInfo) { > > what if you just always set it? will it have any performance/IO op effect? I didn't measure it, but it seems to me unnecessary to serialize security info and write it into metadata in CacheEntry::SetSecurityInfo every time document.write is called. > @@ -628,5 @@ > > - if (mCharsetAndSourceSet) { > > - *aSource = mCharsetSource; > > - aCharset = mCharset; > > - return NS_OK; > > - } > > looks like that this was working even when we didn't have a cache entry. > shouldn't we preserve it? > > can SetCharsetAndSource be called also when we are in the READ mode? If > yes, then it's a bit weird. I'm pretty sure it is called also in READ mode, but I don't remember where does the call come from. Both GetCharsetAndSource and SetCharsetAndSource are called in READ mode when we have the entry. GetCharsetAndSource isn't called in WRITE mode and SetCharsetAndSource is called before first write so mode is still NONE. So it should work with the way how we use wyciwyg channel now. I can add couple of assertions to make sure we know when the use case changes. > If no, OTOH, all these mNeedToWrite* etc flags are here IMO just because we > were opening the cache entry for writing asynchronously. Now we open for > write with OpenTruncate, so that after AsyncOpen we always have an entry. > So, I think we can remove all these 'caches' to write to the cache entry > later, or not? No, as stated above, it is called before we open the cache entry. > > void > > nsWyciwygChannel::NotifyListener() > > -{ > > - if (mListener) { > > any reason not to check we have a listener? it scares me a bit... also, It cannot be null. Listener is checked in AsyncOpen and there is no other place where it could be nulled out. > from experience - turned to an actual new rule when calling callbacks, the > code should even be like this: > > nsCOMPtr<> listener = mListener; > if (listener) > listener->OnStart() > mPending = false > if (listener) > listener->OnStop() > > mListener = nullptr; > > the same should be in OnStart/StopRequest I'm not against.
Flags: needinfo?(honzab.moz)
Comment 12•7 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #11) > (In reply to Honza Bambas (:mayhemer) from comment #10) > > ::: netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp > > @@ -130,5 @@ > > > - // with bug 913828. > > > - MOZ_ASSERT(NS_IsMainThread()); > > > - nsCOMPtr<nsICacheService> service = > > > - do_GetService(NS_CACHESERVICE_CONTRACTID, &rv); > > > - } > > > > I'd rather leave this bit. I am not sure if comm-central has migrated to > > cache2 already or not. > > And how is this related? WyciwygChannel will no longer use cache off the > main thread, so why is it needed? Yes! That's true. Remove it then. > > > @@ +431,5 @@ > > > rv = mCacheEntry->SetMetaDataElement("inhibit-persistent-caching", "1"); > > > if (NS_FAILED(rv)) return rv; > > > } > > > > > > + if (mNeedToSetSecurityInfo) { > > > > what if you just always set it? will it have any performance/IO op effect? > > I didn't measure it, but it seems to me unnecessary to serialize security > info and write it into metadata in CacheEntry::SetSecurityInfo every time > document.write is called. OK, let's leave it. > > > @@ -628,5 @@ > > > - if (mCharsetAndSourceSet) { > > > - *aSource = mCharsetSource; > > > - aCharset = mCharset; > > > - return NS_OK; > > > - } > > > > looks like that this was working even when we didn't have a cache entry. > > shouldn't we preserve it? > > > > can SetCharsetAndSource be called also when we are in the READ mode? If > > yes, then it's a bit weird. > > I'm pretty sure it is called also in READ mode, but I don't remember where > does the call come from. Both GetCharsetAndSource and SetCharsetAndSource > are called in READ mode when we have the entry. GetCharsetAndSource isn't > called in WRITE mode and SetCharsetAndSource is called before first write so > mode is still NONE. So it should work with the way how we use wyciwyg > channel now. I can add couple of assertions to make sure we know when the > use case changes. That's alright, still, this is bug to a fix a crash, not wysiwyg channel. > > > If no, OTOH, all these mNeedToWrite* etc flags are here IMO just because we > > were opening the cache entry for writing asynchronously. Now we open for > > write with OpenTruncate, so that after AsyncOpen we always have an entry. > > So, I think we can remove all these 'caches' to write to the cache entry > > later, or not? > > No, as stated above, it is called before we open the cache entry. > > > > void > > > nsWyciwygChannel::NotifyListener() > > > -{ > > > - if (mListener) { > > > > any reason not to check we have a listener? it scares me a bit... also, > > It cannot be null. Listener is checked in AsyncOpen and there is no other > place where it could be nulled out. I'd still rather leave at least the null check in. > > > from experience - turned to an actual new rule when calling callbacks, the > > code should even be like this: > > > > nsCOMPtr<> listener = mListener; > > if (listener) > > listener->OnStart() > > mPending = false > > if (listener) > > listener->OnStop() > > > > mListener = nullptr; > > > > the same should be in OnStart/StopRequest > > I'm not against. OK, I'll change the f+ to r+, feel free to update the patch and just carry forward.
Flags: needinfo?(honzab.moz)
Comment 13•7 years ago
|
||
Comment on attachment 8845681 [details] [diff] [review] patch v2 r+ based on comment 12. We can't remove some internal caches for charset and secinfo.
Attachment #8845681 -
Flags: feedback+ → review+
Assignee | ||
Comment 14•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=29980803da0919a55062031bc305946a6e0365bb
Attachment #8845681 -
Attachment is obsolete: true
Attachment #8845971 -
Flags: review+
Assignee | ||
Comment 15•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7df6560f902c0f07d671e1ec652b89b89cc9570a&selectedJob=83729531 I had to change the assertion in nsWyciwygChannel::SetCharsetAndSource. With e10s turned on, nsWyciwygChannel::SetCharsetAndSource is called in reading mode when we already closed and released the entry. This is now a new bug, it happens without this patch too. I'll file a follow-up bug for this.
Attachment #8845971 -
Attachment is obsolete: true
Attachment #8847471 -
Flags: review+
Comment 16•7 years ago
|
||
Pushed by mnovotny@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/749de78b4831 Crash in nsWyciwygChannel::GetCharsetAndSource, r=honzab
Comment 17•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/749de78b4831
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Updated•7 years ago
|
Comment 18•7 years ago
|
||
No reports on Nightly since this fix landed. Is this something we can consider uplifting to Aurora/Beta/ESR52?
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 19•7 years ago
|
||
Comment on attachment 8847471 [details] [diff] [review] patch v4 Approval Request Comment [Feature/Bug causing the regression]: 599127 + e10s [User impact if declined]: possible crash when navigating back to wyciwyg document [Is this code covered by automated tests?]: there are some tests that generate wyciwyg documents as well as navigate to them [Has the fix been verified in Nightly?]: there are no reports of the crash after landing the patch [Needs manual test from QE? If yes, steps to reproduce]: no, because there are no exact STR to trigger the crash [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: the risk is low [Why is the change risky/not risky?]: the patch mostly reverts changes from bug 599127 and simplifies the code a lot [String changes made/needed]: none
Flags: needinfo?(michal.novotny)
Attachment #8847471 -
Flags: approval-mozilla-beta?
Attachment #8847471 -
Flags: approval-mozilla-aurora?
Comment 20•7 years ago
|
||
Comment on attachment 8847471 [details] [diff] [review] patch v4 Fix a crash. Aurora54+ & Beta53+.
Attachment #8847471 -
Flags: approval-mozilla-beta?
Attachment #8847471 -
Flags: approval-mozilla-beta+
Attachment #8847471 -
Flags: approval-mozilla-aurora?
Attachment #8847471 -
Flags: approval-mozilla-aurora+
Comment 21•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/d40938cc2aee
Comment 22•7 years ago
|
||
needs rebasing for beta: rafting 408128:d40938cc2aee "Bug 1342366 - Crash in nsWyciwygChannel::GetCharsetAndSource, r=honzab a=gchang" merging netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp warning: conflicts while merging netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp! (edit, then use 'hg resolve --mark') abort: unresolved conflicts, can't continue (use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 23•7 years ago
|
||
Flags: needinfo?(michal.novotny)
Comment 24•7 years ago
|
||
Comment on attachment 8852094 [details] [diff] [review] patch for beta53 This doesn't apply to Beta at all. patching file netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp Hunk #1 FAILED at 8 Hunk #3 FAILED at 416 Hunk #4 succeeded at 456 with fuzz 1 (offset 5 lines). Hunk #5 FAILED at 508 Hunk #6 FAILED at 620 Hunk #8 FAILED at 697 Hunk #9 FAILED at 793 6 out of 9 hunks FAILED -- saving rejects to file netwerk/protocol/wyciwyg/nsWyciwygChannel.cpp.rej patching file netwerk/protocol/wyciwyg/nsWyciwygChannel.h Hunk #2 FAILED at 38 Hunk #3 succeeded at 101 with fuzz 1 (offset 0 lines). 1 out of 3 hunks FAILED -- saving rejects to file netwerk/protocol/wyciwyg/nsWyciwygChannel.h.rej abort: patch failed to apply
Attachment #8852094 -
Attachment is obsolete: true
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 25•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24) > Comment on attachment 8852094 [details] [diff] [review] > patch for beta53 > > This doesn't apply to Beta at all. Sorry, the patch was against an old beta tree (although I'm pretty sure I did hg pull && hg up). This patch should work.
Flags: needinfo?(michal.novotny)
Comment 26•7 years ago
|
||
On the bright side, I bet it'll work nicely for ESR52 when the time comes :D
Comment 27•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/0c857c62e32e
Comment 28•7 years ago
|
||
Comment on attachment 8852266 [details] [diff] [review] patch for beta53 crash-stats isn't showing any hits with signature after 53.0b7, which is exactly what we'd expect given when the patch was uplifted. Fx52 hits this pretty frequently, so I think we should move forward with getting this onto esr52 for the next release. I've verified that it grafts cleanly.
Attachment #8852266 -
Flags: approval-mozilla-esr52?
Comment on attachment 8852266 [details] [diff] [review] patch for beta53 While this crash is low volume, 50 crash occurrences in past 7 days, I am still taking it because the fix has worked well on nightly/beta/release. ESR52+
Attachment #8852266 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
tracking-firefox-esr52:
--- → 54+
Comment 30•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/fb43f6690a26
Comment 31•7 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #19) > [Is this code covered by automated tests?]: there are some tests that > generate wyciwyg documents as well as navigate to them > [Has the fix been verified in Nightly?]: there are no reports of the crash > after landing the patch > [Needs manual test from QE? If yes, steps to reproduce]: no, because there > are no exact STR to trigger the crash Setting qe-verify- based on Michal's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•