Closed
Bug 1308652
Opened 8 years ago
Closed 8 years ago
Remove about:bloat
Categories
(Core :: Networking, defect, P3)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla52
People
(Reporter: mccr8, Assigned: mccr8)
References
Details
Attachments
(2 files)
58 bytes,
text/x-review-board-request
|
froydnj
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
|
Details |
58 bytes,
text/x-review-board-request
|
froydnj
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
|
Details |
I'm looking at bug 1271182 some more, and in particular the negative leaks of an IPC::Message, and my current best guess is that there is some kind of double free occurring. Either that or there is some mistake with, say, the way MOZ_COUNT_CTOR or DTOR is called with a move or copy constructor for IPC::Message.
I added a check to NS_LogDtor, in the part of the code relating to tracking serial numbers of live objects, to ensure that when we destroy an object that we currently think it is live, and it fails to match that, with this stack:
#01: IPC::Message::~Message() [mfbt/RefPtr.h:77]
#02: mozilla::ipc::MessageChannel::DequeueOne(IPC::Message*) [clang/include/c++/v1/deque:2572]
#03: mozilla::ipc::MessageChannel::OnMaybeDequeueOne() [ipc/glue/MessageChannel.cpp:1567]
#04: mozilla::detail::RunnableMethodImpl<bool (mozilla::ipc::MessageChannel::*)(), false, true>::Run [xpcom/glue/nsThreadUtils.h:729]
#05: mozilla::ipc::MessageChannel::DequeueTask::Run() [ipc/glue/MessageChannel.h:561]
#06: nsThread::ProcessNextEvent(bool, bool*) [xpcom/glue/nsCOMPtr.h:402]
#07: NS_ProcessPendingEvents(nsIThread*, unsigned int) [xpcom/glue/nsThreadUtils.cpp:232]
#08: nsBaseAppShell::NativeEventCallback() [widget/nsBaseAppShell.cpp:98]
[...]
#24: XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:4542]
from this try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=022493e337d1
ASan runs haven't picked up anything like this, so a logging problem seems more likely, but then again this negative leak issue is usually happening on OSX, and we only run ASan on Linux. I'm adding some additional instrumentation to distinguish the case where we never saw a ctor at all from the case where we already saw a dtor for the object. Hopefully I'll still be able to reproduce it.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=49def6c4c5e633a160662498e6723a95e5f31513
Assignee | ||
Comment 1•8 years ago
|
||
I should also say that the negative leak of an IPC::Message always occurs alongside a negative leak of CancelableRunnable and a Runnable, but sometimes there is a negative leak of the runnables without the message.
Assignee | ||
Comment 2•8 years ago
|
||
The test in question, browser_aboutURLs.js, loads every about: page. I'm not sure what the goal of that is. In one failing run that I added logging to, the "double free" was happening during the loading of about:bloat. This page, with the behavior defined in nsAboutBloat.cpp, calls nsTraceRefcnt::DumpStatistics(). One of the first things that method does is set gLogging to NoLogging, which stops functions like NS_LogCtor from doing anything.
I suspect what is happening is that while we're in the middle of this call, some other thread (in this particular case it appears to be IPC related, but I'm not sure how this is happening on another thread) creates some stuff, but nsTraceRefcnt doesn't record that fact because it is NoLogging. Then later we destroy that object, so we get a negative leak.
The easiest fix is just to make this test not do about:bloat (there's already a mechanism for it). Fixing about:bloat doesn't really seem like it is worth the effort, but maybe at least a non-fatal assert for this situation would at least prevent this mysterious failures.
I'll confirm my hypothesis and unhide this bug if it is correct.
As a side note, the test browser_aboutURLs.js looks like it is permafail on L64 debug now:
Assertion failure: originAttrsLoadInfo.mUserContextId == originAttrsLoadContext.mUserContextId (The value of mUserContextId in the loadContext and in the loadInfo are not the same!)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bc0f0053c844e2d614c9f585cbb7225caea2929&selectedJob=28866297
I'm not sure how bad that assertion is...
![]() |
||
Comment 3•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> The easiest fix is just to make this test not do about:bloat (there's
> already a mechanism for it). Fixing about:bloat doesn't really seem like it
> is worth the effort, but maybe at least a non-fatal assert for this
> situation would at least prevent this mysterious failures.
Does anybody really use about:bloat at this point? Could we just remove it entirely?
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3)
> Does anybody really use about:bloat at this point? Could we just remove it
> entirely?
I doubt it. I'll email dbaron, as he seems like the most likely person.
Assignee | ||
Comment 5•8 years ago
|
||
dbaron said via email he does not use about:bloat. I'll remove it.
I haven't been able to verify that this is what is happening here, but I'm almost certain that is what the issue is.
There's some nsTraceRefcnt infrastructure that exists only to support about:bloat, and I'll file a followup to remove that.
Blocks: 1271182
Group: dom-core-security
Component: XPCOM → Networking
Summary: Possible double free in MessageChannel::DequeueOne() → Remove about:bloat
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 8•8 years ago
|
||
try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d160c400f75002751696c764c5490b68e661d6d9
I'm waiting on some retriggers of bc4 to try to get higher confidence that this actually fixed the issue.
![]() |
||
Comment 9•8 years ago
|
||
mozreview-review |
Comment on attachment 8799551 [details]
Bug 1308652, part 2 - Only allow nsTraceRefcnt::DumpStatistics to be called once.
https://reviewboard.mozilla.org/r/84692/#review83298
Attachment #8799551 -
Flags: review?(nfroyd) → review+
![]() |
||
Comment 10•8 years ago
|
||
mozreview-review |
Comment on attachment 8799550 [details]
Bug 1308652, part 1 - Remove about:bloat.
https://reviewboard.mozilla.org/r/84690/#review83296
Do we not need to remove any tests that reference this? Or are those tests perma-disabled everywhere?
Attachment #8799550 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 11•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #10)
> Do we not need to remove any tests that reference this? Or are those tests
> perma-disabled everywhere?
There are no tests as far as I can tell, aside from aboutURLs which just iterates over every about: page automatically, so no changes are needed. If there were, there would be more intermittent leaks.
Comment 12•8 years ago
|
||
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e90822faa5d1
part 1 - Remove about:bloat. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/ebae7411d587
part 2 - Only allow nsTraceRefcnt::DumpStatistics to be called once. r=froydnj
Comment 13•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> As a side note, the test browser_aboutURLs.js looks like it is permafail on
> L64 debug now:
Is there a follow-up bug filed for fixing and re-enabling this test on Linux/Windows?
Comment 14•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e90822faa5d1
https://hg.mozilla.org/mozilla-central/rev/ebae7411d587
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8799550 [details]
Bug 1308652, part 1 - Remove about:bloat.
Approval Request Comment
[Feature/regressing bug #]: Sort of bug 1266022, though the underlying issue is very old. (from at least 1999...)
[User impact if declined]: None. This is just causing intermittent failures on TreeHerder.
[Describe test coverage new/current, TreeHerder]: I added some assertions to guard against this particular failure.
[Risks and why]: Very low. It just deletes some debug-only code.
[String/UUID change made/needed]: none
Attachment #8799550 -
Flags: approval-mozilla-beta?
Attachment #8799550 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•8 years ago
|
Attachment #8799551 -
Flags: approval-mozilla-beta?
Attachment #8799551 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 16•8 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
> Is there a follow-up bug filed for fixing and re-enabling this test on
> Linux/Windows?
I was just going to do that in bug 1271182.
status-firefox50:
--- → affected
status-firefox51:
--- → affected
Comment on attachment 8799550 [details]
Bug 1308652, part 1 - Remove about:bloat.
Fixes an intermittent, the patch looks trivial, Aurora51+, Beta50+
Attachment #8799550 -
Flags: approval-mozilla-beta?
Attachment #8799550 -
Flags: approval-mozilla-beta+
Attachment #8799550 -
Flags: approval-mozilla-aurora?
Attachment #8799550 -
Flags: approval-mozilla-aurora+
Attachment #8799551 -
Flags: approval-mozilla-beta?
Attachment #8799551 -
Flags: approval-mozilla-beta+
Attachment #8799551 -
Flags: approval-mozilla-aurora?
Attachment #8799551 -
Flags: approval-mozilla-aurora+
Comment 18•8 years ago
|
||
bugherder uplift |
Comment 19•8 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•