gfx crash on memory-pressure notification

RESOLVED FIXED in mozilla1.9.3a1

Status

()

defect
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: vlad, Assigned: vlad)

Tracking

({fixed1.9.0.15})

Trunk
mozilla1.9.3a1
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.2 +
blocking1.9.0.15 +
wanted1.9.0.x +
wanted1.8.1.x -
in-testsuite +

Firefox Tracking Flags

(status1.9.2 beta1-fixed, blocking1.9.1 .4+, status1.9.1 .4-fixed)

Details

(Whiteboard: [nv])

Attachments

(2 attachments)

Posted patch fixSplinter Review
Went over this with karlt and stuart in irc..

13:22 < vlad> so there's this function and code: http://mxr.mozilla.org/mozilla-
central/source/gfx/src/thebes/nsThebesDeviceContext.cpp#258
13:23 < vlad> it calls NS_RELEASE and then maybe addrefs again
13:23 < vlad> but
13:23 < vlad> http://mxr.mozilla.org/mozilla-central/source/gfx/src/thebes/nsThebesFontMetrics.cpp#59
13:23 < vlad> Destroy() in nsThebesFontMetrics doesn't do anything
13:24 < vlad> and the destructor just deletes a member
13:24 < vlad> so it'll never get removed from that font cache array.. and we'll addref bogus memory, no?
13:25 < vlad> as best I can tell from mxr, nothing ever calls FontMetricsDeleted in our code
13:25 < vlad> (I'm hitting a crash in Compact, when I issue a memory pressure notification)
13:32 < karl> yeah, doesn't look right to me either
13:32 < vlad> I don't even know what that font cache does
13:32 < vlad> I trhink I have it fixed, at least not to crash
13:32 < vlad> I just call the FontMetricsDeleted from the destructor on mDeviceContext
13:33 < karl> i thought it did look right at one time in the past, but maybe i just believed the comments
13:36 < karl> this pattern appears quite a bit:
13:36 < karl> fm->Destroy();
13:36 < karl> NS_RELEASE(fm);
13:37 < vlad> fm->Destroy() should set its device context to null from I can tell
13:37 < vlad> and actually releasing it shouldn' call FontMetricsDeleted if the dc is null
13:37 < vlad> in the places where that pattern appears, fm is destroyed/released only if initialization fails
13:40 < karl> are you seeing the crash in AddRef()?
13:43 < vlad> in the release, actually
13:43 < vlad> the addrefs happen too quickly, the memory still looks semi-valid
13:43 < karl> looks like FontMetricsDeleted is meant to be called from FontMetricsDeleted
13:43 < vlad> but the second time around through a memory pressure notifier after that memory got reused
13:43 < karl> i mean
13:44 < karl> looks like FontMetricsDeleted is meant to be called from ~nsThebesFontMetrics
Attachment #393386 - Flags: review?(mozbugz)
this sounds like a bug romaxa filed
Attachment #393386 - Flags: review?(mozbugz) → review+
the bug i had in mind is bug 503784.

note that not filing bugs with crash signatures as you did here means that it's hard for people to find duplicates. please cooperate with the other users of bugzilla by including crash signatures in your bug summaries.
i'd like to further point out that i specifically asked you, vlad, for a review in that bug, and you bounced the review. i consider this inappropriate.
http://hg.mozilla.org/mozilla-central/rev/11bd05bebf85

I am, as part of my job, doing fewer gfx reviews, which is why that review was bounced (as were most gfx-related reviews that would have good reviewers amongst the gfx team).
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Flags: blocking1.9.2+
Keywords: fixed1.9.2
Whiteboard: [nv]
I'm still seeing a bug that looks a lot like this when I use the faststart component's memory-pressure feature:

>	xul.dll!nsFontCache::Compact()  Line 272 + 0x9 bytes	C++
 	xul.dll!nsThebesDeviceContext::Observe(nsISupports * aSubject=0x00000000, const char * aTopic=0x0521e298, const wchar_t * aSomeData=0x0364cec8)  Line 358	C++
 	xul.dll!nsObserverList::NotifyObservers(nsISupports * aSubject=0x00000000, const char * aTopic=0x0521e298, const wchar_t * someData=0x0364cec8)  Line 129	C++
 	xul.dll!nsObserverService::NotifyObservers(nsISupports * aSubject=0x00000000, const char * aTopic=0x0521e298, const wchar_t * someData=0x0364cec8)  Line 185	C++
 	xul.dll!NS_InvokeByIndex_P(nsISupports * that=0x00000005, unsigned int methodIndex=0x00000003, unsigned int paramCount=0x0012d8f0, nsXPTCVariant * params=0x00000001)  Line 102	C++
 	xul.dll!XPCWrappedNative::CallMethod(XPCCallContext & ccx={...}, XPCWrappedNative::CallMode mode=CALL_METHOD)  Line 2710 + 0x20 bytes	C++
we're still seeing this crash
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 451670
Hmm, my comment was lost earlier.  Need more information from the crash, like looking at data and all that.  karlt should look at/fix this, because we can probably just get rid of most of this code.
just take a look at http://mxr.mozilla.org/mozilla-central/source/gfx/src/thebes/nsThebesDeviceContext.cpp#258 -- its use of oldfm is clearly horrible.
or maybe the code is just gross, and not wrong.  can get a stack -- it crashes all the time on maemo.
A bug was fixed here.  Let's fix the next bug in bug 503784.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
No longer blocks: 451670
Group: core-security
blocking1.9.1: --- → ?
Flags: blocking1.9.0.14?
Attachment #393386 - Flags: approval1.9.1.3?
Attachment #393386 - Flags: approval1.9.0.14?
Flags: blocking1.9.0.14? → blocking1.9.0.15?
Comment on attachment 393386 [details] [diff] [review]
fix

Pushing out approval requests...
Attachment #393386 - Flags: approval1.9.1.4?
Attachment #393386 - Flags: approval1.9.1.3?
Attachment #393386 - Flags: approval1.9.0.15?
Attachment #393386 - Flags: approval1.9.0.14?
Attachment #393386 - Flags: superreview?(pavlov)
Attachment #393386 - Flags: superreview?(pavlov) → superreview+
Tony, due to the lack of a device I'm not able to verify this fix. Could you or Marcia check if the crash went away? Thanks.
Target Milestone: --- → mozilla1.9.3a1
The bugs that depend on bug 503784 seem to confirm that this is now fixed but we should be able to write a test using nsIMemory::heapMinimize().
blocking1.9.1: ? → .4+
Flags: wanted1.9.0.x+
Flags: wanted1.8.1.x-
Flags: blocking1.9.0.15?
Flags: blocking1.9.0.15+
Whiteboard: [nv] → [nv][sg:critical?]
Comment on attachment 393386 [details] [diff] [review]
fix

Approved for 1.9.1.4 and 1.9.0.15, a=dveditz for release-drivers
Attachment #393386 - Flags: approval1.9.1.4?
Attachment #393386 - Flags: approval1.9.1.4+
Attachment #393386 - Flags: approval1.9.0.15?
Attachment #393386 - Flags: approval1.9.0.15+
Assignee: nobody → vladimir
This uses the observer service to send a memory pressure notification.
It actually doesn't crash 1.9.0 and 1.9.1 because of another bug that meant the observer was not registered (bug 419125 comment 6).
http://hg.mozilla.org/mozilla-central/rev/c9b6282f0db7

The observer was registered before 1.9.2a1, so we should wait till those users are offered 1.9.2b1 before landing this test.

The fix for this bug may still be worthwhile on 1.9.0 and 1.9.1 because nsFontCache::Compact() can still be called in some unusual situations if fm->Init() should fail:
http://hg.mozilla.org/releases/mozilla-1.9.1/annotate/b7dd9891657f/gfx/src/nsDeviceContext.cpp#l519
Blocks: 419125
So, reading comment 18, there is no crash on 1.9.0 because of a blocking bug (which may or may not be addressed at some point) and the only way to crash it is via the mochitest anyway? Is this mochitest going to be checked into 1.9.0?
(In reply to comment #19)
> So, reading comment 18, there is no crash on 1.9.0 because of a blocking bug
> (which may or may not be addressed at some point)

The mochitest doesn't crash 1.9.0 because of bug 419125 comment 6, but there could still be a crash on 1.9.0.

> and the only way to crash it is via the mochitest anyway?

Mochitest enables UniversalXPConnect, which makes it easy to produce the crash on 1.9.2.  This is definitely not the only way to crash on 1.9.2.  I expect there are ways to crash on 1.9.0 also, but I don't know what they are.

The file in this patch does not need to be run under mochitest.  To produce the crash, it is enough just to open the file and manually accept the privilege request.

> Is this mochitest going to be checked into 1.9.0?

It could be.  It wouldn't test this bug (unless other things change) but it would test other things.
I'm not concerned with 1.9.2. I work on the 1.9.0 and 1.9.1 security releases. This is fixed in both of those and I'm trying to verify the fix. Since this bug lacks repro steps and the mochitest only works on 1.9.2, it is difficult to reproduce the issue on the branches with which I am concerned.
Pushed test attachment 401127 [details] [diff] [review]

http://hg.mozilla.org/mozilla-central/rev/5af76e7485d4
http://hg.mozilla.org/mozilla-central/rev/bef4ad958736
Group: core-security
Flags: in-testsuite? → in-testsuite+
Whiteboard: [nv][sg:critical?] → [nv]
You need to log in before you can comment on or make changes to this bug.