Closed
Bug 737509
Opened 12 years ago
Closed 12 years ago
[10.5] Crash in _cairo_quartz_surface_mask @ _cairo_user_data_array_fini
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: scoobidiver, Assigned: smichaud)
References
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(4 files)
7.97 KB,
patch
|
jrmuizel
:
review+
|
Details | Diff | Splinter Review |
15.20 KB,
patch
|
Details | Diff | Splinter Review | |
14.97 KB,
patch
|
Details | Diff | Splinter Review | |
13.36 KB,
patch
|
Details | Diff | Splinter Review |
It's #4 top browser crasher in 11.0 on Mac OS X. Stacks look like: Frame Module Signature Source 0 XUL _cairo_user_data_array_fini gfx/cairo/cairo/src/cairo-array.c:389 1 XUL _moz_cairo_surface_destroy gfx/cairo/cairo/src/cairo-surface.c:654 2 XUL _cairo_surface_to_cgimage gfx/cairo/cairo/src/cairo-quartz-surface.c:1116 3 XUL _cairo_quartz_surface_mask_with_surface gfx/cairo/cairo/src/cairo-quartz-surface.c:2926 4 XUL _cairo_quartz_surface_mask gfx/cairo/cairo/src/cairo-quartz-surface.c:2999 5 XUL _cairo_surface_mask gfx/cairo/cairo/src/cairo-surface.c:2164 6 XUL _cairo_gstate_mask gfx/cairo/cairo/src/cairo-gstate.c:1124 7 XUL _moz_cairo_mask gfx/cairo/cairo/src/cairo.c:2314 8 XUL nsSVGIntegrationUtils::PaintFramesWithEffects layout/svg/base/src/nsSVGIntegrationUtils.cpp:362 9 XUL nsDisplaySVGEffects::Paint layout/base/nsDisplayList.cpp:2968 10 XUL mozilla::FrameLayerBuilder::DrawThebesLayer layout/base/FrameLayerBuilder.cpp:2158 11 XUL mozilla::layers::BasicShadowableThebesLayer::PaintBuffer gfx/layers/basic/BasicLayers.cpp:563 12 XUL mozilla::layers::BasicThebesLayer::PaintThebes gfx/layers/basic/BasicLayers.cpp:748 13 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1902 14 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1917 15 XUL mozilla::layers::BasicLayerManager::EndTransactionInternal gfx/layers/basic/BasicLayers.cpp:1614 16 XUL mozilla::layers::BasicShadowLayerManager::EndTransaction gfx/layers/basic/BasicLayers.cpp:1565 17 XUL nsDisplayList::PaintForFrame layout/base/nsDisplayList.cpp:635 18 XUL nsDisplayList::PaintRoot layout/base/nsDisplayList.cpp:540 19 XUL nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1769 20 XUL PresShell::Paint layout/base/nsPresShell.cpp:5502 ... or Frame Module Signature Source 0 XUL _cairo_user_data_array_fini gfx/cairo/cairo/src/cairo-array.c:389 1 XUL _moz_cairo_surface_destroy gfx/cairo/cairo/src/cairo-surface.c:654 2 XUL _cairo_surface_to_cgimage gfx/cairo/cairo/src/cairo-quartz-surface.c:1116 3 XUL _cairo_quartz_surface_mask_with_surface gfx/cairo/cairo/src/cairo-quartz-surface.c:2926 4 XUL _cairo_quartz_surface_mask gfx/cairo/cairo/src/cairo-quartz-surface.c:2999 5 XUL _cairo_surface_mask gfx/cairo/cairo/src/cairo-surface.c:2164 6 XUL _cairo_gstate_mask gfx/cairo/cairo/src/cairo-gstate.c:1124 7 XUL _moz_cairo_mask_surface gfx/cairo/cairo/src/cairo.c:2314 8 XUL gfxContext::Mask gfx/thebes/gfxContext.cpp:797 9 XUL gfxAlphaBoxBlur::Paint gfx/thebes/gfxBlur.cpp:119 10 XUL nsContextBoxBlur::DoPaint layout/base/nsCSSRendering.cpp:4275 11 XUL nsTextFrame::PaintOneShadow layout/generic/nsTextFrameThebes.cpp:4976 12 XUL nsTextFrame::PaintText layout/generic/nsTextFrameThebes.cpp:5430 13 XUL nsDisplayText::Paint layout/generic/nsTextFrameThebes.cpp:4281 14 XUL mozilla::FrameLayerBuilder::DrawThebesLayer layout/base/FrameLayerBuilder.cpp:2158 15 XUL mozilla::layers::BasicThebesLayer::PaintThebes gfx/layers/basic/BasicLayers.cpp:708 16 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1902 17 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1917 18 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1917 19 XUL mozilla::layers::BasicLayerManager::EndTransactionInternal gfx/layers/basic/BasicLayers.cpp:1614 20 XUL mozilla::layers::BasicShadowLayerManager::EndTransaction gfx/layers/basic/BasicLayers.cpp:1565 21 XUL nsDisplayList::PaintForFrame layout/base/nsDisplayList.cpp:635 22 XUL nsDisplayList::PaintRoot layout/base/nsDisplayList.cpp:540 23 XUL nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1769 24 XUL PresShell::Paint layout/base/nsPresShell.cpp:5502 ... or Frame Module Signature Source 0 XUL _cairo_user_data_array_fini gfx/cairo/cairo/src/cairo-array.c:389 1 XUL _moz_cairo_surface_destroy gfx/cairo/cairo/src/cairo-surface.c:654 2 XUL _cairo_surface_to_cgimage gfx/cairo/cairo/src/cairo-quartz-surface.c:1116 3 XUL _cairo_quartz_surface_mask_with_surface gfx/cairo/cairo/src/cairo-quartz-surface.c:2926 4 XUL _cairo_quartz_surface_mask gfx/cairo/cairo/src/cairo-quartz-surface.c:2999 5 XUL _cairo_surface_mask gfx/cairo/cairo/src/cairo-surface.c:2164 6 XUL _cairo_gstate_mask gfx/cairo/cairo/src/cairo-gstate.c:1124 7 XUL _moz_cairo_mask_surface gfx/cairo/cairo/src/cairo.c:2314 8 XUL gfxContext::Mask gfx/thebes/gfxContext.cpp:797 9 XUL gfxAlphaBoxBlur::Paint gfx/thebes/gfxBlur.cpp:119 10 XUL nsCSSRendering::PaintBoxShadowOuter layout/base/nsCSSRendering.cpp:4275 11 XUL nsDisplayBoxShadowOuter::Paint layout/base/nsDisplayList.cpp:1502 12 XUL mozilla::FrameLayerBuilder::DrawThebesLayer layout/base/FrameLayerBuilder.cpp:2158 13 XUL mozilla::layers::BasicShadowableThebesLayer::PaintBuffer gfx/layers/basic/BasicLayers.cpp:563 14 XUL mozilla::layers::BasicThebesLayer::PaintThebes gfx/layers/basic/BasicLayers.cpp:748 15 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1902 16 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1917 17 XUL mozilla::layers::BasicLayerManager::PaintLayer gfx/layers/basic/BasicLayers.cpp:1917 18 XUL mozilla::layers::BasicLayerManager::EndTransactionInternal gfx/layers/basic/BasicLayers.cpp:1614 19 XUL mozilla::layers::BasicShadowLayerManager::EndTransaction gfx/layers/basic/BasicLayers.cpp:1565 20 XUL nsDisplayList::PaintForFrame layout/base/nsDisplayList.cpp:635 21 XUL nsDisplayList::PaintRoot layout/base/nsDisplayList.cpp:540 22 XUL nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1769 23 XUL PresShell::Paint layout/base/nsPresShell.cpp:5502 ... More reports at: https://crash-stats.mozilla.com/report/list?signature=_cairo_user_data_array_fini
Component: Layout → Graphics
QA Contact: layout → thebes
Comment 1•12 years ago
|
||
I'm crossing my fingers that Steven can diagnose this!
Assignee | ||
Comment 2•12 years ago
|
||
These seem to start full-blown with the 20120228210006 build (which I think is FF 11.0b5), without appearing in nightlies. That's extremely puzzling. Do you have any idea, Joe, what could have triggered this? (There are tons of other crashes with this same signature, especially on Linux. But the stacks are otherwise quite different.)
Assignee | ||
Comment 3•12 years ago
|
||
Of course my patch for bug 721663 did land on the 12 and 11 branches on 2012/02/27, just in time for the 11.0b5 "release" :-( So I suppose there's some connection. Though I can't believe my patch *caused* these crashes.
Assignee | ||
Comment 4•12 years ago
|
||
Interestingly, many of the comments indicate that these are crashes on startup.
Assignee | ||
Comment 5•12 years ago
|
||
I'll bet this is a simple NULL-deference bug -- _cairo_array_index() can return NULL. Patch coming up.
Assignee | ||
Comment 6•12 years ago
|
||
> Patch coming up.
Not quite yet.
The 'slots' pointer in _cairo_user_data_array_fini() *does* appear to be NULL. But I can't figure out how this could happen if 'num_slots' is non-zero.
I'll continue work on this tomorrow.
Reporter | ||
Comment 7•12 years ago
|
||
(In reply to Steven Michaud from comment #3) > Of course my patch for bug 721663 did land on the 12 and 11 branches on > 2012/02/27, just in time for the 11.0b5 "release" :-( The Beta-11 (50,000 ADU on Mac) regression range is indeed: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=0338a18c2bc8&tochange=8c9e4873d419 Bug 721663 belongs to the Beta regression range. The first crash in Aurora 12 (5,000 ADU on Mac) happened in 12.0a2/20120303, but it's discontinuous across builds. There are no crashes in 13.0a1 or 14.0a1 (1,500 ADU on Mac).
Blocks: 721663
Keywords: regression
Summary: crash in nsDisplayList::PaintForFrame @ _cairo_user_data_array_fini on Mac OS X 10.5 → [10.5] Crash in _cairo_quartz_surface_mask @ _cairo_user_data_array_fini
Assignee | ||
Comment 8•12 years ago
|
||
I'm about to throw my hands up on this. The only thing that's clear is that the value of 'array->elements' in _cairo_user_data_array_t is invalid, and so also may be value of 'array' itself. I've no idea how this could happen. And until we can reproduce this bug I'm unlikely to be able to find out. I spent most of the day looking for obvious causes (like uninitialized memory), but I couldn't find any. Possibly it's worth mentioning that bug 721663 involved memory corruption. I thought I'd completely worked around that ... but maybe I didn't. Anyone know if platform font code might somehow be involved here (and/or in bug 721663)? I know we use ATS fonts on OS X 10.5, and in the past they've proved to be very flaky. Jon and Jonathan, is it possible to avoid using ATS fonts on OS X 10.5 (to make gfxMacPlatformFontList::UseATSFontEntry() return false even on 10.5) without blowing things up? I'm going to try doing that in combination with backing out my patch for bug 721663, to see if it makes any difference.
Comment 9•12 years ago
|
||
I will take a look at some of the crash reports and see if I can reproduce.
Assignee | ||
Comment 10•12 years ago
|
||
(Correcting comment #8) > The only thing that's clear is that the value of 'array->elements' in > _cairo_user_data_array_t is invalid, and so also may be value of 'array' itself. _cairo_user_data_array_t -> _cairo_user_data_array_fini() http://hg.mozilla.org/mozilla-central/annotate/9023803c5d64/gfx/cairo/cairo/src/cairo-array.c#l379
Comment 11•12 years ago
|
||
So far I generated one crash while testing this, but it was not in this stack. I looked at the correlation reports and there is small correlation to Evernote, so I installed that and the addon.
Comment 12•12 years ago
|
||
It's not clear from the notes here, do you have a testcase that reproduces the problem? The patch for bug 721663 is in the regression range from comment 7. Something to do with that patch seems like a more likely candidate for the root cause of this. > Jon and Jonathan, is it possible to avoid using ATS fonts on OS X > 10.5 (to make gfxMacPlatformFontList::UseATSFontEntry() return false > even on 10.5) without blowing things up? I wouldn't do this unless you have clear evidence that the cause is somehow font related.
Reporter | ||
Comment 13•12 years ago
|
||
(In reply to Steven Michaud from comment #8) > I'm going to try doing that in combination with backing out my patch for bug > 721663, to see if it makes any difference. I don't think it's a good idea because bug 721663 was representing 1.8% of all Mac OS X crashes while this bug represent 0.8% of all Mac OS X crashes.
Assignee | ||
Comment 14•12 years ago
|
||
> I'm going to try doing that in combination with backing out my patch for bug
> 721663, to see if it makes any difference.
I'm not suggesting this as a solution. I'll be doing it for debugging purposes -- to learn more about what's going on.
Assignee | ||
Comment 15•12 years ago
|
||
I tried turning off ATS fonts on OS X 10.5 and also backing out my patch for bug 721663. I still crashed. So it looks like this bug's crashes have nothing to do with ATS fonts, or (probably) with font code in general. By the way, turning off ATS fonts on OS X 10.5 didn't make things blow up. But it did make text very ugly (probably because we fall back to some default font).
Assignee | ||
Comment 16•12 years ago
|
||
>I tried turning off ATS fonts on OS X 10.5 and also backing out my patch for bug > 721663. I still crashed. I still crashed with the STR from bug 721663 comment #3.
Assignee | ||
Comment 17•12 years ago
|
||
So it really does look like I'm stuck until someone can figure out how to reproduce this bug's crashes.
Assignee | ||
Comment 18•12 years ago
|
||
There's been exactly one of this bug's crashes on the 13 or 14 branches in the last month: bp-76ec25e6-61f3-410f-835d-8b48b2120322 So there's some reason to hope that they've been (somehow) worked around in current code.
Assignee | ||
Comment 19•12 years ago
|
||
(Following up comment #15) I also tried turning on ATS fonts on OS X 10.6.8. This *didn't* make the STR from bug 721663 comment #3 start working on 10.6.8.
Reporter | ||
Comment 20•12 years ago
|
||
(In reply to Steven Michaud from comment #18) > So there's some reason to hope that they've been (somehow) worked around in > current code. I won't be so sure because Aurora (5,000 ADU) and Nightly (1,500 ADU) Mac populations are low comparing with 4M ADU in Release.
Assignee | ||
Comment 21•12 years ago
|
||
I've found another way to fix bug 721663 than my original patch for it -- changing my patch for bug 671064 to use a snapshot of a cairo_source_t object, instead of just retaining and releasing the original object. My "new" patch for bug 671064 is arguably more correct than my original patch. And since bug 721663 does seem to be related to this bug (bug 737509), there's a good chance my patch will also fix this bug. Using a snapshot in _cairo_surface_to_cgimage() is more resource intensive than just using the "original" cairo_source_t object, suitably addrefed. But my gut tells me the penalty won't be large -- _cairo_surface_to_cgimage() isn't called *that* often. In any case I don't know how to measure this presumed performance penalty. If anyone knows of tools that might be able to help, please comment here. Andrea, this patch follows your advice in bug 671064 comment #30. Please let me know if I've got it right. I'm still convinced, as I was at bug 721663 comment #26, that these bugs aren't caused by accessing deleted objects. (For example, it doesn't help to comment out every call to CG.*Release() in cairo-quartz-surface.c.) It's pretty clearly some sort of memory corruption, though I still haven't figured out exactly how it happens. However I do now have a hunch: As I understand it, a Cairo snapshot is immutable. My hunch is that bug 721663's and this bug's crashes happen when data is altered in the midst of Core Graphics calls to draw objects (for example CGImageMaskCreate()). So using an immutable snapshot prevents these changes from happening, thereby fixing the crashes. That the crashes happen only on OS X 10.5 may mean that on 10.6 and 10.7 the Core Graphics calls make copies of the data before acting on it. Andrea, do you think this is likely/possible?
Assignee: nobody → smichaud
Status: NEW → ASSIGNED
Attachment #608900 -
Flags: review?(ranma42)
Attachment #608900 -
Flags: review?(joe)
Assignee | ||
Comment 22•12 years ago
|
||
I've started tryserver builds, which should be available by tomorrow morning.
Assignee | ||
Comment 23•12 years ago
|
||
Here's a tryserver build made with my patch from comment #21: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-1eb583c860a1/try-macosx64/firefox-14.0a1.en-US.mac.dmg By the way, there were no non-spurious test failures during the tryserver runs.
Updated•12 years ago
|
Attachment #608900 -
Flags: review?(joe) → review?(jmuizelaar)
Reporter | ||
Comment 24•12 years ago
|
||
It's #4 top browser crasher in 11.0 and #12 in 12.0b2 on Mac OS X.
Assignee | ||
Comment 25•12 years ago
|
||
(Following up comment #21) > However I do now have a hunch: > > As I understand it, a Cairo snapshot is immutable. My hunch is that > bug 721663's and this bug's crashes happen when data is altered in > the midst of Core Graphics calls to draw objects (for example > CGImageMaskCreate()). So using an immutable snapshot prevents these > changes from happening, thereby fixing the crashes. I've used this debugging patch to confirm my hunch. On both OS X 10.5.8 and 10.6.8, following the STR from bug 721663 comment #3 causes the following sequence to be logged: _cairo_surface_to_cgimage(1): Took snapshot 0x12a113cd0 _cairo_surface_snapshot_copy_on_write(1): surface 0x12a113cd0 _cairo_surface_to_cgimage(1): Took snapshot 0x12a113e20 _cairo_surface_snapshot_copy_on_write(1): surface 0x12a113e20 _cairo_surface_to_cgimage(1): Took snapshot 0x12a113f70 _cairo_surface_snapshot_copy_on_write(1): surface 0x12a113f70 _cairo_surface_to_cgimage(1): Took snapshot 0x12a1140c0 _cairo_surface_snapshot_copy_on_write(1): surface 0x12a1140c0 DataProviderReleaseCallback(): Done with snapshot 0x12a113e20 DataProviderReleaseCallback(): Done with snapshot 0x149016d10 DataProviderReleaseCallback(): Done with snapshot 0x12a113cd0 DataProviderReleaseCallback(): Done with snapshot 0x12a113f70 DataProviderReleaseCallback(): Done with snapshot 0x12a1140c0 This shows several cases of _cairo_surface_snapshot_copy_on_write() being called on a snapshot before cairo_surface_destroy() is called on it from DataProviderReleaseCallback(). This indicates an attempt to "change" the snapshot before CGImageMaskCreate() is finished with it. But since snapshots are copy-on-write, in fact no change takes place to the snapshot itself -- the change is made to a copy of the snapshot. Which is what we want, and why my patch is a good thing. This debugging patch includes the actual patch for this bug (from comment #21). My logs don't show any "illegal" attempts to change the snapshots -- none of the "ERROR!" messages appear in them. Which is also a good thing, but isn't the result of my patch.
Comment 26•12 years ago
|
||
Comment on attachment 608900 [details] [diff] [review] Possible patch Sure. Very sorry this took so long.
Attachment #608900 -
Flags: review?(jmuizelaar) → review+
Assignee | ||
Comment 27•12 years ago
|
||
Comment on attachment 608900 [details] [diff] [review] Possible patch Landed on mozilla-central: http://hg.mozilla.org/mozilla-central/rev/17e4143dd6f0 Andrea, I'm not going to wait for your review. But let us know if you have any comments. I actually intended to land this on mozilla-inbound ... but my fingers lead the way to mozilla-central. No bad thing, though -- this should probably land as soon as possible. I'll wait a few days, then request approval for aurora. I'm also tempted to request beta approval, though it may be a bit too late for that.
Attachment #608900 -
Flags: review?(ranma42)
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 28•12 years ago
|
||
Sorry, this patch may have caused this new crash [@ _cairo_surface_snapshot_copy_on_write]: https://tbpl.mozilla.org/php/getParsedLog.php?id=10755974&tree=Firefox I've backed out the patch for now, until the cause of the crash is identified: https://hg.mozilla.org/mozilla-central/rev/52ee2124247d
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 29•12 years ago
|
||
The crash was on OS X 10.5.8. I'll try running tests locally on 10.5.8 and see what happens.
Comment 30•12 years ago
|
||
I retriggered the test and the second run was green. So the crash is intermittent. (That might make it harder to debug, and it means we aren't sure this patch caused the crash.)
Assignee | ||
Comment 31•12 years ago
|
||
> I'll try running tests locally on 10.5.8 and see what happens.
The trunk won't build on 10.5.8 anymore, at least with the mozconfig I normally use.
So for the moment I'm stuck. I'll see what I can turn up tomorrow.
Assignee | ||
Comment 32•12 years ago
|
||
(Following up comment #31) Actually, Bob Clary's advice in bug 721744 comment #8 gets the build working again. But it'll be tomorrow before I start playing with it.
Assignee | ||
Comment 33•12 years ago
|
||
Last night I ran another set of tryserver tests for my patch, and this morning I discovered the same crash again, also on OS X 10.5.8. So the crash must be related to my patch, though I can't believe my patch actually causes it. (It's much more likely the crash is caused by another bug uncovered by my patch.) I still haven't managed to complete a build on 10.5.8 -- now it looks like I need to upgrade Python from 2.5 to 2.6. But I'll be spending the rest of the day on this, trying to get a gdb crash stack for the crash. My hope is to take care of this over the next few days, by fixing the bug my patch (presumably) uncovered. But I now think it's too risky trying to get a fix for this bug's crashes into FF 12. We'll just have to live with them til FF 13.
Assignee | ||
Comment 34•12 years ago
|
||
> Last night I ran another set of tryserver tests for my patch, and this morning I > discovered the same crash again, also on OS X 10.5.8. https://tbpl.mozilla.org/php/getParsedLog.php?id=10764019&tree=Try&full=1#error0
Assignee | ||
Comment 35•12 years ago
|
||
> I still haven't managed to complete a build on 10.5.8 -- now it looks like I need > to upgrade Python from 2.5 to 2.6. Turns out I just needed to install simplejson. See bug 741084 comment #7.
Assignee | ||
Comment 36•12 years ago
|
||
For the record (so it doesn't get lost), here's the log of the original crash, which occurred during a different browser-chrome test: https://tbpl.mozilla.org/php/getParsedLog.php?id=10755974&tree=Firefox&full=1#error0
Assignee | ||
Comment 37•12 years ago
|
||
And here's yet another crash, which happened during one of the tryserver runs of my patch for bug 313700 (which was built during the short time this bug's patch was on the trunk): https://tbpl.mozilla.org/php/getParsedLog.php?id=10762011&tree=Try&full=1#error0 This crash happened during the same browser-chrome test as did the one from comment #36.
Assignee | ||
Comment 38•12 years ago
|
||
I'm making progress on this ... but it's very indirect and painfully slow. The crash stacks from comment #34, comment #36 and comment #37 are at least partially corrupt. For example cairo_surface_flush() does *not* contain line 331 of cairo-surface.c. But what *does* appear at line 331 (a call (indirectly) to _cairo_surface_snapshot_copy_on_write() in cairo_surface_detach_snapshot) makes sense as part of the stack. The gdb stacks I've been able to get are much worse -- as best I can tell they're completely corrupt and useless. Maybe it's something about the test harness environment (since I'm invoking gdb with the --debugger=gdb parameter to runtests.py). But I *am* able to reproduce crashes in gdb with browser_tabview_bug627288.js with my patch, which don't happen without it. So at least I can mount a brute force attack on these crashes. I'll be spending all day tomorrow on this ... and very possibly also the rest of this week.
Assignee | ||
Comment 39•12 years ago
|
||
I've found the bug uncovered by my original patch for this bug, and I've created a new patch that fixes it. But I'm not entirely satisfied with it, because it may be too specifically tailored to the mochitest crashes. What causes these crashes is that code added by Joe's patch for bug 685767 to gfxBlur.cpp (to gfxAlphaBoxBlur::Init()) makes an assumption that turns out to be incorrect. http://hg.mozilla.org/mozilla-central/annotate/40455cbb1ad3/gfx/thebes/gfxBlur.cpp#l73 The incorrect assumption is that gfxAlphaBoxBlur::mImageSurface, created by gfxAlphaBoxBlur::Init(), will live no longer than the gfxAlphaBoxBlur object that created it: A gfxImageSurface object (or any subclass of gfxASurface) lives as long as its internal cairo_surface_t object (gfxASurface::mSurface) does, and the reference count of this object may be increased by cairo code -- which would make it live longer than its "owner" (the gfxAlphaBoxBlur object that created it). On the basis of the above (incorrect) assumption, gfxAlphaBoxBlur::Init() creates an AlphaBoxBlur object (mBlur), whose data it passes to the gfxImageSurface object (mImageSurface) as "existing" data (data not owned by the gfxImageSurface object), and which gets destroyed in the gfxAlphaBoxBlur destructor. But if the gfxImageSurface object lives longer than its "owning" gfxAlphaBoxBlur object, then the "existing" data will become invalid before the gfxAlphaBoxBlur object is destroyed. My current patch addresses this problem by making an AlphaBoxBlur object ref-countable, by passing this object (mBlur) to the gfxImageSurface constructor, and by making the gfxImageSurface object hold a reference to it. Doing this fixes the mochitest crashes. But I can't help but wonder if gfxImageSurface objects need to hold a reference to *any* "existing" data. I'm going to devise some tests for this, and start playing around with them. The first thing I'll do is make the gfxImageSurface destructor deliberately reference any non-owned data (in mData), so see if this causes crashes. I'll do these tests on OS X 10.5, on which jemalloc is disabled.
Assignee | ||
Comment 40•12 years ago
|
||
> I'm going to devise some tests for this, and start playing around > with them. The first thing I'll do is make the gfxImageSurface > destructor deliberately reference any non-owned data (in mData), so > see if this causes crashes. I'll do these tests on OS X 10.5, on > which jemalloc is disabled. I've run all our browser-chrome and chrome tests on this test build, together with most of the regular mochitests (I couldn't get that test run to finish). I found that the following regular mochitests crashed consistently in my altered gfxImageSurface destructor (even when run individually): content/canvas/test/test_2d.composite.uncovered.pattern.destination-atop.html content/canvas/test/test_2d.composite.uncovered.pattern.destination-in.html content/canvas/test/test_2d.composite.uncovered.pattern.source-in.html content/canvas/test/test_2d.composite.uncovered.pattern.source-out.html content/canvas/test/test_canvas.html layout/base/tests/test_bug731777.html And the following test crashed once (though I couldn't get it to crash again): layout/base/tests/test_bug93077-1.html So it does look like the "new" bug (the one uncovered by my original patch for this bug, and partially fixed by my patch from comment #39) does need a more general fix. I'll be working on this.
Assignee | ||
Comment 41•12 years ago
|
||
For those who are curious, here's the debugging patch I've been working with. It incorporates my patch from comment #39.
Reporter | ||
Comment 42•12 years ago
|
||
It's #5 top crasher in 13.0 and #28 in 14.0b6 on Mac OS X.
Reporter | ||
Comment 43•12 years ago
|
||
Let's close it as WFM now that Mac OS X 10.5 is no longer supported in Firefox 17.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Comment 44•12 years ago
|
||
(In reply to comment #43) That's fine with me. There's still unfinished work here, but I'm not going to be able to get back to it for a while. There are too many other more bugs that are more urgent than this one, especially now that OS X 10.5 has been desupported. When (if?) I do get back to this, I'll do it in a different bug.
You need to log in
before you can comment on or make changes to this bug.
Description
•