Crash [@ _cairo_image_surface_assume_ownership_of_data]

RESOLVED FIXED in mozilla2.0b8

Status

()

Core
SVG
--
critical
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: Jesse Ruderman, Assigned: dholbert)

Tracking

(Blocks: 2 bugs, {crash, testcase})

Trunk
mozilla2.0b8
x86
Mac OS X
crash, testcase
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(blocking2.0 final+)

Details

(Whiteboard: [sg:critical?], crash signature)

Attachments

(9 attachments, 2 obsolete attachments)

110 bytes, image/svg+xml
Details
42.34 KB, text/plain
Details
15.44 KB, text/plain
Details
217 bytes, image/svg+xml
Details
1.30 KB, patch
roc
: review+
Details | Diff | Splinter Review
1.64 KB, patch
roc
: review+
Details | Diff | Splinter Review
19.40 KB, text/plain
Details
4.86 KB, patch
roc
: review+
Details | Diff | Splinter Review
3.85 KB, patch
roc
: review+
Details | Diff | Splinter Review
(Reporter)

Description

7 years ago
Created attachment 490956 [details]
testcase (crashes Firefox when loaded on a Mac)

WARNING: Detected DOM modification during paint, bailing out!: file layout/base/FrameLayerBuilder.cpp, line 1704

Followed by a crash [@ _cairo_image_surface_assume_ownership_of_data]
(Reporter)

Comment 1

7 years ago
Created attachment 490957 [details]
log with stack trace
This is super bad. We should not be modifying the DOM in any way during painting.
Assignee: nobody → dholbert
blocking2.0: --- → final+
(Assignee)

Comment 3

7 years ago
On my linux x86_64 machine, I don't get a crash or the warning that Jesse mentions in Comment 0. Instead, I get this output (repeating once for each paint, it appears):

> WARNING: Invalid type to RecordMemoryUsedForSurfaceType!: file gfxASurface.cpp, line 543
> ###!!! ASSERTION: gfxASurface::CairoSurface called with mSurface == nsnull!: 'mSurface != nsnull', file gfxASurface.h, line 117
> WARNING: Invalid type to RecordMemoryUsedForSurfaceType!: file gfxASurface.cpp, line 543
(Assignee)

Comment 4

7 years ago
Confirmed that this crashes soon after loading on a Mac nightly build, though.  (No crashreporter dialog.)
(Assignee)

Comment 5

7 years ago
Just kidding, there was a crash report after all -- the dialog just didn't appear right away.  Report is: bp-95fc8b76-e6c7-4f5c-bddf-dc97c2101116
(Reporter)

Comment 6

7 years ago
fwiw, we have bug 335054 on adding assertions to ensure we don't modify the DOM during painting.
We already have that warning in comment #0. We can't make it an assertion because we know plugins can trigger it in "valid" situations without async plugin painting.
(Reporter)

Comment 8

7 years ago
I don't get the warning every time. Maybe that's because it sometimes manages to crash first?

I'm looking forward to being free of sync plugin painting and being able to enable these assertions!  Should we enable the assertions per-platform as we get async plugin painting up?
I suppose we could do that, but we have to either support async plugin painting for in-process plugins, or run all plugins out of process before we can enable that for a particular platform.
(Assignee)

Comment 10

7 years ago
Created attachment 491066 [details]
backtrace when we fire the warning

Here's the backtrace at the point where we fire the warning.

In this case, we've gotten to the line that issues the warning because, in FrameLayerBuilder::CheckDOMModified, we find the following values:
  mInitialDOMGeneration == 121
  mRootPresContext->GetDOMGeneration() == 137

Those aren't equal (which apparently means we've done some DOM modifications), so we issue the NS_WARNING.
Right. The question is, what bumped the DOM-generation during painting? We might want to add some extra debug-only code to assert immediately when that happens, so you get a useful stack.
(Assignee)

Comment 12

7 years ago
Created attachment 491320 [details]
testcase 2 (crashes Firefox when loaded on a Mac)

The initial testcase uses _the same SVG file_ as the background, which makes this trickier than it needs to be.

Here's an alternate testcase that reproduces this crash (and reproduces Comment 3's assertions on Linux) using a trivial SVG data-URI for the background (that trivial SVG being "<svg xmlns="http://www.w3.org/2000/svg"></svg>")
(Assignee)

Updated

7 years ago
Attachment #490956 - Attachment description: testcase (crashes Firefox when loaded) → testcase (crashes Firefox when loaded on a Mac)
(Assignee)

Updated

7 years ago
Attachment #491320 - Attachment is patch: false
Attachment #491320 - Attachment mime type: text/plain → image/svg+xml
(Assignee)

Comment 13

7 years ago
So the linux assertion & warning are because we try to create a gfxImageSurface with
  width = 33475
  height = 288
and that fails in "_cairo_image_surface_create_with_pixman_format()" because "_cairo_image_surface_is_size_valid()" returns false, since our width is greater than MAX_IMAGE_SIZE (32767) as defined in cairo-image-surface.c

This means we end up having a gfxImageSurface with a null "mSurface" pointer (as indicated in the assertion in Comment 3) and mSurfaceValid == PR_FALSE.

The warning from Comment 3 is harmless and actually should be avoided -- I'll attach a patch to keep us from hitting the code that warns there.
(Assignee)

Comment 14

7 years ago
Created attachment 491407 [details] [diff] [review]
Patch 1: Fix linux warning: skip call to RecordMemoryUsed if surface-allocation failed.

So in the gfxImageSurface constructor, if our surface-setup fails (e.g. due to the size being too large), then our call to gfxASurface::Init will...
 (a) set mSurfaceValid = PR_FALSE.
 (b) destroy our (invalid) surface
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxASurface.cpp?mark=219-221#214

The next thing the gfxImageSurface constructor does is to call "RecordMemoryUsed".  However, we shouldn't be making that call, because:
   Part (b) above means we're basically not using any memory
   Part (a) above means that the RecordMemoryUsed call ends up failing and spamming the NS_WARNING from comment 3 (specifically because gfxASurface::GetType() fails, returning -1, when mSurfaceValid == PR_FALSE -- and RecordMemoryUsed expects a valid GetType() value to be passed in)

So, we shouldn't actually be calling RecordMemoryUsed in this case. This first patch addresses that. (Doesn't actually fix the bug, though.)
Attachment #491407 - Flags: review?(roc)
(Assignee)

Comment 15

7 years ago
(In reply to comment #13)
>   width = 33475
>   height = 288
> and that fails in "_cairo_image_surface_create_with_pixman_format()" because
> "_cairo_image_surface_is_size_valid()" returns false, since our width is
> greater than MAX_IMAGE_SIZE (32767) as defined in cairo-image-surface.c

Stepping through this on the mac right now... So far, we're ok with that large of a width, since _cairo_quartz_verify_surface_size in cairo-quartz-surface.c only mandates that the width be <= CG_MAX_WIDTH == USHRT_MAX.

My guess is that this giant (busted?) surface is the source of our trouble, even though _cairo_quartz_verify_surface_size thinks it's fine...
Attachment #491407 - Flags: review?(roc) → review+
(Assignee)

Comment 16

7 years ago
Ah, so at the point where we crash, we're calling
 _cairo_image_surface_assume_ownership_of_data (surface->imageSurfaceEquiv);
on a cairo_quartz_surface_t, and we have
  surface->base.status == CAIRO_STATUS_SUCCESS
but
  surface->imageSurfaceEquiv.status == CAIRO_STATUS_INVALID_SIZE

So maybe the "imageSurfaceEquiv" error-status just isn't getting recognized soon enough...
(Assignee)

Comment 17

7 years ago
Aha!

So on Linux, we create only one surface, with cairo_image_surface_create_for_data(), which is called in the context of the attached patch. (and that fails in this case, so we effectively bail out.)

On Mac, we create two surfaces -- one with _cairo_quartz_surface_create_internal[1] and one with the same function used on Linux -- cairo_image_surface_create_for_data [2].  The former succeeds (for reasons described in Comment 15) and the latter fails (for the reason described in comment 13).  But we don't ever check the status of the latter one (at least, not at surface-creation-time.

The former surface is "surf->base", and the latter is "surf->imageSurfaceEquiv", described in Comment 16.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/cairo/cairo/src/cairo-quartz-surface.c#3150
[2] http://mxr.mozilla.org/mozilla-central/source/gfx/cairo/cairo/src/cairo-quartz-surface.c#3160
(Assignee)

Updated

7 years ago
Attachment #491407 - Attachment description: Patch 1: Skip call to RecordMemoryUsed if surface-allocation failed. → Patch 1: Fix linux warning: skip call to RecordMemoryUsed if surface-allocation failed.
(Assignee)

Comment 18

7 years ago
Created attachment 491428 [details] [diff] [review]
Patch 2: Fix linux assertion

This patch fixes the Linux assertion from Comment 3. The assertion happens because we have an invalid gfxASurface, and we try to wrap it in a gfxContext, which calls gfxASurface::CairoSurface on it (and fails an assertion because the surface is invalid).
(Assignee)

Comment 19

7 years ago
Created attachment 491432 [details] [diff] [review]
Patch 3: Fix mac crash: set imageSurfaceEquiv to null if it fails

This patch fixes the mac crash on both testcases here, by checking "imageSurfaceEquiv" in our cairo_quartz_surface_t when it's created, and replacing it with a null pointer if it has issues.

I'm not sure what the implications of this are, but it works around this crash, at least.  (and we have checks like "if (surface->imageSurfaceEquiv)" sprinkled around our quartz cairo code, indicating that we can handle a world where imageSurfaceEquiv is null).

This patch does _not_ fix the "Detected DOM modification" warning on the first testcase, though.  I'll look into that next.
Attachment #491432 - Flags: review?(jmuizelaar)
(Assignee)

Updated

7 years ago
Attachment #491428 - Flags: review?(roc)
Comment on attachment 491432 [details] [diff] [review]
Patch 3: Fix mac crash: set imageSurfaceEquiv to null if it fails

+    if (cairo_surface_status(tmpImageSurfaceEquiv)) {

space after cairo_surface_status and cairo_surface_destroy

Looks good to me, although you'll need to package this as a patch against cairo like the other patches in gfx/cairo.
Attachment #491428 - Flags: review?(roc) → review+
(Assignee)

Comment 21

7 years ago
Created attachment 491642 [details]
backtrace of nsRootPresContext::IncrementDOMGeneration call

(In reply to comment #2)
> This is super bad. We should not be modifying the DOM in any way during
> painting.

So these DOM modifications all seem to be tweaks to XUL elements in the helper-document -- in particular, <scrollbar> elements.  This happens during our SVGDocumentWrapper::UpdateViewportBounds call (which is needed to resize our helper SVG document, at paint-time, to potentially align its viewBox with whatever viewport it has been given.)

For example, see attached backtrace for one of the hits in IncrementDOMGeneration calls.  In the backtrace, we're inside of "nsGfxScrollFrameInner::FinishReflowForScrollbar", which has four calls to SetCoordAttribute (for scrollbars), each of which is effectively a call to nsGenericElement::SetAttrAndNotify.  And SetAttrAndNotify creates a mozAutoDocUpdate which calls IncrementDOMGeneration. (and ends up freaking out our FrameLayerBuilder later on)

Note that we don't actually want or need scrollbars in our helper document -- I'm avoiding drawing them by passing nsIPresShell::RENDER_IGNORE_VIEWPORT_SCROLLING to PresShell::RenderDocument (in VectorImage.cpp).  However, the RenderDocument call happens *after* we've hit this warning.  Maybe there's something I should be doing earlier disable scrollbars entirely?  (If so, that'd fix this or at least improve the situation here.)
Hmm, so the problem here is that the image document uses itself as an image at a different size, so while we're drawing it we try to reflow it to a different size. That is actually really bad.

I doubt there are any reasonable uses of this, so I suggest we simply prevent reentrant VectorImage::Draw calls on the same object. We'll need that to prevent infinite recursion in some cases, anyway. We should have a test for the obvious infinite-recursion case.

As a side issue, we should probably not construct scrollbars for image documents, for efficiency. See nsGfxScrollFrameInner::CreateAnonymousContent.
(Assignee)

Comment 23

7 years ago
Ah, of course -- I thought I checked for that, but I guess not.  Yup -- disallowing reentrant "Draw()" calls sounds like a good plan.

I'll add a check for IsBeingUsedAsImage to nsGfxScrollFrameInner::CreateAnonymousContent, too -- thanks for the pointer on that.
(Assignee)

Comment 24

7 years ago
Created attachment 491707 [details] [diff] [review]
Patch 3 v2: Fix mac crash: set imageSurfaceEquiv to null if it fails

This addresses roc's suggestions on the cairo patch, from Comment 20.
Attachment #491707 - Flags: review?(roc)
(Assignee)

Updated

7 years ago
Attachment #491432 - Attachment is obsolete: true
Attachment #491432 - Flags: review?(jmuizelaar)
(Assignee)

Comment 25

7 years ago
(Note that the bugzilla's shiny diff-viewer chokes on lines that start with "++", so it doesn't show most of the changed lines in new patch I've added in gfx/cairo.  Those lines show up fine in the raw patch-file view, though.)
You need to add a line to the README file as well
(Assignee)

Comment 27

7 years ago
Created attachment 491711 [details] [diff] [review]
Patch 4: Prevent reentrant VectorImage::Draw calls, & disable scrollbars in image docs

As discussed in Comment 22 & Comment 23, this patch adds a flag to VectorImage to prevent reentrant Draw calls.  It also adds a tweak to prevent scrollbar creation in svg-as-an-image documents.

(For the latter tweak, my patch collapses two formerly-nested "if" checks into a single "if" check, and then adds "presContext->Document()->IsBeingUsedAsImage()" as a new possible condition for triggering the contents of that clause.)
(Assignee)

Updated

7 years ago
Attachment #491711 - Flags: review?(roc)
(Assignee)

Comment 28

7 years ago
Created attachment 491714 [details] [diff] [review]
Patch 3 v3: Fix mac crash: set imageSurfaceEquiv to null if it fails (now with README note)

Added line to README per comment 26.
Attachment #491707 - Attachment is obsolete: true
Attachment #491714 - Flags: review?(roc)
Attachment #491707 - Flags: review?(roc)
Attachment #491714 - Flags: review?(roc) → review+
Comment on attachment 491711 [details] [diff] [review]
Patch 4: Prevent reentrant VectorImage::Draw calls, & disable scrollbars in image docs

Need a testcase for recursive image drawing.
Attachment #491711 - Flags: review?(roc) → review+
(Assignee)

Comment 30

7 years ago
Yup -- I'll include both testcases from this bug as crashtests.

Thanks for the help & reviews!
Flags: in-testsuite?
(Assignee)

Comment 31

7 years ago
Landed patches:
 http://hg.mozilla.org/mozilla-central/rev/e211245e413a
 http://hg.mozilla.org/mozilla-central/rev/cb7fca54ac45
 http://hg.mozilla.org/mozilla-central/rev/eea80e8a6901
 http://hg.mozilla.org/mozilla-central/rev/d66f56009e9b
and this bug's testcases as crashtests:
 http://hg.mozilla.org/mozilla-central/rev/fe3ee697e942
Flags: in-testsuite? → in-testsuite+
Target Milestone: --- → mozilla2.0b8
(Assignee)

Updated

7 years ago
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
(Assignee)

Comment 32

6 years ago
Any objections to un-hiding this bug?

This only affected gecko2-era code (after SVG-as-an-image support was added), and it was fixed quickly, 5 months ago, well before Firefox 4 was released.
Blocks: 276431
Crash Signature: [@ _cairo_image_surface_assume_ownership_of_data]
(Reporter)

Updated

4 years ago
Blocks: 335054
Group: core-security
(Reporter)

Comment 33

4 years ago
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #7)
> We already have that warning in comment #0. We can't make it an assertion
> because we know plugins can trigger it in "valid" situations without async
> plugin painting.

For fuzzing, I'd be more than happy with an assertion that ignores the plugin case. Even if it's as an exclusion as wide as MOZ_ASSERT_IF(!everLoadedAnyPlugin, ...).
You need to log in before you can comment on or make changes to this bug.