Open Bug 385228 Opened 17 years ago Updated 2 years ago

"ASSERTION: Surface size too large (would overflow)" and huge malloc

Categories

(Core :: SVG, defect)

x86
macOS
defect

Tracking

()

REOPENED

People

(Reporter: jruderman, Unassigned)

References

Details

(Keywords: assertion, testcase)

Attachments

(3 files)

The testcase triggers:

###!!! ASSERTION: Surface size too large (would overflow)!: 'Error', file /Users/jruderman/trunk/mozilla/gfx/thebes/src/gfxASurface.cpp, line 291

firefox-bin(14100,0xa000cfe0) malloc: *** vm_allocate(size=2085036032) failed (error code=3)
firefox-bin(14100,0xa000cfe0) malloc: *** error: can't allocate region
firefox-bin(14100,0xa000cfe0) malloc: *** set a breakpoint in szone_error to debug
No malloc error on Windows.

So I think there are two issues here:

a) gfxASurface::CheckSurfaceSize has some NS_ERROR calls. I think these should either be removed or changed to NS_WARNING.

b) you get a malloc error on a mac.

Attachment #269136 - Flags: superreview?(vladimir)
Attachment #269136 - Flags: review?(vladimir)
I don't understand -- why should these be NS_WARNINGs?  Nothing should be giving thebes too-big surface numbers.
You can crete arbitrary surface sizes in SVG e.g. using filters you can specify the x and y size of the surface directly with the filterRes attributes. I used this function to validate the size of svg surfaces in nsSVGUtils::ConvertToSurfaceSize.
Comment on attachment 269136 [details] [diff] [review]
change NS_ERROR to NS_WARNING

Ah ok, fair enough.. I guess I was thinking of this more as a last-ditch sanity check and not an early check, but you'd want to do the same check in both places anyway.  We may want to just get rid of the NS_WARNINGs in that case at some point, but this is fine for now, then.
Attachment #269136 - Flags: superreview?(vladimir)
Attachment #269136 - Flags: superreview+
Attachment #269136 - Flags: review?(vladimir)
Attachment #269136 - Flags: review+
Assignee: nobody → longsonr
The assertion is now gone. Presumably the Mac allocate issue remains which I can't do anything about.
Assignee: longsonr → nobody
(In reply to comment #5)
> (From update of attachment 269136 [details] [diff] [review])
> Ah ok, fair enough.. I guess I was thinking of this more as a last-ditch sanity
> check and not an early check, but you'd want to do the same check in both
> places anyway.  We may want to just get rid of the NS_WARNINGs in that case at
> some point, but this is fine for now, then.
> 

I could produce another patch to make CheckSurfaceSize take a PRBool suppressErrors argument, possibly defaulting to PR_FALSE. With that I could change the warnings back to asserts.

Depends on: 368573
Flags: in-testsuite?
Robert: over in bug 374815, comment 13, I posted some stack trace info for another large alloc, and you've asked if this one is related. It does hit the same stack, and there's similar symptoms: the callerBBox in my trace for the testcase above is 130x275, but the bbox ends up as 2600x275000 (20x and 1000x the callerBBox dimensions, obviously; which are the pattern's width and height in the test case).

This is at nsSVGPatternFrame.cpp:320. It looks like something is screwy with the pattern units. It seems to think that pattern width="x" height="y" means that the pattern is x times the width of the callerBBox and y times its height??
sorry I meant bug 374815 comment 14, obviously. Looks like both of these are hitting the same bug, other things like crashes and the size-clamping warnings are the result.
Ok, I read the corresponding bits of the spec after this, I didn't realise you were deliberately creating massive patterns here, since otherwise the behaviour is correct: patternUnits defaults to objectBoundingBox, in this case the pattern is 1000x the height of the element to which the pattern is applied, which is always going to result in an enormous bitmap.

However, if the units are objectBoundingBox and the width, height are > 1, you're always going to get a pattern bigger than the element it's being applied to. I don't see why you have to allocate something that large - can't a smaller surface be created with an appropriate offset? The size could be clamped like this somewhere like nsSVGPatternFrame.cpp line 276.

I'll shut up now as this is clearly not a cairo bug and I've strayed into code I understand even less...


Brian,

What happens is that the testcase deliberately generates a large surface (so far so good).

The SVG code then clamps the surface size using nsSVGUtils::ConvertToSurfaceSize
to a maximum of 16384 x 16384. So the surface size passed to cairo in the case above should now be 2600x16384.

On non-mac platforms we can create a surface this size without problems, on Mac, even this surface size blows up with a malloc error. Why does Mac cairo do that?
2,085,036,032 isn't 2600 * 16384 * 4; even 16384 * 16384 * 4 is only 1,073,741,824 .  Can we get a backtrace of exactly where that big number is coming from?
(In reply to comment #12)
> 2,085,036,032 isn't 2600 * 16384 * 4; even 16384 * 16384 * 4 is only
> 1,073,741,824 .  Can we get a backtrace of exactly where that big number is
> coming from?
> 

Inside CG as far as I can tell. This bug doesn't actually crash for me, but with the similar bug in 374815, I've got that failing right now with:

firefox-bin(17896,0xa079df60) malloc: *** mmap(size=1488482304) failed (error code=12)

another mystery number, > 16384^2. tracing from the malloc back to cairo I have:

#4  0x958c7f15 in malloc_zone_calloc ()
#5  0x909e29b6 in ripl_Create ()
#6  0x909ecb72 in ripc_GetColor ()
#7  0x909cd832 in ripc_Render ()
#8  0x909d68a0 in ripc_DrawPath ()
#9  0x928c0687 in CGContextDrawPath ()
#10 0x928c05d4 in CGContextFillPath ()
#11 0x1176c6de in _cairo_quartz_surface_fill (abstract_surface=0x18bdea30, op=CAIRO_OPERATOR_OVER, source=0xbfffba98, path=0xad717c, fill_rule=CAIRO_FILL_RULE_WINDING, tolerance=0.10000000000000001, antialias=CAIRO_ANTIALIAS_DEFAULT) at /Users/brianewins/projects/ffox/mozilla/gfx/cairo/cairo/src/cairo-quartz-surface.c:1273

I traced the creation of the large pattern that was used in this fill:

#0  0x1176db8c in _cairo_quartz_surface_create_internal (cgContext=0x18bdf470, content=CAIRO_CONTENT_COLOR_ALPHA, width=53160, height=3500) at /Users/brianewins/projects/ffox/mozilla/gfx/cairo/cairo/src/cairo-quartz-surface.c:1737

the bitmap size here is:
53160 * 3500 * 4 = 744,240,000
2 * 744,240,000 = 1,488,480,000

and the smallest alloc that can hold that is:
363399 * 4096 = 1,488,482,304 ... the mystery number.

So CG is allocating twice the pattern size for some reason. 

How did we get a width > 16384? Does SVG send that to cairo on Mac? If so why is it not clamped by nsSVGUtils::ConvertToSurfaceSize? 
(In reply to comment #14)
> How did we get a width > 16384? Does SVG send that to cairo on Mac? If so why
> is it not clamped by nsSVGUtils::ConvertToSurfaceSize? 
> 

As far as I can tell, it doesn't work that way on any platform; it clamps when w x h x 4 overflows a PRInt32, except on mac where height is also clamped to 32k (as it is in this bug). If clamping dimensions to 16k is intended, shouldn't nsSVGUtils.cpp#1352 read: 

if (*aResultOverflows ||
       !gfxASurface::CheckSurfaceSize(surfaceSize, NS_SVG_OFFSCREEN_MAX_DIMENSION)) {

I'm not able to debug this on other platforms, but it looks to me like on windows at least, its not doing clamp-and-scale for similarly large patterns, its just not rendering the pattern at all; eg I put a 100%x100% fill=blue rect in the pattern in attachment#1 [details] [diff] [review]; changing the pattern to 250x115 renders no stroke, and at 100x500 it only renders the stroke when zoomed out 4x (and not any other zoom level). That's in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b2pre) Gecko/2007120915 Minefield/3.0b2pre.
Yes, you are right Brian. I forgot that was how it was supposed to work.
Attached image testcase 2
This one ends up allocating about 1.5GB at a time.  This doesn't trigger the NS_WARNING but does trigger the malloc warning.
You've got a rect width 440, a pattern in objectBoundingBox units (the default) and a viewBox which contributes an approximate x2 factor. The pattern is 20 x 440 x 2 userSpace units wide, so it's a big pattern.

It looks like the Mac OS allocates more memory than other operating systems. Maybe CheckSurfaceSize needs to be even more restrictive on that platform.
I'm not getting assertions or malloc errors any more with these testcases.
Closing based on comment 19
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
Crashtest: http://hg.mozilla.org/mozilla-central/rev/43339764467b
Flags: in-testsuite? → in-testsuite+
Depends on: 523255
A mac tinderbox just failed with a huge (634-megabyte) malloc, in between this bug's crashtests.  I initially filed Bug 523255 on the sporadic failure.  Crash log is:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1255992265.1255994821.16757.gz

On the bright side, this shows the value of checking in crashtests! :)  But, on the sad side, this bug is still apparently broken. :(  Reopening.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
It sounds like comment 13 says we could *legitimately* need up to 744,240,000 bytes for this bug's testcase -- is that true?  If so, maybe we *are* doing the "right thing" now, and allocating "only" ~700 megs instead of 1.5 gigs like we used to.
I've disabled this test's second crashtest for now, since it aborts the whole crashtest suite when it sporadically fails, and that's bad.

http://hg.mozilla.org/mozilla-central/rev/d0c93a3077fa

I guess subsequent action here depends on the answer to comment 24.  If the answer is "yes, this test really needs 700 megs", then we should need to shrink this unittest so that it's not quite so demanding.  I don't think it's reasonable to expect that our tinderboxen (or, for that matter, "Joe Developer" running crashtests locally) will have a contiguous 700-meg block of virtual memory available at any given time.
(In reply to comment #25)
> I've disabled this test's second crashtest for now
Er, this *bug*'s second crashtest -- 385228-2.svg.  (That appears to be what was loading when we died, in the log in comment 23 & quoted in Bug 523255 comment 0.)
Depends on: 404077
No longer depends on: 368573
Depends on: 472557
No longer depends on: 404077
No longer blocks: 675709
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: