Closed Bug 330957 Opened 18 years ago Closed 18 years ago

[FIX]Crash during startup in _cairo_ft_scaled_glyph_init

Categories

(Core :: Graphics, defect, P1)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: ajschult784, Assigned: bzbarsky)

Details

(Keywords: crash)

Attachments

(1 file)

With current SeaMonkey trunk w/cairo on FC3, I get a seg fault just after it opens a window.  gdb can't get a useful stack, but valgrind sees this:

Invalid write of size 1
   at 0x4006688: memset (mac_replace_strmem.c:464)
   by 0x8A1829: (within /usr/lib/libfreetype.so.6.3.7)
   by 0x8A19AA: (within /usr/lib/libfreetype.so.6.3.7)
   by 0x8A271E: gray_raster_render (in /usr/lib/libfreetype.so.6.3.7)
   by 0x871292: FT_Outline_Render (in /usr/lib/libfreetype.so.6.3.7)
   by 0x871354: FT_Outline_Get_Bitmap (in /usr/lib/libfreetype.so.6.3.7)
   by 0x5709612: _render_glyph_outline (cairo-ft-font.c:1014)
   by 0x570A196: _cairo_ft_scaled_glyph_init (cairo-ft-font.c:1817)
   by 0x56FB4CA: _cairo_scaled_glyph_lookup (cairo-scaled-font.c:1236)
   by 0x5707FCF: _cairo_xlib_surface_show_glyphs (cairo-xlib-surface.c:2613)
   by 0x56FCEBE: _cairo_surface_show_glyphs (cairo-surface.c:1588)
   by 0x56F17D4: _cairo_gstate_show_glyphs (cairo-gstate.c:1559)
 Address 0x8966120 is not stack'd, malloc'd or (recently) free'd

Running the same exact build on FC4 works ok.  Both are using the FC4 X display.
I've seen this before, but I don't know what's causing it -- I've seen it where I get that crash when running under Xvfb, and don't get the crash when running under the normal X server on RHEL4.  It could be related to the version of freetype installed; if possible, try upgrading your FC3 freetype to whatever is on your FC4 box?
both boxes have freetype 2.1.9
A build from the day before I filed this did work, so this is probably fallout from "Re-landing fixed cairo update" on 3/17
I'm hitting this (and in fact I have the same situation -- FC3 client and FC4 server).  Unfortunately, the FC3 machine is my main development machine, and it's headless (so running with the FC4 server is the only way I can run builds on it).  Oh, and I'm seeing this with Firefox, not Seamonkey...

Does this also happen with FC3 using the FC3 server?
Flags: blocking1.9a1?
OK, so I did some debugging.  When I crash, the stack looks like:

#0  0xb7c38edb in memset () from /lib/tls/libc.so.6
#1  0x00000003 in ?? ()
#2  0x00da782a in gray_render_span (y=4968, count=-1584219336, spans=0x88923a0, 
    raster=0x8891f98) at /usr/src/debug/freetype-2.1.9/src/smooth/ftgrays.c:1310
#3  0x00da79ab in gray_hline (raster=0x8891f98, x=0, y=4969, area=-50529028, acount=1)
    at /usr/src/debug/freetype-2.1.9/src/smooth/ftgrays.c:1410

etc.

Note the bogus stuff on the stack.  Unfortunately, it looks like at least part of the stack is corrupted -- for example the |map| pointer doesn't match &raster->target and it should.  :(
OK, so I got valgrind running; now I get things like:

==12222== Invalid write of size 4
==12222==    at 0xE62ED7: memset (in /lib/tls/libc-2.3.5.so)
==12222==    by 0xDA79AA: (within /usr/lib/libfreetype.so.6.3.7)
==12222==    by 0xDA871E: gray_raster_render (in /usr/lib/libfreetype.so.6.3.7)
==12222==    by 0xD77292: FT_Outline_Render (in /usr/lib/libfreetype.so.6.3.7)
==12222==  Address 0x20F6AB5B is not stack'd, malloc'd or (recently) free'd

dbaron, do you have a pointer to your patch to make valgrind use debuginfo?  I'd love to hit this case with it.  ;)
==30440== Invalid write of size 1
==30440==    at 0x4006605: memset (mac_replace_strmem.c:464)
==30440==    by 0xDA7829: gray_render_span (ftgrays.c:1310)
==30440==    by 0xDA79AA: gray_hline (ftgrays.c:1410)
==30440==    by 0xDA871E: gray_raster_render (ftgrays.c:1493)
==30440==    by 0xD77292: FT_Outline_Render (ftoutln.c:567)
==30440==    by 0xD77354: FT_Outline_Get_Bitmap (ftoutln.c:614)
==30440==    by 0x40A8601: _render_glyph_outline (cairo-ft-font.c:1014)
==30440==    by 0x40A94F5: _cairo_ft_scaled_glyph_init (cairo-ft-font.c:1817)
==30440==    by 0x40951C7: _cairo_scaled_glyph_lookup (cairo-scaled-font.c:1212)
==30440==    by 0x40A6932: _cairo_xlib_surface_show_glyphs (cairo-xlib-surface.c:2618)
==30440==    by 0x409770F: _cairo_surface_show_glyphs (cairo-surface.c:1588)
==30440==    by 0x4087EBC: _cairo_gstate_show_glyphs (cairo-gstate.c:1559)
If you've got *-debuginfo packages, you can poke through the source in /usr/src/debug/ to figure out what's on those lines.  (I wish valgrind included the full path rather than just the basename when printing filenames...)
I did poke at the source; it's not really telling me anything enlightening. :(

I wish I could drop into gdb when I hit this error, but using --db-attach=yes just seems to hang valgrind (0 CPU usage, nothing happening).
I had a chance to try out running on FC3 with an FC3 display and that worked ok.

poking things in the source, it does:

    p = (unsigned char*)map->buffer - y * map->pitch;

y is reasonable (1 when it dies).  pitch is 4 in the calls when it doesn't crash, but when it dies, pitch is 169500.  In my FC5 build, pitch is 8 or 12.  According to the source:

The pitch's absolute value is the number of bytes taken by one bitmap row, including padding.  However, the pitch is positive when the bitmap has a `down' flow, and negative when it has an `up' flow.  In all cases, the pitch is an offset to add to a bitmap pointer in order to go down one row.

169500 seems entirely unreasonable.

bz noticed that when he crashed, map's size (rows, width) was 82855x55143, which is perhaps the first thing that went bad.  pitch was 55144.
OK, if I place a breakpoint in _cairo_ft_scaled_glyph_init I see that we crash on the very first time through that function.  The font involved is "Luxi Sans"; the style is "Regular".

I tried a build which does not crash (running local on FC4).  I see us enter _cairo_ft_scaled_glyph_init with the same font and the same glyph as far as I can tell (the "glyph index" given by scaled_glyph->cache_entry.hash is the same).  The font face info also looks identical to me.

The first place where the two build diverge, as far as I can tell, is when we make the FT_Load_Glyph call at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/gfx/cairo/cairo/src/cairo-ft-font.c&rev=1.7#1725 -- after this point the metrics of the glyphs are different, as are the flags, etc.  Specifically:

Non-crashing build:
(gdb) p *scaled_font->unscaled->face->glyph
$11 = {library = 0x9699b70, face = 0x9c34fb8, next = 0x0, reserved = 0, generic = {
    data = 0x0, finalizer = 0}, metrics = {width = 448, height = 704, horiBearingX = 64, 
    horiBearingY = 704, horiAdvance = 576, vertBearingX = 0, vertBearingY = 0, 
    vertAdvance = 0}, linearHoriAdvance = 571858, linearVertAdvance = 0, advance = {
    x = 576, y = 0}, format = FT_GLYPH_FORMAT_OUTLINE, bitmap = {rows = 0, width = 0, 
    pitch = 0, buffer = 0x0, num_grays = 0, pixel_mode = 0 '\0', palette_mode = 0 '\0', 
    palette = 0x0}, bitmap_left = 0, bitmap_top = 0, outline = {n_contours = 1, 
    n_points = 10, points = 0x9cdf528, tags = 0x9ccb4b0 "aaaAAaaAAa", 
    contours = 0x9cde5f8, flags = 260}, num_subglyphs = 0, subglyphs = 0x0, 
  control_data = 0x9ce5432, control_len = 47, lsb_delta = 0, rsb_delta = 0, other = 0x0, 
  internal = 0x9c35630}

Crashing build:

(gdb) p *scaled_font->unscaled->face->glyph
$48 = {library = 0x8842818, face = 0x834d6d0, next = 0x0, reserved = 0, generic = {
    data = 0x0, finalizer = 0}, metrics = {width = 3529152, height = 5302720, 
    horiBearingX = 668416, horiBearingY = 5302720, horiAdvance = 4481280, 
    vertBearingX = 0, vertBearingY = 0, vertAdvance = 0}, 
  linearHoriAdvance = 2147483647, linearVertAdvance = 0, advance = {x = -4481280, 
    y = 0}, format = FT_GLYPH_FORMAT_OUTLINE, bitmap = {rows = 0, width = 0, pitch = 0, 
    buffer = 0x0, num_grays = 0, pixel_mode = 0 '\0', palette_mode = 0 '\0', 
    palette = 0x0}, bitmap_left = 0, bitmap_top = 0, outline = {n_contours = 1, 
    n_points = 10, points = 0x830b8a8, 
    tags = 0x8814bb8 "\001\001\001\001\001\001\001\001\001\001", contours = 0x8814ab0, 
    flags = 4}, num_subglyphs = 0, subglyphs = 0x0, control_data = 0x88df252, 
  control_len = 47, lsb_delta = 0, rsb_delta = 0, other = 0x0, internal = 0x8562ca0}

OK, the first place the two codepaths seem to diverge (as far as I can tell from debuginfo on optimized freetype RPMs where we seem to be executing the same statement multiple times) is that in FT_Load_Glyph in the crashing case, the driver->clazz->load_glyph call on line 553 of ftobjs.c fails or at least gives bogus data (while in the non-crashing case it succeeds).

So the main difference I see is that in _cairo_ft_scaled_glyph_init the crashing and non-crashing build come out with very different face->size.metrics:

Non-crashing:
(gdb) p face->size.metrics
$9 = {x_ppem = 14, y_ppem = 14, x_scale = 59900, y_scale = 59900, ascender = 960, 
  descender = -256, height = 1344, max_advance = 960}

Crashing:
(gdb) p face->size.metrics
$14 = {x_ppem = 49062, y_ppem = 49062, x_scale = 480660750, y_scale = 480660750, 
  ascender = 7283008, descender = -1547584, height = 10590720, max_advance = 7444288}

And in fact, at the point when we call into _cairo_ft_scaled_glyph_init we already have different scaled_font->unscaled->face->size.metrics in the two builds.
So it looks to me like the call to cairo_scaled_font_create made from _cairo_gstate_ensure_scaled_font returns very different things in the two builds (at least as far as the unscaled metrics are concerned).

In particular, it looks like in _cairo_ft_scaled_font_create at some point we end up with:

Non-crashing build:
(gdb) p scaled_font->base.scale
$12 = {xx = 14.28515625, yx = 0, xy = 0, yy = 14.28515625, x0 = 17, y0 = 19}

Crashing build:
(gdb) p scaled_font->base.scale
$15 = {xx = -2910801.1366120218, yx = 0, xy = 0, yy = -2910801.1366120218, x0 = 18, 
  y0 = 6}

which seems odd to me.
OK.  So as far as I can tell, the issue is that in DrawCairoGlyphs we do:

    if (FcPatternGetDouble(fcfont->font_pattern, FC_PIXEL_SIZE, 0, &size) != FcResultMatch)
        size = 12.0;

In the crashing build, we end up with:
(gdb) p size
$1 = -2910801.136612021

In the non-crashing build we end up with:
(gdb) p size
$2 = 14.28515625
OK, at the callsite into DrawCairoGlyphs (in gfxPangoTextRun::Draw) we have:

(gdb) p *((PangoFcFont*)layoutRun->item->analysis.font)->font_pattern->elts[16].values
$7 = {next = 0x0, value = {type = FcTypeDouble, u = {
      s = 0x917c80b3 <Address 0x917c80b3 out of bounds>, i = -1854111565, 
      b = -1854111565, d = -2910801.1366120218, m = 0x917c80b3, c = 0x917c80b3, 
      f = 0x917c80b3, p = 0x917c80b3, l = 0x917c80b3}}, binding = FcValueBindingStrong}
(gdb) p ((PangoFcFont*)layoutRun->item->analysis.font)->font_pattern->elts[16]
$8 = {object = 0xb7c1b0ef "pixelsize", values = 0x8456650}

So the layoutRun->item->analysis.font thing either isn't a PangoFcFont* or just has a totally bogus "pixelsize".
I think I'm about at the end of what I can do here; I have no idea where this pango font comes from, etc....
the fun all begins with MOZ_pango_font_description_set_absolute_size, which calls  GetXftDPI, which calls XGetDefault(GDK_DISPLAY(), "Xft", "dpi");, which returns NULL, and so GetXftDPI returns 0 (so the "size" in mPangoFontDesc becomes size * 72 / 0, where size ~= 16)
Should this copy of GetXftDPI have the same fixes bz made in the other copy?  Or should the two somehow be put in one place?
> Should this copy of GetXftDPI have the same fixes bz made in the other copy? 

Those fixes weren't cheap.  vlad, pav, how often is MOZ_pango_font_description_set_absolute_size called?  If it's called at all often, we should be doing something smarter (like caching the DPI, say).  And yes, given that this is basically inverting the transform in the other place we should be using the same DPI value.

Also, note that even the fixes I added could, in theory, fail.  In which case we should probably fall back to 96, not 0.

It'd be nice if coverity checked for divide-by-zero...
We only hit the case that calls GetXftDPI() if we're on an old version of pango that doesn't have pango_font_description_set_absolute_size; we should certainly be caching it.  I wrote this before we had gfxPlatformGtk, but that's the logical place to put in a DPI query/cache mechanism.  However, I'd suggest that we can put in a temporary crash-fix band aid, and just have GetXftDPI return 96 instead of 0 if there is no Xft.dpi property set.
This mostly just moves existing code, but I added a NSToCoordRound instead of straight cast to int at the end of GetDPIFromPangoFont.
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Attachment #218764 - Flags: superreview?(dbaron)
Attachment #218764 - Flags: review?(vladimir)
Priority: -- → P1
Summary: Crash during startup in _cairo_ft_scaled_glyph_init → [FIX]Crash during startup in _cairo_ft_scaled_glyph_init
Target Milestone: --- → mozilla1.9alpha
Attachment #218764 - Flags: superreview?(dbaron) → superreview+
(In reply to comment #0)
> With current SeaMonkey trunk w/cairo on FC3, I get a seg fault just after it
> opens a window.  gdb can't get a useful stack, but valgrind sees this:
> 
> Invalid write of size 1
>    at 0x4006688: memset (mac_replace_strmem.c:464)
>    by 0x8A1829: (within /usr/lib/libfreetype.so.6.3.7)

Bug 333861 shows similar stack triggered by some memory corruption due to large font-size

stack from the above bug.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 46912558994496 (LWP 4831)]
0x00002aaaad7cc540 in memset () from /lib64/tls/libc.so.6
(gdb) info stack
#0  0x00002aaaad7cc540 in memset () from /lib64/tls/libc.so.6
#1  0x00002aaaac66a172 in gray_raster_render ()
   from /usr/lib64/libfreetype.so.6
#2  0x00002aaaac668e5a in ps_hints_apply () from /usr/lib64/libfreetype.so.6
#3  0x00002aaaac6697f1 in gray_raster_render ()
   from /usr/lib64/libfreetype.so.6
#4  0x00002aaaac62f5f4 in FT_Outline_Render () from /usr/lib64/libfreetype.so.6
#5  0x00002aaaac62f69e in FT_Outline_Get_Bitmap ()
   from /usr/lib64/libfreetype.so.6
#6  0x00002aaaaae78aa1 in _render_glyph_outline (face=0x11b78c0, 
    font_options=0x10b9b58, surface=0x7fffff833518)
    at /opt/joro/firefox/mozilla/gfx/cairo/cairo/src/cairo-ft-font.c:1014
#7  0x00002aaaaae79c17 in _cairo_ft_scaled_glyph_init (
    abstract_font=0x10b9ae0, scaled_glyph=0x10b9ea0, info=3)
Fixed.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Flags: blocking1.9a1?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: