Closed
Bug 1273765
Opened 8 years ago
Closed 4 years ago
GetTextMetrics failure in _cairo_win32_scaled_font_set_metrics, when printing specific PDF
Categories
(Core :: Printing: Output, defect)
Tracking
()
People
(Reporter: n.nethercote, Unassigned)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
This bug was filed from the Socorro interface and is report bp-246a432d-f35c-4593-9945-054542160517. ============================================================= This is a low-volume crash. I see 22 occurrences in Firefox this year, mostly in 47--49. https://crash-stats.mozilla.com/signature/?product=Firefox&platform=Windows&date=%3E%3D2016-01-01&signature=mozilla%3A%3Agfx%3A%3ARecordedSetTransform%3A%3APlayEvent&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1 I looked at a minidump and confirmed that the problem is that aTranslator->LookupDrawTarget(mDT) is returning nullptr. This clearly isn't supposed to happen, because PrintTranslator::LookupDrawTarget() has MOZ_ASSERT(result), as do all the other Lookup methods in that class. But it is, so somehow the mDrawTargets.GetWeak() call is returning null. bobowen, you've touched printing code recently. Can you please take a look?
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(bobowen.code)
Reporter | ||
Updated•8 years ago
|
Comment 1•8 years ago
|
||
Thanks njn. Printing via the parent, which uses this code, was briefly in Aurora 47, but is now Nightly only until we roll out the low integrity sandbox on content for Windows. So, those crash versions make sense. I'll try and take a look soon.
Whiteboard: sb?
Updated•8 years ago
|
Whiteboard: sb? → sbwc1
Comment 2•8 years ago
|
||
This seems really low level, but I've actually managed to reproduce it, so I'll take a closer look soon. https://crash-stats.mozilla.com/report/index/7896e4c3-0718-4477-8215-2976a2160630 https://crash-stats.mozilla.com/report/index/907a63ed-3953-455f-b622-ffea22160630 One of these is a duplicate I think, not sure how I managed that.
Updated•8 years ago
|
Assignee: nobody → bobowen.code
Flags: needinfo?(bobowen.code)
Comment 3•8 years ago
|
||
This is failing because in the creation event for that DrawTarget the check at [1] is failing. The status for mCairoSurface is CAIRO_STATUS_NO_MEMORY (1), which is a bit odd because although the child process seems to be eating lots of memory (a known problem with printing PDFs I think), the parent process looks OK. [1] http://searchfox.org/mozilla-central/rev/a7c8e9f3cc323fd707659175a46826ad12899cd1/gfx/thebes/PrintTarget.cpp#55
Status: NEW → ASSIGNED
Comment 4•8 years ago
|
||
This crash can be reproduced by printing pages 109 to 110 of: http://www.cisco.com/c/en/us/td/docs/routers/7200/install_and_upgrade/7200vxr_install_config/72vxicg.pdf The root cause of the CAIRO_STATUS_NO_MEMORY is from [1], which seems like an odd choice of default error to me. This is called at [2] (I've also seen it called at [3] while debugging). If I change this to return CAIRO_INT_STATUS_UNSUPPORTED and update _cairo_font_face_set_error and _cairo_scaled_font_set_error to ignore this, then I think the fallback code at [4] gets used and certainly the crash does not occur. Here's a try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=deb328082d000c2d6bb9b99afc126738ef5156a6 This problem exists on release, but the printing code there can pick up on the error and it fails with an error dialog saying "There is not enough free memory to print.". I'll file a separate bug over the issue that RecordedEvent::PlayEvent doesn't pass back a result to the Translator, so that errors like these could be handled without a crash. [1] https://hg.mozilla.org/mozilla-central/file/1bee8d2da23e/gfx/cairo/cairo/src/cairo-win32-surface.c#l141 [2] https://hg.mozilla.org/mozilla-central/file/1bee8d2da23e/gfx/cairo/cairo/src/cairo-win32-font.c#l976 [3] https://hg.mozilla.org/mozilla-central/file/1bee8d2da23e/gfx/cairo/cairo/src/cairo-win32-font.c#l458 [4] https://hg.mozilla.org/mozilla-central/file/1bee8d2da23e/gfx/cairo/cairo/src/cairo-surface.c#l2812
Comment 5•8 years ago
|
||
Also: * changed cairo font *_set_error functions to ignore CAIRO_INT_STATUS_UNSUPPORTED. * fixed error print format in _cairo_win32_print_gdi_error. MozReview-Commit-ID: 759hEFLd0vX
Attachment #8769686 -
Flags: review?(jfkthame)
Comment 6•8 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #4) > This crash can be reproduced by printing pages 109 to 110 of: > http://www.cisco.com/c/en/us/td/docs/routers/7200/install_and_upgrade/ > 7200vxr_install_config/72vxicg.pdf > > The root cause of the is from [1], which seems like > an odd choice of default error to me. > This is called at [2] (I've also seen it called at [3] while debugging). It does seem odd to return CAIRO_STATUS_NO_MEMORY for whatever GDI errors are happening; but what *are* the GDI failures? ISTM we should figure out why they're happening, and whether they indicate some deeper problem. If you can reproduce this in a debug build, you should be seeing the GDI errors reported to stderr by _cairo_win32_print_gdi_error(), afaict... so what are those errors? > > If I change this to return CAIRO_INT_STATUS_UNSUPPORTED and update > _cairo_font_face_set_error and _cairo_scaled_font_set_error to ignore this, > then I think the fallback code at [4] gets used and certainly the crash does > not occur. Another concern here is that surely we shouldn't be crashing even if PrintTarget::MakeDrawTarget() returns null; it looks like it has several failure modes that return null, and whoever calls it needs to be prepared to handle that safely. (I'm not saying the patch here is necessarily the wrong thing to do; the current failure handling does look weird. But before we do this, I'd like to understand better what's actually happening, as otherwise I fear we might just be papering over the real problem.)
Comment 7•8 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #6) > > The root cause of the is from [1], which seems like > > an odd choice of default error to me. > > This is called at [2] (I've also seen it called at [3] while debugging). > > It does seem odd to return CAIRO_STATUS_NO_MEMORY for whatever GDI errors > are happening; but what *are* the GDI failures? ISTM we should figure out > why they're happening, and whether they indicate some deeper problem. If you > can reproduce this in a debug build, you should be seeing the GDI errors > reported to stderr by _cairo_win32_print_gdi_error(), afaict... so what are > those errors? These fprintfs are not guarded, so they come out in opt builds as well ... perhaps they shouldn't. In this particular crash the error we get is 6 - "The handle is invalid." I couldn't work out why this was happening. It is something to do with when the DWrite fonts get substituted for GDI fonts (to get PDF printing to work properly). Following that those GDI fonts are substituted again, because the surface has a ctm at [1]. I can try to investigate this a bit further, but I thought that once I'd stopped it causing an issue in those other *_set_error functions, returning CAIRO_INT_STATUS_UNSUPPORTED would be no worse than CAIRO_STATUS_NO_MEMORY and in some cases is definitely an improvement. Weirdly, once I fixed the error printing statement, in the crashtest logs for the try push in comment 4, there are a load of "The operation completed successfully.", which is 0. Seems like some of these functions don't set a last error even when they fail. > Another concern here is that surely we shouldn't be crashing even if > PrintTarget::MakeDrawTarget() returns null; it looks like it has several > failure modes that return null, and whoever calls it needs to be prepared to > handle that safely. Agreed, I've filed bug 1285942 for this. [1] https://hg.mozilla.org/mozilla-central/file/214884d507ee/gfx/cairo/cairo/src/cairo-win32-printing-surface.c#l1522
Comment 8•8 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #7) > In this particular crash the error we get is 6 - "The handle is invalid." > I couldn't work out why this was happening. > It is something to do with when the DWrite fonts get substituted for GDI > fonts (to get PDF printing to work properly). > Following that those GDI fonts are substituted again, because the surface > has a ctm at [1]. Here's a guess... might it be related to use of fonts that were loaded via @font-face? The PDF has embedded fonts; when we render it, I presume we use @font-face to load those resources and make them available for rendering. But if we load a user font resource through the DWrite font backend (i.e. via gfxDWriteFontList::MakePlatformFont, which works via the IDWriteFontFileLoader interface), I doubt that will make it available to the GDI font subsystem; that would require us to activate the resource with AddFontMemResource, as in gfxGDIFontList::MakePlatformFont. In other words, I'm suspicious that it simply doesn't work to substitute GDI fonts for DWrite fonts on the fly when we're dealing with @font-face resources. Maybe that's how we're ending up with invalid resource handles?
Comment 9•8 years ago
|
||
Comment on attachment 8769686 [details] [diff] [review] Return CAIRO_INT_STATUS_UNSUPPORTED from _cairo_win32_print_gdi_error to allow for fallback As far as the actual cairo patch here is concerned, jrmuizel might be a better reviewer; I think he has much more in-depth cairo background, and could assess whether this is the appropriate cairo-esque approach to handling the failures. And as for the root cause of the problems, can you determine whether it is in fact related to user font resources failing to "cross over" between dwrite and gdi, as suggested above? If that's the case, then even if we prevent the crash (which is a good start, obviously) we may still not be getting correct output, as we'll probably still end up using a wrong font.
Attachment #8769686 -
Flags: review?(jfkthame) → review?(jmuizelaar)
Comment 10•8 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #9) > And as for the root cause of the problems, can you determine whether it is > in fact related to user font resources failing to "cross over" between > dwrite and gdi, as suggested above? If that's the case, then even if we > prevent the crash (which is a good start, obviously) we may still not be > getting correct output, as we'll probably still end up using a wrong font. This might be part of the problem, but wouldn't any web fonts fail to print in that case. I've investigated a bit further and it seems that the original failure is at [1]. We don't check the return value here, so we're left with 0 for metrics.tmPitchAndFamily, which means the is_bitmap flag gets set further down. I notice that the latest cairo code has a check on GetTextMetrics (although not the one below), but if I just add that bit of code it crashes somewhere else in cairo, because another bit of the code doesn't check to see if the scaled_font is valid. If we always want to use GDI and not DWrite fonts for printing, then I suppose I might be able to make that switch in the PrintTranslator. [1] https://hg.mozilla.org/mozilla-central/file/94c926911767/gfx/cairo/cairo/src/cairo-win32-font.c#l895
Comment 11•8 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #10) > > I've investigated a bit further and it seems that the original failure is at > [1]. > We don't check the return value here, so we're left with 0 for > metrics.tmPitchAndFamily, which means the is_bitmap flag gets set further > down. Can we find out what font GetTextMetrics is failing on?
Flags: needinfo?(bobowen.code)
Comment 12•8 years ago
|
||
I'll look into this again when I get back. Now that bug 1285942 has landed this shouldn't crash any more, so I'll remove the sandbox tracking.
Keywords: crash
Summary: Crash in mozilla::gfx::RecordedSetTransform::PlayEvent → GetTextMetrics failure in _cairo_win32_scaled_font_set_metrics, when printing specific PDF
Whiteboard: sbwc1
Comment 13•8 years ago
|
||
The crash was resolved separately, but as I'd got a fair way towards finding the underlying (possibly long-standing) bug, I'd like to re-look at this at some point. De-assigning myself for the moment as I don't have time right now.
Assignee: bobowencode → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(bobowencode)
Comment 14•8 years ago
|
||
Comment on attachment 8769686 [details] [diff] [review] Return CAIRO_INT_STATUS_UNSUPPORTED from _cairo_win32_print_gdi_error to allow for fallback Clearing this request until I (or someone else) has had time to look at the underlying failure.
Attachment #8769686 -
Flags: review?(jmuizelaar)
Comment 15•7 years ago
|
||
This crash has been spiking recently. I crashed while printing a PDF receipt from regonline.com. Strangely, I've got four crash reports for a single crash: https://crash-stats.mozilla.com/report/index/35a21009-a391-45ca-a88a-a1d020170707 https://crash-stats.mozilla.com/report/index/032e304c-28bc-47bd-8032-213340170707 https://crash-stats.mozilla.com/report/index/671d0ca0-382d-43f5-b087-41c590170707 https://crash-stats.mozilla.com/report/index/ba6330f7-32f1-4d6a-b151-4acb50170707
Comment 16•7 years ago
|
||
It's intermittent for me, so finding a regression range is quite hard.
Comment 17•7 years ago
|
||
the signature is spiking up again in the past few days on 55.0.3 - some comments talk about printing from irs.gov...
Comment 19•6 years ago
|
||
I can reproduce the crash on Nightly59.0a1 x64 windows10. bp-0b61c4cb-8ee1-4d06-9369-002700171222 Reproducible: always Steps To Reproduce: 1. Open twitter e.g. https://twitter.com/d_toybox not need login 2. Print with Microsoft Print to PDF. 3. Input file name, then click on Save button Actual Results: Browser crashes
Keywords: crash,
reproducible
Updated•6 years ago
|
status-firefox59:
--- → affected
Comment 20•6 years ago
|
||
(In reply to Alice0775 White from comment #19) > I can reproduce the crash on Nightly59.0a1 x64 windows10. > bp-0b61c4cb-8ee1-4d06-9369-002700171222 > > Reproducible: always > > Steps To Reproduce: > 1. Open twitter > e.g. https://twitter.com/d_toybox not need login > 2. Print with Microsoft Print to PDF. > 3. Input file name, then click on Save button > > Actual Results: > Browser crashes I filed a separate a new Bug 1426807. Because the regression range is different.
status-firefox59:
affected → ---
Keywords: reproducible
Comment 21•4 years ago
|
||
Closing because no crashes reported for 12 weeks.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•