Crashes [@ create_protected_copy ] while printing on 127 branch and up
Categories
(Core :: Printing: Output, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox126 | --- | unaffected |
firefox127 | + | wontfix |
firefox128 | --- | fixed |
firefox129 | --- | fixed |
People
(Reporter: smichaud, Assigned: jfkthame)
References
(Regression)
Details
(Keywords: regression)
Crash Data
Attachments
(2 files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
These crashes happen only on macOS 13 and 14, so they may be partly an Apple bug. But they started somewhere on the 127 branch, so presumably they're also at least partly a Mozilla bug. They don't happen frequently enough to pin down a precise regression range.
Edit: They also happen on macOS 12 and 11.
They all have _cairo_quartz_snapshot_create
on the stack.
Typical crash stack:
Crashing Thread (0), Name: MainThread
Frame Module Signature Source Trust
0 libsystem_platform.dylib _platform_memmove context
1 CoreGraphics create_protected_copy cfi
2 CoreGraphics CGDataProviderCreateWithCopyOfData cfi
3 CoreGraphics CGDataProviderCreateTrustedWithCopyOfData cfi
4 CoreGraphics CGBitmapContextCreateImage cfi
5 XUL _cairo_quartz_snapshot_create gfx/cairo/cairo/src/cairo-quartz-surface.c:2650 cfi
6 XUL _cairo_quartz_surface_snapshot_get_image gfx/cairo/cairo/src/cairo-quartz-surface.c:2676 cfi
7 XUL _cairo_surface_to_cgimage gfx/cairo/cairo/src/cairo-quartz-surface.c:756 cfi
8 XUL _cairo_quartz_setup_pattern_source gfx/cairo/cairo/src/cairo-quartz-surface.c:987 inlined
8 XUL _cairo_quartz_setup_state gfx/cairo/cairo/src/cairo-quartz-surface.c:1248 cfi
9 XUL _cairo_quartz_cg_fill gfx/cairo/cairo/src/cairo-quartz-surface.c:1858 cfi
10 XUL _cairo_compositor_fill gfx/cairo/cairo/src/cairo-compositor.c:245 cfi
11 XUL _cairo_surface_fill gfx/cairo/cairo/src/cairo-surface.c:2502 cfi
12 XUL _cairo_gstate_fill gfx/cairo/cairo/src/cairo-gstate.c:1352 cfi
13 XUL _moz_cairo_fill_preserve gfx/cairo/cairo/src/cairo.c:2454 cfi
14 XUL mozilla::gfx::DrawTargetCairo::DrawPattern(mozilla::gfx::Pattern const&, mozilla::gfx::StrokeOptions const&, mozilla::gfx::DrawOptions const&, mozilla::gfx::DrawTargetCairo::DrawPatternType, bool) gfx/2d/DrawTargetCairo.cpp:1051 cfi
15 XUL mozilla::gfx::DrawTargetCairo::FillRect(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, float> const&, mozilla::gfx::Pattern const&, mozilla::gfx::DrawOptions const&) gfx/2d/DrawTargetCairo.cpp:1101 cfi
16 XUL mozilla::gfx::RecordedFillRect::PlayEvent(mozilla::gfx::Translator*) const gfx/2d/RecordedEventImpl.h:2489 cfi
17 XUL std::__1::__function::__value_func<bool (mozilla::gfx::RecordedEvent*)>::operator()[abi:un170006](mozilla::gfx::RecordedEvent*&&) const /builds/worker/fetches/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:518 inlined
17 XUL std::__1::function<bool (mozilla::gfx::RecordedEvent*)>::operator()(mozilla::gfx::RecordedEvent*) const /builds/worker/fetches/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:1169 inlined
17 XUL mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::EventStream>(mozilla::gfx::EventStream&, mozilla::gfx::RecordedEvent::EventType, std::__1::function<bool (mozilla::gfx::RecordedEvent*)> const&) gfx/2d/RecordedEventImpl.h:4514 cfi
18 XUL mozilla::layout::PrintTranslator::TranslateRecording(mozilla::layout::PRFileDescStream&) layout/printing/PrintTranslator.cpp:54 cfi
19 XUL mozilla::layout::RemotePrintJobParent::PrintPage(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layout::PRFileDescStream&, nsRefCountedHashtable<nsIntegralHashKey<unsigned long long, 0>, RefPtr<mozilla::gfx::RecordedDependentSurface> >*) layout/printing/ipc/RemotePrintJobParent.cpp:179 cfi
20 XUL mozilla::layout::RemotePrintJobParent::FinishProcessingPage(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, nsRefCountedHashtable<nsIntegralHashKey<unsigned long long, 0>, RefPtr<mozilla::gfx::RecordedDependentSurface> >*) layout/printing/ipc/RemotePrintJobParent.cpp:158 inlined
20 XUL mozilla::layout::RemotePrintJobParent::RecvProcessPage(int const&, int const&, nsTArray<unsigned long long>&&) layout/printing/ipc/RemotePrintJobParent.cpp:132 cfi
21 XUL mozilla::layout::PRemotePrintJobParent::OnMessageReceived(IPC::Message const&) ipc/ipdl/PRemotePrintJobParent.cpp:376 cfi
22 XUL mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) ipc/ipdl/PContentParent.cpp:6517 cfi
23 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) ipc/glue/MessageChannel.cpp:1820 inlined
23 XUL mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::UniquePtr<IPC::Message, mozilla::DefaultDelete<IPC::Message> >) ipc/glue/MessageChannel.cpp:1739 inlined
23 XUL mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) ipc/glue/MessageChannel.cpp:1530 inlined
23 XUL mozilla::ipc::MessageChannel::MessageTask::Run() ipc/glue/MessageChannel.cpp:1630 cfi
24 XUL mozilla::RunnableTask::Run() xpcom/threads/TaskController.cpp:580 inlined
24 XUL mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:907 inlined
24 XUL mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:730 cfi
25 XUL mozilla::TaskController::ProcessPendingMTTask(bool) xpcom/threads/TaskController.cpp:516 inlined
25 XUL mozilla::TaskController::TaskController()::$_0::operator()() const xpcom/threads/TaskController.cpp:234 inlined
25 XUL mozilla::detail::RunnableFunction<mozilla::TaskController::TaskController()::$_0>::Run() xpcom/threads/nsThreadUtils.h:548 cfi
26 XUL nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1199 inlined
26 XUL NS_ProcessPendingEvents(nsIThread*, unsigned int) xpcom/threads/nsThreadUtils.cpp:445 cfi
27 XUL nsBaseAppShell::NativeEventCallback() widget/nsBaseAppShell.cpp:87 cfi
28 XUL nsAppShell::ProcessGeckoEvents(void*) widget/cocoa/nsAppShell.mm:541 cfi
29 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ cfi
30 CoreFoundation __CFRunLoopDoSource0 cfi
31 CoreFoundation __CFRunLoopDoSources0 cfi
32 CoreFoundation __CFRunLoopRun cfi
33 CoreFoundation CFRunLoopRunSpecific cfi
34 HIToolbox RunCurrentEventLoopInMode cfi
35 HIToolbox ReceiveNextEventCommon cfi
36 HIToolbox _BlockUntilNextEventMatchingListInModeWithFilter cfi
37 AppKit _DPSNextEvent cfi
38 AppKit -[NSApplication(NSEventRouting) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] cfi
39 XUL -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] widget/cocoa/nsAppShell.mm:196 cfi
40 AppKit -[NSApplication run] cfi
41 XUL -[GeckoNSApplication run] widget/cocoa/nsAppShell.mm:174 cfi
42 XUL nsAppShell::Run() widget/cocoa/nsAppShell.mm:871 cfi
43 XUL nsAppStartup::Run() toolkit/components/startup/nsAppStartup.cpp:296 cfi
44 XUL XREMain::XRE_mainRun() toolkit/xre/nsAppRunner.cpp:5741 cfi
45 XUL XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:5953 cfi
46 XUL XRE_main(int, char**, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:6010 cfi
47 firefox do_main(int, char**, char**) browser/app/nsBrowserApp.cpp:230 inlined
47 firefox main browser/app/nsBrowserApp.cpp:448 cfi
48 dyld start cfi
Assignee | ||
Comment 1•23 days ago
|
||
Presumably related to the cairo 1.18.0 update, but without concrete STR it may be difficult to investigate as it's clearly not affecting every print operation.
(There was a cairo-quartz fix recently landed in bug 1900028, but one of the crash reports I see comes from the RC1 build (20240603152359) which included that fix, so apparently that wasn't the issue here.)
The CGDataProviderCreateTrustedWithCopyOfData
function on the stack is intruiging; I don't see any mention of that on developer.apple.com. Nor does Google have much about create_protected_copy
(some internal CoreGraphics thing, presumably).
Anyhow, I'm going to mark this as a regression from bug 1892913, given that this cairo_quartz_surface code underwent significant changes then, but also call it S3 for now, unless it becomes higher-frequency.
Looking at the current crash reports, there are a couple of pairs that look like they might be the same user making two attempts to print something, and crashing both times; if so, perhaps there's hope that we'll get a bug report with a specific page/document that reproduces this.
Assignee | ||
Comment 2•23 days ago
|
||
Steven, if you know (or can discover) anything about this CGDataProviderCreateTrustedWithCopyOfData thing that is getting used internally by CGBitmapContextCreateImage, that might give us clues as to what's triggering this. My Google searches have come up with nothing so far...
Comment 3•23 days ago
|
||
Set release status flags based on info from the regressing bug 1892913
Reporter | ||
Comment 4•23 days ago
•
|
||
So I ran Hopper Disassembler and took a look at the CoreGraphics
framework (on an Intel Mac running macOS 13.6.7).
CGDataProviderCreateTrustedWithCopyOfData()
just calls CGDataProviderCreateWithCopyOfData()
and sets a flag in the object returned. Within the CoreGraphics
framework it's only called from CGBitmapContextCreateImage()
, and then only if CGContextGetType()
returns 0x4
(kCGContextTypeBitmap
) or 0xc
(unknown type).
create_protected_copy()
creates a CFData
object from raw data, and if it's not greater than vm_page_size
also calls vm_protect()
on it with set_maximum
== true
and new_protection
== 1
(read-only).
I still don't really know what "trusted" means here. But it has to do with bitmaps. I suppose it could mean "immutable", but create_protected_copy()
is called from CGDataProviderCreateWithCopyOfData()
, so both providers have immutable data.
Edit: Digging around on https://opensource.apple.com, I found a reference to the term "trusted UI", which might be relevant here.
See comment #7 below.
Comment 5•23 days ago
•
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #1)
Looking at the current crash reports, there are a couple of pairs that look like they might be the same user making two attempts to print something, and crashing both times; if so, perhaps there's hope that we'll get a bug report with a specific page/document that reproduces this.
meta-note: none of this bug's associated crash reports were submitted with a URL-of-the-page-that-was-loaded. I assume that's because these are all parent-process crashes, and the parent process isn't specific to any one page/URL. Maybe when we enter PContentParent::OnMessageReceived
or somesuch, we should make a note of the content process's URL for usage in possible crash reports, in the event that there's a crash? (if the user checks the box in the crash-report dialog to include the URL of the crashing content) I'm not sure whether that's something that's already supposed to just work.
Comment 6•23 days ago
|
||
[Tracking Requested - why for this release]: We should make sure the volume is not concerning once this hits release.
Updated•22 days ago
|
Reporter | ||
Comment 7•20 days ago
•
|
||
Following up comment #4
I may have figured out what the "trusted" means in CGDataProviderCreateTrustedWithCopyOfData()
. If I'm right it doesn't mean "TrustedUI". Instead it means something like "[a bitmap] created the standard way", as opposed to "[a bitmap] created using a CGContextDelegate
callback".
CGBitmapContextCreateImage()
, before it does anything else, first calls CGContextDelegateImplementsCallback()
with type
(arg1
) set to 0x1a
. By digging through the CoreGraphics
framework I've found that 0x1a
== kCGContextDelegateCreateImage
. If this callback is implemented, CGBitmapContextCreateImage()
calls it (indirectly, via CGContextDelegateCreateImage()
). Otherwise it goes on to create a bitmap "in the standard way".
This distinction between "trusted" and "non-trusted" is moot, though. The whole CGContextDelegate
API (including CGContextDelegateSetCallback()
) is undocumented (though there's been some work to reverse engineer it). So it's highly unlikely that anyone besides Apple uses it. And Apple does use it, in a few cases, to set a kCGContextDelegateCreateImage
callback. But the callback is always Apple code, usually also in the CoreGraphics
framework (there's one more case in the RenderBox
framework).
For the record, the RenderBox
callback, whose name is mangled, is create_image(CGContextDelegate*, CGRenderingState*, CGGState*)
.
Comment 8•19 days ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #5)
we should make a note of the content process's URL for usage in possible crash reports
I filed bug 1901639 on this, FWIW.
Comment 9•18 days ago
|
||
Set release status flags based on info from the regressing bug 1892913
Comment 10•17 days ago
|
||
The bug is marked as tracked for firefox127 (release). However, the bug still isn't assigned and has low severity.
:fgriffith, could you please find an assignee and increase the severity for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Updated•17 days ago
|
Reporter | ||
Updated•11 days ago
|
Assignee | ||
Comment 11•10 days ago
|
||
This eliminates the new _cairo_quartz_surface_snapshot, and the CGContextRef-based
version of cairo_quartz_image_surface, which seems to be the potentially-problematic
codepath.
(Unfortunately, with no known-crashing URL or steps to reproduce, I don't have
any way to actually test this short of landing it and watching crash-stats.)
Assignee | ||
Comment 12•10 days ago
|
||
Here's a possible workaround that we might consider trying. Basically, the idea is to revert part of cairo_quartz_surface to the pre-1.18.0 version, where AFAIK we weren't seeing a crash like this. The implementation of _cairo_surface_to_cgimage
and its dependencies is substantially different: cairo_quartz_surface_snapshot doesn't exist, and cairo_quartz_image_surface is backed by a CGImageRef rather than a CGContextRef. Hopefully that will avoid us making the CoreGraphics call that ends up crashing here.
I've pushed a try run at https://treeherder.mozilla.org/jobs?repo=try&revision=8389c00d84d37c993ec292d21de629492d0d6be0 to check how things look there. In a little bit of local testing, printing functionality still seems to work OK; @jwatt, if you're able to do a bit of testing as well, that'd be awesome.
Comment 13•9 days ago
|
||
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/25ffbca1272c Revert to the pre-1.18.0 version of _cairo_surface_to_cgimage() and cairo_quartz_image_surface code. r=gfx-reviewers,lsalzman
Comment 14•9 days ago
|
||
bugherder |
Updated•8 days ago
|
Comment 15•8 days ago
|
||
The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox128
towontfix
.
For more information, please visit BugBot documentation.
Assignee | ||
Comment 16•8 days ago
|
||
Lacking any known STR, we can't be sure whether this patch will in fact stop the crashes (though I'm hopeful, given that it removes the specific codepath where we're crashing, and reverts to older code that was working OK). Given that watching crash-stats is currently our only way to assess this (and the crash rate is too low for Nightly to provide useful data), I think we should go ahead and take it on beta.
Assignee | ||
Comment 17•8 days ago
|
||
This eliminates the new _cairo_quartz_surface_snapshot, and the CGContextRef-based
version of cairo_quartz_image_surface, which seems to be the potentially-problematic
codepath.
(Unfortunately, with no known-crashing URL or steps to reproduce, I don't have
any way to actually test this short of landing it and watching crash-stats.)
Original Revision: https://phabricator.services.mozilla.com/D214297
Updated•8 days ago
|
Comment 18•8 days ago
|
||
beta Uplift Approval Request
- User impact if declined: possible parent-process crash while printing on macOS
- Code covered by automated testing: no
- Fix verified in Nightly: no
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: no known str
- Risk associated with taking this patch: low-ish
- Explanation of risk level: bascially reverting to an earlier version of the cairo-quartz code; but not entirely risk-free given that surrounding code has changed substantially, so there could be unanticipated side-effects (but limited to macOS printing, the only scenario where this code is used)
- String changes made/needed: none
- Is Android affected?: no
Updated•5 days ago
|
Updated•4 days ago
|
Updated•4 days ago
|
Comment 19•4 days ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/ae4a63db00f4
Description
•