Closed Bug 1600491 Opened 5 years ago Closed 5 years ago

Crash in [@ mozilla::gfx::RecordedEventDerived<T>::RecordToStream]

Categories

(Core :: Graphics, defect, P2)

x86
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla73
Root Cause Coding: Other
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- unaffected
firefox72 blocking fixed
firefox73 + fixed

People

(Reporter: jseward, Assigned: mattwoodrow)

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-f3aff941-e046-41cd-87c0-f99c40191130.

This crash appears in 2 different installations in the Windows nightly of
20191128094109.

Top 10 frames of crashing thread:

0 xul.dll mozilla::gfx::RecordedEventDerived<mozilla::gfx::RecordedSourceSurfaceCreation>::RecordToStream gfx/2d/RecordedEvent.h:421
1 xul.dll mozilla::gfx::DrawEventRecorderMemory::RecordEvent gfx/2d/DrawEventRecorder.cpp:58
2 xul.dll mozilla::gfx::DrawEventRecorderPrivate::StoreSourceSurfaceRecording gfx/2d/DrawEventRecorder.cpp:46
3 xul.dll static void mozilla::gfx::EnsureSurfaceStoredRecording gfx/2d/DrawTargetRecording.cpp:45
4 xul.dll void mozilla::gfx::DrawTargetRecording::FillRect gfx/2d/DrawTargetRecording.cpp:245
5 xul.dll void gfxSurfaceDrawable::DrawInternal gfx/thebes/gfxDrawable.cpp:88
6 xul.dll bool gfxSurfaceDrawable::Draw gfx/thebes/gfxDrawable.cpp:64
7 xul.dll gfxUtils::DrawPixelSnapped gfx/thebes/gfxUtils.cpp:562
8 xul.dll bool mozilla::image::imgFrame::Draw image/imgFrame.cpp:638
9 xul.dll mozilla::image::ImgDrawResult mozilla::image::RasterImage::DrawInternal image/RasterImage.cpp:1380

Flags: needinfo?(bobowencode)

Thanks, looks like this increase was down to my turning on remote canvas.
Hopefully I can address this as I'm looking at handling the failure scenarios more gracefully.
I'll leave the needinfo, until I've had time to look at the dumps.

[Tracking Requested - why for this release]:
the signature is regressing in volume during the 72.0 cycle - it's accounting for 2.15% of tab crashes on beta currently.

Bob, this one is looking bad in 72 with (IIRC) remote canvas disabled.

Yes it looks like it is enabled on beta? I notice most of the crashes are for recording images. Inlining them into the stream probably isn't the best idea, even if we do a copy. I would say we should just record a handle, and allocate a separate buffer. I would also note that with WebRender, images are stored in shared memory, and it just stores an ID in the WebRender blob image recordings; in theory remote canvas could take advantage of this. I would, on the face of it, have no problem with "upgrading" an image to shared memory for non-WR if we detect it is used with a remote canvas.

Severity: normal → major
Assignee: nobody → bobowencode
Severity: major → critical

(In reply to Andrew Osmond [:aosmond] from comment #4)

Yes it looks like it is enabled on beta?

If this is on beta then I don't think it can be the canvas remoting.

I notice most of the crashes are for recording images. Inlining them into the stream probably isn't the best idea, even if we do a copy. I would say we should just record a handle, and allocate a separate buffer.

For the canvas remoting I tried using separate shared memory for images, but because of the currently slow handle replication it was actually slower. The canvas remoting streams all of this through a ring buffer, so as you write it in, it is being read out the other side.
I do think that using separate shared memory eventually would be better.

I would also note that with WebRender, images are stored in shared memory, and it just stores an ID in the WebRender blob image recordings; in theory remote canvas could take advantage of this.

Yeah I noticed this, I intend to add code to detect and do exactly this for canvas remoting.

Flags: needinfo?(bobowencode)

This actually appears to be recording under gfx::PaintFragment::Record.
It is possible that something I have landed affected this, but I can't think what and I'm not familiar with the PaintFragment stuff.

rhunt: any ideas what might have changed here?

Flags: needinfo?(rhunt)

mattwoodrow: you might have an idea about this as well

Flags: needinfo?(matt.woodrow)
Keywords: topcrash

This is indeed for screenshotting, most likely for thumbnails.

It looks like MemStream uses fallible allocations (realloc), but we don't actually check the result anywhere.

I agree that we should probably send large SourceSurfaces as Shmem allocations, though maybe we should just be rasterizing in the content process and only sending the final result.

We actually have really bad performance currently, since we realloc this buffer a lot of times, and then still have to copy to shmem before sending cross-process.

Flags: needinfo?(matt.woodrow)
Priority: -- → P2

Ryan, do you remember if there was a specific reason we decided to serialize drawing commands between processes, rather than just sending a rasterized surface?

Oh, nevermind, I remember. We need drawing commands in case there is an OOP iframe contained within the commands, and we have to wait for that to be available (in the parent) before we can flatten things infront/behind it.

Flags: needinfo?(rhunt)
Assignee: bobowencode → nobody
Crash Signature: [@ mozilla::gfx::RecordedEventDerived<T>::RecordToStream] → [@ mozilla::gfx::RecordedEventDerived<T>::RecordToStream] [@ memcpy | mozilla::gfx::RecordedSourceSurfaceCreation::Record<T> ]
Assignee: nobody → matt.woodrow
Status: NEW → ASSIGNED
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c238b47d7d57 Try not to crash if we fail to allocate memory in MemStream. r=bobowen
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Can you request beta uplift? Thanks!

Flags: needinfo?(matt.woodrow)

Following up in email.

Comment on attachment 9115673 [details]
Bug 1600491 - Try not to crash if we fail to allocate memory in MemStream. r?bobowen

Beta/Release Uplift Approval Request

  • User impact if declined: Crashes during screenshots (tab thumbnails included).
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: No know STR, likely need system with very low available memory
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Just some basic null checks to handle allocation failure.
  • String changes made/needed: None
Flags: needinfo?(matt.woodrow)
Attachment #9115673 - Flags: approval-mozilla-beta?

Comment on attachment 9115673 [details]
Bug 1600491 - Try not to crash if we fail to allocate memory in MemStream. r?bobowen

gfx crash fix, approved for 72.0b9

Side note, Matt, it looks like this will leak if realloc fails?

Flags: needinfo?(matt.woodrow)
Attachment #9115673 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Please specify a root cause for this bug. See :tmaity for more information.

Root Cause: --- → ?
Root Cause: ? → Coding: Other

Bug 1705173 fixed the leak mentioned in comment 19.

Flags: needinfo?(matt.woodrow)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: