Closed Bug 1558117 Opened 5 years ago Closed 5 years ago

Crash in [@ mozilla::layers::CanvasTranslator::TranslateRecording] - event type: 18

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla75

Tracking Flags:

Tracking

Status

firefox-esr60

---

unaffected

firefox-esr68

---

unaffected

firefox67

---

unaffected

firefox67.0.1

---

unaffected

firefox68

---

unaffected

firefox69

---

disabled

firefox72

---

disabled

firefox73

---

disabled

firefox74

---

disabled

firefox75

---

fixed

People

(Reporter: calixte, Assigned: bobowen)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files)

Bug 1558117 Part 1: Add logging when we fail to read an event from canvas ring buffer. r=jrmuizel! 5 years ago Bob Owen (:bobowen) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1558117 Part 2: Check if other side is closed while state is AboutToWait. r=jrmuizel! 5 years ago Bob Owen (:bobowen) 47 bytes, text/x-phabricator-request		Details \| Review

Calixte Denizet (:calixte)

Reporter

Description

•

5 years ago

This bug is for crash report bp-7b64357d-3cb1-4aa2-a870-00ffc0190609.

Top 10 frames of crashing thread:

0 xul.dll void CrashStatsLogForwarder::CrashAction gfx/thebes/gfxPlatform.cpp:398
1 xul.dll mozilla::gfx::Log<1, mozilla::gfx::CriticalLogger>::Flush gfx/2d/Logging.h:288
2 xul.dll void mozilla::gfx::Log<1, mozilla::gfx::CriticalLogger>::~Log gfx/2d/Logging.h:281
3 xul.dll mozilla::layers::CanvasTranslator::TranslateRecording gfx/layers/CanvasTranslator.cpp:94
4 xul.dll void mozilla::layers::CanvasParent::StartTranslation gfx/layers/ipc/CanvasParent.cpp:144
5 xul.dll nsresult mozilla::detail::RunnableMethodImpl< xpcom/threads/nsThreadUtils.h:1176
6 xul.dll nsresult nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:244
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1176
8 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
9 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:303

There are 2 crashes (from 1 installation) in nightly 69 with buildid 20190609090758. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1464032.

[1] https://hg.mozilla.org/mozilla-central/rev?node=04067aec22bb

Flags: needinfo?(bobowencode)

Jessie [:jbonisteel] pls NI

Updated

•

5 years ago

Priority: -- → P3

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

status-firefox69: affected → fix-optional

Bob Owen (:bobowen)

Assignee

Comment 1

•

5 years ago

This crashed creating a SourceSurface in the GPU process recording playback.
The fact it's gone away could just be down to the pref being disabled again, so I'm going to set this to block release as hopefully we'll see it again on Nightly once I enable by default (if it isn't already fixed).

Blocks: 1548487

Flags: needinfo?(bobowencode)

Alistair Vining

Comment 2

•

5 years ago

I enabled gfx.canvas.remote yesterday, restarted Nightly, browsed the demos at https://davidwalsh.name/canvas-demos, and crashed on http://codepen.io/ara_machine/pen/nuJCG giving crash report bp-ee5e9be6-4ff5-4763-b078-3fe670190911 with this signature.

I was not able to reproduce the crash just now.

Bob Owen (:bobowen)

Assignee

Comment 3

•

5 years ago

(In reply to Alistair Vining from comment #2)

I enabled gfx.canvas.remote yesterday, restarted Nightly, browsed the demos at https://davidwalsh.name/canvas-demos, and crashed on http://codepen.io/ara_machine/pen/nuJCG giving crash report bp-ee5e9be6-4ff5-4763-b078-3fe670190911 with this signature.

I was not able to reproduce the crash just now.

Hi Alistair, thanks for testing and reporting this.
I think I've managed to reproduce this.
It looks like an issue in the Path recording/translation in some edge case.
Hopefully it won't take too long to pinpoint the problem.

Assignee: nobody → bobowencode

Status: NEW → ASSIGNED

Priority: P3 → P1

Bob Owen (:bobowen)

Assignee

Comment 4

•

5 years ago

The crash in TranslateRecording is for when any event type fails to play back, so I'm going to split this out into a number of different bugs.

This one has already discussed event type 18 (PATHCREATION), so I'll switch this one to that.

Blocks: 1547286
No longer blocks: 1548487

Summary: Crash in [@ mozilla::layers::CanvasTranslator::TranslateRecording] → Crash in [@ mozilla::layers::CanvasTranslator::TranslateRecording] - event type: 18

Bob Owen (:bobowen)

Assignee

Updated

•

5 years ago

status-firefox72: --- → disabled

Bob Owen (:bobowen)

Assignee

Comment 5

•

5 years ago

Having split all these out, it looks like they are probably all due to the reading from the ringbuffer failing.
So, I think I need to treat those differently, probably adding critical notes/error so it can be picked up in subsequent crashes if they occur.

Bob Owen (:bobowen)

Assignee

Comment 6

•

5 years ago

I think the read failures can only be because the content process has timed out trying to write or it has shutdown.

So my plan is to only have a short timeout when writing and then create a new shared memory buffer, so that we are never waiting.
As for the shutdown scenario, we will no longer crash on event translation failure when bug 1598585 lands.

Depends on: 1598585

Bob Owen (:bobowen)

Assignee

Comment 10

•

5 years ago

(In reply to Bob Owen (:bobowen) from comment #6)

I think the read failures can only be because the content process has timed out trying to write or it has shutdown.

So my plan is to only have a short timeout when writing and then create a new shared memory buffer, so that we are never waiting.
As for the shutdown scenario, we will no longer crash on event translation failure when bug 1598585 lands.

I did get something working for creating new shared memory, but ran into a bit of a problem with passing the handle.
On reflection I decided that the shutdown scenario is probably more likely, so initially I'm just going to land some more logging given that the failure itself will no longer cause a crash.

Bob Owen (:bobowen)

Assignee

Comment 11

•

5 years ago

Attached file Bug 1558117 Part 1: Add logging when we fail to read an event from canvas ring buffer. r=jrmuizel! — Details

Depends on D61825

Bob Owen (:bobowen)

Assignee

Comment 12

•

5 years ago

Attached file Bug 1558117 Part 2: Check if other side is closed while state is AboutToWait. r=jrmuizel! — Details

If the other side crashed with AboutToWait set in CanvasEventRingBuffer then in
theory we could spin forever waiting for it to change.

Depends on D61826

Bob Owen (:bobowen)

Assignee

Comment 13

•

5 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&duplicate_jobs=visible&revision=1f1c7da22aaec4c01758220835b6ea68c64ef1ee

Pulsebot

Comment 14

•

5 years ago

Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/3b8384aee07a Part 1: Add logging when we fail to read an event from canvas ring buffer. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/d2a8ac40ce6b Part 2: Check if other side is closed while state is AboutToWait. r=jrmuizel

Cosmin Sabou [:CosminS]

Comment 15

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3b8384aee07a
https://hg.mozilla.org/mozilla-central/rev/d2a8ac40ce6b

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox75: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla75

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

status-firefox69: fix-optional → disabled

status-firefox73: --- → disabled

status-firefox74: --- → disabled

status-firefox-esr68: --- → unaffected

Anca Soncutean, Desktop QA

Updated

•

5 years ago

QA Whiteboard: [qa-75b-p2]

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Has Regression Range: --- → yes

You need to log in before you can comment on or make changes to this bug.