Closed Bug 1660490 Opened 1 year ago Closed 1 year ago

Crash in [@ RetainedDisplayListBuilder::MergeDisplayLists]

Categories

(Core :: Print Preview, defect, P1)

Firefox 81
defect

Tracking

()

VERIFIED FIXED
82 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox79 --- unaffected
firefox80 --- unaffected
firefox81 --- verified
firefox82 --- verified

People

(Reporter: Anca, Assigned: hiro)

References

(Blocks 2 open bugs, Regression, )

Details

(Keywords: crash, regression, Whiteboard: [print2020_v81] [old-ui-])

Crash Data

Attachments

(3 files, 1 obsolete file)

Attached image screencast issue .gif

Crash report: https://crash-stats.mozilla.org/report/index/b0d02945-49b0-41ef-82e6-9aa120200821

Top 10 frames of crashing thread:

0 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
1 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
2 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
3 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
4 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
5 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:838
6 xul.dll RetainedDisplayListBuilder::AttemptPartialUpdate layout/painting/RetainedDisplayListBuilder.cpp:1498
7 xul.dll static nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:4216
8 xul.dll mozilla::PresShell::Paint layout/base/PresShell.cpp:6371
9 xul.dll nsViewManager::ProcessPendingUpdatesPaint view/nsViewManager.cpp:460

Affected versions

  • 81.0a1

Affected platforms

  • macOS 10.14
  • Windows 10

Unaffected platforms

  • Ubuntu 18.04

Steps to reproduce

  1. Launch Firefox
  2. Make sure print.tab_modal.enabled is set on true
  3. Go to https://www.w3schools.com/html/html_iframe.asp
  4. Bring the context menu up with right click
  5. Go to This Frame - Print Frame

Actual result

  • Tab crashes

Regression range

Suggested severity

  • S3

This feels like a Core bug, but I think jwatt knows about it already, so may dupe it to a different one…

Component: Printing → Print Preview
Product: Toolkit → Core

I can't reproduce with the STR on macOS 10.15.

Severity: -- → S3
Priority: -- → P1

I believe bug 1660291 fixed this, at least on Linux, I did double check that reverting bug 1660291 causes Hit MOZ_CRASH(Item found was in the wrong list! type 282 (outer type was 55 at depth 4, now is 2)).

One thing I don't yet understand is that bug 1660291 and bug 1659457 (regressor) were in the same merge window (https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=e375b85cfba38ff5f49493d1d48b7561f7f2f8d7).

Also note that I tried to reproduce the crash on Window, but can't.

Anca, would you mind trying to reproduce on the latest nightly? If it doesn't reproduce there, I am going to close this.

Depends on: 1660291
Flags: needinfo?(anca.soncutean)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #3)

Anca, would you mind trying to reproduce on the latest nightly? If it doesn't reproduce there, I am going to close this.

I can still reproduce the crash with the latest Nightly (2020-08-23), both on Windows 10 and macOS 10.15

Flags: needinfo?(anca.soncutean)

I couldn't repro the crash on a release build, but on a debug build I get:

[Child 236363, Main Thread] ###!!! ASSERTION: Primary child list can have at most one frame in it: 'aListID != kPrincipalList || aChildList.IsEmpty() || aChildList.OnlyChild()', file /home/emilio/src/moz/gecko-4/layout/generic/nsCanvasFrame.cpp, line 262

Which seems like it may plausibly cause such an issue.

Nah, turns out that we put multiple placeholders on the canvas frame from ReplicateFixedFrames: https://searchfox.org/mozilla-central/rev/62f6cc5d9c829bc0c6f18e25f93203a98681ac97/layout/base/nsCSSFrameConstructor.cpp#8163

I could reproduce the crash on a local build, to reproduce the crash the iframe content should have been scrolled to some extent (that said, it's not 100% reproducible).

What I've noticed so far is

  1. The crash happens in the original document, not in the cloned one.
  2. It looks like somewhat related to APZ, especially apz.allow_zooming. If apz.allow_zooming is false, I can't reproduce the crash (but I am not 100% sure). And with enabling the pref I got this assertion

Assertion failure: ancestorTransform.IsIdentity(), at gfx/layers/composite/AsyncCompositionManager.cpp:430

(That said, the assertion might not be related to the crash itself)

CCing APZ folks.

I can reproduce the tab crash in Nightly. It also seems easy for me to reproduce with apz.allow_zooming = true and I haven't reproduced with apz.allow_zooming = false. Pretty good evidence that it is zooming related. The zoom item in the display list might be the cause/related.

In a debug build, with webrender I hit this assert

https://searchfox.org/mozilla-central/rev/19c23d725f27d0989e4a60f36d64004cebb39736/gfx/layers/apz/src/APZCTreeManager.cpp#762

Without webrender I hit the same assert as hiro in comment 7.

I think those asserts indicate we are breaking some assumptions, but I'm not sure if they would have the same fix as the MergeDisplayLists assert failure or not.

The numbers in

Hit MOZ_CRASH(Item found was in the wrong list! type 282 (outer type was 55 at depth 4, now is 2))

indicate that an item was inside a display item of type subdocument on the last paint but is now inside a display item of type async zoom this paint, and that happened without an invalidation.

I dumped the display list just before hitting that assert and there are two nsDisplayAsyncZoom that are nested: one for the https://www.w3schools.com/html/html_iframe.asp document and one of the iframe inside it. We only create nsDisplayAsyncZoom if IsRootContentDocumentCrossProcess is true here

https://searchfox.org/mozilla-central/rev/19c23d725f27d0989e4a60f36d64004cebb39736/layout/generic/nsGfxScrollFrame.cpp#3563

So something is getting confused and marking that iframe as a root content document when it is definitely not. I think that is the root cause and explains all of the other problems here.

Thank you, Timothy! I am going to try find where the root flag gets confused.

Flags: needinfo?(hikezoe.birchill)

It comes from this SetContainer call in Document::CreateStaticClone. The container (docshell) is the cloned one, and its BrowsingContext doesn't have parent. That's the reason why the document is confused as top level content root. I will track down where the doc shell is created.

Note that bug 1636728 changes the cloning process significantly, probably fixing this.

So worth checking whether the bug repros on an autoland build I guess.

I still get a tab crash with autoland build from https://hg.mozilla.org/integration/autoland/rev/81666c6d64bf17651ddc3445e6629ca652dcbb7d which has the patches from bug 1636728.

Timothy beat me. :) (I am still building autoland)

This is bit annoying. In this case the cloned docshell, thus the cloned document will be a top level content document, so it's correct. What I don't quite understand is why we call SetContainer with the cloned docshell for the original document? I am asking Olli about it in the DOM matrix channel now.

Flags: needinfo?(hikezoe.birchill)

I tried to add a mochitest in printpreview_helper.xhtml, but it doesn't cause the crash unfortunately.

I haven't seen the crash on other documents having iframes such as https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API#Thresholds. So presumably there are other factors to cause the crash (to be precise it's a diagnostic assertion in the RDL stuff) I haven't been aware of.

Also note that I pushed another try run with "-u all" because I am kind of feeling auto try for print stuff changes runs fewer tests than other kinds of changes.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2163e1107e31a02e12e72a8e35635fe1a26d0371

Assignee: nobody → hikezoe.birchill
Status: NEW → ASSIGNED

As Timothy suggested to me on Matrix, I added a position:fixed element in the iframe, but it doesn't help to reproduce the crash either on Windows or on Linux.

I also pushed a try run, it didn't crash on try servers either. (It seems there is no WebRender version of Mochitest chrome tasks)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb023c8f3552388ae8bc2affff003c70d7267059

So, I will go without any automated tests.

Attachment #9172062 - Attachment is obsolete: true
Pushed by hikezoe.birchill@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a50f5f23987c
Reset `IsTopLevelContentDocument` and `IsContentDocument` depending on the given docshell in Document::SetContainer. r=smaug
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch

Comment on attachment 9172078 [details]
Bug 1660490 - Reset IsTopLevelContentDocument and IsContentDocument depending on the given docshell in Document::SetContainer. r?smaug

Beta/Release Uplift Approval Request

  • User impact if declined: On betas it causes crashes. On releases it's hard to predict, apparently a bunch of APZ features will be broken.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Please see comment 0
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change made the function in question sane, but if there is code relying on the old odd behavior, the same thing what I wrote in "User impact if declined" will happen in different way, but it unlikely happens. (I did check all call sites of the function)
  • String changes made/needed: none
Attachment #9172078 - Flags: approval-mozilla-beta?
Flags: qe-verify+

The crash is no longer reproducible with the latest Nightly (2020-08-27) on Windows 10 and macOS 10.14.

Comment on attachment 9172078 [details]
Bug 1660490 - Reset IsTopLevelContentDocument and IsContentDocument depending on the given docshell in Document::SetContainer. r?smaug

Approved for 81.0b4.

Attachment #9172078 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

The crash is no longer reproducible with Beta 81.0b4 on Windows 10 and macOS 10.14.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.