Closed Bug 1460526 Opened 2 years ago Closed 2 years ago

Intermittent /html/semantics/embedded-content/image-maps/image-map-processing-model/hash-name-reference.html | application crashed [@ MOZ_CrashPrintf]

Categories

(Core :: Web Painting, defect, P1, critical)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 + fixed
firefox62 + fixed

People

(Reporter: intermittent-bug-filer, Assigned: mattwoodrow)

References

(Blocks 1 open bug)

Details

(Keywords: crash, intermittent-failure, Whiteboard: [retriggered][stockwell fixed:product])

Crash Data

Attachments

(1 file, 2 obsolete files)

Filed by: ncsoregi [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=177806118&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/PI2mtHzaRFChLOhkxiIPMQ/runs/0/artifacts/public/logs/live_backing.log

[task 2018-05-10T04:29:54.339Z] 04:29:54     INFO - TEST-START | /html/semantics/embedded-content/image-maps/image-map-processing-model/hash-name-reference.html
[task 2018-05-10T04:29:55.413Z] 04:29:55     INFO - PID 9184 | [Parent 9184, Gecko_IOThread] WARNING: pipe error (67): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
[task 2018-05-10T04:29:55.413Z] 04:29:55     INFO - PID 9184 | ###!!! [Parent][MessageChannel] Error: (msgtype=0x16007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
[task 2018-05-10T04:29:55.414Z] 04:29:55     INFO - PID 9184 | ###!!! [Parent][MessageChannel] Error: (msgtype=0x16007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
[task 2018-05-10T04:29:55.474Z] 04:29:55     INFO - PID 9184 | ###!!! [Parent][MessageChannel] Error: (msgtype=0x16007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
[task 2018-05-10T04:29:55.523Z] 04:29:55     INFO - PID 9184 | A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down
[task 2018-05-10T04:29:55.673Z] 04:29:55     INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/CIeCQAHFSayVs8kf1PcGPQ/artifacts/public/build/target.crashreporter-symbols.zip
[task 2018-05-10T04:29:55.750Z] 04:29:55     INFO - PID 9184 | *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping
[task 2018-05-10T04:30:02.038Z] 04:30:02     INFO - mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmp1WkfEZ.mozrunner/minidumps/249a791f-2219-8c48-2efb-8c7fcd973073.dmp /tmp/tmpbmVSCB
[task 2018-05-10T04:30:10.501Z] 04:30:10     INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/249a791f-2219-8c48-2efb-8c7fcd973073.dmp
[task 2018-05-10T04:30:10.501Z] 04:30:10     INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/249a791f-2219-8c48-2efb-8c7fcd973073.extra
[task 2018-05-10T04:30:10.564Z] 04:30:10     INFO - PROCESS-CRASH | /html/semantics/embedded-content/image-maps/image-map-processing-model/hash-name-reference.html | application crashed [@ MOZ_CrashPrintf]
From the crash stack:

 0  firefox!MOZ_CrashPrintf
 1  libxul.so!InvalidArrayIndex_CRASH
 2  libxul.so!RetainedDisplayListBuilder::MergeDisplayLists
 3  libxul.so!MergeState::ProcessItemFromNewList
 4  libxul.so!RetainedDisplayListBuilder::MergeDisplayLists
 5  libxul.so!RetainedDisplayListBuilder::AttemptPartialUpdate
 6  libxul.so!nsLayoutUtils::PaintFrame
Component: DOM → Layout: View Rendering
Hi,

I did some retriggers for this bug, and I think that the culprit is this changeset (6ae810d395d1):

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=linux%20opt-web-platform-tests-e10s&tochange=1feb2371c435c5928e8c7b0abcc1a695b4ef4b60&fromchange=3b3ca6f189ae2981942a77aadb1059ea15e10852

The same test ran in 5 pushes before this one and all jobs are green.

@Matt, can you please take a look?
Flags: needinfo?(matt.woodrow)
Whiteboard: [stockwell needswork:owner] → [stockwell needswork:owner] [retriggered]
Disabled this on linux and osx, should we also disable this on Windows?
Flags: needinfo?(jmaher)
Attachment #8975416 - Flags: review?(jmaher)
I'm working on tracking this down.
Assignee: nobody → matt.woodrow
Flags: needinfo?(matt.woodrow)
Comment on attachment 8975416 [details] [diff] [review]
Disabled hash-name-reference.html on linux and osx

Review of attachment 8975416 [details] [diff] [review]:
-----------------------------------------------------------------

as this is a web-platform-test, skip-if would need to be disabled, for example:
https://searchfox.org/mozilla-central/source/testing/web-platform/meta/IndexedDB/idbdatabase-createObjectStore-exception-order.htm.ini#2

as this is frequent on osx, linux, windows, it would just be disabled everywhere, similar to the example I just referenced.

Keep in mind :mattwoodrow is looking into this now, I assume we will get some traction, but it is good to have a patch ready to land if we are sitting without a fix in 48 hours, so lets get a patch ready.
Attachment #8975416 - Flags: review?(jmaher) → review-
Flags: needinfo?(jmaher)
Flags: needinfo?(jmaher)
Attachment #8975493 - Flags: review?(jmaher)
Attachment #8975416 - Attachment is obsolete: true
Flags: needinfo?(jmaher)
Comment on attachment 8975493 [details] [diff] [review]
Disabled hash-name-reference.html for frequent failures

Review of attachment 8975493 [details] [diff] [review]:
-----------------------------------------------------------------

great, lets land this in 2 days unless there is a r+ patch.
Attachment #8975493 - Flags: review?(jmaher) → review+
needinfo myself to revisit
Flags: needinfo?(jmaher)
Duplicate of this bug: 1460591
Duplicate of this bug: 1461019
This crash seems to be caused by an access to invalid array index[1], in this case index 0 of an empty array. Is this patch fixing something else?

[1]: https://searchfox.org/mozilla-central/rev/2b9779c59390ecc47be7a70d99753653d8eb5afc/layout/painting/RetainedDisplayListBuilder.cpp#303
Flags: needinfo?(matt.woodrow)
(In reply to Miko Mynttinen [:miko] from comment #17)
> This crash seems to be caused by an access to invalid array index[1], in
> this case index 0 of an empty array. Is this patch fixing something else?
> 
> [1]:
> https://searchfox.org/mozilla-central/rev/
> 2b9779c59390ecc47be7a70d99753653d8eb5afc/layout/painting/
> RetainedDisplayListBuilder.cpp#303

Sorry, this is a bit confusing.

We're doing the lookup for 'does an old item exist' by checking the display items on the frame, finding one of the same key, and then returning what index it was in its display list.

This can go wrong if there was more than one match (we built two identical items) or if the new item is in a different list to the old item (and we didn't mark it as invalid).

In either of those cases, the returned index is for the wrong list, and trying to access it will either crash due to an invalid index, or crash soon after because the contents of that index was the wrong type of display item. The patch in bug 1459997 makes us crash in a clearer way if this happens.

This fixes a cases where we build duplicate items, which triggers the assertions.
Flags: needinfo?(matt.woodrow)
Comment on attachment 8975705 [details]
Bug 1460526 - Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list.

https://reviewboard.mozilla.org/r/243932/#review250120

LGTM.
Attachment #8975705 - Flags: review?(mikokm) → review+
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/21ab7bb3b993
Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list. r=miko
Comment on attachment 8975493 [details] [diff] [review]
Disabled hash-name-reference.html for frequent failures

thanks for getting a fix in
Attachment #8975493 - Attachment is obsolete: true
Flags: needinfo?(jmaher)
Backout by aiakab@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/88d727680cb5
Backed out changeset 21ab7bb3b993 for failing reftest on worker/workspace/build/src/layout/painting/RetainedDisplayListBuilder
Backed out changeset 21ab7bb3b993 (bug 1460526) for failing reftest on worker/workspace/build/src/layout/painting/RetainedDisplayListBuilder

Logs:https://treeherder.mozilla.org/logviewer.html#?job_id=178646523&repo=autoland

Backout revision https://treeherder.mozilla.org/logviewer.html#?job_id=178646523&repo=autoland
Failing push https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=21ab7bb3b993f2b2778f9dde10ff16c3c20f320f&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified

Part of the log:16:34:21     INFO - REFTEST TEST-START | file:///Users/cltbld/tasks/task_1526426854/build/tests/reftest/tests/layout/reftests/xul/mac-tab-toolbar.xul == file:///Users/cltbld/tasks/task_1526426854/build/tests/reftest/tests/layout/reftests/xul/mac-tab-toolbar-ref.xul
16:34:21     INFO - REFTEST TEST-LOAD | file:///Users/cltbld/tasks/task_1526426854/build/tests/reftest/tests/layout/reftests/xul/mac-tab-toolbar.xul | 7 / 76 (9%)
16:34:21     INFO - ++DOMWINDOW == 22 (0x11b51c400) [pid = 1942] [serial = 22] [outer = 0x109c3a600]
16:34:22     INFO - JavaScript error: chrome://global/content/bindings/tabbox.xml, line 209: ReferenceError: Cc is not defined
16:34:22     INFO - Assertion failure: aList->mOldItems.IsEmpty(), at /builds/worker/workspace/build/src/layout/painting/RetainedDisplayListBuilder.cpp:118
16:35:03     INFO - #01: RetainedDisplayListBuilder::AttemptPartialUpdate(unsigned int, mozilla::DisplayListChecker*) [layout/painting/RetainedDisplayListBuilder.cpp:1116]
16:35:03     INFO - 
16:35:03     INFO - #02: nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) [layout/base/nsLayoutUtils.cpp:3685]
16:35:03     INFO - 
16:35:03     INFO - #03: mozilla::PresShell::Paint(nsView*, nsRegion const&, unsigned int) [layout/base/PresShell.cpp:6173]
16:35:03     INFO - 
16:35:03     INFO - #04: nsDOMWindowUtils::UpdateLayerTree() [gfx/src/nsRegion.h:495]
16:35:03     INFO - 
16:35:03     INFO - ###!!! [Parent][MessageChannel] Error: (msgtype=0x16007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
Flags: needinfo?(matt.woodrow)
Depends on: 1461812
Flags: needinfo?(matt.woodrow)
Attachment #8975493 - Attachment is obsolete: false
Attachment #8975705 - Attachment is obsolete: true
Comment on attachment 8975705 [details]
Bug 1460526 - Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list.

This is green on Try with bug 1461812 landed first.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fcd6399f1bd6d8280850ab24f87312b5bfbac3e4
Attachment #8975705 - Attachment is obsolete: false
Attachment #8975493 - Attachment is obsolete: true
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/985c842e7cba
Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list. r=miko
Keywords: checkin-needed
Priority: -- → P1
https://hg.mozilla.org/mozilla-central/rev/985c842e7cba
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Comment on attachment 8975705 [details]
Bug 1460526 - Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list.

Approval Request Comment
[Feature/Bug causing the regression]: Retained-dl, bug 1452464 made it show up on automation though.
[User impact if declined]: Crashes on some pages.
[Is this code covered by automated tests?]: Yes, existing tests caught this issue.
[Has the fix been verified in Nightly?]: Test no longer fail on mozilla-central.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Makes the code simpler, bails out from retained-dl, rather than trying to handle the complex partial-retain case.
[String changes made/needed]: None.
Attachment #8975705 - Flags: approval-mozilla-beta?
Comment on attachment 8975705 [details]
Bug 1460526 - Don't attempt to do a partial rebuild when a sublist DAG gets too complex, always rebuild the whole display list.

Retained display list fix needed for the feature to ship in Fx61. Approved for 61.0b6.
Attachment #8975705 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/457462d0bb3d
Flags: in-testsuite+
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell fixed:product]
Component: Layout: View Rendering → Layout: Web Painting
You need to log in before you can comment on or make changes to this bug.