Closed Bug 1468021 Opened 7 years ago Closed 6 years ago

null deref at [@ MergeState::ProcessItemFromNewList]

Categories

(Core :: Web Painting, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 + wontfix
firefox62 + wontfix
firefox63 --- wontfix
firefox65 --- wontfix
firefox66 + wontfix
firefox67 --- fixed

People

(Reporter: tsmith, Assigned: mattwoodrow)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, regression, testcase)

Crash Data

Attachments

(2 files, 1 obsolete file)

Attached file testcase.html
I'm not sure if this is a new bug or if this is the testcase you are looking for. ==119535==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f0d2aa43e46 bp 0x7ffc3c20b0f0 sp 0x7ffc3c20af20 T0) ==119535==The signal is caused by a READ memory access. ==119535==Hint: address points to the zero page. #0 0x7f0d2aa43e45 in MergeState::ProcessItemFromNewList(nsDisplayItem*, mozilla::Maybe<Index<MergedListUnits> > const&) src/layout/painting/RetainedDisplayListBuilder.cpp #1 0x7f0d2aa43298 in RetainedDisplayListBuilder::MergeDisplayLists(nsDisplayList*, RetainedDisplayList*, RetainedDisplayList*, mozilla::Maybe<mozilla::ActiveScrolledRoot const*>&, unsigned int) src/layout/painting/RetainedDisplayListBuilder.cpp:514:36 #2 0x7f0d2aa44201 in MergeState::ProcessItemFromNewList(nsDisplayItem*, mozilla::Maybe<Index<MergedListUnits> > const&) src/layout/painting/RetainedDisplayListBuilder.cpp:293:25 #3 0x7f0d2aa43298 in RetainedDisplayListBuilder::MergeDisplayLists(nsDisplayList*, RetainedDisplayList*, RetainedDisplayList*, mozilla::Maybe<mozilla::ActiveScrolledRoot const*>&, unsigned int) src/layout/painting/RetainedDisplayListBuilder.cpp:514:36 #4 0x7f0d2aa4c4d1 in RetainedDisplayListBuilder::AttemptPartialUpdate(unsigned int, mozilla::DisplayListChecker*) src/layout/painting/RetainedDisplayListBuilder.cpp:1206:7 #5 0x7f0d2a1f377a in nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) src/layout/base/nsLayoutUtils.cpp:3695:40 #6 0x7f0d2a0e6dda in mozilla::PresShell::Paint(nsView*, nsRegion const&, unsigned int) src/layout/base/PresShell.cpp:6314:5 #7 0x7f0d29a73067 in nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) src/view/nsViewManager.cpp:480:19 #8 0x7f0d29a71e7c in nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) src/view/nsViewManager.cpp:412:33 #9 0x7f0d29a774a6 in nsViewManager::ProcessPendingUpdates() src/view/nsViewManager.cpp:1102:5 #10 0x7f0d2a0600ba in nsRefreshDriver::Tick(long, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:2039:11 #11 0x7f0d2a06cccb in TickDriver src/layout/base/nsRefreshDriver.cpp:328:13 #12 0x7f0d2a06cccb in mozilla::RefreshDriverTimer::TickRefreshDrivers(long, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) src/layout/base/nsRefreshDriver.cpp:301 #13 0x7f0d2a06c899 in mozilla::RefreshDriverTimer::Tick(long, mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:320:5 #14 0x7f0d2a06f3de in RunRefreshDrivers src/layout/base/nsRefreshDriver.cpp:760:5 #15 0x7f0d2a06f3de in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:673 #16 0x7f0d2a06efec in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::TimeStamp) src/layout/base/nsRefreshDriver.cpp:574:9 #17 0x7f0d2a92515f in mozilla::layout::VsyncChild::RecvNotify(mozilla::TimeStamp const&) src/layout/ipc/VsyncChild.cpp:68:16 #18 0x7f0d235ec908 in mozilla::layout::PVsyncChild::OnMessageReceived(IPC::Message const&) src/obj-firefox/ipc/ipdl/PVsyncChild.cpp:167:20 #19 0x7f0d234c3ff7 in mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) src/obj-firefox/ipc/ipdl/PBackgroundChild.cpp:1988:28 #20 0x7f0d2302f8ce in mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) src/ipc/glue/MessageChannel.cpp:2134:25 #21 0x7f0d2302c7e4 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) src/ipc/glue/MessageChannel.cpp:2064:17 #22 0x7f0d2302e03c in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) src/ipc/glue/MessageChannel.cpp:1910:5 #23 0x7f0d2302e698 in mozilla::ipc::MessageChannel::MessageTask::Run() src/ipc/glue/MessageChannel.cpp:1943:15 #24 0x7f0d2213f8fa in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1088:14 #25 0x7f0d2215bb54 in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:519:10 #26 0x7f0d23037576 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:125:5 #27 0x7f0d22f8c00c in RunInternal src/ipc/chromium/src/base/message_loop.cc:325:10 #28 0x7f0d22f8c00c in RunHandler src/ipc/chromium/src/base/message_loop.cc:318 #29 0x7f0d22f8c00c in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:298 #30 0x7f0d29b015ea in nsBaseAppShell::Run() src/widget/nsBaseAppShell.cpp:157:27 #31 0x7f0d2dda8e8f in XRE_RunAppShell() src/toolkit/xre/nsEmbedFunctions.cpp:896:22 #32 0x7f0d22f8c00c in RunInternal src/ipc/chromium/src/base/message_loop.cc:325:10 #33 0x7f0d22f8c00c in RunHandler src/ipc/chromium/src/base/message_loop.cc:318 #34 0x7f0d22f8c00c in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:298 #35 0x7f0d2dda8846 in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:722:34 #36 0x4f1ca4 in content_process_main src/browser/app/../../ipc/contentproc/plugin-container.cpp:50:30 #37 0x4f1ca4 in main src/browser/app/nsBrowserApp.cpp:287 #38 0x7f0d41a9982f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291 #39 0x4210e8 in _start (firefox+0x4210e8)
Flags: in-testsuite?
I think this is a test case for bug 1462497. Given that the test case also includes progress element, we don't properly handle the frame there?
Flags: needinfo?(matt.woodrow)
This is a bug with the interactions between table captions, ::before/::after pseudo-elements, and RDL. It's a long standing bug, but RDL just adds an assertion that uncovers it. Pretty obscure, and unlikely to affect too much web-content, and won't crash in release builds, so I don't think we need to block on this. The bug here happens when we build the display items for a table that has a caption. The caption items are created after all the table content, and then we sort the resulting display list using the DOM ordering to move the caption items into their correct spot. Unfortunately, the sorting uses nsLayoutUtils::CompareTreePosition, which always returns 0 (nodes have no relative ordering) if one of the nodes is anonymous [1] (::after in this case). When we do a full build we get a display list that has ordering like: B, ::after, A Sorting that makes no changes, since CompareTreePosition returns 0 when comparing B and ::after, as well as when comparing ::after and A. When we do a partial build we can end up only building: B, A Sorting that list compares B and A, determines that there is a relative ordering and switches us to: A, B Combining those two lists gives us a cycle in our DAG, and we hit the release assertion. I believe this would be fixed by finally fixing bug 604270, and not sorting for tables. Unfortunately we also sort the Outlines() list in every stacking context, and it seems like it should be possible to hit a similar bug if you have pseudo elements that draw outlines. The simplest solution is to just walk up the DOM tree to the nearest node that isn't anonymous, which would give us deterministic sorting (with all pseudo content considered behind normal content). We could also update CompareTreePosition to try understand some pseudos more (like knowing the position of ::after), but we'd still need fallback behaviour. [1] https://searchfox.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#1696
Flags: needinfo?(matt.woodrow)
How do you feel about doing something like this David? The bug is explained in the previous comment, and this fixes it (for both table-captions, and outlines), by treating anonymous content nodes as being their nearest 'real' ancestor for sorting. It's maybe not the correct sorting order, but that's already an issue, and at least this way it's deterministic. I think we should try fix bug 604270 regardless, but at we won't assert at least with this.
Assignee: nobody → matt.woodrow
Attachment #8986094 - Flags: feedback?(dbaron)
Priority: -- → P2
If this lands, I think it is probably best to let it ride the trains with 63.
Comment on attachment 8986094 [details] [diff] [review] Skip over anonymous content when sorting the display list in content order, since CompareTreePosition gives up otherwise > The simplest solution is to just walk up the DOM tree to the nearest node that isn't anonymous, which would give us deterministic sorting (with all pseudo content considered behind normal content). > > We could also update CompareTreePosition to try understand some pseudos more (like knowing the position of ::after), but we'd still need fallback behaviour. It seems like these two solutions aren't actually compatible. i.e., if we later come along and fix CompareTreePosition, this code will make that fix not apply. It also seems like it's worth trying to fix CompareTreePosition. (I'm a little worried about ending up with dependencies on this behavior -- if we make it reliable, we're more likely to have things depend on it.) It seems like we could assign indices to different sorts of content, such as: ::marker/::-moz-list-number/::-moz-list-bullet -> -2 ::before -> -1 regular children -> 0 ::after -> 1 nsIAnonymousContentCreator content -> 2 // is this appropriate? and similar for the other sorts of anonymous content, and then CompareTreePosition could use those indices for comparison, and could call nsIAnonymousContentCreator::AppendAnonymousContentTo to compare two elements that are both from the same nsIAnonymousContentCreator. (How many other types of native-anonymous content are there? It seems like we could separate the other categories by their pseudo-element and/or NODE_IS_ANONYMOUS_ROOT/NODE_IS_NATIVE_ANONYMOUS_ROOT bits and then assume everything else came from nsIAnonymousContentCreator. And it seems like XBL and shadow DOM should have an order already...)
Attachment #8986094 - Flags: feedback?(dbaron) → feedback-
Miko, is it possible that you might have cycles to pick this up for 63?
Flags: needinfo?(mikokm)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #6) > Miko, is it possible that you might have cycles to pick this up for 63? Unfortunately not at the moment.
Flags: needinfo?(mikokm)
Bughunter found this on https://terraria.gamepedia.com/Crystal_Ball Reproduces on Nightly Windows/Linux but not Beta. bp-d28370a3-d30f-4b7e-bab6-708480181201

A lot of the crashes are on https://terraria.gamepedia.com/Sawmill (and other pages on that domain), which indeed uses <table>s with captions. I can also reproduce it there.

So it looks like the crashes in the wild are indeed the same as the fuzzer testcase (which is very non-obvious from the crash reports/stacks).

I think we just need to address dbaron's feedback and update the patch. I'll try get to that this week.

Thanks Matt!

Flags: needinfo?(mikokm)

(In reply to David Baron :dbaron: 🏴󠁵󠁳󠁣󠁡󠁿 ⌚UTC-8 (if account gets disabled due to email bounces, ask a bugzilla admin to reenable it) from comment #5)

It seems like these two solutions aren't actually compatible. i.e., if we
later come along and fix CompareTreePosition, this code will make that fix
not apply.

It also seems like it's worth trying to fix CompareTreePosition. (I'm a
little worried about ending up with dependencies on this behavior -- if we
make it reliable, we're more likely to have things depend on it.)

It seems like we could assign indices to different sorts of content, such as:

::marker/::-moz-list-number/::-moz-list-bullet -> -2
::before -> -1
regular children -> 0
::after -> 1
nsIAnonymousContentCreator content -> 2 // is this appropriate?

and similar for the other sorts of anonymous content, and then
CompareTreePosition could
use those indices for comparison, and could call
nsIAnonymousContentCreator::AppendAnonymousContentTo to compare two elements
that are both from the same nsIAnonymousContentCreator.

(How many other types of native-anonymous content are there? It seems like
we could separate the other categories by their pseudo-element and/or
NODE_IS_ANONYMOUS_ROOT/NODE_IS_NATIVE_ANONYMOUS_ROOT bits and then assume
everything else came from nsIAnonymousContentCreator. And it seems like XBL
and shadow DOM should have an order already...)

Do you think we need to try to define all of these orderings? It seems like the relative ordering of most anonymous content is somewhat arbitrary.

The main thing we need for this particular bug is that the results are consistent when comparing anonymous elements to normal children, so that

B, anonymous, A

doesn't sort differently to

B, A.

Do you think it would be ok to just consider all anonymous elements to be behind regular children (and equal to each other), with the exception of ::after?

Attachment #8986094 - Attachment is obsolete: true
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/52e867c4246a Handle anonymous content when comparing using DoCompareTreePosition. r=dbaron
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

Can we land a test for this? Also, is this something we should consider nominating for Beta approval?

Flags: needinfo?(matt.woodrow)

I think we shouldn't uplift this, since the crash is a diagnostic assert and won't affect release builds. The patch is also non-trivial, so I think it's worth riding the trains.

Flags: needinfo?(matt.woodrow)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: