Closed Bug 1461267 Opened 7 years ago Closed 4 years ago

Crash in InvalidArrayIndex_CRASH | MergeState::Finalize

Categories

(Core :: Web Painting, defect, P3)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
thunderbird_esr60 --- unaffected
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 + disabled
firefox62 - wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- wontfix
firefox69 --- fix-optional
firefox70 --- fix-optional

People

(Reporter: calixte, Assigned: mattwoodrow)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is report bp-efc900b1-21a1-4077-a46c-b4c0f0180511. ============================================================= Top 10 frames of crashing thread: 0 mozglue.dll MOZ_CrashPrintf mfbt/Assertions.cpp:63 1 xul.dll InvalidArrayIndex_CRASH xpcom/ds/nsTArray.cpp:26 2 xul.dll MergeState::Finalize layout/painting/RetainedDisplayListBuilder.cpp:303 3 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:491 4 xul.dll MergeState::ProcessItemFromNewList layout/painting/RetainedDisplayListBuilder.cpp:280 5 xul.dll RetainedDisplayListBuilder::MergeDisplayLists layout/painting/RetainedDisplayListBuilder.cpp:488 6 xul.dll RetainedDisplayListBuilder::AttemptPartialUpdate layout/painting/RetainedDisplayListBuilder.cpp:1172 7 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:3679 8 xul.dll mozilla::PresShell::Paint layout/base/PresShell.cpp:6351 9 xul.dll nsViewManager::ProcessPendingUpdatesPaint view/nsViewManager.cpp:480 ============================================================= There are 23 crashes (from 20 installations) in nightly 62 starting with buildid 20180510220127. :mattwoodrow, could you investigate please ?
Flags: needinfo?(matt.woodrow)
Bug 1453541 is targeting Fx61, so calling this affected to get it on the radar.
I believe this is a duplicate of bug 1459997 (just a slightly different callstack, but same cause), but we can leave it separate for now.
Flags: needinfo?(matt.woodrow)
Priority: -- → P1
Assignee: nobody → matt.woodrow
Still seeing a few crashes here in beta 61 and nightly 62.
(In reply to Matt Woodrow (:mattwoodrow) from comment #2) > I believe this is a duplicate of bug 1459997 (just a slightly different > callstack, but same cause), but we can leave it separate for now. Like bug 1461832, this also affects 61.0b13 beta channel builds, so we'll presumably be seeing this on release as well.
This one is interesting. We iterate mOldDAG, and use the index to lookup entries in mOldList. The assertion is an invalid array invalid, so our two structures must be different lengths. The crashing index is often, but not always 0, so it really seems like they got out of sync somehow rather than just being uninitialized (or there's two bugs). I've done an audit of the code and can't see any way this could happen :(
Comment on attachment 8986088 [details] Bug 1461267 - Add some more assertions to verify the length of the DAG. https://reviewboard.mozilla.org/r/251546/#review258166
Attachment #8986088 - Flags: review?(mstange) → review+
Keywords: leave-open
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9e0e314fe39a Add some more assertions to verify the length of the DAG. r=mstange
As expected, this went away on 61.0b14 when we disabled RDL by default. Interestingly, there haven't been any reports from 62 yet (and b1/b2 were DevEdition-only). Matt, did this signature morph into something else now?
Flags: needinfo?(matt.woodrow)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #10) > As expected, this went away on 61.0b14 when we disabled RDL by default. > Interestingly, there haven't been any reports from 62 yet (and b1/b2 were > DevEdition-only). Matt, did this signature morph into something else now? Yes, the new assertions will have given it a new signature. I found two crashes: https://crash-stats.mozilla.com/report/index/f3d3b0be-7dd2-45c6-9d69-194f00180625 https://crash-stats.mozilla.com/report/index/58c4fbd2-0a3f-4a4e-bbb0-fcdbe0180627 Both have the DAG with length of 2, and neither have the list length available in the minidump. Both are lists of type nsDisplayWrapList. Crashing in PreProcessDisplayList suggests that someone modified our list between paints. Crashing in MergeState suggests that someone modified our list while we were building the new list?! Or possibly that we managed to not call ProProcessDisplayList for this list? Both very weird, and I've audited for this pattern before and not found anything. No URLs available as of yet.
Flags: needinfo?(matt.woodrow)
No crashes in 62 yet with those signatures, so I'm dropping the tracking flag.
Crash Signature: [@ InvalidArrayIndex_CRASH | MergeState::Finalize] → [@ InvalidArrayIndex_CRASH | MergeState::Finalize] [@ RetainedDisplayListBuilder::PreProcessDisplayList] [@ MergeState::MergeState ]
I found a new signature for this, the stacks look almost identical to some of the other signatures in this bug.
Crash Signature: [@ InvalidArrayIndex_CRASH | MergeState::Finalize] [@ RetainedDisplayListBuilder::PreProcessDisplayList] [@ MergeState::MergeState ] → [@ InvalidArrayIndex_CRASH | MergeState::Finalize] [@ RetainedDisplayListBuilder::PreProcessDisplayList] [@ MergeState::MergeState ] [@ InvalidArrayIndex_CRASH | MergeState::ResolveNodeIndexesOldToMerged ]
Pretty low volume in 62 release, moderate on beta 63 and we are about to release 63. There are only a few crashes in nightly 64.
This is P1 but has been wontfixed several times in a row. Marking this fix-optional to remove it from repeated triage.

Updating flags. Still happening across 67/68 but in low volume.

we are less than a week away from RC so wontfix for 67.

Matt, this is marked as a P1 but we have shipped multiple releases with this bug and the volume of crashes is low for these signatures, should be downgrade it to P3?

Flags: needinfo?(matt.woodrow)

Yeah, definitely. I'm not actively working on this due to the low volume (though I would if a testcase/STR surfaced).

Flags: needinfo?(matt.woodrow)
Priority: P1 → P3

Happy to take a patch for 70 but since this is triaged and set to P3 priority I'm setting it as fix-optional.
That will remove the bug from weekly regression triage.

The leave-open keyword is there and there is no activity for 6 months.
:mattwoodrow, maybe it's time to close this bug?

Flags: needinfo?(matt.woodrow)

I had a quick look at the crashes and the only signature that still sees active crashes is [@ RetainedDisplayListBuilder::PreProcessDisplayList]. Under that signatures there's at least three different stacks with significantly different causes. The most common of which is hitting this assertion. The second most common is this assertion but it's a lot less frequent. Since the remaining crashes don't seem to have anything to do with the original issue I suggest to close this bug and open a new one specifically under the [@ RetainedDisplayListBuilder::PreProcessDisplayList] signature, pointing out the failing assertions.

Flags: needinfo?(matt.woodrow)

Closing this issue as Resolved > Worksforme since no crashes with this Signature were reported in the last 6 months.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME

:andrei, what's the signature you're talking about ?

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: