Closed Bug 898713 Opened 6 years ago Closed 6 years ago

tree closed due to unexpected assertions on Mac OS X debug mochitests (most reliably on 10.7)

Categories

(Core :: Graphics, defect, blocker)

x86_64
macOS
defect
Not set
blocker

Tracking

()

RESOLVED FIXED

People

(Reporter: dbaron, Unassigned)

References

Details

mozilla-inbound is currently closed due to orange.

This orange comes in the form of unexpected assertions on Mac OS X debug mochitests.  They're happening most reliably on 10.7, but also on other versions.  The assertion appears to be:

14:39:58     INFO -  [Parent 334] ###!!! ASSERTION: Texture initialization failed! -- error 0x506, Source 0, Source format 0,  RGBA Compat 1: 'Error', file ../../../gfx/layers/opengl/CompositorOGL.cpp, line 904

though I have not confirmed this is the only assertion.


The problem we have is that the orange is the result of a backout -- a backout of bug 873378.  This bug caused major regressions, probably due to the problem that bug 896250 and bug 897239 attempted to fix, but were backed out for.  (Probably for the same reason the tree is currently closed.)


I think what this means is that bug 873378 disabled some pretty significant aspects of the layers system, and on top of that some regression landed that broke something, leading to these assertions, but only leading to these assertions if bug 873378's breakage is not present.

We need to bisect to find the actual cause of the problem and back it out.  This means bisecting about a week worth of changes.
(In reply to David Baron [:dbaron] (don't cc:, use needinfo? instead) from comment #0)
> The problem we have is that the orange is the result of a backout -- a
> backout of bug 873378.  This bug caused major regressions, probably due to
> the problem that bug 896250 and bug 897239 attempted to fix, but were backed
> out for.  (Probably for the same reason the tree is currently closed.)

Er, I didn't quite say that right.

Bug 873378 caused major regressions.  Bug 896250 and bug 897239 may well have been the fix for these regressions.

But bug 896250 and bug 897239 were backed out for causing assertions -- probably the same problem as this bug.  (Though I haven't actually dug back through tbpl and confirmed that.)
I'm going to try pushing heads of many of the merges listed in:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7b2c82ae98db

to try, with a backout of bug 873378 on top of each of those heads, asking for a Mac OS X 10.7 debug mochitest-3.
Sorry for all the trouble this is causing. I saw that someone had backed out my faulty patch in bug 873378 and hoped that everything would be fine. You are correct that the bad patch to that bug completed disabled complex layer tree. I find it very concerning that a patch so wrong got merged to central (i.e. passed our test AND our performance testing). I imagine what happened with complex layer tree disabled that a regression was landed on top.

Unfortunately I've had a several flight canceling and about to fly so I can't offer any help at the moment. Sorry.

I'll discuss some of the simplification of this fallout in our GFX meeting next week and we will discuss if we can improve our testing to prevent something like this.
s/simplification/implication.
Looks like the problem is in:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b3fcd828cadc&tochange=a4c1961bf723
though I'm doing two reruns of both the before and after runs to double-check.
(In reply to David Baron [:dbaron] (don't cc:, use needinfo? instead) from comment #7)
> Looks like the problem is in:
> https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b3fcd828cadc&tochange=a4c1961bf723
> though I'm doing two reruns of both the before and after runs to
> double-check.

This is confirmed.

I might not stay awake to watch the runs from comment 8 complete.  If I don't:  if the green->orange transition is between 2->3 or 3->4 (numbering them 1->4), then it should be obvious what to back out.  If not, more bisection is needed.

Either way, back the relevant bug out of inbound, wait for it to cycle if it's not certain from the bisections already done, and reopen inbound.
It appears my guesses were wrong, and that it's in the bottom half of this merge:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7d66decedd21&tochange=d4afe4997be6

i.e., the part through and including d4afe4997be6.

Or, in other words, these changes on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c380eb98e301&tochange=d4afe4997be6
excluding the merge from mozilla-central.
A new round, mostly examining what I'm suspicious of in this smaller set:

https://tbpl.mozilla.org/?tree=Try&rev=daccfef3d635 on 3467ec69e0dd
https://tbpl.mozilla.org/?tree=Try&rev=987d8f7ddc99 on 517eaa8ba87e
https://tbpl.mozilla.org/?tree=Try&rev=4994fc0cc2e7 on 62cea72bee19
https://tbpl.mozilla.org/?tree=Try&rev=458ac2fd1ddf on fef7cf65b4e7
https://tbpl.mozilla.org/?tree=Try&rev=d09d02c9acab on 429c2a431ba8
https://tbpl.mozilla.org/?tree=Try&rev=abac7353baa1 on fa7746ab3ddf

(focusing on bug 786303, bug 832960, and bug 894576, which seem rather related to the tests in which the assertions are happening, though less directly to the assertion itself)

I'm definitely not staying awake for these.
Er, actually.

I read the results of comment 8 wrong.

=================================
IGNORE COMMENT 10 AND COMMENT 11.
=================================

What we're looking for is in the *top* part of this merge:
https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=a4c1961bf723
after 9089fe288899.
OK, a real new round this time:
https://tbpl.mozilla.org/?tree=Try&rev=55ff5ab03cf0 on 60d5f08d0a71
https://tbpl.mozilla.org/?tree=Try&rev=9a152d7ee6ec on 91356879fbfd
https://tbpl.mozilla.org/?tree=Try&rev=ff63acbf540e on 489046125fa6

As always, easiest to follow on:
https://tbpl.mozilla.org/?tree=Try&pusher=dbaron@mozilla.com
though you might have to click the green down arrow at the bottom.

I think it's likely to need one more round of bisection after these, but then that should be enough.


If somebody else wants to do that round of bisection, just take the exact patch I've been pushing to try repeatedly, update to the revision you want to test, commit that patch (this is easiest with mq) on top of that revision, and push the new commit to try.

We're looking for when things transition from green to orange.  (Yes, I caught the transition in the wrong direction in comment 10, from a random orange!)
(In reply to David Baron [:dbaron] (don't cc:, use needinfo? instead) from comment #13)
> OK, a real new round this time:
> https://tbpl.mozilla.org/?tree=Try&rev=55ff5ab03cf0 on 60d5f08d0a71
> https://tbpl.mozilla.org/?tree=Try&rev=9a152d7ee6ec on 91356879fbfd
> https://tbpl.mozilla.org/?tree=Try&rev=ff63acbf540e on 489046125fa6

60d5f08d0a71 and 91356879fbfd are green, 489046125fa6 is orange. So the offender should be one of these changesets: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=91356879fbfd&tochange=489046125fa6
(In reply to Dão Gottwald [:dao] from comment #15)
> https://tbpl.mozilla.org/?tree=Try&rev=7fb3bd9459f2 on ff8930432f20
> https://tbpl.mozilla.org/?tree=Try&rev=b0a4296daaa5 on bf3c83a7aed0
> https://tbpl.mozilla.org/?tree=Try&rev=8c4489ab05c7 on ff3b2131de12
> https://tbpl.mozilla.org/?tree=Try&rev=57951c952f47 on 0f7620a5047a
> https://tbpl.mozilla.org/?tree=Try&rev=cdf7b60699d3 on fe1213d6035d

So fe1213d6035d (bug 887868) appears to be the first orange changeset. Not sure whether that makes any sense. Pushed https://tbpl.mozilla.org/?tree=Try&rev=df7020b9c9e7 based on mozilla-inbound tip.
(In reply to Dão Gottwald [:dao] from comment #16)
> So fe1213d6035d (bug 887868) appears to be the first orange changeset. Not
> sure whether that makes any sense. Pushed
> https://tbpl.mozilla.org/?tree=Try&rev=df7020b9c9e7 based on mozilla-inbound
> tip.

That was green.

Backed out from inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/36326783bd82

I'll leave this bug open and the tree closed until inbound turns green.
Blocks: 887868
Inbound reopened.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.