Closed Bug 1324185 Opened 3 years ago Closed 3 years ago

Massive Mozmill regressions from bug 1322414

Categories

(Thunderbird :: General, defect)

53 Branch
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 53.0

People

(Reporter: jorgk-bmo, Assigned: aleth)

References

Details

At M-C rev 63b447888a64 we only had the Mozmill failure from bug 1316256.

Applying https://hg.mozilla.org/mozilla-central/rev/7ba8dff4a242 from that bug makes
mozmake SOLO_TEST=content-policy/test-js-content-policy.js mozmill-one
work.

While bug 1316256 landed between M-C rev 63b447888a64 and M-C rev 5a536a16e337, something else also landed in that range
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=63b447888a64&tochange=5a536a16e337
that broke test-js-content-policy.js and many other tests.

https://archive.mozilla.org/pub/thunderbird/try-builds/aleth@instantbird.org-db7e6b2b78ffe19d529c135fa81ad2b5feb9f4ff/try-comm-central-linux64/try-comm-central_ubuntu64_vm_test-mozmill-bm114-tests1-linux64-build4.txt.gz

The test now fails with:
INFO -  SUMMARY-UNEXPECTED-FAIL | test-js-content-policy.js | test-js-content-policy.js::test_jsContentPolicy
INFO -    EXCEPTION: troller.window.content is null
INFO -      at: test-folder-display-helpers.js line 2006
INFO -         _internal_assert_displayed test-folder-display-helpers.js:2006 1
INFO -         assert_selected_and_displayed test-folder-display-helpers.js:2100 3
INFO -         test_jsContentPolicy test-js-content-policy.js:250 3
OK, I built at the current tip of C-C rev 5a163ed948ce and M-C rev 34a1ab064cb5 and my build went through (with Rust installed).

mozmake SOLO_TEST=content-policy/test-js-content-policy.js mozmill-one passes, equally:
mozmake SOLO_TEST=content-policy/test-dns-prefetch.js mozmill-one

So no idea what's happening on the trees. Are we chasing a ghost here?
Aleth, can you please do a try run with Rust to see whether these failures persist.
Flags: needinfo?(aleth)
I closed the tree again due to this bug. We have Mozmill failure on C-C without Rust and on Try-C-C with Rust. So the failures are real, but as stated in comment #1, I don't see them locally.

BTW: I ran the test that complains loudest locally:
mozmake SOLO_TEST=folder-display/test-deletion-with-multiple-displays.js mozmill-one
It passes.

Anyone has any clue here?
Flags: needinfo?(aleth)
Flags: needinfo?(acelists)
The 'troller' is an internal variable in our mozmill code, https://dxr.mozilla.org/comm-central/rev/5a163ed948ceb5e930d754a6491599d12471d94a/mail/test/mozmill/shared-modules/test-folder-display-helpers.js#1997 . But maybe the problem starts somewhere else.
Update:

A try run
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=ece05e1ef65cbf7f65c79815554fd474a11400a9
backing out M-C:
e3c689dbcd92	Gijs Kruitbosch — Bug 1322414 - part 2,3,4: use a separate 'primary' attribute for primary child browsers, r=bz,mconley
9a6a73f37649	Gijs Kruitbosch — Bug 1322414 - part 1 - remove GetContentShellById and id passing, r=bz
4616e2bb5fc4	Gijs Kruitbosch — Bug 1322609 - use getTabBrowser() instead of a type attribute check in marionette, r=ato
and our fix from bug 1323968 came out green.

Great work, Aleth, and thanks for the persistence. My regression range in comment #0 is wrong. So bug 1323968 really only half fixed it (as you were joking on IRC). The strange thing is that it changed one whole heap of failures into another one. And it fixed local builds.

So what's special about Mozmill on the servers?
(In reply to Jorg K (GMT+1) from comment #6)
> So what's special about Mozmill on the servers?

The logs show how exactly it is called in automation (environment, cwd, etc)
Flags: needinfo?(aleth)
It appears this is reproducible in some cases locally after all. I've seen a local failure with

make SOLO_TEST=utils/test-iteratorUtils.js mozmill-one

Maybe it depends also on which preceding tests have run?
And this https://bugzilla.mozilla.org/show_bug.cgi?id=1322414#c15 seems to match exactly what is going on here.
Blocks: 1322414
Summary: Massive Mozmill failure on 2016-12-16 → Massive Mozmill regressions from bug 1322414
Version: 52 Branch → 53 Branch
(In reply to aleth [:aleth] from comment #9)
> And this https://bugzilla.mozilla.org/show_bug.cgi?id=1322414#c15 seems to
> match exactly what is going on here.
Great detective work. How do we fix it? ;-)
(In reply to Jorg K (GMT+1) from comment #10)
> (In reply to aleth [:aleth] from comment #9)
> > And this https://bugzilla.mozilla.org/show_bug.cgi?id=1322414#c15 seems to
> > match exactly what is going on here.
> Great detective work. How do we fix it? ;-)

You'll have to debug one of the failures to find out... ;)
(In reply to aleth [:aleth] from comment #11)
> (In reply to Jorg K (GMT+1) from comment #10)
> > (In reply to aleth [:aleth] from comment #9)
> > > And this https://bugzilla.mozilla.org/show_bug.cgi?id=1322414#c15 seems to
> > > match exactly what is going on here.
> > Great detective work. How do we fix it? ;-)
> 
> You'll have to debug one of the failures to find out... ;)

My *guess* would be that window.content is being evaluated now either on the wrong window, or too early before the new attribute is set correctly.
I understood the m-c guy did some mistake and is probably supposed to fix it in m-c too?
Flags: needinfo?(acelists)
(In reply to :aceman from comment #13)
> I understood the m-c guy did some mistake and is probably supposed to fix it
> in m-c too?

No, I think all it means is that Gijs ran into the same symptom while fixing up the issues caused by the change for m-c.
Blocks: 1323968
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 53.0
Assignee: nobody → aleth
You need to log in before you can comment on or make changes to this bug.