Closed Bug 1346232 Opened 7 years ago Closed 7 years ago

stylo: Reftests application timed out after 330 seconds with no output

Categories

(Core :: CSS Parsing and Computation, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: cbook, Assigned: hiro)

References

()

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed])

Attachments

(3 files)

[task 2017-03-10T11:47:38.437605Z] 11:47:38    ERROR - REFTEST ERROR | reftest | application timed out after 330 seconds with no output

like https://treeherder.mozilla.org/logviewer.html#?job_id=83018605&repo=autoland&lineNumber=1669 as example - not sure if this is a existing bug but seems to hit intermittently stylo reftests
Hm, looks like this is happening a fair bit, and there's no obvious place to start in the logs.

Time to throw up the bat signal I think. dmajor, do you have some cycles to help us out here?
Blocks: stylo
Flags: needinfo?(dmajor)
Priority: -- → P1
Summary: Stylo Reftests application timed out after 330 seconds with no output → stylo: Reftests application timed out after 330 seconds with no output
Is treeherder the only place to see these tests? Can I run them locally without jumping through hoops? Can I run them on Windows?
Flags: needinfo?(dmajor)
(In reply to David Major [:dmajor] from comment #5)
> Is treeherder the only place to see these tests? Can I run them locally
> without jumping through hoops?

Yes, do a build with --enable-stylo in your mozconfig. Then do:

./mach reftest --disable-e10s --setpref=reftest.compareStyloToGecko=true layout/reftests/reftest-stylo.list

> Can I run them on Windows?

Xidorn does stylo development on windows so I think it works. That said, the only CI we have is linux64, and that's where the intermittent is, so I don't know how likely it is that the problem would reproduce on windows.

Another approach might be to get ASAN builds working (bug 1336013) and see if that spits out anything interesting on CI.
Assigning just to avoid having unassigned p1 bugs. Let me know if you aren't able to take this.
Assignee: nobody → dmajor
Whiteboard: [stockwell needswork]
> Yes, do a build with --enable-stylo in your mozconfig. Then do:
> 
> ./mach reftest --disable-e10s --setpref=reftest.compareStyloToGecko=true
> layout/reftests/reftest-stylo.list

Running locally on Windows, I notice a few things -

- I consistently hit bug 1347399 on layout/reftests/bugs/652991-2.html. (Or rather, I get a subsequent MOZ_CRASH, since my opt build skips the asserts)

- layout/reftests/bugs/613433-*.html consistently get stuck and won't proceed until I re-focus the reftest window (even if I'm hands-off for the whole run, not giving it any reason to lose focus)

- If I click around between windows while the reftest starts up, it gets into the same "stuck" state as above, and does nothing until I give focus to the reftest. This looks like the same thing that's happening in CI, but is it really something specific to stylo? My memory is fuzzy but I thought this was just par for the course with reftests in general.
Does comment 8 help at all? If not, any tips on what to dig further into?
Flags: needinfo?(bobbyholley)
(In reply to David Major [:dmajor] from comment #10)
> Does comment 8 help at all?

From what I see in the logs, we never even run a single test, so I think it's a different issue. Either the browser or the harness is hanging on startup.

> If not, any tips on what to dig further into?

Aside from the ASAN thing, I would probably add logging in a bunch of places in the browser and harness startup sequence, and then do retriggers on try to see where it gets stuck.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #11)
> From what I see in the logs, we never even run a single test, so I think
> it's a different issue. Either the browser or the harness is hanging on
> startup.

Are you sure? When I get my local build into the stuck-on-startup state by changing window focus, my console has the same five lines as the CI log, ending at "Marionette INFO":

[task 2017-03-10T11:42:05.234776Z] 11:42:05     INFO - REFTEST INFO | Checking for orphan ssltunnel processes...
[task 2017-03-10T11:42:05.256820Z] 11:42:05     INFO - REFTEST INFO | Checking for orphan xpcshell processes...
[task 2017-03-10T11:42:05.295845Z] 11:42:05     INFO - REFTEST INFO | Running with e10s: False
[task 2017-03-10T11:42:05.296841Z] 11:42:05     INFO - REFTEST INFO | Application command: /home/worker/workspace/build/application/firefox/firefox -marionette -profile /tmp/tmpfGoyAf.mozrunner
[task 2017-03-10T11:42:07.982796Z] 11:42:07     INFO - 1489146127979	Marionette	INFO	Listening on port 2828
(In reply to David Major [:dmajor] from comment #12)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #11)
> > From what I see in the logs, we never even run a single test, so I think
> > it's a different issue. Either the browser or the harness is hanging on
> > startup.
> 
> Are you sure? When I get my local build into the stuck-on-startup state by
> changing window focus, my console has the same five lines as the CI log,
> ending at "Marionette INFO":

Oh, I misread your last point and didn't see that it was about startup. It could be a focus issue, though I thought that reftests were supposed to be able to deal with not being focused, or focus themselves (does the same thing happen in a non-stylo build?).

The treeherder summary also links to a failure screenshot, which looks...odd: https://public-artifacts.taskcluster.net/LSQ0KKfvTe6KIOb97HlexA/0/public/test_info//mozilla-test-fail-screenshot__bN4oS.png
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #13)

> The treeherder summary also links to a failure screenshot, which
> looks...odd:
> https://public-artifacts.taskcluster.net/LSQ0KKfvTe6KIOb97HlexA/0/public/
> test_info//mozilla-test-fail-screenshot__bN4oS.png

Oh, interesting! Locally I get two empty windows in the same size and position. The larger one on the left was supposed to become the window where the reftests are loaded, and the thing on the right is the auxiliary window.

I'll try non stylo...
I can make the same thing happen with a non-stylo build.
(In reply to David Major [:dmajor] from comment #15)
> I can make the same thing happen with a non-stylo build.

Bugzilla has a good number of hits for "reftest 330" in non-stylo builds. I wonder if they're all focus?
> - If I click around between windows while the reftest starts up, it gets
> into the same "stuck" state as above, and does nothing until I give focus to
> the reftest.

> I can make the same thing happen with a non-stylo build.

I get the feeling that this is a more general problem with reftests and window focus, and not really stylo-specific.

This is probably as far as I can take this without domain expertise. smaug, you show up the most on the log for nsFocusManager.cpp -- does the reftest focus code [1] seem ok to you? And, is nsFocusManager::ClearFocus (which is what I assume |gBrowser.focus()| does) do the right thing if the Firefox app itself doesn't have focus?

[1] https://dxr.mozilla.org/mozilla-central/rev/ff04d410e74b69acfab17ef7e73e7397602d5a68/layout/tools/reftest/reftest.jsm#413-421
Flags: needinfo?(bugs)
I'm not familiar with reftest setup... but what if gBrowser already has focus?

And ClearFocus? Where is that coming into play here?

http://searchfox.org/mozilla-central/rev/006005beff40d377cfd2f69d3400633c5ff09127/dom/interfaces/base/nsIFocusManager.idl#50 might be relevant here, depending on how reftests run. So activate the right top level window and then focus some element in it?
Flags: needinfo?(bugs)
As noted earlier, logs show this is basically a startup hang -- no tests are run.

Debug logs have several warnings on startup:

WARNING: stylo: No docshell yet, assuming Gecko style system: file /home/worker/workspace/build/src/dom/base/nsDocument.cpp, line 12983

WARNING: attempt to modify an immutable nsStandardURL: file /home/worker/workspace/build/src/netwerk/base/nsStandardURL.cpp, line 1644

WARNING: Failed to retarget HTML data delivery to the parser thread.: file /home/worker/workspace/build/src/parser/html/nsHtml5StreamParser.cpp, line 988

WARNING: NS_ENSURE_TRUE(standardURL) failed: file /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229

WARNING: stylo: cannot get ServoStyleSheets from XBL bindings yet. See bug 1290276.: file /home/worker/workspace/build/src/layout/base/nsCSSFrameConstructor.cpp, line 2716

WARNING: stylo: ServoStyleSets cannot handle @font-face rules yet. See bug 1290237.: file /home/worker/workspace/build/src/dom/base/nsDocument.cpp, line 12877

I don't know if any of these are cause for concern / related to the hang.


There are also screenshots, which are consistent and strange. And crash reports after the timeout...but I don't see anything unexpected in them.
The earliest stylo reftest "timed out after 330 seconds" that I can find is https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=stylo%20reftest&tochange=452781c4ee876084bdc6a05a99d21597b7445724&fromchange=bdbd9679bbf1cb4c928fb5e2e049ea9906e737fc&selectedJob=82831814...but I don't see anything related in that changeset or the previous few changesets.
(In reply to Geoff Brown [:gbrown] from comment #20)
> Debug logs have several warnings on startup:

All of those warnings appear in passing test runs as well.

(In reply to Olli Pettay [:smaug] from comment #19)
> And ClearFocus? Where is that coming into play here?

So the test says `gBrowser.focus()` which I assumed (guessed) lands in nsDOMWindowUtils::Focus with aElement == nullptr, which would lead to ClearFocus.
I am pretty sure this is (1) a problem with window focus and (2) not specific to stylo.

Can you find a more Gecko-knowledgeable owner to take it from here?
Flags: needinfo?(bobbyholley)
the main concern I have here is that this seems to be only showing up as a stylo specific error (where are the other platforms?)  I know this does happen in other platforms, but looking for a few minutes on bugzilla results in 2 bugs with no activity in recent months (bug 1265229 and bug 1298796).

I did look at a few logs and I see this in the runner:
REFTEST INFO | Running with e10s: False

I filed bug 1348754 to look into why this is not in e10s mode.


From a sheriff perspective this looks like a stylo reftest specific failure and it is one of the top failures.  Please ensure this doesn't get passed around from team to team and received appropriate attention.  I would like to be a bit more patient here before disabling the tests or hiding them on treeherder- so far we are 1 week into this and I would like to see this resolved in a few days.
(In reply to David Major [:dmajor] from comment #25)
> I am pretty sure this is (1) a problem with window focus and (2) not
> specific to stylo.

This does seem to be triggered by something with stylo. Maybe it's timing-related or incidental, but the correlation noted in comment 26 seems to be strong enough that I think it's something we need to fix.

It's also not clear to me that we really know that focus is the culprit here and not just the symptom. It seems to me that retriggering with logging per comment 6 would be a good way to bisect where in the startup pipeline we're getting stuck.

I can't emphasize enough that, IME, the only fully-general and reliable way to debug intermittent CI failures is to push logging and then hit the retrigger button five or ten times until the failure appears (rinse and repeat with more logging to answer the next question that arises).

> Can you find a more Gecko-knowledgeable owner to take it from here?

I certainly can't make you work on it - but the stylo team is swamped and there's no obvious person (either inside our outside the team) with special expertise to give this to. We really just need somebody who's good at debugging to attack this from first principles and narrow down the cause to something of the form: "this bug happens because we get hung up here with these abnormal inputs/state".

Your skillset seems like a good match for this, but if you really don't want to I can try to find somebody else.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #27)
> Your skillset seems like a good match for this, but if you really don't want
> to I can try to find somebody else.

I'll tell you why I ask, which I should have been more clear about upfront, is that I've been asked to ramp up on Quantum Flow stuff and have several weeks of work-travel coming up. Combined with my general low enthusiasm for printf debugging, I don't see myself getting around to this in the immediate future.
I've been looking more at the range gbrown pointed out, specifically:

https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=stylo%20reftest&tochange=bfd89f8fb93aed915d449184213078a1b946454e&fromchange=adb5053309977cfdf18e29ab041f37abbbe00d60

The retriggers are sure starting to make it seem like this started with wlach's push. My money is on this bit:

https://hg.mozilla.org/integration/autoland/rev/ebdd7d5fa7450f7ae6d685a584f136908b69e356#l2.12

Presumably that somehow changed whether or not these tests actually run in e10s mode. Joel, what's going on here? Are we somehow running in some franken-configuration that's half-e10s half-non-e10s in stylo? That might explain why this is only showing up for stylo reftests.
Flags: needinfo?(jmaher)
(In reply to David Major [:dmajor] from comment #28)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #27)
> > Your skillset seems like a good match for this, but if you really don't want
> > to I can try to find somebody else.
> 
> I'll tell you why I ask, which I should have been more clear about upfront,
> is that I've been asked to ramp up on Quantum Flow stuff and have several
> weeks of work-travel coming up. Combined with my general low enthusiasm for
> printf debugging, I don't see myself getting around to this in the immediate
> future.

Ok, thanks for the heads-up. Hopefully this e10s configuration business will lead somewhere.
Assignee: dmajor → nobody
Component: Layout → CSS Parsing and Computation
I tried to reproduce this on try server unsuccessfully:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d808679009eb30842e2e2af1e1fdc8e15e1d17a6&filter-resultStatus=success&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable

this has real e10s mode as well as non-e10s (what we appear to be running all the time anyway).  I do wonder if we get things really running in e10s if this will not be an issue anymore- lets keep pushing on that until it is not a variable anymore.
Flags: needinfo?(jmaher)
Ok - over to Joel for now until we get this e10s automation business sorted out in bug 1348754.
Assignee: nobody → jmaher
as a note, I landed the code to run reftests in e10s, lets check back in on Monday and see where this is at (assuming it is merged around today)
moving to e10s doesn't solve this, we are still seeing failures at the same rate.
:bholley, after enabling e10s for the tests we still have a high failure rate- I cannot find more signs of non stylo reftest timeouts on startup (or other timeouts related to the browser startup/shutdown).  Can you find someone to look at this?  Give the high frequency of failure, I would like to see this addressed soon, or we move the reftests to tier-3 or disabled.
Flags: needinfo?(bobbyholley)
FWIW, I did a quick check the difference.  It stuck just before loading reftest-stylo.list in failure case.
Attachment #8852252 - Attachment is patch: true
Attachment #8852252 - Attachment mime type: text/x-patch → text/plain
After the harness times out, it kills the browser to get a crash report. Most of the crash reports here are unhelpful -- either minidump_stackwalk finds a bad header in the minidump or the report is not symbolicated -- but there are a few "good" ones. Here are a few recent, symbolicated, crash reports:

https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=86656873&lineNumber=1834
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=86656854&lineNumber=1834
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=86705759&lineNumber=1677
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=86700057&lineNumber=1836
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=86755890&lineNumber=1834

They seem consistent, at least for thread 0:

[task 2017-03-27T13:25:16.397866Z] 13:25:16     INFO - Thread 0 (crashed)
[task 2017-03-27T13:25:16.398422Z] 13:25:16     INFO -  0  libc-2.15.so + 0xe7993
[task 2017-03-27T13:25:16.398553Z] 13:25:16     INFO -     rax = 0xfffffffffffffffc   rdx = 0xffffffffffffffff
[task 2017-03-27T13:25:16.399129Z] 13:25:16     INFO -     rcx = 0xffffffffffffffff   rbx = 0x00007fa123bf2320
[task 2017-03-27T13:25:16.399259Z] 13:25:16     INFO -     rsi = 0x0000000000000006   rdi = 0x00007fa0fe649a00
[task 2017-03-27T13:25:16.399810Z] 13:25:16     INFO -     rbp = 0x00007ffe93ea5e90   rsp = 0x00007ffe93ea5e50
[task 2017-03-27T13:25:16.400371Z] 13:25:16     INFO -      r8 = 0x0000000000000000    r9 = 0x0000000000000292
[task 2017-03-27T13:25:16.400497Z] 13:25:16     INFO -     r10 = 0x0000000000000000   r11 = 0x0000000000000293
[task 2017-03-27T13:25:16.401083Z] 13:25:16     INFO -     r12 = 0x00007fa118748689   r13 = 0x00000000ffffffff
[task 2017-03-27T13:25:16.401213Z] 13:25:16     INFO -     r14 = 0x0000000000000006   r15 = 0x0000000000000001
[task 2017-03-27T13:25:16.401796Z] 13:25:16     INFO -     rip = 0x00007fa123ee3993
[task 2017-03-27T13:25:16.402344Z] 13:25:16     INFO -     Found by: given as instruction pointer in context
[task 2017-03-27T13:25:16.402445Z] 13:25:16     INFO -  1  libxul.so!PollWrapper [nsAppShell.cpp:ccf27d7cdcdc : 46 + 0x10]
[task 2017-03-27T13:25:16.403498Z] 13:25:16     INFO -     rbp = 0x00007ffe93ea5e90   rsp = 0x00007ffe93ea5e80
[task 2017-03-27T13:25:16.404014Z] 13:25:16     INFO -     rip = 0x00007fa1187486b5
[task 2017-03-27T13:25:16.404097Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.404208Z] 13:25:16     INFO -  2  libglib-2.0.so.0.3200.4 + 0x47ff6
[task 2017-03-27T13:25:16.404295Z] 13:25:16     INFO -     rbp = 0x00007fa0fe649a00   rsp = 0x00007ffe93ea5ea0
[task 2017-03-27T13:25:16.404802Z] 13:25:16     INFO -     rip = 0x00007fa11f596ff6
[task 2017-03-27T13:25:16.404922Z] 13:25:16     INFO -     Found by: call frame info
[task 2017-03-27T13:25:16.405006Z] 13:25:16     INFO -  3  libglib-2.0.so.0.3200.4 + 0x48124
[task 2017-03-27T13:25:16.405551Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5ef0   rip = 0x00007fa11f597124
[task 2017-03-27T13:25:16.405656Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.405760Z] 13:25:16     INFO -  4  libxul.so!nsAppShell::ProcessNextNativeEvent [nsAppShell.cpp:ccf27d7cdcdc : 279 + 0x5]
[task 2017-03-27T13:25:16.406665Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5f10   rip = 0x00007fa1187486fb
[task 2017-03-27T13:25:16.406819Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.407289Z] 13:25:16     INFO -  5  libxul.so!nsBaseAppShell::DoProcessNextNativeEvent [nsBaseAppShell.cpp:ccf27d7cdcdc : 138 + 0x10]
[task 2017-03-27T13:25:16.407412Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5f20   rip = 0x00007fa11871d135
[task 2017-03-27T13:25:16.407476Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.407628Z] 13:25:16     INFO -  6  librt-2.15.so + 0x415d
[task 2017-03-27T13:25:16.408136Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5f30   rip = 0x00007fa1249e415d
[task 2017-03-27T13:25:16.408221Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.408350Z] 13:25:16     INFO -  7  libxul.so!nsBaseAppShell::OnProcessNextEvent [nsBaseAppShell.cpp:ccf27d7cdcdc : 289 + 0x8]
[task 2017-03-27T13:25:16.408891Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5f60   rip = 0x00007fa118720314
[task 2017-03-27T13:25:16.408980Z] 13:25:16     INFO -     Found by: stack scanning
[task 2017-03-27T13:25:16.409113Z] 13:25:16     INFO -  8  libxul.so!nsThread::ProcessNextEvent [nsThread.cpp:ccf27d7cdcdc : 1225 + 0xf]
[task 2017-03-27T13:25:16.409632Z] 13:25:16     INFO -     rsp = 0x00007ffe93ea5fb0   rip = 0x00007fa116d501e6
[task 2017-03-27T13:25:16.409719Z] 13:25:16     INFO -     Found by: stack scanning

https://dxr.mozilla.org/mozilla-central/source/widget/gtk/nsAppShell.cpp#46

That's not providing any insight for me, but I thought I'd point it out. And of course, there are dozens of other threads for crash report experts to consider.
Depends on: 1351518
Hiro's log diff in comment 43 led us to a theory in bug 1351518. Just pushed that, fingers crossed.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #46)
> Hiro's log diff in comment 43 led us to a theory in bug 1351518. Just pushed
> that, fingers crossed.

Didn't fix it: https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=87151917

I think the next step is to do some pushes with logging in the harness and try to figure out where things are getting dropped on the floor. Hiro is looking at that (though having trouble triggering the crash with logging):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0d4c84bd4fb7152307c8a6270c62645449e8eafc&selectedJob=87136973

Over to him for now.
Assignee: jmaher → hikezoe
I am almost convinced that we can't reproduce this timeout on *try servers* if we try revisions based on current mozilla-central.
Actually I found a couple of tries [1][2][3][4][5] including this timeout in recent tries, but all of them are based on old revision.  The newest revision among them is [5], it's based on https://hg.mozilla.org/try/rev/19289cc8bf6f .

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=86546334&repo=try&lineNumber=1669
[2] https://treeherder.mozilla.org/logviewer.html#?job_id=86480286&repo=try&lineNumber=1669
[3] https://treeherder.mozilla.org/logviewer.html#?job_id=86537583&repo=try&lineNumber=1668
[4] https://treeherder.mozilla.org/logviewer.html#?job_id=86138085&repo=try&lineNumber=1796
[5] https://treeherder.mozilla.org/logviewer.html#?job_id=85807974&repo=try&lineNumber=1838

I am not sure why we still fail on m-c or other branches, but the difference is a clue to track this bug down.
Joel, do you know the difference between try server and other production servers?
Flags: needinfo?(jmaher)
OK, I just realized that failure cases happened on ubuntu 12.04. (Though I did check several failure logs)

I think using desktop1604-test docker image will solve this.
Flags: needinfo?(jmaher)
Oh wait. We seem to use only ubuntu 12.04 for stylo reftest both on try and aurora (maybe m-c as well).
Attachment #8852324 - Attachment is obsolete: true
Attachment #8852324 - Flags: review?(jmaher)
in bug 1309086 we started running reftests on 1604:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/tests.yml#960

but in reftests-stylo we do not specify the newer OS version:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/tests.yml#1042

just adding one line will make a big difference there- good find!
(In reply to Geoff Brown [:gbrown] from comment #53)

It looks like switching to ubuntu 16.04 introduces several failures, but they seem consistent, some are unexpected passes, and I don't see any time outs. I suspect switching to 16.04 and updating stylo reftest expectations is the way forward.

:hiro - Let me know if I can help with anything.
Geoff, I think switching to ubuntu 16.04 is worthwhile doing. If there is no problem with regard to server resources something.  We should try it.

Note: As far as I can tell we can't reproduce this timeout on *try* server if we use the revision based on recent m-c.  Joel's try in comment 8 couldn't reproduce it either.  There must be some changes in early this month that solved this timeout only on *try*.
Attachment #8852324 - Attachment is obsolete: false
Attachment #8852324 - Flags: review?(jmaher)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #55)
> Geoff, I think switching to ubuntu 16.04 is worthwhile doing. If there is no
> problem with regard to server resources something.  We should try it.

I think it is fine for server resources; in fact, I think 16.04 is preferred.

> Note: As far as I can tell we can't reproduce this timeout on *try* server
> if we use the revision based on recent m-c.  Joel's try in comment 8
> couldn't reproduce it either.  There must be some changes in early this
> month that solved this timeout only on *try*.

That is strange, and it reminds me of bug 1348754, but I can't think of anything else which could be different on try.
(In reply to Geoff Brown [:gbrown] from comment #54)
> (In reply to Geoff Brown [:gbrown] from comment #53)
> 
> It looks like switching to ubuntu 16.04 introduces several failures, but
> they seem consistent, some are unexpected passes, and I don't see any time
> outs. I suspect switching to 16.04 and updating stylo reftest expectations
> is the way forward.

Yes, this sounds like the right approach to me.

Hiro is a hero!
Comment on attachment 8852324 [details]
Bug 1346232 - Use Ubuntu 16.04 docker image for stylo reftest to avoid timeouts.

https://reviewboard.mozilla.org/r/124590/#review127452

This looks good, but changing to ubuntu 16.04 will change which tests pass and fail, as my push in comment 53 demonstrated. :hiro, are you preparing a separate patch to update the test annotations?
OK, not yet but  will do.
Comment on attachment 8852324 [details]
Bug 1346232 - Use Ubuntu 16.04 docker image for stylo reftest to avoid timeouts.

https://reviewboard.mozilla.org/r/124590/#review127606

one line reviews are easy
Attachment #8852324 - Flags: review?(jmaher) → review+
Comment on attachment 8852811 [details]
Bug 1346232 - Update reftest expectations.

https://reviewboard.mozilla.org/r/124974/#review127610

this is enabling many tests, a big win!

::: layout/reftests/line-breaking/reftest-stylo.list
(Diff revision 1)
>  == punctuation-open-3.html punctuation-open-3.html
>  == punctuation-open-4.html punctuation-open-4.html
>  == quotationmarks-1.html quotationmarks-1.html
> -# The following is currently disabled on Linux because of a rendering issue with missing-glyph
> +== quotationmarks-cjk-1.html quotationmarks-cjk-1.html
> -# representations on the test boxes. See bug
> -fails == quotationmarks-cjk-1.html quotationmarks-cjk-1.html

odd, this fails on non stylo still, while this is valid possibly we can remove the skip-if(gtkWidget) for the reftest.list file as well :)
Attachment #8852811 - Flags: review+
(In reply to Joel Maher ( :jmaher) from comment #65)
> Comment on attachment 8852811 [details]
> Bug 1346232 - Update reftest expectations.
> 
> https://reviewboard.mozilla.org/r/124974/#review127610
> 
> this is enabling many tests, a big win!
> 
> ::: layout/reftests/line-breaking/reftest-stylo.list
> (Diff revision 1)
> >  == punctuation-open-3.html punctuation-open-3.html
> >  == punctuation-open-4.html punctuation-open-4.html
> >  == quotationmarks-1.html quotationmarks-1.html
> > -# The following is currently disabled on Linux because of a rendering issue with missing-glyph
> > +== quotationmarks-cjk-1.html quotationmarks-cjk-1.html
> > -# representations on the test boxes. See bug
> > -fails == quotationmarks-cjk-1.html quotationmarks-cjk-1.html
> 
> odd, this fails on non stylo still, while this is valid possibly we can
> remove the skip-if(gtkWidget) for the reftest.list file as well :)

Oh, indeed. I will talk with Masayuki about this tomorrow.

Thank you for the review!
Comment on attachment 8852811 [details]
Bug 1346232 - Update reftest expectations.

https://reviewboard.mozilla.org/r/124974/#review127634

This looks great. Thanks so much!
Attachment #8852811 - Flags: review?(gbrown) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/245fb9f42112
Use Ubuntu 16.04 docker image for stylo reftest to avoid timeouts. r=jmaher
https://hg.mozilla.org/integration/autoland/rev/26a362c81067
Update reftest expectations. r=jmaher
https://hg.mozilla.org/mozilla-central/rev/245fb9f42112
https://hg.mozilla.org/mozilla-central/rev/26a362c81067
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Whiteboard: [stockwell needswork] → [stockwell fixed]
Depends on: 1345283
No longer depends on: 1345283
Results after landing are pretty good.  I am really happy to help you guys, Bobby and Joel!
(In reply to Hiroyuki Ikezoe (:hiro) from comment #71)
> Results after landing are pretty good.  I am really happy to help you guys,
> Bobby and Joel!

Awesome, thanks so much for figuring this one out Hiro!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: