1304432 - Intermittent browser/components/originattributes/test/browser/browser_imageCacheIsolation.js | This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort. -

Reporter

Description

•

8 years ago

treeherder

Filed by: philringnalda [at] gmail.com

https://treeherder.mozilla.org/logviewer.html#?job_id=3798433&repo=autoland

https://queue.taskcluster.net/v1/task/aJmSWutYTpedd7H4eKvcag/runs/0/artifacts/public%2Flogs%2Flive_backing.log

Comment hidden (Intermittent Failures Robot)

8 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 6
* mozilla-inbound: 1
* fx-team: 1

Platform breakdown:
* linux32: 7
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-09-19&endday=2016-09-25&tree=all

Comment hidden (Intermittent Failures Robot)

19 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 7
* mozilla-inbound: 6
* try: 2
* mozilla-central: 2
* fx-team: 2

Platform breakdown:
* linux32: 14
* linux64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-09-26&endday=2016-10-02&tree=all

Comment hidden (Intermittent Failures Robot)

18 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 10
* mozilla-inbound: 5
* mozilla-central: 2
* fx-team: 1

Platform breakdown:
* linux32: 15
* linux64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-03&endday=2016-10-09&tree=all

Comment hidden (Intermittent Failures Robot)

24 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 11
* autoland: 9
* fx-team: 3
* mozilla-central: 1

Platform breakdown:
* linux32: 21
* linux64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-10&endday=2016-10-16&tree=all

Comment hidden (Intermittent Failures Robot)

47 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 18
* mozilla-inbound: 14
* fx-team: 7
* mozilla-central: 5
* try: 3

Platform breakdown:
* linux32: 39
* linux64: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-17&endday=2016-10-23&tree=all

Comment hidden (Intermittent Failures Robot)

15 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 9
* try: 3
* autoland: 3

Platform breakdown:
* linux32: 14
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-24&endday=2016-10-24&tree=all

Comment hidden (Intermittent Failures Robot)

19 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 9
* mozilla-inbound: 7
* try: 2
* fx-team: 1

Platform breakdown:
* linux32: 16
* linux64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-26&endday=2016-10-26&tree=all

Geoff Brown [:gbrown]

Comment 9

•

8 years ago

This test was added 2016-09-17 in bug 1264572; failures started almost immediately.

This is a test time-out primarily in Linux Debug runs. On Linux Debug, this is a long-running test, typically taking ~40 seconds; intermittently, run time exceeds 45 seconds, causing the timeout failure. requestLongerTimeout() may be appropriate, unless the test can be simplified or split into 2 or more tests.

Blocks: 1264572

Flags: needinfo?(huseby)

Comment hidden (Intermittent Failures Robot)

87 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 41
* autoland: 31
* try: 10
* mozilla-central: 3
* fx-team: 2

Platform breakdown:
* linux32: 70
* linux64: 17

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-24&endday=2016-10-30&tree=all

Dave Huseby [:huseby]

Assignee

Comment 11

•

8 years ago

I'm on it.

Flags: needinfo?(huseby)

Dave Huseby [:huseby]

Assignee

Updated

•

8 years ago

Assignee: nobody → huseby

Tanvi Vyas[:tanvi]

Updated

•

8 years ago

Whiteboard: [OA-testing]

Comment hidden (Intermittent Failures Robot)

25 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 8
* try: 5
* mozilla-central: 5
* autoland: 5
* fx-team: 2

Platform breakdown:
* linux32: 23
* linux64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1304432&startday=2016-10-31&endday=2016-11-06&tree=all

Dave Huseby [:huseby]

Assignee

Comment 13

•

8 years ago

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d899d30cf4d88bb645829041060920d0542dfdeb

Status: NEW → ASSIGNED

Dave Huseby [:huseby]

Assignee

Comment 14

•

8 years ago

Attached patch split up the isolation tests. (obsolete) — Details — Splinter Review

Dave Huseby [:huseby]

Assignee

Comment 15

•

8 years ago

I'm looking to make sure the debug linux builds don't time out.

Dave Huseby [:huseby]

Assignee

Updated

•

8 years ago

Attachment #8808389 - Flags: review?(amarchesini)

Dave Huseby [:huseby]

Assignee

Updated

•

8 years ago

Attachment #8808389 - Flags: review?(amarchesini) → review?(ehsan)

(no longer active)

Comment 16

•

8 years ago

Comment on attachment 8808389 [details] [diff] [review]
split up the isolation tests.

This patch doesn't remove browser_imageCacheIsolation.js, so please remove that file when landing.

Also, in the future you could use the much simpler solution of adding a requestLongerTimeout() to the test.  :-)  I'm not going to make you do that here since you've already done the much more complex task of splitting the test...

Attachment #8808389 - Flags: review?(ehsan) → review+

Dave Huseby [:huseby]

Assignee

Comment 17

•

8 years ago

I was under the impression that requestLongerTimeout() was something to be avoided.  I can just create a new patch for that and resubmit for review.  That won't take very long.

Flags: needinfo?(ehsan)

Dave Huseby [:huseby]

Assignee

Comment 18

•

8 years ago

Attached patch Bug_1304432.patch — Details — Splinter Review

simpler patch to add requestLongerTimeout(2)

Attachment #8808389 - Attachment is obsolete: true

Attachment #8811994 - Flags: review?(ehsan)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 19

•

8 years ago

Imo, it's better to have smaller isolated tests, than a test taking minutes to run.

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 20

•

8 years ago

(and yes, requestLongerTimeout in browser-chrome tests was intended as a temporary solution while people could split their tests better, but it started being misused as "the solution")

Dave Huseby [:huseby]

Assignee

Comment 21

•

8 years ago

So which way do we want to go here?  I have both patches.  The original one, with the review feedback incorporated.  And the new one that just adds requestLongerTimeout(2).

Once we decide which way to go, I'll land it.

Flags: needinfo?(ehsan) → needinfo?(mak77)

Dave Huseby [:huseby]

Assignee

Comment 22

•

8 years ago

ehsan, which way should I go on this bug? see comment 21.

Flags: needinfo?(ehsan)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 23

•

8 years ago

I didn't mean to impose roadblocks, I was just pointing out a de-facto existing problem, to clarify where we are.
We could make the b-c timeout be really large, like minutes, and many tests would stop failing with this error. The problem is that then most developers wouldn't notice if the test they wrote is very unefficient or just too large. Unfortunately we use to add more and more subtests to existing tests, until the existing test is extremely hard to maintain or debug and takes huge amount of time to run.
The hard 45s timeout is there to prevent this explosion of test contents and save some server side time (by forcing to write more efficient tests).
requestLongerTimeout is basically a way to say "I don't care" and bypass all of this. It had to exist cause there are hardly enough resources to fix all the tests.

There are alternative solutions to all of this, for example having a dashboard stating the avg run time for each test that could automatically file a bug when the time goes over an acceptable threshold, but we don't have such a thing. What we have is just this friendly "please split your very long or unefficient test so that it's more manageable".

comment 16 gave you an r+ and thus I think, after fixing the comments, you can land the patch that was reviewed at that time. That said, decision is up to your reviewer :)

Flags: needinfo?(mak77)

(no longer active)

Comment 24

•

8 years ago

I don't really have anything against requestLongerTimeout() myself, especially as a fix to intermittent test failures.  While what Marco said is definitely good software engineering, I think in practice we gain very little from "efficient" tests.  What's more important is making sure the test is easy to understand for a human.  Oftentimes splitting a test that takes 1 minute to run may make the test less readable, in which case I actually prefer requestLongerTimeout.

At any rate, I'm happy to take your original patch here.  Sorry for starting this side conversation in the bug, I meant to save you some time in the future.  :-)

Flags: needinfo?(ehsan)

(no longer active)

Updated

•

8 years ago

Attachment #8811994 - Flags: review?(ehsan)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 25

•

8 years ago

(In reply to :Ehsan Akhgari from comment #24)
> Oftentimes splitting a test that
> takes 1 minute to run may make the test less readable, in which case I
> actually prefer requestLongerTimeout.

That's right. Unfortunately what happens more often is that a simple test like browser_abouthome.js that was initially thought to just check the page localStorage and buttons, starts growing other 10 sub tests for any additional new feature, and one of the sub tests starts failing intermittently. It is now really hard to debug, cause it is hundreds of lines and testing barely related things, plus sheriffs may end up disabling the whole test if it fails too often, instead of just the intermittent sub test.

Btw, sorry for the noise!

(no longer active)

Comment 26

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #25)
> (In reply to :Ehsan Akhgari from comment #24)
> > Oftentimes splitting a test that
> > takes 1 minute to run may make the test less readable, in which case I
> > actually prefer requestLongerTimeout.
> 
> That's right. Unfortunately what happens more often is that a simple test
> like browser_abouthome.js that was initially thought to just check the page
> localStorage and buttons, starts growing other 10 sub tests for any
> additional new feature, and one of the sub tests starts failing
> intermittently. It is now really hard to debug, cause it is hundreds of
> lines and testing barely related things, plus sheriffs may end up disabling
> the whole test if it fails too often, instead of just the intermittent sub
> test.

Yes, that's a great example of the test needs to be split, I agree.  Ideally reviewers won't let that happen but I'm sure I have made such mistakes myself as a reviewer since it's often hard to know exactly how big a test currently is.  :/

BTW another case that I've seen happen at least once in a way that indicated a bug was a test stated to assert at some point at the test runtime got inflated by the super slow printing of stack traces for the assertions, and as a result it ended up intermittently failing.  I really wish we had per-test time tracking and would fail the test when it suddenly changed its runtime drastically instead of putting arbitrary hard limits on test run times, then we wouldn't have this conversation in the first place and hopefully would be able to get rid of all of these intermittent failures (and requestLongerTimeout altogether too).  Perhaps jmaher would be interested to think about this?

Now I'll stop being so off-topic in this bug.  :-)

Flags: needinfo?(jmaher)

Dave Huseby [:huseby]

Assignee

Comment 27

•

8 years ago

Comment on attachment 8811994 [details] [diff] [review]
Bug_1304432.patch

Ehsan,  I am submitting the patch with the requestLongerTimeout() for review for two reasons: 

1) Splitting the tests up duplicated a bunch of code, potentially creating maintenance headaches later.  
2) This test only times out on Linux Debug builds. 

I think this is ultimately the best solution.

--dave

Attachment #8811994 - Flags: review?(ehsan)

(no longer active)

Comment 28

•

8 years ago

Comment on attachment 8811994 [details] [diff] [review]
Bug_1304432.patch

Review of attachment 8811994 [details] [diff] [review]:
-----------------------------------------------------------------

Sure.  Thanks!

Attachment #8811994 - Flags: review?(ehsan) → review+

Joel Maher ( :jmaher ) (UTC -8)

Comment 29

•

8 years ago

thanks for pinging me on this :ehsan.  I know that Kyle has looked into tracking per test failures and metrics via the ActiveData project- I am not sure that is still active or in the works, but if it was, it would be nice to look at average runtime per day for a given test on a platform, and compare it day over day to look for any runtimes increase >10% and flag us.

Flags: needinfo?(jmaher) → needinfo?(klahnakoski)

Dave Huseby [:huseby]

Assignee

Updated

•

8 years ago

Keywords: intermittent-failure → checkin-needed

(no longer active)

Updated

•

8 years ago

Keywords: intermittent-failure

Pulsebot

Comment 30

•

8 years ago

Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9966dd2e2ccb
Intermittent test timeouts, added requestLongerTimeout. r=ehsan

Keywords: checkin-needed

Carsten Book [:Tomcat]

Comment 31

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9966dd2e2ccb

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox53: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 53

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

status-firefox51: --- → unaffected

status-firefox52: --- → affected

Ryan VanderMeulen [:RyanVM]

Comment 32

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-aurora/rev/3dba6bf0b663

status-firefox52: affected → fixed

Flags: in-testsuite-

Dave Huseby [:huseby]

Assignee

Updated

•

8 years ago

Priority: -- → P3

Kyle Lahnakoski [:ekyle]

Comment 33

•

8 years ago

Yes, ActiveData has all the individual test runtimes, and it is certainly possible to track-and-alert when there are changes, or additions.  The problem will be the massive number of false positives; much like we get in our performance tests, but 100x more. I have a set of specific actions that can reduce this false positive rate, it a machine learning solution, with additional computationally-expensive features that work well for this domain. It has never been a priority to implement.

Kyle Lahnakoski [:ekyle]

Updated

•

8 years ago

Flags: needinfo?(klahnakoski)

split up the isolation tests. 8 years ago Dave Huseby [:huseby] 14.01 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
Bug_1304432.patch 8 years ago Dave Huseby [:huseby] 1015 bytes, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review