Closed Bug 1304432 Opened 8 years ago Closed 8 years ago

Intermittent browser/components/originattributes/test/browser/browser_imageCacheIsolation.js | This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort. -

Categories

(Firefox :: General, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Firefox 53
Tracking Status
firefox51 --- unaffected
firefox52 --- fixed
firefox53 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: huseby)

References

Details

(Keywords: intermittent-failure, Whiteboard: [OA-testing])

Attachments

(1 file, 1 obsolete file)

This test was added 2016-09-17 in bug 1264572; failures started almost immediately.

This is a test time-out primarily in Linux Debug runs. On Linux Debug, this is a long-running test, typically taking ~40 seconds; intermittently, run time exceeds 45 seconds, causing the timeout failure. requestLongerTimeout() may be appropriate, unless the test can be simplified or split into 2 or more tests.
Blocks: 1264572
Flags: needinfo?(huseby)
I'm on it.
Flags: needinfo?(huseby)
Assignee: nobody → huseby
Whiteboard: [OA-testing]
Attached patch split up the isolation tests. (obsolete) — Splinter Review
I'm looking to make sure the debug linux builds don't time out.
Attachment #8808389 - Flags: review?(amarchesini)
Attachment #8808389 - Flags: review?(amarchesini) → review?(ehsan)
Comment on attachment 8808389 [details] [diff] [review]
split up the isolation tests.

This patch doesn't remove browser_imageCacheIsolation.js, so please remove that file when landing.

Also, in the future you could use the much simpler solution of adding a requestLongerTimeout() to the test.  :-)  I'm not going to make you do that here since you've already done the much more complex task of splitting the test...
Attachment #8808389 - Flags: review?(ehsan) → review+
I was under the impression that requestLongerTimeout() was something to be avoided.  I can just create a new patch for that and resubmit for review.  That won't take very long.
Flags: needinfo?(ehsan)
simpler patch to add requestLongerTimeout(2)
Attachment #8808389 - Attachment is obsolete: true
Attachment #8811994 - Flags: review?(ehsan)
Imo, it's better to have smaller isolated tests, than a test taking minutes to run.
(and yes, requestLongerTimeout in browser-chrome tests was intended as a temporary solution while people could split their tests better, but it started being misused as "the solution")
So which way do we want to go here?  I have both patches.  The original one, with the review feedback incorporated.  And the new one that just adds requestLongerTimeout(2).

Once we decide which way to go, I'll land it.
Flags: needinfo?(ehsan) → needinfo?(mak77)
ehsan, which way should I go on this bug? see comment 21.
Flags: needinfo?(ehsan)
I didn't mean to impose roadblocks, I was just pointing out a de-facto existing problem, to clarify where we are.
We could make the b-c timeout be really large, like minutes, and many tests would stop failing with this error. The problem is that then most developers wouldn't notice if the test they wrote is very unefficient or just too large. Unfortunately we use to add more and more subtests to existing tests, until the existing test is extremely hard to maintain or debug and takes huge amount of time to run.
The hard 45s timeout is there to prevent this explosion of test contents and save some server side time (by forcing to write more efficient tests).
requestLongerTimeout is basically a way to say "I don't care" and bypass all of this. It had to exist cause there are hardly enough resources to fix all the tests.

There are alternative solutions to all of this, for example having a dashboard stating the avg run time for each test that could automatically file a bug when the time goes over an acceptable threshold, but we don't have such a thing. What we have is just this friendly "please split your very long or unefficient test so that it's more manageable".

comment 16 gave you an r+ and thus I think, after fixing the comments, you can land the patch that was reviewed at that time. That said, decision is up to your reviewer :)
Flags: needinfo?(mak77)
I don't really have anything against requestLongerTimeout() myself, especially as a fix to intermittent test failures.  While what Marco said is definitely good software engineering, I think in practice we gain very little from "efficient" tests.  What's more important is making sure the test is easy to understand for a human.  Oftentimes splitting a test that takes 1 minute to run may make the test less readable, in which case I actually prefer requestLongerTimeout.

At any rate, I'm happy to take your original patch here.  Sorry for starting this side conversation in the bug, I meant to save you some time in the future.  :-)
Flags: needinfo?(ehsan)
Attachment #8811994 - Flags: review?(ehsan)
(In reply to :Ehsan Akhgari from comment #24)
> Oftentimes splitting a test that
> takes 1 minute to run may make the test less readable, in which case I
> actually prefer requestLongerTimeout.

That's right. Unfortunately what happens more often is that a simple test like browser_abouthome.js that was initially thought to just check the page localStorage and buttons, starts growing other 10 sub tests for any additional new feature, and one of the sub tests starts failing intermittently. It is now really hard to debug, cause it is hundreds of lines and testing barely related things, plus sheriffs may end up disabling the whole test if it fails too often, instead of just the intermittent sub test.

Btw, sorry for the noise!
(In reply to Marco Bonardo [::mak] from comment #25)
> (In reply to :Ehsan Akhgari from comment #24)
> > Oftentimes splitting a test that
> > takes 1 minute to run may make the test less readable, in which case I
> > actually prefer requestLongerTimeout.
> 
> That's right. Unfortunately what happens more often is that a simple test
> like browser_abouthome.js that was initially thought to just check the page
> localStorage and buttons, starts growing other 10 sub tests for any
> additional new feature, and one of the sub tests starts failing
> intermittently. It is now really hard to debug, cause it is hundreds of
> lines and testing barely related things, plus sheriffs may end up disabling
> the whole test if it fails too often, instead of just the intermittent sub
> test.

Yes, that's a great example of the test needs to be split, I agree.  Ideally reviewers won't let that happen but I'm sure I have made such mistakes myself as a reviewer since it's often hard to know exactly how big a test currently is.  :/

BTW another case that I've seen happen at least once in a way that indicated a bug was a test stated to assert at some point at the test runtime got inflated by the super slow printing of stack traces for the assertions, and as a result it ended up intermittently failing.  I really wish we had per-test time tracking and would fail the test when it suddenly changed its runtime drastically instead of putting arbitrary hard limits on test run times, then we wouldn't have this conversation in the first place and hopefully would be able to get rid of all of these intermittent failures (and requestLongerTimeout altogether too).  Perhaps jmaher would be interested to think about this?

Now I'll stop being so off-topic in this bug.  :-)
Flags: needinfo?(jmaher)
Comment on attachment 8811994 [details] [diff] [review]
Bug_1304432.patch

Ehsan,  I am submitting the patch with the requestLongerTimeout() for review for two reasons: 

1) Splitting the tests up duplicated a bunch of code, potentially creating maintenance headaches later.  
2) This test only times out on Linux Debug builds. 

I think this is ultimately the best solution.

--dave
Attachment #8811994 - Flags: review?(ehsan)
Comment on attachment 8811994 [details] [diff] [review]
Bug_1304432.patch

Review of attachment 8811994 [details] [diff] [review]:
-----------------------------------------------------------------

Sure.  Thanks!
Attachment #8811994 - Flags: review?(ehsan) → review+
thanks for pinging me on this :ehsan.  I know that Kyle has looked into tracking per test failures and metrics via the ActiveData project- I am not sure that is still active or in the works, but if it was, it would be nice to look at average runtime per day for a given test on a platform, and compare it day over day to look for any runtimes increase >10% and flag us.
Flags: needinfo?(jmaher) → needinfo?(klahnakoski)
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9966dd2e2ccb
Intermittent test timeouts, added requestLongerTimeout. r=ehsan
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/9966dd2e2ccb
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 53
Priority: -- → P3
Yes, ActiveData has all the individual test runtimes, and it is certainly possible to track-and-alert when there are changes, or additions.  The problem will be the massive number of false positives; much like we get in our performance tests, but 100x more. I have a set of specific actions that can reduce this false positive rate, it a machine learning solution, with additional computationally-expensive features that work well for this domain. It has never been a priority to implement.
Flags: needinfo?(klahnakoski)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: