Intermittent test_trackingprotection_whitelist.html | Test timed out.

RESOLVED WORKSFORME

Status

()

Toolkit
Safe Browsing
P5
normal
RESOLVED WORKSFORME
2 years ago
4 months ago

People

(Reporter: Tomcat, Unassigned)

Tracking

({intermittent-failure})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [stockwell unknown], URL)

(Reporter)

Description

2 years ago
https://treeherder.mozilla.org/logviewer.html#?job_id=13838896&repo=mozilla-inbound

 11:10:10     INFO -  2753 INFO TEST-UNEXPECTED-FAIL | toolkit/components/url-classifier/tests/mochitest/test_trackingprotection_whitelist.html | Test timed out.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Hasn't happened in almost a year. Closing.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → WONTFIX
Actually, that's "hasn't happened at least 5 times a week in almost a year" - the bot got muzzled, and is only allowed to comment on things that its masters think happen often enough to care about, and to find out about actual frequency you have to look at https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&entireHistory=true&tree=trunk (and generally "it just stopped happening" is WORKSFORME rather than WONTFIX).
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Priority: -- → P2
Thanks for reopening Phil.

Any chance we can add that more useful link to the automated bug creation script?
We could, but comment 0 when in theory there's only supposed to have ever been one instance would be an odd place to put it, and wouldn't affect bugs like this filed long ago.

If instead you go to https://bugzilla.mozilla.org/userprefs.cgi?tab=settings and scroll down to the User Interface section and toggle on "When viewing a bug, show its corresponding Orange Factor page" you'll get a link to the current week's worth (where the whole history is one click away), after the text (once OF churns around enough to notice I just starred this) "1 failure on trunk in the past week."
(In reply to Phil Ringnalda (:philor) from comment #7)
> If instead you go to https://bugzilla.mozilla.org/userprefs.cgi?tab=settings
> and scroll down to the User Interface section and toggle on "When viewing a
> bug, show its corresponding Orange Factor page" you'll get a link to the
> current week's worth (where the whole history is one click away), after the
> text (once OF churns around enough to notice I just starred this) "1 failure
> on trunk in the past week."

Oh, I didn't know about this. Thanks!
Priority: P2 → P5

Comment 9

11 months ago
5 failures in 609 pushes (0.008 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-inbound: 4
* autoland: 1

Platform breakdown:
* windows7-32: 2
* linux64: 2
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2016-12-19&endday=2016-12-25&tree=all

Comment 10

10 months ago
7 failures in 733 pushes (0.01 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* try: 3
* autoland: 2
* mozilla-inbound: 1
* mozilla-central: 1

Platform breakdown:
* linux64: 5
* windows8-64: 1
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-01-30&endday=2017-02-05&tree=all

Comment 11

9 months ago
6 failures in 836 pushes (0.007 failures/push) were associated with this bug in the last 7 days.  
Repository breakdown:
* mozilla-inbound: 2
* autoland: 2
* try: 1
* mozilla-beta: 1

Platform breakdown:
* linux64: 4
* osx-10-10: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-06&endday=2017-02-12&tree=all

Comment 12

9 months ago
31 failures in 173 pushes (0.179 failures/push) were associated with this bug yesterday.  
Repository breakdown:
* autoland: 20
* mozilla-inbound: 8
* mozilla-central: 2
* graphics: 1

Platform breakdown:
* linux64: 16
* linux32: 7
* windows7-32: 3
* osx-10-10: 3
* windows8-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-22&endday=2017-02-22&tree=all

Comment 13

9 months ago
33 failures in 146 pushes (0.226 failures/push) were associated with this bug yesterday.  
Repository breakdown:
* mozilla-inbound: 14
* autoland: 10
* graphics: 4
* try: 3
* mozilla-central: 2

Platform breakdown:
* linux64: 13
* windows7-32: 7
* windows8-64: 6
* linux32: 5
* osx-10-10: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-23&endday=2017-02-23&tree=all
The recent spike looks like either a regression from bug 1338970 or bug 1288633. Henry and Tim, please take a look.
Flags: needinfo?(ntim.bugs)
Flags: needinfo?(hchang)
Redirecting request from comment 14.
Flags: needinfo?(ntim.bugs) → needinfo?(tnguyen)
Bug 1209786 started failing very frequently at the same time.
See Also: → bug 1209786
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #14)
> The recent spike looks like either a regression from bug 1338970 or bug
> 1288633. Henry and Tim, please take a look.

The intermittent happened more than 1 year ago but has been breaking more frequently recently. I doubt that is related to bug 1341514 because SpecialPowers flushEnvPref may block runnexttest
But even Im right, fixing this only reduces the number of intermittent
Flags: needinfo?(tnguyen)

Comment 18

9 months ago
38 failures in 182 pushes (0.209 failures/push) were associated with this bug yesterday.  
Repository breakdown:
* mozilla-inbound: 15
* autoland: 15
* try: 4
* mozilla-central: 3
* graphics: 1

Platform breakdown:
* linux64: 17
* osx-10-10: 9
* linux32: 5
* windows8-64: 4
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-24&endday=2017-02-24&tree=all

Comment 19

9 months ago
17 failures in 47 pushes (0.362 failures/push) were associated with this bug yesterday.  
Repository breakdown:
* mozilla-inbound: 10
* mozilla-central: 5
* autoland: 2

Platform breakdown:
* linux32: 6
* linux64: 5
* windows8-64: 3
* osx-10-10: 2
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-25&endday=2017-02-25&tree=all

Comment 20

9 months ago
144 failures in 812 pushes (0.177 failures/push) were associated with this bug in the last 7 days. 

This is the #6 most frequent failure this week. 

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. **

Repository breakdown:
* autoland: 65
* mozilla-inbound: 49
* mozilla-central: 14
* try: 10
* graphics: 6

Platform breakdown:
* linux64: 65
* linux32: 30
* windows8-64: 17
* windows7-32: 16
* osx-10-10: 16

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-20&endday=2017-02-26&tree=all
doing some retriggers here trying to find when this started:
https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=mochitest-chrome-3%20linux&tochange=51ecfa1834c3ee219899ef1096eb8a950b0b176f&fromchange=ebea9daabb964050c24edaf47a15ad6fe278f3cf&selectedJob=78980478

I will follow up here when I narrow down where this started from, either way we need to get working on fixing this :)
Flags: needinfo?(jmaher)
Whiteboard: [stockwell needswork]
with bug 1288633, we see a spike in these failures.

With 144 failures last week, I would like to know when we can expect a fix for this or if we should consider disabling this if there is no time to investigate and fix this in the short term.
Blocks: 1288633
Flags: needinfo?(jmaher) → needinfo?(tnguyen)
I assume the failure here is the same as bug 1209786, please consider that in your investigations.

Comment 24

9 months ago
26 failures in 125 pushes (0.208 failures/push) were associated with this bug yesterday.  
Repository breakdown:
* autoland: 17
* graphics: 4
* mozilla-inbound: 3
* mozilla-central: 2

Platform breakdown:
* linux64: 12
* linux32: 6
* windows8-64: 5
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-27&endday=2017-02-27&tree=all
I disabled a test in bug 1341514, adding that test make others seem not be stable.
Let see in next several days to see failure rate of the intermittent so that I can narrow down the problem
Flags: needinfo?(tnguyen)
Comment hidden (typo)
well, I see the intermittent has occurred more frequently because bug 1341514. The reason is that an invalid provider "fake" leads a failure update.
https://searchfox.org/mozilla-central/rev/9c1c7106eef137e3413fd867fc1ddfb1d3f6728c/addon-sdk/source/test/preferences/firefox.json#12
Will fix in bug 1341514
https://treeherder.mozilla.org/#/jobs?repo=try&revision=febe88b5e8f31a33d3ebdb34da577e2a0e0354e8&selectedJob=80876754
No longer blocks: 1288633
Drop the ni since Thomas has found the root cause and is fixing it.
Flags: needinfo?(hchang)

Comment 30

9 months ago
32 failures in 783 pushes (0.041 failures/push) were associated with this bug in the last 7 days. 

This is the #39 most frequent failure this week. 

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. **

Repository breakdown:
* autoland: 18
* mozilla-inbound: 5
* graphics: 5
* mozilla-central: 4

Platform breakdown:
* linux64: 15
* windows8-64: 6
* linux32: 6
* windows7-32: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-02-27&endday=2017-03-05&tree=all
Assigning to Thomas as per comment 29.
Assignee: nobody → tnguyen
Status: REOPENED → ASSIGNED
Priority: P5 → P1
It happens less frequently since 1341514 has fixed
I would like to change priority to P3
Priority: P1 → P3
I think the problem is here https://searchfox.org/mozilla-central/source/toolkit/components/url-classifier/tests/UrlClassifierTestUtils.jsm#90
We may have an running update and this will bailed out
https://searchfox.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#2043

Rarely we setup test data while running periodical update, but if it occurs, the test will be timed out.
and the same with bug 1209786 which uses the same idea of setting up test data
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #33)
> I think the problem is here
> https://searchfox.org/mozilla-central/source/toolkit/components/url-
> classifier/tests/UrlClassifierTestUtils.jsm#90
> We may have an running update and this will bailed out
> https://searchfox.org/mozilla-central/source/toolkit/components/url-
> classifier/nsUrlClassifierDBService.cpp#2043
> 
> Rarely we setup test data while running periodical update, but if it occurs,
> the test will be timed out.

Ah that makes sense. It will be a problem for a number of our tests in fact.

As a first step, we should change the above code so that useTestDatabase() calls the reject() callback on the promise when BeginUpdate() returns an error. That way, the cause of these test failures will be clearly labelled and no longer just a timeout.

Secondly, we should think about what we can do here. Wait for the update to finish and re-try the manual update? Disable normal updates before we add our test entries? Hardcode all of the test entries we need (like bug 1182876).
See Also: → bug 1182876

Updated

9 months ago
Depends on: 1345922
(In reply to François Marier [:francois] from comment #35)
> (In reply to Thomas Nguyen[:tnguyen] ni plz from comment #33)
> > I think the problem is here
> > https://searchfox.org/mozilla-central/source/toolkit/components/url-
> > classifier/tests/UrlClassifierTestUtils.jsm#90
> > We may have an running update and this will bailed out
> > https://searchfox.org/mozilla-central/source/toolkit/components/url-
> > classifier/nsUrlClassifierDBService.cpp#2043
> > 
> > Rarely we setup test data while running periodical update, but if it occurs,
> > the test will be timed out.
> 
> Ah that makes sense. It will be a problem for a number of our tests in fact.
> 
> As a first step, we should change the above code so that useTestDatabase()
> calls the reject() callback on the promise when BeginUpdate() returns an
> error. That way, the cause of these test failures will be clearly labelled
> and no longer just a timeout.

Also we should refrain from doing the next update until the previous one
is done. I will do them all in Bug 1345922.

> 
> Secondly, we should think about what we can do here. Wait for the update to
> finish and re-try the manual update? Disable normal updates before we add
> our test entries? Hardcode all of the test entries we need (like bug
> 1182876).

Comment 37

9 months ago
6 failures in 790 pushes (0.008 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 2
* autoland: 2
* try: 1
* mozilla-central: 1

Platform breakdown:
* linux32: 3
* windows8-64: 1
* osx-10-10: 1
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&startday=2017-03-06&endday=2017-03-12&tree=all
Whiteboard: [stockwell needswork] → [stockwell unknown]
Rate once/month and we are not sure the timeout is really related to url-classifier (probably some previous tests take too long time)
Priority: P3 → P5
Assignee: tnguyen → nobody
Status: ASSIGNED → NEW
this bug no longer occurs
Status: NEW → RESOLVED
Last Resolved: a year ago4 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.