Closed Bug 1203438 Opened 9 years ago Closed 7 years ago

Intermittent test_trackingprotection_whitelist.html | Test timed out.

Categories

(Toolkit :: Safe Browsing, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

References

()

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

https://treeherder.mozilla.org/logviewer.html#?job_id=13838896&repo=mozilla-inbound

 11:10:10     INFO -  2753 INFO TEST-UNEXPECTED-FAIL | toolkit/components/url-classifier/tests/mochitest/test_trackingprotection_whitelist.html | Test timed out.
Hasn't happened in almost a year. Closing.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Actually, that's "hasn't happened at least 5 times a week in almost a year" - the bot got muzzled, and is only allowed to comment on things that its masters think happen often enough to care about, and to find out about actual frequency you have to look at https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1203438&entireHistory=true&tree=trunk (and generally "it just stopped happening" is WORKSFORME rather than WONTFIX).
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Priority: -- → P2
Thanks for reopening Phil.

Any chance we can add that more useful link to the automated bug creation script?
We could, but comment 0 when in theory there's only supposed to have ever been one instance would be an odd place to put it, and wouldn't affect bugs like this filed long ago.

If instead you go to https://bugzilla.mozilla.org/userprefs.cgi?tab=settings and scroll down to the User Interface section and toggle on "When viewing a bug, show its corresponding Orange Factor page" you'll get a link to the current week's worth (where the whole history is one click away), after the text (once OF churns around enough to notice I just starred this) "1 failure on trunk in the past week."
(In reply to Phil Ringnalda (:philor) from comment #7)
> If instead you go to https://bugzilla.mozilla.org/userprefs.cgi?tab=settings
> and scroll down to the User Interface section and toggle on "When viewing a
> bug, show its corresponding Orange Factor page" you'll get a link to the
> current week's worth (where the whole history is one click away), after the
> text (once OF churns around enough to notice I just starred this) "1 failure
> on trunk in the past week."

Oh, I didn't know about this. Thanks!
Priority: P2 → P5
The recent spike looks like either a regression from bug 1338970 or bug 1288633. Henry and Tim, please take a look.
Flags: needinfo?(ntim.bugs)
Flags: needinfo?(hchang)
Redirecting request from comment 14.
Flags: needinfo?(ntim.bugs) → needinfo?(tnguyen)
Bug 1209786 started failing very frequently at the same time.
See Also: → 1209786
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #14)
> The recent spike looks like either a regression from bug 1338970 or bug
> 1288633. Henry and Tim, please take a look.

The intermittent happened more than 1 year ago but has been breaking more frequently recently. I doubt that is related to bug 1341514 because SpecialPowers flushEnvPref may block runnexttest
But even Im right, fixing this only reduces the number of intermittent
Flags: needinfo?(tnguyen)
doing some retriggers here trying to find when this started:
https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=mochitest-chrome-3%20linux&tochange=51ecfa1834c3ee219899ef1096eb8a950b0b176f&fromchange=ebea9daabb964050c24edaf47a15ad6fe278f3cf&selectedJob=78980478

I will follow up here when I narrow down where this started from, either way we need to get working on fixing this :)
Flags: needinfo?(jmaher)
Whiteboard: [stockwell needswork]
with bug 1288633, we see a spike in these failures.

With 144 failures last week, I would like to know when we can expect a fix for this or if we should consider disabling this if there is no time to investigate and fix this in the short term.
Blocks: 1288633
Flags: needinfo?(jmaher) → needinfo?(tnguyen)
I assume the failure here is the same as bug 1209786, please consider that in your investigations.
I disabled a test in bug 1341514, adding that test make others seem not be stable.
Let see in next several days to see failure rate of the intermittent so that I can narrow down the problem
Flags: needinfo?(tnguyen)
well, I see the intermittent has occurred more frequently because bug 1341514. The reason is that an invalid provider "fake" leads a failure update.
https://searchfox.org/mozilla-central/rev/9c1c7106eef137e3413fd867fc1ddfb1d3f6728c/addon-sdk/source/test/preferences/firefox.json#12
Will fix in bug 1341514
No longer blocks: 1288633
Drop the ni since Thomas has found the root cause and is fixing it.
Flags: needinfo?(hchang)
Assigning to Thomas as per comment 29.
Assignee: nobody → tnguyen
Status: REOPENED → ASSIGNED
Priority: P5 → P1
It happens less frequently since 1341514 has fixed
I would like to change priority to P3
Priority: P1 → P3
I think the problem is here https://searchfox.org/mozilla-central/source/toolkit/components/url-classifier/tests/UrlClassifierTestUtils.jsm#90
We may have an running update and this will bailed out
https://searchfox.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#2043

Rarely we setup test data while running periodical update, but if it occurs, the test will be timed out.
and the same with bug 1209786 which uses the same idea of setting up test data
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #33)
> I think the problem is here
> https://searchfox.org/mozilla-central/source/toolkit/components/url-
> classifier/tests/UrlClassifierTestUtils.jsm#90
> We may have an running update and this will bailed out
> https://searchfox.org/mozilla-central/source/toolkit/components/url-
> classifier/nsUrlClassifierDBService.cpp#2043
> 
> Rarely we setup test data while running periodical update, but if it occurs,
> the test will be timed out.

Ah that makes sense. It will be a problem for a number of our tests in fact.

As a first step, we should change the above code so that useTestDatabase() calls the reject() callback on the promise when BeginUpdate() returns an error. That way, the cause of these test failures will be clearly labelled and no longer just a timeout.

Secondly, we should think about what we can do here. Wait for the update to finish and re-try the manual update? Disable normal updates before we add our test entries? Hardcode all of the test entries we need (like bug 1182876).
See Also: → 1182876
Depends on: 1345922
(In reply to François Marier [:francois] from comment #35)
> (In reply to Thomas Nguyen[:tnguyen] ni plz from comment #33)
> > I think the problem is here
> > https://searchfox.org/mozilla-central/source/toolkit/components/url-
> > classifier/tests/UrlClassifierTestUtils.jsm#90
> > We may have an running update and this will bailed out
> > https://searchfox.org/mozilla-central/source/toolkit/components/url-
> > classifier/nsUrlClassifierDBService.cpp#2043
> > 
> > Rarely we setup test data while running periodical update, but if it occurs,
> > the test will be timed out.
> 
> Ah that makes sense. It will be a problem for a number of our tests in fact.
> 
> As a first step, we should change the above code so that useTestDatabase()
> calls the reject() callback on the promise when BeginUpdate() returns an
> error. That way, the cause of these test failures will be clearly labelled
> and no longer just a timeout.

Also we should refrain from doing the next update until the previous one
is done. I will do them all in Bug 1345922.

> 
> Secondly, we should think about what we can do here. Wait for the update to
> finish and re-try the manual update? Disable normal updates before we add
> our test entries? Hardcode all of the test entries we need (like bug
> 1182876).
Whiteboard: [stockwell needswork] → [stockwell unknown]
Rate once/month and we are not sure the timeout is really related to url-classifier (probably some previous tests take too long time)
Priority: P3 → P5
Assignee: tnguyen → nobody
Status: ASSIGNED → NEW
this bug no longer occurs
Status: NEW → RESOLVED
Closed: 8 years ago7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.