Closed Bug 1370585 Opened 2 years ago Closed 2 years ago

Intermittent toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | ping should reach page in blacklist when tracking protection is off - got true, expected false

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox55 --- fixed
firefox56 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: hchang)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed:timing])

Attachments

(1 file)

Assignee: nobody → hchang
Priority: -- → P2
primarily on android debug, this started June 14th and has had a high rate since then.

I have started some retriggers here:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=mochitest-48&tochange=f65461f10e0a07b2ef94c51df003eda1612f6285&fromchange=6d2026f8b291d0aa09168d41e361dc6fafcfc410&selectedJob=106951381

Not sure if this will help find a root cause or not.


In the earlier log on the 14th:
https://queue.taskcluster.net/v1/task/NhDIeL-mTECn6aI3NrIldA/runs/0/artifacts/public/logs/live_backing.log

I see this text:
[task 2017-06-14T14:23:44.380842Z] 14:23:44     INFO -  151 INFO TEST-START | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html
[task 2017-06-14T14:23:55.326101Z] 14:23:55     INFO -  Buffered messages logged at 14:23:40
[task 2017-06-14T14:23:55.326248Z] 14:23:55     INFO -  152 INFO TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | A valid string reason is expected
[task 2017-06-14T14:23:55.326362Z] 14:23:55     INFO -  153 INFO TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | Reason cannot be empty
[task 2017-06-14T14:23:55.326414Z] 14:23:55     INFO -  Buffered messages logged at 14:23:46
[task 2017-06-14T14:23:55.326530Z] 14:23:55     INFO -  154 INFO TEST-FAIL | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | The author of the test has indicated that flaky timeouts are expected.  Reason: Delay to make sure ping is made prior than XHR
[task 2017-06-14T14:23:55.326615Z] 14:23:55     INFO -  155 INFO TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | ping should reach page not in blacklist
[task 2017-06-14T14:23:55.326665Z] 14:23:55     INFO -  Buffered messages logged at 14:23:47
[task 2017-06-14T14:23:55.326776Z] 14:23:55     INFO -  156 INFO TEST-FAIL | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | The author of the test has indicated that flaky timeouts are expected.  Reason: Delay to make sure ping is made prior than XHR
[task 2017-06-14T14:23:55.326825Z] 14:23:55     INFO -  Buffered messages finished
[task 2017-06-14T14:23:55.326925Z] 14:23:55     INFO -  157 INFO TEST-UNEXPECTED-FAIL | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | ping should reach page in blacklist when tracking protection is off - got true, expected false
[task 2017-06-14T14:23:55.326991Z] 14:23:55     INFO -      SimpleTest.is@SimpleTest/SimpleTest.js:312:5
[task 2017-06-14T14:23:55.327286Z] 14:23:55     INFO -      isPingedWithRetry/xhr.onload@toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html:96:9
[task 2017-06-14T14:23:55.327641Z] 14:23:55     INFO -  158 INFO TEST-FAIL | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | The author of the test has indicated that flaky timeouts are expected.  Reason: Delay to make sure ping is made prior than XHR
[task 2017-06-14T14:23:55.328170Z] 14:23:55     INFO -  159 INFO TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | ping should not reach page in blacklist when tracking protection is on
[task 2017-06-14T14:23:55.328637Z] 14:23:55     INFO -  160 INFO TEST-OK | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html | took 14772ms


With this higher failure rate I would like to see us get some type of resolution on this in the next week (fixed, reduced, disabled)

:francois, I see you are the triage owner for this component, can you find someone from the safe browsing team to look at this?
Flags: needinfo?(francois)
Whiteboard: [stockwell needswork]
(In reply to Joel Maher ( :jmaher) from comment #2)
> primarily on android debug, this started June 14th and has had a high rate
> since then.
> 
> I have started some retriggers here:
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-
> searchStr=mochitest-
> 48&tochange=f65461f10e0a07b2ef94c51df003eda1612f6285&fromchange=6d2026f8b291d
> 0aa09168d41e361dc6fafcfc410&selectedJob=106951381
> 
> Not sure if this will help find a root cause or not.
> 
> 
> In the earlier log on the 14th:
> https://queue.taskcluster.net/v1/task/NhDIeL-mTECn6aI3NrIldA/runs/0/
> artifacts/public/logs/live_backing.log
> 
> I see this text:
> [task 2017-06-14T14:23:44.380842Z] 14:23:44     INFO -  151 INFO TEST-START
> | toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html
> [task 2017-06-14T14:23:55.326101Z] 14:23:55     INFO -  Buffered messages
> logged at 14:23:40
> [task 2017-06-14T14:23:55.326248Z] 14:23:55     INFO -  152 INFO TEST-PASS |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> A valid string reason is expected
> [task 2017-06-14T14:23:55.326362Z] 14:23:55     INFO -  153 INFO TEST-PASS |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> Reason cannot be empty
> [task 2017-06-14T14:23:55.326414Z] 14:23:55     INFO -  Buffered messages
> logged at 14:23:46
> [task 2017-06-14T14:23:55.326530Z] 14:23:55     INFO -  154 INFO TEST-FAIL |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> The author of the test has indicated that flaky timeouts are expected. 
> Reason: Delay to make sure ping is made prior than XHR
> [task 2017-06-14T14:23:55.326615Z] 14:23:55     INFO -  155 INFO TEST-PASS |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> ping should reach page not in blacklist
> [task 2017-06-14T14:23:55.326665Z] 14:23:55     INFO -  Buffered messages
> logged at 14:23:47
> [task 2017-06-14T14:23:55.326776Z] 14:23:55     INFO -  156 INFO TEST-FAIL |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> The author of the test has indicated that flaky timeouts are expected. 
> Reason: Delay to make sure ping is made prior than XHR
> [task 2017-06-14T14:23:55.326825Z] 14:23:55     INFO -  Buffered messages
> finished
> [task 2017-06-14T14:23:55.326925Z] 14:23:55     INFO -  157 INFO
> TEST-UNEXPECTED-FAIL |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> ping should reach page in blacklist when tracking protection is off - got
> true, expected false
> [task 2017-06-14T14:23:55.326991Z] 14:23:55     INFO -     
> SimpleTest.is@SimpleTest/SimpleTest.js:312:5
> [task 2017-06-14T14:23:55.327286Z] 14:23:55     INFO -     
> isPingedWithRetry/xhr.onload@toolkit/components/url-classifier/tests/
> mochitest/test_classify_ping.html:96:9
> [task 2017-06-14T14:23:55.327641Z] 14:23:55     INFO -  158 INFO TEST-FAIL |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> The author of the test has indicated that flaky timeouts are expected. 
> Reason: Delay to make sure ping is made prior than XHR
> [task 2017-06-14T14:23:55.328170Z] 14:23:55     INFO -  159 INFO TEST-PASS |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> ping should not reach page in blacklist when tracking protection is on
> [task 2017-06-14T14:23:55.328637Z] 14:23:55     INFO -  160 INFO TEST-OK |
> toolkit/components/url-classifier/tests/mochitest/test_classify_ping.html |
> took 14772ms
> 
> 
> With this higher failure rate I would like to see us get some type of
> resolution on this in the next week (fixed, reduced, disabled)
> 
> :francois, I see you are the triage owner for this component, can you find
> someone from the safe browsing team to look at this?

I had a look last week but haven't had an idea. Will spend more time looking
at this given that the failure rate is going higher and higher :(
Flags: needinfo?(francois)
Attachment #8879087 - Flags: review?(francois)
I don't know what change was made on 6/14 but the failed test
is a little timing sensitive: we trigger a "ping" for an anchor tag, 
wait a while then use XHR to check if the server state has been changed by 
the "ping" request. In theory the requests should be handled in the 
way "first-sent-first-come-first-served" so the waiting time shouldn't
matter. However, not all the operations (ping, XHR, blacklist check)
are synchronous so the waiting time might still be a thing. So, I changed
the one-time check to a periodic poll and it looks good in [1].

Francois, could you help review the patch given that the above argument?
I dup'ed 10 test cases and re-trigger a million times in [1] so it should be 
a good evidence to support what I've argued. 

Thanks.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=541473a1b112587bc20472b877e59b1f6c8b8651&selectedJob=108131977
Flags: needinfo?(francois)
Comment on attachment 8879087 [details]
Bug 1370585 - Periodically check the flag (at most 30 seconds) rather than checking only once.

https://reviewboard.mozilla.org/r/150424/#review155228
Attachment #8879087 - Flags: review?(francois) → review+
(In reply to Henry Chang [:hchang] from comment #6)
> Francois, could you help review the patch given that the above argument?
> I dup'ed 10 test cases and re-trigger a million times in [1] so it should be 
> a good evidence to support what I've argued. 

Sounds good to me. Let's merge and see how it goes. It should make this test more reliable.
Flags: needinfo?(francois)
Pushed by hchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d06f14a3eb8a
Periodically check the flag (at most 30 seconds) rather than checking only once. r=francois
https://hg.mozilla.org/mozilla-central/rev/d06f14a3eb8a
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
https://hg.mozilla.org/releases/mozilla-beta/rev/92582020c573
Flags: in-testsuite+
Whiteboard: [stockwell needswork] → [stockwell fixed]
The last failure was 7 days ago (6/20) so I am confident it has been actually fixed.
Whiteboard: [stockwell fixed] → [stockwell fixed:timing]
You need to log in before you can comment on or make changes to this bug.