Closed Bug 1353182 Opened 7 years ago Closed 7 years ago

Intermittent test_safe_browsing_warning_pages.py TestSafeBrowsingWarningPages.test_warning_pages | TimeoutException: Timed out after 300.2 seconds (outage of support.mozilla.org)

Categories

(Testing :: Firefox UI Tests, defect)

Version 3
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure)

Filed by: wkocher [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=88415215&repo=autoland

https://queue.taskcluster.net/v1/task/DWV-yEr-TfuwlZflhKPWtg/runs/0/artifacts/public/logs/live_backing.log

This started failing across inbound and autoland around the same time.
Not such a helpful assertion message. Looks like we missed to specify the message argument to the Wait().until() call.

task 2017-04-03T21:20:08.850324Z] 21:20:08     INFO - TEST-UNEXPECTED-ERROR | test_safe_browsing_warning_pages.py TestSafeBrowsingWarningPages.test_warning_pages | TimeoutException: Timed out after 300.2 seconds
[task 2017-04-03T21:20:08.851735Z] 21:20:08     INFO - Traceback (most recent call last):
[task 2017-04-03T21:20:08.852406Z] 21:20:08     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 166, in run
[task 2017-04-03T21:20:08.852524Z] 21:20:08     INFO -     testMethod()
[task 2017-04-03T21:20:08.852940Z] 21:20:08     INFO -   File "/home/worker/workspace/build/tests/firefox-ui/tests/functional/security/test_safe_browsing_warning_pages.py", line 60, in test_warning_pages
[task 2017-04-03T21:20:08.853038Z] 21:20:08     INFO -     self.check_report_button(unsafe_page)
[task 2017-04-03T21:20:08.853177Z] 21:20:08     INFO -   File "/home/worker/workspace/build/tests/firefox-ui/tests/functional/security/test_safe_browsing_warning_pages.py", line 90, in check_report_button
[task 2017-04-03T21:20:08.853229Z] 21:20:08     INFO -     expected.element_stale(button))
[task 2017-04-03T21:20:08.853307Z] 21:20:08     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/wait.py", line 150, in until
[task 2017-04-03T21:20:08.853589Z] 21:20:08     INFO -     cause=last_exc)

So we are waiting here for the report button to become stale. It means the action we do does not trigger a page load, or there is a networking issue which prevents the target page from being loaded.
This happens because support.mozilla.org was hardly reachable over the day. Details can be found in bug 1351498.
Depends on: 1351498
I thought that this bug is caused by our recently big change in url-classifier but seems it is not.
For now, the intermittent only occurred on 3/4/5-04 and then reduced to 0 from 06/04, likely comment 2 is the root cause.
Thanks for verifying this Thomas.

Since this problem should go away once SUMO becomes reliable again, I'll mark this as WONTFIX.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
We have to keep this open to be able to star the failures in Treeherder. We can close as WFM once it appears to work fine again. If we close it now sheriffs might simply file a different bug.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
I checked Orange Factor and the failures are indeed gone with builds from April 6th:

https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1353182&startday=2017-04-03&endday=2017-04-12&tree=all

It means it was fixed on SUMO side.

Francois, I wonder if we could get rid of this remote dependency by turning some preference values to a local HTTP server, or if you really would like to have the checks against the real site.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Flags: needinfo?(francois)
Resolution: --- → WORKSFORME
Summary: Intermittent test_safe_browsing_warning_pages.py TestSafeBrowsingWarningPages.test_warning_pages | TimeoutException: Timed out after 300.2 seconds → Intermittent test_safe_browsing_warning_pages.py TestSafeBrowsingWarningPages.test_warning_pages | TimeoutException: Timed out after 300.2 seconds (outage of support.mozilla.org)
(In reply to Henrik Skupin (:whimboo) from comment #8)
> Francois, I wonder if we could get rid of this remote dependency by turning
> some preference values to a local HTTP server, or if you really would like
> to have the checks against the real site.

I think we only care that the button works.

In our test harness, is there a way to capture the request to SUMO and redirect it to some local webserver that will return an empty page (but still a 200)?
Flags: needinfo?(francois) → needinfo?(hskupin)
No, but I assume there is a pref with formatting options, which is retrieved by Firefox and the target url in this case SUMO is build from it? We could easily update this pref. May this is `app.support.baseURL;https://support.mozilla.org/1/firefox/%VERSION%/%OS%/%LOCALE%/`? If yes, we can get this switched to our local server on a new bug.
Flags: needinfo?(hskupin) → needinfo?(francois)
(In reply to Henrik Skupin (:whimboo) from comment #10)
> No, but I assume there is a pref with formatting options, which is retrieved
> by Firefox and the target url in this case SUMO is build from it? We could
> easily update this pref. May this is
> `app.support.baseURL;https://support.mozilla.org/1/firefox/%VERSION%/%OS%/
> %LOCALE%/`? If yes, we can get this switched to our local server on a new
> bug.

Yes, it does use that pref.

The code that opens SUMO ("Why was this page blocked?" button) is here: https://searchfox.org/mozilla-central/rev/d4eaa9c2fa54d553349ac88f0c312155a4c6e89e/browser/base/content/browser.js#3195

and it uses this code:

        openHelpLink("phishing-malware", false, "current");

The openHelpLink() function is defined here: https://searchfox.org/mozilla-central/rev/d4eaa9c2fa54d553349ac88f0c312155a4c6e89e/browser/base/content/utilityOverlay.js#906

and uses getHelpLinkUrl(): https://searchfox.org/mozilla-central/rev/d4eaa9c2fa54d553349ac88f0c312155a4c6e89e/browser/base/content/utilityOverlay.js#901

which does the use app.support.baseURL pref.

So if we can redirect all SUMO requests to a local server that always returns a 200, we should eliminate all of the false positives.
Flags: needinfo?(francois)
You need to log in before you can comment on or make changes to this bug.