Closed Bug 1090198 Opened 11 years ago Closed 8 years ago

Intermittent e10s 014.html | WebSockets: serialize establish a connection - assert_greater_than: expected a number greater than 998 but got 979

Categories

(Core :: DOM: Workers, defect, P5)

x86
Windows XP
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
e10s + ---
firefox34 --- unaffected
firefox35 --- unaffected
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- affected
firefox39 --- affected
firefox40 --- affected
firefox-esr31 --- unaffected
firefox-esr52 --- fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: cbook, Assigned: jgraham)

References

(Blocks 1 open bug, )

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])

Windows XP 32-bit mozilla-inbound opt test web-platform-tests-4 https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3352698&repo=mozilla-inbound 05:37:27 INFO - TEST-UNEXPECTED-FAIL | /websockets/constructor/014.html | WebSockets: serialize establish a connection - assert_greater_than: expected a number greater than 998 but got 979
Component: General → Networking: WebSockets
Blocks: 504553
Component: Networking: WebSockets → DOM: Workers
given that this is new today in the websockets code is this caused by 1090170?
Blocks: 1090170
Flags: needinfo?(amarchesini)
These tests were only enabled in production on Windows today, so it might be an older issue that doesn't affect linux. Notice that the tolerance of the test seems rather small. I wonder if the problem is just timer accuracy on Windows and whether having a larger tolerance and/or switching to high resolution time would help.
(In reply to James Graham [:jgraham] from comment #22) > These tests were only enabled in production on Windows today, so it might be > an older issue that doesn't affect linux. Notice that the tolerance of the > test seems rather small. I wonder if the problem is just timer accuracy on > Windows and whether having a larger tolerance and/or switching to high > resolution time would help. what bug enabled them? but yes - I would just make the tolerances in the test more obvious. Websockets in gecko just has the responsibility for making sure they don't overlap and so doesn't even use a timer for this case... e.g. make the sleep 1.25s instead of 1s
No longer blocks: 504553, 1090170
james - can we patch those tests asap (36 oranges in 2 days) or disable again if they need to be coordinated with upstream..
Flags: needinfo?(amarchesini) → needinfo?(james)
Flags: needinfo?(james)
Assignee: nobody → james
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
seems its still happening :(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Andrea, do you have time to look into this pretty frequent WebSockets test failure?
Flags: needinfo?(amarchesini)
Target Milestone: mozilla36 → ---
I would propose to increase the tolerance of this test based on comment 28. Maybe 1.5/2 secs?
Flags: needinfo?(amarchesini) → needinfo?(james)
I updated this upstream, let's see if the problems get better.
Flags: needinfo?(james)
That solve the non-e10s problem, but it remains a frequent problem e10s-only.
Summary: Intermittent 014.html | WebSockets: serialize establish a connection - assert_greater_than: expected a number greater than 998 but got 979 → Intermittent e10s 014.html | WebSockets: serialize establish a connection - assert_greater_than: expected a number greater than 998 but got 979
Blocks: e10s-tests
tracking-e10s: --- → +
Intermittent e10s test failure
Priority: -- → P5
This is a badly written test. It doesn't measure time between opening the channel and onopen event. Instead, it opens 4 channels and measures time between onopen events. Intervals between individual onopen events are not anyhow guaranteed by sleep(2) in the handshake handler because the next channel is already doing handshake when we're processing onopen event of the previous channel. I verified this by delaying the first onopen event by 1.5s which makes the test always failing and the delay between second and first onopen is now only 0.5s: ws[i].onopen = t.step_func(function(e) { + if (events == 0) { + var startDate = new Date(); + var currentDate = new Date(); + + while (currentDate - startDate < 1500) { + currentDate = new Date(); + } + } events++;
This got frequent after yesterday's wpt push: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&tochange=59d5c20392e6b3e5963e698c1c6b778bfbecd2de&fromchange=20dd333bb5715107c224e74ad85db5b6bfa902c8&filter-searchStr=linux%20debug%20tc-W-e10s%28wpt8%29&selectedJob=90449447 Bug 1352351 - Enable --run-by-dir for web-platform-tests, r=ato Bug 1352351 - Fix infinite loop getting next test with --run-by-dir, r=ato Bug 1321179 - Disable navigation test on Windows for instability r=ato Bug 1355060 - Fix error from assert outside step in history traversal test, r=ato James, can you take a look at these failures, please?
Flags: needinfo?(james)
Maybe the right thing is just to disable it until we have time to fix it properly. Are you able to do that?
Flags: needinfo?(james) → needinfo?(aryx.bugmail)
Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/mozilla-inbound/rev/e27eb8e2b2e4 Disable intermittent websockets/constructor/014.html on e10s debug. r=requested-by-jgraham DONTBUILD
Status: REOPENED → RESOLVED
Closed: 11 years ago8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Whiteboard: [stockwell disabled]
You need to log in before you can comment on or make changes to this bug.