Closed Bug 1295453 Opened 3 years ago Closed 3 years ago

Intermittent dom/network/tests/test_tcpsocket_client_and_server_basics.html, dom/network/tests/test_tcpsocket_jsm.html | Test timed out.

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: wlach)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

This is pretty frequent and tends to cause failures in subsequent tests as well. Any chance you could take a look, Josh?
Flags: needinfo?(josh)
Looks like the same problem as bug 1295483.
Blocks: 1104156
Flags: needinfo?(josh)
FWIW, I'm looking into this. In the examples I've looked at, it seems like the problem is that a socket is being closed before the event we're expecting happens, and then we eventually just time out waiting for it to happen (which it never does). Not really sure why this would be the case, it's obviously *not supposed to happen* and nothing we're explicitly doing should trigger a close.

Next steps:

1. See if we can reproduce this on a Linux machine
2. See if we can reproduce this under rr.

It would indeed not surprise me if this was related to bug 1295483 or possibly other tcp oranges as well, but this seems like one of the simpler manifestations of what (might) be a larger problem, so let's start here.
Ok, I managed to reproduce this in rr (using chaos mode), though in retrospect I probably didn't need to: the bug is pretty simple, in some cases we close immediately before writing *any* data to the socket, so we wait for that event forever. There's no guarantee even 1 byte will be written, so we shouldn't expect it to. It should be sufficient to just wait for the connection to close, then to verify that a full set of data wasn't written (in theory this could still intermittently fail depending on timing, but the chances of that are infinitesimally small)
Assignee: nobody → wlachance
Note that the above patch basically reverts a change that :jdm requested in https://bugzilla.mozilla.org/show_bug.cgi?id=1104156#c28 ("We should first use serverQueue.waitForDataWithAtLeastLength - we want to verify that at least 1 byte is received before the close message, but we should also verify that the received data is less than the length of the two big arrays. Does that make sense?"), but I'm pretty sure that isn't the right thing to assert on. Would we expect that behaviour most of the time? Yes. Is it an error if we don't see it? No.

(unless I'm missing something)
Summary: Intermittent dom/network/tests/test_tcpsocket_client_and_server_basics.html | Test timed out. → Intermittent dom/network/tests/test_tcpsocket_client_and_server_basics.html, dom/network/tests/test_tcpsocket_jsm.html | Test timed out.
Duplicate of this bug: 1295483
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Comment on attachment 8786850 [details]
Bug 1295453 - Don't wait for data to be sent when closing socket immediately

https://reviewboard.mozilla.org/r/75734/#review74948

::: dom/network/tests/test_tcpsocket_client_and_server_basics.js:382
(Diff revision 1)
> -  is(serverReceived.length < (2 * bigUint8Array.length), true, 'Received array length less than sent array length');
> -
>    is((yield serverQueue.waitForEvent()).type, 'close',
>       'The close event is received after calling closeImmediately');
>  
> +  is(serverReceived.length < (2 * bigUint8Array.length), true, 'Received array length less than sent array length');

This serverReceived value is unrelated to the sends that happened in this particular test, so we can't assert anything about this case now.

What if we make a waitForAnyDataOrClose helper which uses Promise.race to catch both cases, and treat the close case as a length of 0?
Attachment #8786850 - Flags: review?(josh) → review-
Comment on attachment 8786850 [details]
Bug 1295453 - Don't wait for data to be sent when closing socket immediately

https://reviewboard.mozilla.org/r/75734/#review74948

> This serverReceived value is unrelated to the sends that happened in this particular test, so we can't assert anything about this case now.
> 
> What if we make a waitForAnyDataOrClose helper which uses Promise.race to catch both cases, and treat the close case as a length of 0?

Ah right, however I don't think we want Promise.race here, since we really do just want the close message to be called, at least as far as I understand things. I think this updated patch should work, let me know if I'm missing something.
Comment on attachment 8786850 [details]
Bug 1295453 - Don't wait for data to be sent when closing socket immediately

https://reviewboard.mozilla.org/r/75734/#review75264

Thanks!
Attachment #8786850 - Flags: review?(josh) → review+
try run looks happy, I'm going to try to land:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=417cc6083b7e

(missing data for runs on other platforms which I hoped to have probably because my try syntax is wrong, but I'm still pretty confident this won't break stuff)
Pushed by wlachance@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3d0b41fdd93b
Don't wait for data to be sent when closing socket immediately r=jdm
https://hg.mozilla.org/mozilla-central/rev/3d0b41fdd93b
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Duplicate of this bug: 1295625
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.