Intermittent /streams/transferable/writable-stream.html | second write should wait for first underlying write to complete - promise_test: Unhandled rejection with value: object "TypeError: resolveWrite is not a function"
Categories
(Core :: DOM: Streams, defect, P5)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr91 | --- | unaffected |
| firefox100 | --- | unaffected |
| firefox101 | --- | unaffected |
| firefox102 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: saschanaz)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(1 file)
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=378615548&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/N2y6e3dvRq2TtKitQHJBGQ/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/N2y6e3dvRq2TtKitQHJBGQ/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
[task 2022-05-19T12:41:26.549Z] 12:41:26 INFO - TEST-PASS | /streams/transferable/writable-stream.html | effective queue size of a transferred writable should be 2
[task 2022-05-19T12:41:26.549Z] 12:41:26 INFO - TEST-UNEXPECTED-FAIL | /streams/transferable/writable-stream.html | second write should wait for first underlying write to complete - promise_test: Unhandled rejection with value: object "TypeError: resolveWrite is not a function"
[task 2022-05-19T12:41:26.550Z] 12:41:26 INFO - ..
[task 2022-05-19T12:41:26.550Z] 12:41:26 INFO - TEST-OK | /streams/transferable/writable-stream.html | took 213ms
[task 2022-05-19T12:41:26.550Z] 12:41:26 INFO - No more tests
[task 2022-05-19T12:41:26.556Z] 12:41:26 INFO - PID 14268 | 1652964086552 Marionette INFO Stopped listening on port 50602
[task 2022-05-19T12:41:26.998Z] 12:41:26 INFO - Browser exited with return code 0
[task 2022-05-19T12:41:27.001Z] 12:41:27 INFO - Closing logging queue
[task 2022-05-19T12:41:27.001Z] 12:41:27 INFO - queue closed
[task 2022-05-19T12:41:27.017Z] 12:41:27 INFO - Got 1 unexpected results, with 0 unexpected passes
[task 2022-05-19T12:41:27.019Z] 12:41:27 INFO - SUITE-END | took 5s
[task 2022-05-19T12:41:27.019Z] 12:41:27 INFO - Repetition 5 / 5
[task 2022-05-19T12:41:27.020Z] 12:41:27 INFO - SUITE-START | Running 1 tests
[task 2022-05-19T12:41:27.021Z] 12:41:27 INFO - Running testharness tests
[task 2022-05-19T12:41:27.033Z] 12:41:27 INFO - Setting up ssl
[task 2022-05-19T12:41:27.066Z] 12:41:27 INFO - certutil | b''
[task 2022-05-19T12:41:27.100Z] 12:41:27 INFO - certutil | b''
[task 2022-05-19T12:41:27.112Z] 12:41:27 INFO - certutil | b'\nCertificate Nickname Trust Attributes\n SSL,S/MIME,JAR/XPI\n\nweb-platform-tests CT,, \n'
[task 2022-05-19T12:41:27.116Z] 12:41:27 INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpqykr7ett
[task 2022-05-19T12:41:27.126Z] 12:41:27 INFO - Starting runner
[task 2022-05-19T12:41:27.677Z] 12:41:27 INFO - PID 14535 | 1652964087676 Marionette INFO Marionette enabled
[task 2022-05-19T12:41:27.680Z] 12:41:27 INFO - PID 14535 | 1652964087679 Marionette INFO Listening on port 55993
[task 2022-05-19T12:41:31.098Z] 12:41:31 INFO - PID 14535 | console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/tmpqykr7ett/search.json.mozlz4", (void 0)))
[task 2022-05-19T12:41:31.796Z] 12:41:31 INFO - TEST-START | /streams/transferable/writable-stream.html
Comment 1•3 years ago
|
||
:saschanaz, since you are the author of the regressor, bug 1659025, could you take a look?
For more information, please visit auto_nag documentation.
| Assignee | ||
Comment 2•3 years ago
•
|
||
(Copying from https://bugzilla.mozilla.org/show_bug.cgi?id=1769956#c1)
Unhandled rejection with value: object "TypeError: resolveWrite is not a function"
This sounds like a JavaScript context issue. Matthew, do you have any idea? I couldn't repro this locally, not sure what's different on TVw?
Seems it doesn't fail on Windows, interesting.
Comment 3•3 years ago
•
|
||
Tier 1 failure here.
Comment 4•3 years ago
|
||
Set release status flags based on info from the regressing bug 1659025
Comment 5•3 years ago
|
||
So Test Verification tests run with a whole batch of variations to try to trip up ordering constraints. So if we got anything wrong (or if the testcase+spec is actually wrong) I'd be entirely unsurprised to see it trip up intermittents here.
Reading the test case it seems like the problem is that we're in a situation where we've made it all the way to line 107, where we invoke the binding resolveWrite before we've ever actually executed the underlying sink write method that should have actually installed it.
Tweaking the test case I was able to reproduce this on my laptop (macOS arm64):
promise_test(async () => {
let resolveWrite;
assert_true(resolveWrite == undefined, "resolveWrite ") // new assert
const orig = new WritableStream({
write() {
return new Promise(resolve => {
resolveWrite = resolve;
});
}
});
const transferred = await transfer(orig);
const writer = transferred.getWriter();
await writer.write('a');
assert_true(resolveWrite == undefined, "ResolveWrite need to wait until microtask queue is drained"); // new assert
let writeDone = false;
const writePromise = writer.write('b').then(() => {
writeDone = true;
});
await flushAsyncEvents();
assert_false(writeDone, 'second write should not have resolved yet'); // new assert
assert_true(resolveWrite instanceof Function, "ResolveWrite needed to wait until microtask queue is drained"); // new assert
resolveWrite();
await writePromise; // (makes sure this doesn't cause timeout)
assert_true(writeDone, "now we've done write two.")
}, 'second write should wait for first underlying write to complete');
The case that failed is Running tests in a loop 10 times with flags chaos_mode_flags=0xfb : FAIL
Given that this happening is reliant on events happening in the correct order, one does wonder if what is happening in these TV cases is something is causing the event loop or micro task queue to have the wrong number of events at the wrong time; I wonder how chaos achieves this.
We could potentially rewrite this test to be more robust to interference (considering flushAsyncEvents is defined as 'two event loop spins'), by having another promise that's only resolved once resolveWrite exists...
I'll keep looking at this after lunch.
| Assignee | ||
Comment 6•3 years ago
•
|
||
Oh thanks, that does look problematic. Generally anything with delay(0) seems harmful in streams tests, since there's no guarantee that it will actually wait until all the messaging finishes.
That said, I still can't repro this even with your modification 🧐
| Assignee | ||
Comment 7•3 years ago
|
||
function makePromiseAndResolveFunc() {
let resolve;
const promise = new Promise(r => { resolve = r; });
return [promise, resolve];
};
promise_test(async () => {
const [writeCalled, resolveWriteCalled] = makePromiseAndResolveFunc();
let resolveWrite;
const orig = new WritableStream({
write() {
resolveWriteCalled();
return new Promise(resolve => {
resolveWrite = resolve;
});
}
});
const transferred = await transfer(orig);
const writer = transferred.getWriter();
await writer.write('a');
let writeDone = false;
const writePromise = writer.write('b').then(() => {
writeDone = true;
});
await writeCalled;
assert_false(writeDone, 'second write should not have resolved yet');
resolveWrite();
await writePromise; // (makes sure this doesn't cause timeout)
}, 'second write should wait for first underlying write to complete');
Does this fix the problem on your machine?
| Assignee | ||
Comment 8•3 years ago
•
|
||
If that works, I'd like to refactor the tests with makePromiseAndResolveFunc which I just copied from web-locks/resources/helpers.js.
Edit: It seems it works https://treeherder.mozilla.org/jobs?repo=try&revision=c3ae797419a3c90b20d1484821fc03966202104f
Comment 9•3 years ago
|
||
Locally that definitely fixes this one for me.
(In reply to Kagami :saschanaz from comment #6)
Oh thanks, that does look problematic. Generally anything with
delay(0)seems harmful in streams tests, since there's no guarantee that it will actually wait until all the messaging finishes.
So that's a question I have for someone who know more about this than I: Are there any guarantees about messaging that have to be enforced? Or is it totally legal for messages to sit in a queue for an arbitrary amount of time (spins of the event loop/microtask queues) before being delivered? So long as that's the case, strongly agree these tests need to be refactored.
That said, I still can't repro this even with your modification 🧐
To reproduce I commented out everything else in streams/transferable/writable-stream.html, and then ran ./mach wpt --headless streams/transferable/writable-stream.html --verify
| Assignee | ||
Comment 10•3 years ago
|
||
Are there any guarantees about messaging that have to be enforced? Or is it totally legal for messages to sit in a queue for an arbitrary amount of time (spins of the event loop/microtask queues) before being delivered? So long as that's the case, strongly agree these tests need to be refactored.
Messages uses different event queue than the setTimeout does:
- https://html.spec.whatwg.org/multipage/web-messaging.html#posted-message-task-source
- https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html#timer-task-source
Those two are independent, so there's no guarantee that any given setTimeout will fire after a given postMessage task.
To reproduce I commented out everything else in streams/transferable/writable-stream.html, and then ran ./mach wpt --headless streams/transferable/writable-stream.html --verify
That doesn't work either, and the TVw CI job doesn't fail on Windows either, so I guess something is different on Windows. 🤷♀️
Comment 11•3 years ago
|
||
(In reply to Kagami :saschanaz from comment #10)
Messages uses different event queue than the setTimeout does:
- https://html.spec.whatwg.org/multipage/web-messaging.html#posted-message-task-source
- https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html#timer-task-source
Those two are independent, so there's no guarantee that any given setTimeout will fire after a given postMessage task.
That seems like pretty conclusive evidence these tests need to be refactored. Thank you for the explanation!
That doesn't work either, and the TVw CI job doesn't fail on Windows either, so I guess something is different on Windows. 🤷♀️
Ah; I was testing on macOS.
| Assignee | ||
Comment 12•3 years ago
|
||
postMessage is independent from setTimeout and thus delay(0) cannot gauarantee the task order.
Updated•3 years ago
|
Comment 13•3 years ago
|
||
Comment 15•3 years ago
|
||
| bugherder | ||
Updated•3 years ago
|
| Comment hidden (Intermittent Failures Robot) |
Description
•