Closed
Bug 1224825
Opened 9 years ago
Closed 9 years ago
Race condition in MessagePort::close
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla45
Tracking | Status | |
---|---|---|
firefox45 | --- | fixed |
People
(Reporter: baku, Assigned: baku)
References
Details
Attachments
(2 files, 8 obsolete files)
4.18 KB,
patch
|
Details | Diff | Splinter Review | |
26.64 KB,
patch
|
Details | Diff | Splinter Review |
var mc = new MessageChannel(); worker.postMessage('', [mc.port2]);
for(var i = 0; i < 100; ++i) mc.port1.postMessage(i);
mc.port1.close();
There are 2 issues here:
1. Our MessagePort code can produce a race condition with this code because when we call Close(), we send a message to the MessagePortParent in order to close the channel between the 2 ports. But, if the worker is too slow, it will not receive the 100 messages.
2. when we call Close(), if there is a pending queue of messages, we should wait until it's fully sent to PBackground.
Attachment #8687574 -
Flags: review?(bugs)
Comment 1•9 years ago
|
||
Comment on attachment 8687574 [details] [diff] [review]
foo.patch
>@@ -262,16 +283,30 @@ MessagePortService::CloseAll(const nsID&
> data->mParent->Close();
> }
>
> for (const MessagePortServiceData::NextParent& parent : data->mNextParents) {
> parent.mParent->CloseAndDelete();
> }
>
> nsID destinationUUID = data->mDestinationUUID;
>+
>+ // If we have informations about the other port and that port has some
>+ // pending messages to deliver but the parent has not processed them yet,
>+ // because its entangling request didn't arrive yet), we cannot close this
>+ // channel.
>+ MessagePortServiceData* destinationData;
>+ if (mPorts.Get(destinationUUID, &destinationData) &&
>+ !destinationData->mMessages.IsEmpty() &&
>+ destinationData->mWaitingForNextParent) {
I don't understand this.
What is mWaitingForNextParent about here?
>+ MOZ_ASSERT(!destinationData->mNextStepCloseAll);
>+ destinationData->mNextStepCloseAll = true;
and why if we're waiting for next parent we should close something?
I think I'm missing to understand the meaning of the variables here.
>+ var MAX = 100;
>+
>+ function runTest() {
>+ var worker = new Worker('data:javascript,onmessage = function(e) { e.ports[0].onmessage = function(evt) { postMessage(evt.data);}}');
>+
>+ var count = 0;
>+ worker.onmessage = function(e) {
>+ is(e.data, count, "Correct value expected!");
>+ if (++count == MAX) {
>+ info("All the messages correctly received");
>+ SimpleTest.finish();
>+ }
>+ }
>+
>+ var mc = new MessageChannel();
>+ worker.postMessage(42, [mc.port2]);
>+
>+ for (var i = 0; i < MAX; ++i) {
>+ mc.port1.postMessage(i);
>+ }
>+
>+ mc.port1.close();
Could you add a test where you post a message after .close();
and then change onmessage handler a bit so that it calls finish() asynchronously (using some timeout, say 100ms) and checks that
e.data is never >= MAX.
Attachment #8687574 -
Flags: review?(bugs) → review-
Comment 2•9 years ago
|
||
Hmm, I guess I start to understand the variables...
but don't then understand the removal of if (!data->mNextParents.IsEmpty()) {
Assignee | ||
Comment 3•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #2)
> Hmm, I guess I start to understand the variables...
> but don't then understand the removal of if (!data->mNextParents.IsEmpty()) {
The reason why I removed that is because it doesn't make sense :)
The nextParent can arrive in the future and at that point that check is just misleading.
Assignee | ||
Comment 4•9 years ago
|
||
> >+ MessagePortServiceData* destinationData;
> >+ if (mPorts.Get(destinationUUID, &destinationData) &&
> >+ !destinationData->mMessages.IsEmpty() &&
> >+ destinationData->mWaitingForNextParent) {
> I don't understand this.
> What is mWaitingForNextParent about here?
We set this variable to true when a disentangle request is received but we don't have a valid 'next' parent yet.
The meaning is: eventually a new parent will arrive and will dispatch the messages to it.
(In case something else happens, StructuredClone will call ForceClose).
So, in case a Close() has been called, we must check if have to deliver messages to a parent that doesn't exist yet but eventually will come.
> >+ MOZ_ASSERT(!destinationData->mNextStepCloseAll);
> >+ destinationData->mNextStepCloseAll = true;
> and why if we're waiting for next parent we should close something?
> I think I'm missing to understand the meaning of the variables here.
Because we must deliver the pending messages that are still in the queue.
Otherwise this code:
var mc = new MessageChannel();
somewhere.postMessage(mc.port2, [mc.port2]);
mc.port1.postMessage(42);
mc.port1.close();
will not be able to dispatch '42' to the port2 when it will be received on the other side.
Comment 5•9 years ago
|
||
mWaitingForNextParent is also true by default, and that is what is a bit misleading - at least in my mind.
"next" hints that there is some current or old parent.
Perhaps mWaitingForNewParent?
Assignee | ||
Comment 6•9 years ago
|
||
Attachment #8687574 -
Attachment is obsolete: true
Attachment #8687663 -
Flags: review?(bugs)
Updated•9 years ago
|
Attachment #8687663 -
Flags: review?(bugs) → review+
Comment 8•9 years ago
|
||
Had to back this out together with bug 1221852 for M(3) failures:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4922c78419fa
http://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-win32/1447588173/mozilla-inbound_win7-ix_test-mochitest-3-bm126-tests1-windows-build74.txt.gz
05:02:40 WARNING - TEST-UNEXPECTED-FAIL | dom/security/test/csp/test_upgrade_insecure_reporting.html | application terminated with exit code 1
05:02:40 INFO - runtests.py | Application ran for: 0:01:26.973000
05:02:40 INFO - zombiecheck | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpczcznlpidlog
05:02:40 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/9IEpt2YhSKiHy2Y4angz2w/artifacts/public/build/firefox-45.0a1.en-US.win32.crashreporter-symbols.zip
05:02:51 INFO - mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\4a5d6160-41b7-4840-9ee8-2492525fefb6.dmp
05:02:51 INFO - mozcrash Saved app info as C:\slave\test\build\blobber_upload_dir\4a5d6160-41b7-4840-9ee8-2492525fefb6.extra
05:02:51 WARNING - PROCESS-CRASH | dom/security/test/csp/test_upgrade_insecure_reporting.html | application crashed [@ mozilla::`anonymous namespace'::RunWatchdog(void *)]
Flags: needinfo?(amarchesini)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(amarchesini)
Comment 10•9 years ago
|
||
And out again in https://hg.mozilla.org/integration/mozilla-inbound/rev/a63e2f7359f3, RunWatchdog crashes in desktop mochitest-3 running dom/security/test/csp/test_upgrade_insecure_reporting.html, and if my bet is right also Android mochitest hangs.
Assignee | ||
Comment 11•9 years ago
|
||
It turned out that this approach is not what the spec wants.
I added 2 tests and I introduced a new method for SharedWorkers (or any other use of MessagePort): DeliverAndClose().
With this method we mark the port as closed but we finish the delivering of the pending messages. More comments on the patch.
Attachment #8687663 -
Attachment is obsolete: true
Attachment #8687829 -
Flags: review?(bugs)
Assignee | ||
Comment 12•9 years ago
|
||
With this second patch we force the shutting down of the port in case we need it in MessagePortService. The previous code is correct, but it needs some changes when ForceClose() is called.
Attachment #8687830 -
Flags: review?(bugs)
Comment 13•9 years ago
|
||
Comment on attachment 8687829 [details] [diff] [review]
patch 1 - DeliverAndClose
>+function test_closeInBetween() {
>+ var mc = new MessageChannel();
>+
>+ for (var i = 0; i < MAX; ++i) {
>+ mc.port1.postMessage(i);
>+ }
>+
>+ for (var i = 0; i < MAX; ++i) {
>+ mc.port1.postMessage(i + MAX);
>+ }
>+
>+ mc.port2.onmessage = function(e) {
>+ ok (e.data < MAX, "Correct message received!");
>+ if (e.data == MAX - 1) {
>+ mc.port2.close();
What in the spec says this is the correct behavior?
.close() should just disentangle, and I don't see disentangling meaning that other messages shouldn't be processed.
Sure, step 13 in postMessage is totally buggy, but at least it doesn't talk about disentangled ports.
Attachment #8687829 -
Flags: review?(bugs) → review-
Updated•9 years ago
|
Attachment #8687830 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 14•9 years ago
|
||
Attachment #8688104 -
Flags: review?(bugs)
Assignee | ||
Comment 15•9 years ago
|
||
Full patch again, with the removed |if (mClosing)...| and a test for that scenario.
Attachment #8687829 -
Attachment is obsolete: true
Attachment #8688122 -
Flags: review?(bugs)
Updated•9 years ago
|
Attachment #8688104 -
Flags: review?(bugs)
Comment 16•9 years ago
|
||
Comment on attachment 8688122 [details] [diff] [review]
patch 1 - Close
I start to dislike MessagePort API (which forces the implementation to be so complicated)
Attachment #8688122 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 17•9 years ago
|
||
Sorry for the ping-pong of review requests. Here a better approach:
1. We unref the MessagePort only when we don't have pending messages.
2. When the window is close or the worker is down, we call CloseForced. CloseForced flushed the pending messages and it starts the close procedure.
3. Dispatch() has a lot of comments about all the possible scenarios.
Attachment #8688104 -
Attachment is obsolete: true
Attachment #8688122 -
Attachment is obsolete: true
Attachment #8688171 -
Flags: review?(bugs)
Comment 18•9 years ago
|
||
Comment on attachment 8688171 [details] [diff] [review]
patch 1 - Close
case eStateDisentangling: vs case eStateDisentangled:
could use a comment. It is odd that latter state may process messages which earlier state can't. (this is because those aren't quite earlier/latter in all the cases)
Attachment #8688171 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 19•9 years ago
|
||
In this patch I get rid of mNextStep and I introduce 3 new state.
Attachment #8688632 -
Flags: review?(bugs)
Comment 20•9 years ago
|
||
Comment on attachment 8688632 [details] [diff] [review]
patch 1 - interdiff
I guess this makes the code a bit easier to read.
The change is scary. Do we need this on beta too?
Attachment #8688632 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 21•9 years ago
|
||
About beta: without these patches we deliver the messages in a non-consistent way after Close(). I guess it's ok if we skip beta for this code. But I would like to have in aurora.
Comment 22•9 years ago
|
||
But do we need to fix Bug 1221852 for beta?
Assignee | ||
Comment 23•9 years ago
|
||
We can easily write a patch for beta without the test included in patch for bug 1221852. In this way we will not have an intermittent failure.
Comment 24•9 years ago
|
||
Assignee | ||
Comment 25•9 years ago
|
||
Attachment #8687830 -
Attachment is obsolete: true
Attachment #8688171 -
Attachment is obsolete: true
Attachment #8688632 -
Attachment is obsolete: true
Assignee | ||
Comment 26•9 years ago
|
||
Comment 27•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f04ec936e226
https://hg.mozilla.org/mozilla-central/rev/67a3836250de
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•