Closed Bug 1062367 Opened 10 years ago Closed 10 years ago

Intermittent test_bug883784.xul | Exception wanted - Result logged after SimpleTest.finish()

Categories

(Core :: DOM: Workers, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
firefox-esr31 --- wontfix

People

(Reporter: RyanVM, Assigned: martijn.martijn)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 3 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=47307447&tree=B2g-Inbound

Rev4 MacOSX Snow Leopard 10.6 b2g-inbound opt test mochitest-other on 2014-09-03 03:53:57 PDT for push cfd71a5a178e
slave: t-snow-r4-0143

03:59:15     INFO -  997 INFO TEST-START | chrome://mochitests/content/chrome/dom/workers/test/test_bug883784.xul
03:59:15     INFO -  998 INFO TEST-OK | chrome://mochitests/content/chrome/dom/workers/test/test_bug883784.xul | took 96ms
03:59:15     INFO -  999 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/dom/workers/test/test_bug883784.xul | Exception wanted - Result logged after SimpleTest.finish()
I tried to fix this in bug 1032878.
Apparently, event.data.type == 'status' can fire later than event.data.type == 'url' sometimes:
http://hg.mozilla.org/mozilla-central/rev/2f4a23d34287#l11.2
Ok, the fix for bug 1032878 is wrong.
postMessage({type: 'url', url: url}); comes before postMessage({type: 'status', status: status, msg: 'Exception wanted' }); and postMessage({type: 'finish' });.
But because of the async xmlhttprequest, the ok(true, "Blob readable!"); call is normally later than when former messages are handled, but not always (as what this bug is about).
I guess the test should track if the one or the other is already called.

Relevant files:
http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/test_bug883784.jsm
http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/jsm_url_worker.js
Assignee: nobody → martijn.martijn
Attached patch 1062367.diff (obsolete) — Splinter Review
This should fix it.
Attachment #8484112 - Flags: review?(amarchesini)
Attached patch 1062367.diff (obsolete) — Splinter Review
Pushed this to try: https://tbpl.mozilla.org/?tree=Try&rev=443c08f476fc
Attachment #8484112 - Attachment is obsolete: true
Attachment #8484112 - Flags: review?(amarchesini)
Attachment #8485730 - Flags: review?(amarchesini)
Comment on attachment 8485730 [details] [diff] [review]
1062367.diff

Review of attachment 8485730 [details] [diff] [review]:
-----------------------------------------------------------------

I hope the comment is understandable :)

::: dom/workers/test/test_bug883784.jsm
@@ +12,5 @@
>          xhr.open('GET', event.data.url, false);
>          xhr.onreadystatechange = function() {
>            if (xhr.readyState == 4) {
>              ok(true, "Blob readable!");
> +            setTimeout(finish, 0);

Why should this fix the issue It seems that we receive a 'status' message (line 8) after this finish() call.

So actually the issue is this:
in jsm_url_worker.js we do:

  postMessage({type: 'url', url: url});

and then, after,after a while:

  postMessage({type: 'status', status: status, msg: 'Exception wanted' });
  postMessage({type: 'finish' });

but when the 'url' is received by the main-thread we do an XHR and onreadystatechange, readyState == 4, we finish the test.
It can happen that the status "Exception wanted" comes after this finish().

What we should really do is:
1. remove the finish() when readyState is == 4.
2. add: if(event.data.type == 'finish') { finish() } in onmessage.
Attachment #8485730 - Flags: review?(amarchesini) → review-
(In reply to Andrea Marchesini (:baku) from comment #7)
> Comment on attachment 8485730 [details] [diff] [review]
> 1062367.diff
> 
> Review of attachment 8485730 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I hope the comment is understandable :)
> 
> ::: dom/workers/test/test_bug883784.jsm
> @@ +12,5 @@
> >          xhr.open('GET', event.data.url, false);
> >          xhr.onreadystatechange = function() {
> >            if (xhr.readyState == 4) {
> >              ok(true, "Blob readable!");
> > +            setTimeout(finish, 0);
> 
> Why should this fix the issue It seems that we receive a 'status' message
> (line 8) after this finish() call.

I think this should fix it, because this is the (seldom) case where this xhr runs before the 'status' message.

These status messages are all running after each other: http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/jsm_url_worker.js
So adding a timeout would make sure that the xhr is finished after these messages.

> What we should really do is:
> 1. remove the finish() when readyState is == 4.
> 2. add: if(event.data.type == 'finish') { finish() } in onmessage.

Then I will get the failures back that I saw as part of bug 1032878, because xhr is then finishing and posting results after the SimpleTest.finish() call.
Sorry, I should have let you review this part of the patch that was part of that bug.
As an alternative, I could back out this part of the patch for bug 1032878. Then keep track if xhr has been run or not with the "if(event.data.type == 'finish') { finish() }" in onmessage call and then finish or not in there or in the xhr call.
What about if we move the 'url' as last test and all the status messages before that?
I'm saying: remove http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/jsm_url_worker.js#81
and move http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/jsm_url_worker.js#79 before line 70.

This should fix the issue. What do you think?
(In reply to Andrea Marchesini (:baku) from comment #9)
> What about if we move the 'url' as last test and all the status messages
> before that?
> I'm saying: remove
> http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/
> jsm_url_worker.js#81

This will cause dom/workers/test/test_url.xul to timeout, because it will miss the 'finish' call.
But removing this:
> http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/
> jsm_url_worker.js#81
is not necessary to fix this issue.

Just moving move http://mxr.mozilla.org/mozilla-central/source/dom/workers/test/jsm_url_worker.js#79 before line 70 should be enough.
Attached patch 1062367_v2.diff (obsolete) — Splinter Review
Sorry for the delay, did the suggestion in comment 11.
Pushed to try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f98e3414765e
Attachment #8485730 - Attachment is obsolete: true
Attachment #8500319 - Flags: review?(amarchesini)
Comment on attachment 8500319 [details] [diff] [review]
1062367_v2.diff

Review of attachment 8500319 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/test/jsm_url_worker.js
@@ +75,5 @@
>      URL.createObjectURL(new Object());
>    } catch(e) {
>      status = true;
>    }
>  

the reason why this postmessage() was here is because we want to check 'status' after this try/catch.
If you move the postMessage() to line 70 you are testing something else.
Attachment #8500319 - Flags: review?(amarchesini) → review-
What I meant in comment 9 is:

1. remove postMessage({type: 'finish' });
2. move postMessage({type: 'url', url: url}); after:
   postMessage({type: 'status', status: status, msg: 'Exception wanted' });

Because the issue is that we have 2 finish() calls. THe first one happens when we do
postMessage({type: finish'); and the second one is when, on the main-thread we do: onreadystatechange...

We have to remove one of them to have a clean workflow.
(In reply to Andrea Marchesini (:baku) from comment #29)
> What I meant in comment 9 is:
> 
> 1. remove postMessage({type: 'finish' });

I can't do that, see comment 10.
Anyway, the postMessage({type: finish'); thing doesn't do anything in test_bug883784.jsm, because I removed that part in bug 1032878.
(In reply to Andrea Marchesini (:baku) from comment #29)
> Because the issue is that we have 2 finish() calls. THe first one happens
> when we do
> postMessage({type: finish'); and the second one is when, on the main-thread
> we do: onreadystatechange...
> 
> We have to remove one of them to have a clean workflow.

That's not true. The postMessage({type: finish'); call doesn't trigger a finish() anymore in test_bug883784.jsm as part of bug 1032878.

I can do option 2 in comment 29, but you don't seem to be satisfied with that?
An alternative would be to revert the changes of bug 1032878 and then to have some mechanism that checks if all the results have been fired before eventually calling finish().
Flags: needinfo?(amarchesini)
> I can do option 2 in comment 29, but you don't seem to be satisfied with
> that?

I'm totally fine with this.
Flags: needinfo?(amarchesini)
Attached patch 1062367_v3.diffSplinter Review
Like this then, pushed to try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=29a51e5c1b30
Attachment #8500319 - Attachment is obsolete: true
Attachment #8500537 - Flags: review?(amarchesini)
Comment on attachment 8500537 [details] [diff] [review]
1062367_v3.diff

Review of attachment 8500537 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm. Have you try to run it on tbpl several times?
Attachment #8500537 - Flags: review?(amarchesini) → review+
I've retriggered it on windows 7 debug something like 10 times now on treeherder, since most of the failures seemed to have come from there, lately.
All retriggers are green.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/963d24ebf03b
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.