Closed Bug 833144 Opened 11 years ago Closed 11 years ago

Update basic gUM mochitests to not use setTimeout() and other fixes

Categories

(Core :: WebRTC, defect)

defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: whimboo, Assigned: whimboo)

Details

(Whiteboard: [getUserMedia])

Attachments

(1 obsolete file)

As we have seen in the past, the custom timeout value for canplaythrough and timeupdate events can cause us trouble if the machines are slower. Further as I was told a test should never use setTimeout. This always indicates there is a problem.

So lets get his timeout removed and by using the general timeout for mochitests, as what all the other tests are using. Beside I also fixed the MediaStreamPlayback object by adding prototype definitions, and enhancing the code.

I'm still not sure about the LocalMediaStream check at the beginning of canplaythrough. I strongly believe we should remove that but I will tackle that later on bug 831789.
Attached patch Patch v1 (obsolete) — Splinter Review
Attachment #704724 - Flags: review?(roc)
Invalid. The timeouts were introduced to control an expectation of how long we think it is acceptable for a test to run within reason and to ensure that we do cleanup effectively if we fail. Relying on mochitests timeout is definitely not the right direction to go, as that doesn't follow the control flow of what we intend to need to cleanup if we timeout.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → INVALID
Attachment #704724 - Attachment is obsolete: true
Attachment #704724 - Flags: review?(roc)
I'd also prefer in the future that you ask on why someone made a design decision they made instead of actually trying to modify code they've written without thinking it through.
(In reply to Henrik Skupin (:whimboo) from comment #0)
> As we have seen in the past, the custom timeout value for canplaythrough and
> timeupdate events can cause us trouble if the machines are slower. Further
> as I was told a test should never use setTimeout. This always indicates
> there is a problem.

True that we ran into trouble with a timeout amount, but we're not hitting that anymore. Alas you experience that with anything that requires a timeout eventually.

And that's wrong btw that a test should never use setTimeout. Our entire Gaia UI Tests suite uses that because in case we need to "give up," because has something has happened, fail safely, and cleanup.

> 
> So lets get his timeout removed and by using the general timeout for
> mochitests, as what all the other tests are using. Beside I also fixed the
> MediaStreamPlayback object by adding prototype definitions, and enhancing
> the code.

See comments above. That's a bad idea to remove that to ensure that we're cleaning up correctly.

> 
> I'm still not sure about the LocalMediaStream check at the beginning of
> canplaythrough. I strongly believe we should remove that but I will tackle
> that later on bug 831789.

And that's also wrong. The instanceof check was added because it's a contract test with the interface. It's perfectly acceptable test other WebAPIs have used to ensure that their interface contract is held.

Once again, I strongly suggest next time you actually talk to the person who built the code before making incorrect assumptions. Thanks.
(In reply to Jason Smith [:jsmith] from comment #2)
> Invalid. The timeouts were introduced to control an expectation of how long
> we think it is acceptable for a test to run within reason and to ensure that
> we do cleanup effectively if we fail. Relying on mochitests timeout is
> definitely not the right direction to go, as that doesn't follow the control
> flow of what we intend to need to cleanup if we timeout.

Really, I'm not sure why you think it's invalid. Please read my comments and don't always start to close bugs as invalid right away. I gave a clear reasoning for this change. I discussed that with Roc on IRC and we both agreed on that. We can ask other developers if you want how we should proceed here, but please refrain from closing this bug again.

Just one more thing, if ever such an event is not getting triggered, it's a failure. Running into the global timeout doesn't require us to run any cleanup code in a failure case.

(In reply to Jason Smith [:jsmith] from comment #4)
> True that we ran into trouble with a timeout amount, but we're not hitting
> that anymore. Alas you experience that with anything that requires a timeout
> eventually.

Because the test machines are not that slow. Means we were happy so far.

> And that's wrong btw that a test should never use setTimeout. Our entire
> Gaia UI Tests suite uses that because in case we need to "give up," because
> has something has happened, fail safely, and cleanup.

It's a different test framework and you can't compare that. Even I believe that using setTimeout() in Marionette is not a good call. Not sure who started to make use of that.

> > I'm still not sure about the LocalMediaStream check at the beginning of
> > canplaythrough. I strongly believe we should remove that but I will tackle
> > that later on bug 831789.
> 
> And that's also wrong. The instanceof check was added because it's a
> contract test with the interface. It's perfectly acceptable test other
> WebAPIs have used to ensure that their interface contract is held.

In that case it should check for a MediaStream which will make no difference. Reason is that I want to re-use this class for the peer connection tests but I can't given that strict binding to a LocalMediaStream.

> Once again, I strongly suggest next time you actually talk to the person who
> built the code before making incorrect assumptions. Thanks.

I would suggest you to be more careful with comments like those on Bugzilla. It's not the first time it happened. So please be constructive and cooperative when it comes to improvement requests. We all want to be successful in WebRTC and that means we have to work together towards it. Thanks.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Comment on attachment 704724 [details] [diff] [review]
Patch v1

I have asked Roc for review and I still want that. Further adding Randell to the list of reviewers.
Attachment #704724 - Attachment is obsolete: false
Attachment #704724 - Flags: review?(roc)
Attachment #704724 - Flags: review?(rjesup)
Comment on attachment 704724 [details] [diff] [review]
Patch v1

I actually meant it when I said I'm taking the patch.
Attachment #704724 - Attachment is obsolete: true
Attachment #704724 - Flags: review?(roc)
Attachment #704724 - Flags: review?(rjesup)
As owner of the code, I'm not taking the patch. Invalid.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → INVALID
(In reply to Henrik Skupin (:whimboo) from comment #5)
> (In reply to Jason Smith [:jsmith] from comment #2)
> > Invalid. The timeouts were introduced to control an expectation of how long
> > we think it is acceptable for a test to run within reason and to ensure that
> > we do cleanup effectively if we fail. Relying on mochitests timeout is
> > definitely not the right direction to go, as that doesn't follow the control
> > flow of what we intend to need to cleanup if we timeout.
> 
> Really, I'm not sure why you think it's invalid. Please read my comments and
> don't always start to close bugs as invalid right away. I gave a clear
> reasoning for this change. I discussed that with Roc on IRC and we both
> agreed on that. We can ask other developers if you want how we should
> proceed here, but please refrain from closing this bug again.
> 
> Just one more thing, if ever such an event is not getting triggered, it's a
> failure. Running into the global timeout doesn't require us to run any
> cleanup code in a failure case.

True, but that maybe be intending to do something in future test when I want the timeout to occur? Eventually that will become the case when Ted lands his stuff when we actually expecting a timeout for dead streams.

> 
> (In reply to Jason Smith [:jsmith] from comment #4)
> > True that we ran into trouble with a timeout amount, but we're not hitting
> > that anymore. Alas you experience that with anything that requires a timeout
> > eventually.
> 
> Because the test machines are not that slow. Means we were happy so far.
> 
> > And that's wrong btw that a test should never use setTimeout. Our entire
> > Gaia UI Tests suite uses that because in case we need to "give up," because
> > has something has happened, fail safely, and cleanup.
> 
> It's a different test framework and you can't compare that. Even I believe
> that using setTimeout() in Marionette is not a good call. Not sure who
> started to make use of that.

I really strongly disagree here. And yes, you can compare them.

> 
> > > I'm still not sure about the LocalMediaStream check at the beginning of
> > > canplaythrough. I strongly believe we should remove that but I will tackle
> > > that later on bug 831789.
> > 
> > And that's also wrong. The instanceof check was added because it's a
> > contract test with the interface. It's perfectly acceptable test other
> > WebAPIs have used to ensure that their interface contract is held.
> 
> In that case it should check for a MediaStream which will make no
> difference. Reason is that I want to re-use this class for the peer
> connection tests but I can't given that strict binding to a LocalMediaStream.

No, actually that was intentional to check for LocalMediaStream because we had a bug before where we had the wrong object type upon calling gum. So it entirely makes sense to have that.

I'd refactor to pull out whatever is needed for your specific needs.

> 
> > Once again, I strongly suggest next time you actually talk to the person who
> > built the code before making incorrect assumptions. Thanks.
> 
> I would suggest you to be more careful with comments like those on Bugzilla.
> It's not the first time it happened. So please be constructive and
> cooperative when it comes to improvement requests. We all want to be
> successful in WebRTC and that means we have to work together towards it.
> Thanks.

In this case, it's not an improvement when the person writing the tests is intending to extend it to have "intentional" timeouts vs. "unintentional" timeouts. So yeah, you've made a very poor assumption when you decided not talk to the person who wrote the code and is extending it. I'll be definitely having a chat with Clint on this.
Here's the original reason why I had explicit timeouts:

To have a clear cleanup path for the event to fails to fire on the results - meaning reporting a clear, an accurate test failure message (X event failed to fire), and clean things up in case I had a test that would continue on after the call (more logic happens after). It was defensive in case that consideration was put on the table. I didn't feel that relying on mochitest's generic error reporting was useful in this case, as I'd get a general timeout error, but not an indication of *what* failed. This gives a clearer direction as to why something failed. Alas, it's a style choice, but I'd prefer to be more explicit in case we need to debug directly.
Everybody stay calm! Nothing here is worth getting excited about IMHO :-).

As for process: As original reviewer of these tests, I think it would have been fine and normal for me to review this patch, and it wasn't wrong for Henrik to write a patch and ask me to review it. However, it is also reasonable for Jason, as original author, to add his own r-.

As for the merits of the patch: In a lot of other cases we have switched from a test-specific timeout to relying on the general mochitest timeout, to avoid false positives on slow machines. IMHO it wouldn't be wrong to do so here, but I appreciate what Jason said in comment #10 that it can be easier to debug with a local timeout. So let's keep doing that.

(In reply to Jason Smith [:jsmith] from comment #9)
> True, but that maybe be intending to do something in future test when I want
> the timeout to occur? Eventually that will become the case when Ted lands
> his stuff when we actually expecting a timeout for dead streams.

I'm not sure what that is, but using a timeout to detect a success behavior of "nothing happened" is a method of last resort, since it makes for slow tests. Sometimes we have no alternative I guess.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> As for the merits of the patch: In a lot of other cases we have switched
> from a test-specific timeout to relying on the general mochitest timeout, to
> avoid false positives on slow machines. IMHO it wouldn't be wrong to do so
> here, but I appreciate what Jason said in comment #10 that it can be easier
> to debug with a local timeout. So let's keep doing that.

Sure. I was also thinking about that. And that's why I put an info() line in the code so that you know what you are waiting for. With that change you would even be more safe in local debugging, given that due to unforeseen reasons the test would not abort after those 10s if it would take 11s. Here an example output:

 0:10.88 20 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Before starting the media element, currentTime = 0
 0:10.88 21 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Started playing media stream.
[..]
0:10.90 31 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | No src is defined
 0:10.90 32 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Start offset time is not a number
 0:10.90 33 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Started waiting for timeupdate events
 0:11.16 34 INFO TEST-END | /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | finished in 573ms

You will be totally aware about which action we are waiting for before the timeout happened. It doesn't have less or more information but it's simply more stable.
Just for reference here the try server run I submitted yesterday:
https://tbpl.mozilla.org/?tree=Try&rev=0780d3a0d1e9
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> (In reply to Jason Smith [:jsmith] from comment #9)
> > True, but that maybe be intending to do something in future test when I want
> > the timeout to occur? Eventually that will become the case when Ted lands
> > his stuff when we actually expecting a timeout for dead streams.
> 
> I'm not sure what that is, but using a timeout to detect a success behavior
> of "nothing happened" is a method of last resort, since it makes for slow
> tests. Sometimes we have no alternative I guess.

Same here. A bug reference would be kinda appreciated when making such claims. I haven't heard about this yet. So more information is welcome.
(In reply to Henrik Skupin (:whimboo) from comment #12)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> > As for the merits of the patch: In a lot of other cases we have switched
> > from a test-specific timeout to relying on the general mochitest timeout, to
> > avoid false positives on slow machines. IMHO it wouldn't be wrong to do so
> > here, but I appreciate what Jason said in comment #10 that it can be easier
> > to debug with a local timeout. So let's keep doing that.
> 
> Sure. I was also thinking about that. And that's why I put an info() line in
> the code so that you know what you are waiting for. With that change you
> would even be more safe in local debugging, given that due to unforeseen
> reasons the test would not abort after those 10s if it would take 11s. Here
> an example output:
> 
>  0:10.88 20 INFO TEST-PASS |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Before
> starting the media element, currentTime = 0
>  0:10.88 21 INFO TEST-INFO |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Started
> playing media stream.
> [..]
> 0:10.90 31 INFO TEST-PASS |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | No src
> is defined
>  0:10.90 32 INFO TEST-PASS |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Start
> offset time is not a number
>  0:10.90 33 INFO TEST-INFO |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html | Started
> waiting for timeupdate events
>  0:11.16 34 INFO TEST-END |
> /tests/dom/media/tests/mochitest/test_getUserMedia_basicVideo.html |
> finished in 573ms
> 
> You will be totally aware about which action we are waiting for before the
> timeout happened. It doesn't have less or more information but it's simply
> more stable.

Henrik, there's a difference between style choice and a bug. You are arguing for a style choice right now, not a bug. My approach reports when the timeout fails and cleans up by making sure that there's a clear path for failure explicit shown. That would report an info flag that something started, but if mochitest runs out a time, we timeout. However - this risks a such a bug as the following:

If for example a timeupdate event is firing on the media stream, but we had a bug where the time on the media stream never changed, you would enter an infinite loop for a long period of time, where as his fails quicker. Same goes for canplaythrough event never firing - the test would go longer, rather than end quicker with an appropriate error.

I don't see any argument here that the approach suggested is more stable - it's a style choice with no stability benefits to either side. The tests already written in this style operating under that rule - and I don't plan on changing it unless there's evidence of problems.
Well I should clarify after writing that - meant to say in comment 15 that my approach fails fast so that we don't have long tests, not necessarily a bug, but a design choice.
(In reply to Henrik Skupin (:whimboo) from comment #14)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> > (In reply to Jason Smith [:jsmith] from comment #9)
> > > True, but that maybe be intending to do something in future test when I want
> > > the timeout to occur? Eventually that will become the case when Ted lands
> > > his stuff when we actually expecting a timeout for dead streams.
> > 
> > I'm not sure what that is, but using a timeout to detect a success behavior
> > of "nothing happened" is a method of last resort, since it makes for slow
> > tests. Sometimes we have no alternative I guess.
> 
> Same here. A bug reference would be kinda appreciated when making such
> claims. I haven't heard about this yet. So more information is welcome.

A bug hasn't been filed yet, but the point is there's a bunch of tests that can be added after smoke and BFTs are added. There's definitely going to be tests that could be added that would be something along the lines of "start playing the content, kill the camera stream, verify the appropriate events are no longer firing after X seconds." Obviously there's opportunity for debate here on approach, but that's for a separate bug.

I do stand think this is not value-added arguing on this bug if the majority of the changes purposed are a style choice. Please find another way to implement your patch on a new bug that actually focuses on the problem needed to be solved, not the style piece.
No longer blocks: 831789
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: