Last Comment Bug 676533 - /restartTests/testDiscoveryPane_UpAndComingModule/test1.js sometime fails due to a TimeoutError
: /restartTests/testDiscoveryPane_UpAndComingModule/test1.js sometime fails due...
Status: RESOLVED WORKSFORME
[mozmill-test-failure][remote]
:
Product: Mozilla QA
Classification: Other
Component: Mozmill Tests (show other bugs)
: unspecified
: All All
: -- normal (vote)
: ---
Assigned To: Maniac Vlad Florin (:vladmaniac)
:
:
Mentors:
http://mozmill-ci.blargon7.com/#/remo...
Depends on: 677383
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-04 07:33 PDT by Alex Lakatos[:AlexLakatos]
Modified: 2012-08-06 23:45 PDT (History)
5 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---
fixed
fixed
fixed
fixed
fixed


Attachments
re-enable test patch v1.0 (4.76 KB, patch)
2012-05-21 08:36 PDT, Maniac Vlad Florin (:vladmaniac)
dave.hunt: review-
Details | Diff | Splinter Review
re-enable test patch v1.1 (4.85 KB, patch)
2012-05-22 05:45 PDT, Maniac Vlad Florin (:vladmaniac)
dave.hunt: review-
Details | Diff | Splinter Review
re-enable test patch v1.2 (4.85 KB, patch)
2012-05-22 06:50 PDT, Maniac Vlad Florin (:vladmaniac)
hskupin: review+
Details | Diff | Splinter Review
esr patch v1.0 (5.83 KB, patch)
2012-05-24 01:01 PDT, Maniac Vlad Florin (:vladmaniac)
hskupin: review+
Details | Diff | Splinter Review

Description Alex Lakatos[:AlexLakatos] 2011-08-04 07:33:33 PDT
http://mozmill-release.brasstacks.mozilla.com/#/remote/report/15a8b41a151956b8d71bd0934b2c3379

TimeoutError("Modal dialog has been found and processed")@resource://mozmill/modules/utils.js:429 waitFor([object Proxy],"Modal dialog has been found and processed",25000,100,[object Proxy])@resource://mozmill/modules/utils.js:467

This happens because the addon download takes longer than the timeout.
Comment 1 Maniac Vlad Florin (:vladmaniac) 2011-08-08 02:07:21 PDT
Manually increasing the timeout parameter value will somehow fix this? Opinions, anyone?
Comment 2 Henrik Skupin (:whimboo) 2011-08-08 02:16:54 PDT
I would like to know why the download takes longer for exactly this test. Do we know which add-on we are trying to download? Has it a larger size as all the others?
Comment 3 Maniac Vlad Florin (:vladmaniac) 2011-08-08 02:53:20 PDT
This failure has occured only once, we'll see this after lunch and report back with another comment. Meanwhile, we can't see the error while running the entire remote testrun
Comment 4 Maniac Vlad Florin (:vladmaniac) 2011-08-08 05:46:06 PDT
(In reply to Henrik Skupin (:whimboo) from comment #2)
> I would like to know why the download takes longer for exactly this test. Do
> we know which add-on we are trying to download? Has it a larger size as all
> the others?

It seems to be independent of the add-on we install. We install a random addon during this test, and tried separately with all three of them listed. All pass here. 

There is an add-on though, "LastPass password manager" which is more than 2MB of size, when the other two are max 400 kB each.
Comment 5 Alex Lakatos[:AlexLakatos] 2011-08-08 06:24:11 PDT
Seeing as this is remote served content, may it have something to do with the cdn?
Internet connection is a variable as well. If the connection drops below 100kb/s there is a high chance the timeout is exceeded on the "LastPass" addon.
I'm also seeing a variable number of seconds before the addon actually shows any download progress, so this is another variable.
Seeing as how we can not account for all the variables I think we only have to options here:
1. Increase the "TIMEOUT_DOWNLOAD" on each test as we see it starts to fail
2. Live with the fact that remote test fail sometime (as this has failed occasionally)
Comment 6 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-08-08 07:19:00 PDT
What is the current timeout?
Comment 7 Maniac Vlad Florin (:vladmaniac) 2011-08-08 07:20:47 PDT
(In reply to Anthony Hughes (:ashughes) from comment #6)
> What is the current timeout?

It's in the error "Modal dialog has been found and processed",25000,100,[object Proxy])@resource://mozmill/modules/utils.js:467 - 25000 this is 25 seconds - looks more than enough, but still...
Comment 8 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-08-08 07:22:56 PDT
Yeah, 25 seconds is quite long; I'm reluctant to bump it.

I'm leaning toward allowing this to fail once in a while if we can report back in the error message a status of download progress (in the case of Download Timeouts)
Comment 9 Henrik Skupin (:whimboo) 2011-08-08 12:17:57 PDT
Well we need a kind of fix here. Especially if people with a slower connection are running the test. Something which we could do is to add a waitFor call for the download progress. That's most likely a shared module enhancement. I'm not quite sure yet, if it would gracefully work with the modal dialog. Because the last check could happen too late and the dialog blocks the execution. I assume the API we have implemented on bug 600052 is now failing because it's a doorhanger?
Comment 10 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-08-08 12:38:17 PDT
What is the maximum file size for add-ons? 1MB @ 10 kb/s takes 100 seconds.
Comment 11 Henrik Skupin (:whimboo) 2011-08-08 13:38:29 PDT
I don't know of a size restriction. There are some extensions out there which are about 5MB. So those can even be larger. I would prefer to check for the download state progress change. That would solve the general problem and would only fail if the connection drops and the download has been stalled.
Comment 12 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-08-08 13:52:14 PDT
(In reply to Henrik Skupin (:whimboo) from comment #11)
> I don't know of a size restriction. There are some extensions out there
> which are about 5MB. So those can even be larger. I would prefer to check
> for the download state progress change. That would solve the general problem
> and would only fail if the connection drops and the download has been
> stalled.

Does this mean changing waitForDownload() in the API?
Comment 13 Henrik Skupin (:whimboo) 2011-08-08 14:05:15 PDT
I would tend to say yes, if we have to change it. I haven't checked the code yet. If that's the case lets get a new bug on file so we can update the API.
Comment 14 Maniac Vlad Florin (:vladmaniac) 2011-08-08 14:09:11 PDT
If we need to check for the download progress change, I might have some ideas :) If it needs to be done in the test, it will be my first job tomorrow. 

Thanks for helping with this!
Comment 15 Henrik Skupin (:whimboo) 2011-08-08 14:15:43 PDT
No. That's definitely an API change because it will apply to all of our tests we currently have.
Comment 16 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-08-08 14:56:23 PDT
(In reply to Henrik Skupin (:whimboo) from comment #15)
> No. That's definitely an API change because it will apply to all of our
> tests we currently have.

Filed. See bug 677383.
Comment 17 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-03-08 10:42:10 PST
I believe this bug is WONTFIX due to bug 732353. I think we should still keep bug 677383 open though as it affects all Add-on installation tests.
Comment 18 Maniac Vlad Florin (:vladmaniac) 2012-05-21 07:26:19 PDT
I will work further on re-enabling this test since the Up&Coming module did not change much lately and I think we can still maintain it. I will make it my task to watch this and monitor eventual failures. 

Also, the test will suffer some updating along the way, so don't be scared if the patch will be bigger than simply deleting the skipping part

Reopening
Comment 19 Maniac Vlad Florin (:vladmaniac) 2012-05-21 08:36:08 PDT
Created attachment 625657 [details] [diff] [review]
re-enable test patch v1.0

Re enables the test 
Update the test due to api changes 
Small optimizations 
Better style guide 
Tested several times on os x and linux ubuntu 

There are lots of reports I'm pasting two 
MAC: http://mozmill-crowd.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a4e61b2 
LINUX: http://mozmill-crowd.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a4e5441

Somewhere in the near future it would also be nice to fix bug 677383, I will consider taking it once his time will come.
Comment 20 Dave Hunt (:davehunt) 2012-05-22 04:00:43 PDT
Comment on attachment 625657 [details] [diff] [review]
re-enable test patch v1.0

No issue with the patch regarding re-enabling the test, but as we pick a random addon we should at least include the name of the addon in the assertion message. That way if we have failures then we know which addon causes it.
Comment 21 Maniac Vlad Florin (:vladmaniac) 2012-05-22 05:45:28 PDT
Created attachment 625981 [details] [diff] [review]
re-enable test patch v1.1

Fixed
Comment 22 Dave Hunt (:davehunt) 2012-05-22 06:39:41 PDT
Comment on attachment 625981 [details] [diff] [review]
re-enable test patch v1.1

+  var addonName  = randomAddon.getNode().lastElementChild.textContent;

Hey Vlad. Can you fix the double space. Also, is there a reason for lastElementChild or would lastChild work? If so, could you update the patch. Thanks.
Comment 23 Maniac Vlad Florin (:vladmaniac) 2012-05-22 06:50:10 PDT
Created attachment 625992 [details] [diff] [review]
re-enable test patch v1.2

nit fixed 

Well we need the lastElementChild, otherwise it ads extra spaces we do not want that. You can test yourself for sanity.
Comment 24 Dave Hunt (:davehunt) 2012-05-22 09:45:12 PDT
Landed as:
http://hg.mozilla.org/qa/mozmill-tests/rev/b0c42d67d8c0

We're going to re-enable the remote tests for central, and once we see this passing we can transplant it to the remaining branches.
Comment 25 Henrik Skupin (:whimboo) 2012-05-23 01:36:11 PDT
As you can see in the following report it is still broken in CI. What happened here? Was it an interim orange?

http://mozmill-ci.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a67048a
Comment 26 Maniac Vlad Florin (:vladmaniac) 2012-05-23 01:54:16 PDT
(In reply to Henrik Skupin (:whimboo) from comment #25)
> As you can see in the following report it is still broken in CI. What
> happened here? Was it an interim orange?
> 
> http://mozmill-ci.blargon7.com/#/remote/report/
> f87375a634b1a5ba746e5f763a67048a

I've somehow expected this to fail at some point. Will run things again see if I can reproduce locally, but doubt that
Comment 27 Dave Hunt (:davehunt) 2012-05-23 02:35:40 PDT
I think this is likely to be intermittent. We could watch it and push when we're satisfied, or increase the timeout further?
Comment 28 Henrik Skupin (:whimboo) 2012-05-23 03:12:57 PDT
The jenkins master was under heavy load with about 100% for the Java process. It could be related to that. So lets watch upcoming testruns for this test.
Comment 29 Henrik Skupin (:whimboo) 2012-05-23 05:27:34 PDT
No more failures on default:
http://mozmill-ci.blargon7.com/#/remote/reports

Tests have been backported to:
http://hg.mozilla.org/qa/mozmill-tests/rev/465867142118 (aurora)
http://hg.mozilla.org/qa/mozmill-tests/rev/da5626047354 (beta)
http://hg.mozilla.org/qa/mozmill-tests/rev/929830ce80c2 (release)

Vlad, please come up with a backport for esr10 because we fail to apply this patch.
Comment 30 Maniac Vlad Florin (:vladmaniac) 2012-05-23 05:30:07 PDT
I know I know I was waiting for the results before uploading it. 
A patch will follow up shortly
Comment 31 Henrik Skupin (:whimboo) 2012-05-23 07:04:56 PDT
Please stop submitting updates if you get a mid-air collision. This warning isn't shown without a reason. Release the bug and resubmit again instead. Thanks!
Comment 32 Maniac Vlad Florin (:vladmaniac) 2012-05-24 01:01:03 PDT
Created attachment 626721 [details] [diff] [review]
esr patch v1.0

patch for mozilla-esr10 branch
Comment 33 Henrik Skupin (:whimboo) 2012-05-24 01:52:55 PDT
Comment on attachment 625992 [details] [diff] [review]
re-enable test patch v1.2

I assume that patch should have gotten a r+.
Comment 34 Henrik Skupin (:whimboo) 2012-05-24 01:57:21 PDT
Landed as:
http://hg.mozilla.org/qa/mozmill-tests/rev/5b2386f0e6f9

Lets get the Litmus tests re-enabled now.
Comment 35 Maniac Vlad Florin (:vladmaniac) 2012-05-24 02:08:32 PDT
The litmus test here is for AMO 
https://litmus.mozilla.org/show_test.cgi?id=15373 
and was already enabled. 

We do not have a mozmill subgroup for AMO at the moment
Comment 36 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-29 11:18:37 PDT
I noticed this test is still failing though intermittently and with different errors in Firefox 14.0a2 and 15.0a1:

Example:
http://mozmill-ci.blargon7.com/#/remote/report/fdec829b93b19c73985be1d3882430af

Also seeing the following errors intermittently:
* aElement is undefined
* AddonsManager_categories: Categories could not be found. 
* Selected category has been loaded.
Comment 37 Maniac Vlad Florin (:vladmaniac) 2012-05-30 07:40:17 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> I noticed this test is still failing though intermittently and with
> different errors in Firefox 14.0a2 and 15.0a1:
> 
> Example:
> http://mozmill-ci.blargon7.com/#/remote/report/
> fdec829b93b19c73985be1d3882430af
> 
> Also seeing the following errors intermittently:
> * aElement is undefined
> * AddonsManager_categories: Categories could not be found. 
> * Selected category has been loaded.

This test is network dependent so I was expecting it to fail at some point
Comment 38 Henrik Skupin (:whimboo) 2012-05-30 09:56:48 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)

> * AddonsManager_categories: Categories could not be found. 

That should never fail. Please file this as a new bug. Could be a regression caused by the different category timeout handling code.

> * aElement is undefined
> * Selected category has been loaded.

We should try to increase the timeout for the remote pane even more. Lets do it here.
Comment 39 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-30 10:56:24 PDT
(In reply to Henrik Skupin (:whimboo) from comment #38)
> (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> 
> > * AddonsManager_categories: Categories could not be found. 
> 
> That should never fail. Please file this as a new bug. Could be a regression
> caused by the different category timeout handling code.

See bug 759832.
Comment 40 Maniac Vlad Florin (:vladmaniac) 2012-07-20 05:53:16 PDT
This fails again 
http://mozmill-ci.blargon7.com/#/remote/report/89726f6b98208a209e7ce2df10359516
Comment 41 Maniac Vlad Florin (:vladmaniac) 2012-08-05 23:31:07 PDT
Andreea has been monitoring AMO for a while and it seems we do not see the '404' Error anymore, which is also reflected in our lately reports - the timeout error is not seen 
http://mozmill-ci.blargon7.com/#/remote/reports?branch=17.0&platform=All&from=2012-08-03&to=2012-08-06

We have a new one though, 
http://mozmill-ci.blargon7.com/#/remote/report/29fc09ba0c8360d637617903a01672f1

I think we should move it to another bug and keep monitoring both.
Comment 42 Maniac Vlad Florin (:vladmaniac) 2012-08-06 23:45:29 PDT
Based on comment 41 and the report link I am closing this as a WFM. 
Please anybody seeing the timeout again, reopen this issue. 

At the moment, we are going to continue monitoring this test on bug 780556

Note You need to log in before you can comment on or make changes to this bug.