Closed Bug 676533 Opened 13 years ago Closed 12 years ago

/restartTests/testDiscoveryPane_UpAndComingModule/test1.js sometime fails due to a TimeoutError

Categories

(Mozilla QA Graveyard :: Mozmill Tests, defect)

defect
Not set
normal

Tracking

(firefox12 fixed, firefox13 fixed, firefox14 fixed, firefox15 fixed, firefox-esr10 fixed)

RESOLVED WORKSFORME
Tracking Status
firefox12 --- fixed
firefox13 --- fixed
firefox14 --- fixed
firefox15 --- fixed
firefox-esr10 --- fixed

People

(Reporter: AlexLakatos, Assigned: vladmaniac)

References

()

Details

(Whiteboard: [mozmill-test-failure][remote])

Attachments

(2 files, 2 obsolete files)

http://mozmill-release.brasstacks.mozilla.com/#/remote/report/15a8b41a151956b8d71bd0934b2c3379

TimeoutError("Modal dialog has been found and processed")@resource://mozmill/modules/utils.js:429 waitFor([object Proxy],"Modal dialog has been found and processed",25000,100,[object Proxy])@resource://mozmill/modules/utils.js:467

This happens because the addon download takes longer than the timeout.
Whiteboard: [mozmill-test-failure]
Manually increasing the timeout parameter value will somehow fix this? Opinions, anyone?
I would like to know why the download takes longer for exactly this test. Do we know which add-on we are trying to download? Has it a larger size as all the others?
This failure has occured only once, we'll see this after lunch and report back with another comment. Meanwhile, we can't see the error while running the entire remote testrun
(In reply to Henrik Skupin (:whimboo) from comment #2)
> I would like to know why the download takes longer for exactly this test. Do
> we know which add-on we are trying to download? Has it a larger size as all
> the others?

It seems to be independent of the add-on we install. We install a random addon during this test, and tried separately with all three of them listed. All pass here. 

There is an add-on though, "LastPass password manager" which is more than 2MB of size, when the other two are max 400 kB each.
Seeing as this is remote served content, may it have something to do with the cdn?
Internet connection is a variable as well. If the connection drops below 100kb/s there is a high chance the timeout is exceeded on the "LastPass" addon.
I'm also seeing a variable number of seconds before the addon actually shows any download progress, so this is another variable.
Seeing as how we can not account for all the variables I think we only have to options here:
1. Increase the "TIMEOUT_DOWNLOAD" on each test as we see it starts to fail
2. Live with the fact that remote test fail sometime (as this has failed occasionally)
What is the current timeout?
(In reply to Anthony Hughes (:ashughes) from comment #6)
> What is the current timeout?

It's in the error "Modal dialog has been found and processed",25000,100,[object Proxy])@resource://mozmill/modules/utils.js:467 - 25000 this is 25 seconds - looks more than enough, but still...
Yeah, 25 seconds is quite long; I'm reluctant to bump it.

I'm leaning toward allowing this to fail once in a while if we can report back in the error message a status of download progress (in the case of Download Timeouts)
Well we need a kind of fix here. Especially if people with a slower connection are running the test. Something which we could do is to add a waitFor call for the download progress. That's most likely a shared module enhancement. I'm not quite sure yet, if it would gracefully work with the modal dialog. Because the last check could happen too late and the dialog blocks the execution. I assume the API we have implemented on bug 600052 is now failing because it's a doorhanger?
What is the maximum file size for add-ons? 1MB @ 10 kb/s takes 100 seconds.
I don't know of a size restriction. There are some extensions out there which are about 5MB. So those can even be larger. I would prefer to check for the download state progress change. That would solve the general problem and would only fail if the connection drops and the download has been stalled.
(In reply to Henrik Skupin (:whimboo) from comment #11)
> I don't know of a size restriction. There are some extensions out there
> which are about 5MB. So those can even be larger. I would prefer to check
> for the download state progress change. That would solve the general problem
> and would only fail if the connection drops and the download has been
> stalled.

Does this mean changing waitForDownload() in the API?
I would tend to say yes, if we have to change it. I haven't checked the code yet. If that's the case lets get a new bug on file so we can update the API.
If we need to check for the download progress change, I might have some ideas :) If it needs to be done in the test, it will be my first job tomorrow. 

Thanks for helping with this!
No. That's definitely an API change because it will apply to all of our tests we currently have.
Depends on: 677383
(In reply to Henrik Skupin (:whimboo) from comment #15)
> No. That's definitely an API change because it will apply to all of our
> tests we currently have.

Filed. See bug 677383.
I believe this bug is WONTFIX due to bug 732353. I think we should still keep bug 677383 open though as it affects all Add-on installation tests.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
I will work further on re-enabling this test since the Up&Coming module did not change much lately and I think we can still maintain it. I will make it my task to watch this and monitor eventual failures. 

Also, the test will suffer some updating along the way, so don't be scared if the patch will be bigger than simply deleting the skipping part

Reopening
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Assignee: nobody → vlad.mozbugs
Whiteboard: [mozmill-test-failure] → [mozmill-test-failure][remote]
Attached patch re-enable test patch v1.0 (obsolete) — Splinter Review
Re enables the test 
Update the test due to api changes 
Small optimizations 
Better style guide 
Tested several times on os x and linux ubuntu 

There are lots of reports I'm pasting two 
MAC: http://mozmill-crowd.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a4e61b2 
LINUX: http://mozmill-crowd.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a4e5441

Somewhere in the near future it would also be nice to fix bug 677383, I will consider taking it once his time will come.
Attachment #625657 - Flags: review?(dave.hunt)
Comment on attachment 625657 [details] [diff] [review]
re-enable test patch v1.0

No issue with the patch regarding re-enabling the test, but as we pick a random addon we should at least include the name of the addon in the assertion message. That way if we have failures then we know which addon causes it.
Attachment #625657 - Flags: review?(dave.hunt) → review-
Attached patch re-enable test patch v1.1 (obsolete) — Splinter Review
Fixed
Attachment #625657 - Attachment is obsolete: true
Attachment #625981 - Flags: review?(dave.hunt)
Comment on attachment 625981 [details] [diff] [review]
re-enable test patch v1.1

+  var addonName  = randomAddon.getNode().lastElementChild.textContent;

Hey Vlad. Can you fix the double space. Also, is there a reason for lastElementChild or would lastChild work? If so, could you update the patch. Thanks.
Attachment #625981 - Flags: review?(dave.hunt) → review-
nit fixed 

Well we need the lastElementChild, otherwise it ads extra spaces we do not want that. You can test yourself for sanity.
Attachment #625981 - Attachment is obsolete: true
Attachment #625992 - Flags: review?(dave.hunt)
Landed as:
http://hg.mozilla.org/qa/mozmill-tests/rev/b0c42d67d8c0

We're going to re-enable the remote tests for central, and once we see this passing we can transplant it to the remaining branches.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
As you can see in the following report it is still broken in CI. What happened here? Was it an interim orange?

http://mozmill-ci.blargon7.com/#/remote/report/f87375a634b1a5ba746e5f763a67048a
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [mozmill-test-failure][remote] → [mozmill-test-failure][remote][mozmill-test-skipped]
(In reply to Henrik Skupin (:whimboo) from comment #25)
> As you can see in the following report it is still broken in CI. What
> happened here? Was it an interim orange?
> 
> http://mozmill-ci.blargon7.com/#/remote/report/
> f87375a634b1a5ba746e5f763a67048a

I've somehow expected this to fail at some point. Will run things again see if I can reproduce locally, but doubt that
I think this is likely to be intermittent. We could watch it and push when we're satisfied, or increase the timeout further?
The jenkins master was under heavy load with about 100% for the Java process. It could be related to that. So lets watch upcoming testruns for this test.
Status: REOPENED → ASSIGNED
No more failures on default:
http://mozmill-ci.blargon7.com/#/remote/reports

Tests have been backported to:
http://hg.mozilla.org/qa/mozmill-tests/rev/465867142118 (aurora)
http://hg.mozilla.org/qa/mozmill-tests/rev/da5626047354 (beta)
http://hg.mozilla.org/qa/mozmill-tests/rev/929830ce80c2 (release)

Vlad, please come up with a backport for esr10 because we fail to apply this patch.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
I know I know I was waiting for the results before uploading it. 
A patch will follow up shortly
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Please stop submitting updates if you get a mid-air collision. This warning isn't shown without a reason. Release the bug and resubmit again instead. Thanks!
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Attached patch esr patch v1.0Splinter Review
patch for mozilla-esr10 branch
Attachment #626721 - Flags: review?(hskupin)
Comment on attachment 625992 [details] [diff] [review]
re-enable test patch v1.2

I assume that patch should have gotten a r+.
Attachment #625992 - Flags: review?(dave.hunt) → review+
Attachment #626721 - Flags: review?(hskupin) → review+
Landed as:
http://hg.mozilla.org/qa/mozmill-tests/rev/5b2386f0e6f9

Lets get the Litmus tests re-enabled now.
Flags: in-litmus?(vlad.mozbugs)
Whiteboard: [mozmill-test-failure][remote][mozmill-test-skipped] → [mozmill-test-failure][remote]
The litmus test here is for AMO 
https://litmus.mozilla.org/show_test.cgi?id=15373 
and was already enabled. 

We do not have a mozmill subgroup for AMO at the moment
Flags: in-litmus?(vlad.mozbugs)
I noticed this test is still failing though intermittently and with different errors in Firefox 14.0a2 and 15.0a1:

Example:
http://mozmill-ci.blargon7.com/#/remote/report/fdec829b93b19c73985be1d3882430af

Also seeing the following errors intermittently:
* aElement is undefined
* AddonsManager_categories: Categories could not be found. 
* Selected category has been loaded.
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> I noticed this test is still failing though intermittently and with
> different errors in Firefox 14.0a2 and 15.0a1:
> 
> Example:
> http://mozmill-ci.blargon7.com/#/remote/report/
> fdec829b93b19c73985be1d3882430af
> 
> Also seeing the following errors intermittently:
> * aElement is undefined
> * AddonsManager_categories: Categories could not be found. 
> * Selected category has been loaded.

This test is network dependent so I was expecting it to fail at some point
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)

> * AddonsManager_categories: Categories could not be found. 

That should never fail. Please file this as a new bug. Could be a regression caused by the different category timeout handling code.

> * aElement is undefined
> * Selected category has been loaded.

We should try to increase the timeout for the remote pane even more. Lets do it here.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Henrik Skupin (:whimboo) from comment #38)
> (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> 
> > * AddonsManager_categories: Categories could not be found. 
> 
> That should never fail. Please file this as a new bug. Could be a regression
> caused by the different category timeout handling code.

See bug 759832.
Andreea has been monitoring AMO for a while and it seems we do not see the '404' Error anymore, which is also reflected in our lately reports - the timeout error is not seen 
http://mozmill-ci.blargon7.com/#/remote/reports?branch=17.0&platform=All&from=2012-08-03&to=2012-08-06

We have a new one though, 
http://mozmill-ci.blargon7.com/#/remote/report/29fc09ba0c8360d637617903a01672f1

I think we should move it to another bug and keep monitoring both.
Based on comment 41 and the report link I am closing this as a WFM. 
Please anybody seeing the timeout again, reopen this issue. 

At the moment, we are going to continue monitoring this test on bug 780556
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → WORKSFORME
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: