Closed Bug 1333713 Opened 7 years ago Closed 7 years ago

Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js,browser_extension_update_background.js | Test timed out

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 54
Tracking Status
firefox52 --- unaffected
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: aswan)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed])

Attachments

(2 files)

Other than comment 0, all failures here follow the recent change to browser_extension_update.js in bug 1330823.
Flags: needinfo?(aswan)
I didn't look back through every old starring to be sure it started separately, but it looks a bit like at least on Linux it shifted yesterday into the "hey, I'm still working here!" flavor of exceeded the timeout threshold.
Summary: Intermittent browser/base/content/test/general/browser_extension_update.js | Test timed out - → Intermittent browser/base/content/test/general/browser_extension_update.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up.
this is fairly frequent now
Most failures are linux32/debug. Logs have several instances of:

Warning: attempting to write 4287 bytes to preference extensions.bootstrappedAddons. This is bad for general performance and memory usage. Such an amount of data should rather be written to an external file. This preference will not be sent to any content processes.


As philor observed, most current failures are "This test exceeded the timeout threshold. It should be rewritten or split up..." which could be addressed with requestLongerTimeout().
Hm, I did add another test to that file recently (and have another one to add shortly).  I'm surprised though, this test runs in 30s on a not-especially-fast macbook air.  I guess the combination of debug builds and slower linux vms is enough to make it 3x slower?
I'll break this up, patch coming shortly.  Leaving the ni? until the patch is up.
Flags: needinfo?(aswan)
Attachment #8835068 - Flags: review?(dtownsend)
Assignee: nobody → aswan
Comment on attachment 8835068 [details]
Bug 1333713 Split up browser_extension_update.js

https://reviewboard.mozilla.org/r/110784/#review112176
Attachment #8835068 - Flags: review?(dtownsend) → review+
Pushed by aswan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e5e790fb5edd
Split up browser_extension_update.js r=mossop
Summary: Intermittent browser/base/content/test/general/browser_extension_update.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up. → Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up.
https://hg.mozilla.org/mozilla-central/rev/e5e790fb5edd
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
Dunno whether there's some larger number of chunks you could break it up into which wouldn't time out, but the answer apparently was not "two chunks."
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up. → Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js,browser_extension_update_background.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up.
Target Milestone: Firefox 54 → ---
:mossop, can you take another look here?
Flags: needinfo?(dtownsend)
Over to Andrew
Flags: needinfo?(dtownsend) → needinfo?(aswan)
This looks like a different failure, its not the "it should be rewritten" flavor of timeout.
Can we get a new bug for this failure?  (I'm not sure how to do that and have it properly connected with brasstacks and the intermittent reporter)
Flags: needinfo?(aswan) → needinfo?(jmaher)
I see a couple problems here:
1) incorrect version of the addon installed
2) test timing out in general, from what I see it is 99% browser_extension_update_interactive.js.

I think our focus should be on browser_extension_update_interactive.js and any fixes there could be applied to browser_extension_update.js or browser_extension_update_background.js if applicable.

it looks like the last line we print status out from is:
https://dxr.mozilla.org/mozilla-central/source/browser/base/content/test/general/browser_extension_update_interactive.js#156

and we never output the ok messages from:
https://dxr.mozilla.org/mozilla-central/source/browser/base/content/test/general/browser_extension_update_interactive.js#40


here is data from a linux64 log [0]:
0:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | Created install - 
10:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | Addon was installed - 
10:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | Version 1 of the addon is installed - 
10:20:04     INFO - Console message: Warning: attempting to write 4424 bytes to preference extensions.bootstrappedAddons. This is bad for general performance and memory usage. Such an amount of data should rather be written to an external file. This preference will not be sent to any content processes.
10:20:04     INFO - Console message: 1487096361282	addons.xpi	WARN	Failed to remove temporary file C:\Users\cltbld\AppData\Local\Temp\tmp-w85.xpi for addon https://example.com/browser/browser/base/content/test/general/browser_webext_update1.xpi: [Exception... "Component returned failure code: 0x80520015 (NS_ERROR_FILE_ACCESS_DENIED) [nsIFile.remove]"  nsresult: "0x80520015 (NS_ERROR_FILE_ACCESS_DENIED)"  location: "JS frame :: resource://gre/modules/addons/XPIProvider.jsm :: removeTemporaryFile :: line 5469"  data: no] Stack trace: removeTemporaryFile()@resource://gre/modules/addons/XPIProvider.jsm:5469 < startInstall/<()@resource://gre/modules/addons/XPIProvider.jsm:5836 < process()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:922 < walkerLoop()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:806
10:20:04     INFO - Console message: 1487096361893	addons.update-checker	WARN	Update manifest did not contain an entry for workerbootstrap-test@mozilla.org
10:20:04     INFO - Console message: 1487096361894	addons.update-checker	WARN	Update manifest did not contain an entry for worker-test@mozilla.org
10:20:04     INFO - Console message: 1487096361896	addons.update-checker	WARN	Update manifest did not contain an entry for mozscreenshots@mozilla.org
10:20:04     INFO - Console message: 1487096361920	addons.update-checker	WARN	Update manifest did not contain an entry for indexedDB-test@mozilla.org
10:20:04     INFO - Console message: 1487096361922	addons.update-checker	WARN	Update manifest did not contain an entry for special-powers@mozilla.org
10:20:04     INFO - Buffered messages logged at 10:19:22
10:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | addon-webext-permissions notification shown - 
10:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | notification panel open - 
10:20:04     INFO - TEST-PASS | browser/base/content/test/general/browser_extension_update_interactive.js | Should still be running the old version - 
10:20:04     INFO - Console message: 1487096362269	addons.update-checker	WARN	Update manifest did not contain an entry for mochikit@mozilla.org
10:20:04     INFO - Buffered messages finished
10:20:04     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_extension_update_interactive.js | Test timed out - 
10:20:04     INFO - Not taking screenshot here: see the one that was previously logged
10:20:04     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_extension_update_interactive.js | Addon update@tests.mozilla.org was still installed at the end of the test - 
10:20:04     INFO - Stack trace:
10:20:04     INFO - chrome://mochitests/content/browser/browser/base/content/test/general/browser_extension_update_interactive.js:null:9
10:20:04     INFO - Console message: Warning: attempting to write 4137 bytes to preference extensions.bootstrappedAddons. This is bad for general performance and memory usage. Such an amount of data should rather be written to an external file. This preference will not be sent to any content processes.
10:20:04     INFO - MEMORY STAT | vsize 838MB | vsizeMaxContiguous 726MB | residentFast 270MB | heapAllocated 115MB
10:20:04     INFO - TEST-OK | browser/base/content/test/general/browser_extension_update_interactive.js | took 45157ms
10:20:04     INFO - Not taking screenshot here: see the one that was previously logged
10:20:04     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_extension_update_interactive.js | Found a tab after previous test timed out: about:addons - 
10:20:05     INFO - checking window state


maybe the NS_ERROR_FILE_ACCESS_DENIED is the root cause?


:aswan, let me know if this doesn't help out much or if you would prefer I open a new bug, etc.

[0] https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-win32/1487092425/mozilla-inbound_win7_vm_test-mochitest-browser-chrome-7-bm128-tests1-windows-build155.txt.gz
Flags: needinfo?(jmaher) → needinfo?(aswan)
Summary: Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js,browser_extension_update_background.js | Test timed out, | This test exceeded the timeout threshold. It should be rewritten or split up. → Intermittent browser/base/content/test/general/browser_extension_update.js,browser_extension_update_interactive.js,browser_extension_update_background.js | Test timed out
The FILE_ACCESS_DENIED error is one we commonly get on windows that I would love to clean up, but I don't think its the cause of the more frequent failures here since we're also getting those on Linux which doesn't have the same file locking.
My original needinfo? was asking for a new bug, this bug was originally for an older test exceeding the per-file timeout, somehow these new failures got added to this bug instead of a new one getting created.
Flags: needinfo?(aswan)
Setting leave-open for now since I pushed a fix for an error message that doesn't fix this bug but will help diagnose it.
Upon closer examination, I think the Linux and Windows failures may actually be different, ugh.
Keywords: leave-open
Ugh this test seems to be failing in several different ways, its unclear to me if there's some common root cause or the whole thing is just a big train wreck.  This caught my eye though:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=77977610&lineNumber=2998
INFO - Console message: 1487261695196	addons.xpi	WARN	Download of https://example.com/browser/browser/base/content/test/general/browser_webext_update2.xpi failed: Refusing to upgrade addon update@tests.mozilla.org to different ID update_perms@tests.mozilla.org

This test installs browser_webext_update1.xpi which then gets updated to browser_webext_update2.xpi, both of which have the ID update@tests.mozilla.org.  How did update_perms@tests.mozilla.org ever arise?  That is the ID of browser_webext_update_perms.xpi which is used in a completely different test...
:aswan, this is still failing very frequently- it has been a week since we last looked at this.  Do you have other thoughts on fixing this?  Given the frequency here, I would be inclined to disable this set of tests on windows opt/pgo in the short term until we can figure more out.
Flags: needinfo?(aswan)
Not sure what the trunk percentage is, but https://treeherder.mozilla.org/#/jobs?repo=try&revision=5baf1cfa37b52598bcacd2aadbe1abd7f5a43765&selectedJob=79584338&group_state=expanded says _interactive is going to have around a 60% failure rate on Win7 PGO (which is the only opt, on release branches) when we merge to aurora in just over a week.
Well, I've got patches for bug 1340586 nearly finished, which should address the "it should be rewritten or split up" style of failures, but there appear to be a few other type of failures that have also gotten swept into this bug, which I haven't yet been able to figure out.  If you want to disable tests until I get 1340586 in and/or manage to diagnose the other failures, I certainly understand and don't object.
Flags: needinfo?(aswan)
I need to follow up here in 2 days (March 2nd) to ensure this is fixed- outage today makes this hard to verify
Flags: needinfo?(jmaher)
Did I miss something?  The underlying problem has never been identified so I don't expect this to be fixed.  I started a try job this morning in an effort to gather more information but I think it was a victim of the aws outage.  I'll start a new one when try re-opens.
I had assumed the work in bug 1340586 would have solve this or most of this bug.  I will stay tuned- I have a few fixes blocked by the AWS outage as well.
Bug 1340586 can help with the "it should be splitten up" flavor of timeout but as I said a few times above (eg comment 17, comment 34), there are other failures that have gotten lumped into this bug.  Splitting those out would be a fine thing but I don't know how to do that and keep all the intermittent tracking working properly so I'm just proceeding using this bug for now.
Update: I'm narrowing in on the most common failure here, for every update method we do the following sequence:
- trigger an update check
- expect an update notification, check its contents
- cancel the update, ensure that the update it not applied
- trigger a new update check that will eventually be applied...

There appears to be a race between the 3rd and 4th steps where the cancellation of the first update has not been fully applied so triggering the new update at step 4 turns into a no-op (since it appears that there is already an update check in progress)
I'm actively on this, hope to make further progress today.
thanks for working on this :aswan, Please ask for help if needed, I am happy to push to try or try a few things out in a one click loaner.
Flags: needinfo?(jmaher)
Whiteboard: [stockwell needswork]
Attachment #8842969 - Flags: review?(dtownsend)
Comment on attachment 8842969 [details]
Bug 1333713 Ensure last update check finished before starting a new one

https://reviewboard.mozilla.org/r/116686/#review118496

Dropping review for the changes we discussed over IRC for adding an observer notification when the update checks complete. It's probably also worth waitying for that after starting an update check rather than waiting for it before starting a new one.
Attachment #8842969 - Flags: review?(dtownsend)
Comment on attachment 8842969 [details]
Bug 1333713 Ensure last update check finished before starting a new one

https://reviewboard.mozilla.org/r/116686/#review118542

::: browser/base/content/test/webextensions/browser_extension_update_interactive.js:77
(Diff revision 2)
>  
>    let win = yield BrowserOpenAddonsMgr("addons://list/extension");
>  
>    // Trigger an update check
>    let popupPromise = promisePopupNotificationShown("addon-webext-permissions");
> -  yield triggerUpdate(win, addon);
> +  let {promise: checkPromise} = yield triggerUpdate(win, addon);

Hrm, can we not just wait on the update check to complete here rather than later? That sucks.
Attachment #8842969 - Flags: review?(dtownsend) → review+
Comment on attachment 8842969 [details]
Bug 1333713 Ensure last update check finished before starting a new one

https://reviewboard.mozilla.org/r/116686/#review118542

> Hrm, can we not just wait on the update check to complete here rather than later? That sucks.

Well the "check for updates" command doesn't get re-enabled until all the installs have resolved somehow (finished, failed, or canceled).  And at this point, we still have to interact with the permission prompt for an updated extension (which is the whole purpose of this test) before its install can complete.  Which is a long way of saying, no, I don't think so.
Pushed by aswan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eb6b0553d079
Ensure last update check finished before starting a new one r=mossop
Whiteboard: [stockwell needswork] → [stockwell fixed]
See Also: → 1343612
Well this bug was tracking multiple failures and the underlying files got moved and renamed a few times so its kind of hard to say precisely what this bug is and hence if it is fixed.  But we're getting new bugs filed for intermittents in the new location for webextensions notification tests and the orange tracker isn't finding any new instances of this bug, so lets call it done.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
You need to log in before you can comment on or make changes to this bug.