Closed Bug 1082133 Opened 10 years ago Closed 10 years ago

browser_*_behavior.js fails when run as a standalone directory

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 36

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

Attachments

(1 file, 5 obsolete files)

we are close to changing our mochitests to run one directory at a time.  Unfortunately while doing this we run into cases where tests fail, usually because they accidentally depend on the state of the browser being set by a previous test.

In this case we have a consistent set of bc1 failures:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=6d8041d9592d

running this locally, I can reproduce the problems easily:
./mach mochitest-browser browser/components/search/test

775 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/search/test/browser_bing_behavior.js | uncaught exception - NotFoundError: The operation failed because the requested database object could not be found. For example, an object store did not exist but was being opened. at chrome://browser/content/abouthome/aboutHome.js:256
Stack trace:
chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:simpletestOnerror:1317
null:null:0
776 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/search/test/browser_google_behavior.js | uncaught exception - NotFoundError: The operation failed because the requested database object could not be found. For example, an object store did not exist but was being opened. at chrome://browser/content/abouthome/aboutHome.js:256
Stack trace:
chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:simpletestOnerror:1317
null:null:0
777 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/search/test/browser_yahoo_behavior.js | uncaught exception - NotFoundError: The operation failed because the requested database object could not be found. For example, an object store did not exist but was being opened. at chrome://browser/content/abouthome/aboutHome.js:256
Stack trace:
chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:simpletestOnerror:1317


If there are any pointers for me to hack on this, please speak up.
strange, these tests use ignoreAllUncaughtExceptions
Thanks Mak.

I looked into this and we do ignore uncaught exceptions:
http://dxr.mozilla.org/mozilla-central/source/browser/components/search/test/browser_google_behavior.js#146

but this is for 1 of the many tests, could it be that the other tests hit this scenario?
oh this doesn't look right, but in my testing we were always resetting the value of ignoreunexpectedexceptions BEFORE we checked for them.  I had even ignored it for every single test case in every single file and it was still failing.  

This is my version 1, maybe we can take this or come closer to a real solution.
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8504952 - Flags: review?(mak77)
I suspect something is fishy in the way the test proceeds, like it proceeds to the next test too early and thus it resets ignore exceptions too early... may you check if the test using about:home is properly waiting both for the page to be loaded and the attribute mutation before proceeding?
I'm not sure why standalone directory makes a difference here though...
this new approach isn't ideal, but it works locally 5 times in a row for me.  Maybe this will highlight the problem that we are having and help expose the bug elsewhere?
Attachment #8504952 - Attachment is obsolete: true
Attachment #8504952 - Flags: review?(mak77)
Attachment #8505454 - Flags: feedback?
Attachment #8505454 - Flags: feedback? → feedback?(mak77)
I have pushed this to try server and it is green.  I would be willing to try anything else out, but would like to get this figured out sooner rather than later.
yeah sorry, but before taking a workaround I would like to understand a little bit better what's up... So I need to reserve some hours for trying to debug it.
fair enough; I am a bit out of my knowledge domain with this bug- but hopefully my work has at least given you a starting point for debugging.  If you have other data you think I could collect, do ask and I will test/patch/collect/report what I can!
:mak, have you had a chance to look into this at all?  I would be excited to get this landed this week.
Flags: needinfo?(mak77)
Sorry for late, I just finished the assigned work and now I can look into this.
Unfortunately I cannot reproduce the failure locally, maybe it's related to the machine speed or the OS (I tried on Win8.1).

I think to move on in a safer way we should do a couple things:

1. split out of the default_behavior tests the about:home part of the test.

This basically means copy the test to a browser_*_abouthome_behavior.js file, remove all of the tests but the about:home one, then in the original test remove the about:home one. This way we can set IgnoreAllUncaughtExceptions globally for the whole test rather than for a single piece of it, that should make it more reliable.

2. cleanup a couple details based on browser_abouthome, for example disconnect the mutation observer once we get the mutation, and executeSoon before doing any job inside the mutation observer. See:
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/test/general/browser_aboutHome.js#575

Regarding the reason, it's pretty clear sometimes, by the time we run, the indexeddb database is not ready yet. It's possible the executeSoon in the mutation observer will help, in any case once the test is splitted out it's easier to work on it and we can also skip it without touching the other tests.
Flags: needinfo?(mak77)
Comment on attachment 8505454 [details] [diff] [review]
alternative solution, wait 100ms before going to the next test (0.9)

Review of attachment 8505454 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think I can approve solutions based on setTimeout, sorry. We had too many intermittents in the past due to setTimeout being a "solution" :/
Attachment #8505454 - Flags: feedback?(mak77) → feedback-
Great idea Mak.  I have done this and tested locally.  Unfortunately it fails for me, but at least we have all the other tests running.  It would be nice to agree that the new test file is doing the right thing and the other cleanup is done appropriately.  

Once there, maybe I can hack around a bit and get the new test file to work partially.  Thanks again for helping out here!
Attachment #8505454 - Attachment is obsolete: true
Attachment #8511294 - Flags: feedback?(mak77)
Comment on attachment 8511294 [details] [diff] [review]
move about:home tests to their own file, comment out for now (1.0)

Review of attachment 8511294 [details] [diff] [review]:
-----------------------------------------------------------------

yeah i think this is a good path forward, so we could also eventually skip this and handle it in a separate bug so it won't block by-dir running.

let's see if the suggested changes here help.

::: browser/components/search/test/browser.ini
@@ +35,5 @@
>  [browser_yahoo.js]
>  [browser_yahoo_behavior.js]
>  skip-if = e10s # Bug ?????? - some issue with progress listeners [JavaScript Error: "req.originalURI is null" {file: "chrome://mochitests/content/browser/browser/components/search/test/browser_bing_behavior.js" line: 127}]
> +[browser_abouthome_behavior.js]
> +skip-if = e10s || true # Bug ???????; bug 1082133 - fails in run-by-dir mode due to leaking windows until shutdown 

trailing space

::: browser/components/search/test/browser_abouthome_behavior.js
@@ +2,5 @@
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +
> +/*
> + * Test Google search plugin URLs
> + */

this comment needs to be updated

@@ +10,5 @@
> +const BROWSER_SEARCH_PREF      = "browser.search.";
> +
> +const MOZ_PARAM_LOCALE         = /\{moz:locale\}/g;
> +const MOZ_PARAM_DIST_ID        = /\{moz:distributionID\}/g;
> +const MOZ_PARAM_OFFICIAL       = /\{moz:official\}/g;

would be nicer to add a _REGEX suffix/prefix

@@ +12,5 @@
> +const MOZ_PARAM_LOCALE         = /\{moz:locale\}/g;
> +const MOZ_PARAM_DIST_ID        = /\{moz:distributionID\}/g;
> +const MOZ_PARAM_OFFICIAL       = /\{moz:official\}/g;
> +
> +let runtime = Cc["@mozilla.org/xre/app-info;1"].getService(Ci.nsIXULRuntime);

Services.appinfo should do

@@ +17,5 @@
> +// Custom search parameters
> +const MOZ_OFFICIAL = runtime.isOfficialBranding ? "official" : "unofficial";
> +const MOZ_DISTRIBUTION_ID = runtime.distributionID;
> +
> +var google_client;

please just use "let" everywhere in this new test

@@ +33,5 @@
> +  google_client = "firefox-a";
> +  break;
> +}
> +
> +const GOOGLE_CLIENT = google_client;

XPCOMUtils.defineLazyGetter(this, "GOOGLE_CLIENT", () => {
  switch (services.appinfo.defaultUpdateChannel) {
    case "beta":
     return "firefox-beta";
    case...
  }
});

@@ +44,5 @@
> +  try {
> +    distributionID = Services.prefs.getCharPref(BROWSER_SEARCH_PREF + "distributionID");
> +  } catch (ex) {
> +    distributionID = MOZ_DISTRIBUTION_ID;
> +  }

I'd also move this to a global lazyGetter

@@ +59,5 @@
> +
> +  var gCurrTest;
> +  var gTests = [
> +    {
> +      name: "home page search",

test names should be personalized adding the search engine name

@@ +69,5 @@
> +        let previouslySelectedEngine = Services.search.currentEngine;
> +        Services.search.currentEngine = engine;
> +
> +        // load about:home, but remove the listener first so it doesn't
> +        // get in the way

all of the code waiting for load and the mutation could be factored out into an helper, it's the same for each engine...
That should quite simplify the test structure.

@@ +164,5 @@
> +          let doc = gBrowser.contentDocument;
> +          let mutationObserver = new MutationObserver(function (mutations) {
> +            for (let mutation of mutations) {
> +              if (mutation.attributeName == "searchEngineName") {
> +                // Re-add the listener, and perform a search

As I suggested, once we get the right attribute mutation we should executeSoon before doing anything, so we give the page one tick to handle the same mutation.

@@ +262,5 @@
> +      info("Running : " + gCurrTest.name);
> +      executeSoon(gCurrTest.run);
> +    } else {
> +      // Make sure we listen again for uncaught exceptions in the next test or cleanup.
> +      ignoreAllUncaughtExceptions(false);

there should be no need for this, indeed I think you should not unset it here, it's less safe for intermittent oranges. the test harness will take care of that.

::: browser/components/search/test/browser_amazon_behavior.js
@@ +62,2 @@
>      {
>        name: "new tab search",

why did you disable the newtab test?
Attachment #8511294 - Flags: feedback?(mak77) → feedback+
thanks for the ping- I had this buried in my queue- let me spend some time this morning on redoing the patch!
updated test with feedback, verified it locally in a few ways and it is up on try!
Attachment #8511294 - Attachment is obsolete: true
Attachment #8519008 - Flags: feedback?
Attachment #8519008 - Flags: feedback? → review?(mak77)
Comment on attachment 8519008 [details] [diff] [review]
move about:home search testing to own file (2.0)

Review of attachment 8519008 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't see the try results, but I hope they are good!

::: browser/components/search/test/browser_abouthome_behavior.js
@@ +22,5 @@
> +  } catch (ex) {
> +    return Services.appinfo.distributionID;
> +  }
> +
> +});

nit: pointless newline

@@ +44,5 @@
> +
> +  function replaceUrl(base) {
> +    base = base.replace(MOZ_PARAM_LOCALE_REGEX, getLocale());
> +    base = base.replace(MOZ_PARAM_DIST_ID_REGEX, distributionID);
> +    base = base.replace(MOZ_PARAM_OFFICIAL_REGEX, MOZ_OFFICIAL);

return base.replace(...)
           .replace(...)
           .replace(...);
Attachment #8519008 - Flags: review?(mak77) → review+
I noticed on try  server I was failing, the reason was two parts:
1)     default:
      // In automation I get org.mozilla, locally I get firefox-a
      return "firefox-a";

2) I would get windows leaking from this test.  So we need to figure this out or land it disabled.  Since we are not ready to flip the switch (new tests have failed), then I can keep working on this.

open to any thoughts on this.
(In reply to Joel Maher (:jmaher) from comment #18)
> I noticed on try  server I was failing, the reason was two parts:
> 1)     default:
>       // In automation I get org.mozilla, locally I get firefox-a
>       return "firefox-a";

how does this differ from before? Why were the previous tests passing?

maybe the problem is that "GOOGLE_CLIENT" should use the distributionID lazyGetter, rather than directly return Services.appinfo.distributionID;

> 2) I would get windows leaking from this test.  So we need to figure this
> out or land it disabled.  Since we are not ready to flip the switch (new
> tests have failed), then I can keep working on this.

I'd be fine landing disabled with a follow-up to figure the leak. What is leaking?

FWIW, looks like you are not calling .disconnect() (IIRC that's the right name) on the mutation observer once you get the right mutation. Not sure if this is related or not.
I'd also try to executeSoon(finish) rather than finish()
taking review feedback into account;  I had to do more than I thought was carryover r+ worthy for the mutationObserver .disconnect(), so I am asking for another review.

Running locally this is great; I am testing on try for a few scenarios and if try is good and r+, this will land!
Attachment #8519008 - Attachment is obsolete: true
Attachment #8520704 - Flags: review?(mak77)
Comment on attachment 8520704 [details] [diff] [review]
move about:home search testing to own file (2.1)

Review of attachment 8520704 [details] [diff] [review]:
-----------------------------------------------------------------

I hope that's a green!

::: browser/components/search/test/browser_abouthome_behavior.js
@@ +48,5 @@
> +  }
> +
> +  let gMutation = null;
> +
> +  function cleanup_mutation() {

instead of this I'd do:

registerCleanupFunction(() => {
  if (gMutation)
    gMutation.disconnect();
});

...and see below...

@@ +77,5 @@
> +      tab.linkedBrowser.removeEventListener("load", load, true);
> +
> +      // Observe page setup
> +      let doc = gBrowser.contentDocument;
> +      let mutationObserver = new MutationObserver(function (mutations) {

...here I'd rather do
gMutation = new MutationObserver...

@@ +81,5 @@
> +      let mutationObserver = new MutationObserver(function (mutations) {
> +        for (let mutation of mutations) {
> +          if (mutation.attributeName == "searchEngineName") {
> +            // Re-add the listener, and perform a search
> +            gBrowser.addProgressListener(listener);

gMutation.disconnect();
gMutation = null;

then you can avoid all of the cleanup_mutation calls
Attachment #8520704 - Flags: review?(mak77) → review+
updated with review feedback, will wait for a few more test results before checkin-needed, but so far it is looking great!
Attachment #8520704 - Attachment is obsolete: true
Attachment #8520810 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4e83df89ebb8
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 36
my most recent sanity check yields that this is leaking windows until shutdown on win7|8:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f0cfa6ce5f87

Any thoughts on things I could try?  I am glad to see this working well on other platforms though.  Worse case scenario I could just not run this specific test on windows until we figure this out.
Status: RESOLVED → REOPENED
Flags: needinfo?(mak77)
Resolution: FIXED → ---
The only thing I recall is that indexedDB when a tab is closed takes some time to close and in the meanwhile it keeps the global window alive.
To cope with this in the past we made b-c tests run gc+cc twice before reporting leaks, with a timeout.
It's possible for whatever reason the timeout is not enough for this test that opens about:home so many times?

You could do a try push increasing the timeout to 3 seconds and see what happens:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/browser-test.js#561
Flags: needinfo?(mak77)
hmm, this is still failing on try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a5d5cf77c0a0

Any other ideas?  Maybe I could reduce the scope of the searches we do?  We have 5 tests and are leaking 4 windows, so either we do something right in one of our searches, or the first one has enough time to properly close down.  Maybe we settimeout(finish, 5000) ?  I really don't like doing that, but it could be a solution.
I'd first test if increasing the timeout to a large value (let's say 10s) would still fail, in such a case it's likely there's a real leak...
It appears we probably have a leak:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=8d915f6709d7

If you look at the changes I did, you can see there are some longer delays there:
https://hg.mozilla.org/try/rev/8d915f6709d7

I am leaning towards disabling this test on windows for the short term, should I file a new bug for the leaking windows?
(In reply to Joel Maher (:jmaher) from comment #30)
> I am leaning towards disabling this test on windows for the short term,
> should I file a new bug for the leaking windows?

I think it's ok to disable on Windows and have a separate bug about the leak, the skip-if line in browser.ini should point to that filed bug.
Thanks for the tries.
Depends on: 1100301
taking the leaking windows over to bug 1100301
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: