Closed Bug 1246695 Opened 4 years ago Closed 4 years ago

Fix e10s TART newtab race condition on newtab change

Categories

(Testing :: Talos, defect, P1)

defect

Tracking

(e10s+, firefox46+ fixed, firefox47+ fixed)

RESOLVED FIXED
mozilla47
Iteration:
47.2 - Feb 22
Tracking Status
e10s + ---
firefox46 + fixed
firefox47 + fixed

People

(Reporter: oyiptong, Assigned: oyiptong)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(1 file)

There is a race condition present in deaf7c6ca55b and before which causes frequent TART timeouts.

The behavior was fixed in bug 1218996, but introduced undesirable behavior for addon developers. A change in 1240169 reverted the behavior, but re-introduced the TART timeouts.
I propose a patch that fixes the race conditions that occur between the time TART sets a newtab URL and opens a newtab page.

There are still some other race conditions, but this should make the situation better than it was on and prior to deaf7c6ca55b, so we will be able to collect TART data points.
Assignee: nobody → oyiptong
Blocks: 1240169
Assignee: oyiptong → nobody
No longer blocks: 1240169
Sorry, I accidentally reverted some changes to this bug :/
Assignee: nobody → oyiptong
Blocks: 1240169
Olivier, thanks for filing this bug and pushing on a fix!
Blocks: 1230978
Comment on attachment 8717086 [details]
MozReview Request: Bug 1246695 - Fix e10s TART newtab race condition on newtab change r?mconley

https://reviewboard.mozilla.org/r/34041/#review31059

This looks fine - there are a few places where we've broken up single functions into many smaller functions that I don't think we really need to do. See below.

Thanks for looking into this, oyiptong!

::: testing/talos/talos/tests/tart/addon/content/tart.js:494
(Diff revision 1)
> -        function(){aboutNewTabService.newTabURL = "about:blank";
> -                   Services.prefs.setBoolPref("browser.newtabpage.enabled", true); // preview images if using about:newtab
> +        function(){Services.prefs.setBoolPref("browser.newtabpage.enabled", true); next();},
> +        function(){Services.prefs.setBoolPref("browser.newtab.preload", false); next();},

Is it really necessary to break these up like this into all of these functions? I suspect we can get the same behaviours with:

```JavaScript:

init: [
  function() {
    Services.prefs.setBoolPref("browser.newtabpage.enabled", true);
    Services.prefs.setBoolPref("browser.newtab.preload", false);
    self.pinTart();
    self.makeNewTabURLChangePromise("about:blank").then(next);
  }
]
```

I realize there are other items in this `subtests` array that break things up like that, but I'd rather we not transmute this single function into so many.

::: testing/talos/talos/tests/tart/addon/content/tart.js:502
(Diff revision 1)
> -        function(){aboutNewTabService.resetNewTabURL();
> -                   Services.prefs.setBoolPref("browser.newtabpage.enabled", origNewtabEnabled);
> -                   Services.prefs.setBoolPref("browser.newtab.preload", origPreload);
> -                   Services.prefs.setCharPref("layout.css.devPixelsPerPx", origDpi);
> -                   if (origPinned) self.pinTart(); else self.unpinTart();
> +        function(){Services.prefs.setBoolPref("browser.newtabpage.enabled", origNewtabEnabled); next();},
> +        function(){Services.prefs.setBoolPref("browser.newtab.preload", origPreload); next();},
> +        function(){Services.prefs.setCharPref("layout.css.devPixelsPerPx", origDpi); next();},
> +        function(){self.makeNewTabURLChangePromise("about:newtab").then(next);},
> +        function(){if (origPinned) self.pinTart(); else self.unpinTart(); next();},

Same as above - breaking each of these synchronous operations into individual functions doesn't seem necessary. The only one we really care about is the `self.makeNewTabURLChangePromise` one, which we can put at the end of a single function, calling `next` on resolve.

::: testing/talos/talos/tests/tart/addon/content/tart.js:539
(Diff revision 1)
> -        function(){aboutNewTabService.resetNewTabURL();
> -                   Services.prefs.setCharPref("layout.css.devPixelsPerPx", "-1");
> -                   Services.prefs.setBoolPref("browser.newtab.preload", false);
> +        function(){Services.prefs.setCharPref("layout.css.devPixelsPerPx", "-1"); next();},
> +        function(){Services.prefs.setBoolPref("browser.newtab.preload", false); next();},
> +        function(){self.makeNewTabURLChangePromise("about:newtab").then(next);},

Let's not split these up into separate functions, and instead keep the single function, calling next once the `makeNewTabURLChangePromise` resolves.

::: testing/talos/talos/tests/tart/addon/content/tart.js:548
(Diff revision 1)
> -        function(){aboutNewTabService.resetNewTabURL();
> -                   Services.prefs.setCharPref("layout.css.devPixelsPerPx", "-1");
> -                   Services.prefs.setBoolPref("browser.newtab.preload", true);
> +        function(){Services.prefs.setCharPref("layout.css.devPixelsPerPx", "-1"); next();},
> +        function(){Services.prefs.setBoolPref("browser.newtab.preload", true); next();},
> +        function(){self.makeNewTabURLChangePromise("about:newtab").then(next);},

Same as above.
Attachment #8717086 - Flags: review?(mconley) → review+
> Is it really necessary to break these up like this into all of these functions?

I don't mind one way or another, just wanted to make the code consistent with some other function chains.
(In reply to Olivier Yiptong [:oyiptong] from comment #8)
> > Is it really necessary to break these up like this into all of these functions?
> 
> I don't mind one way or another, just wanted to make the code consistent
> with some other function chains.

Unless accidentally, the approach should be that things which are synchronous by nature (such as changing a pref) and/or don't get measured (such as synchronous blocks of setups) don't have to be split into asynchronous chains.
Comment on attachment 8717086 [details]
MozReview Request: Bug 1246695 - Fix e10s TART newtab race condition on newtab change r?mconley

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34041/diff/1-2/
https://hg.mozilla.org/mozilla-central/rev/dfce706d1c8f
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Comment on attachment 8717086 [details]
MozReview Request: Bug 1246695 - Fix e10s TART newtab race condition on newtab change r?mconley

Approval Request Comment
[Feature/regressing bug #]: Fixes Talos TART timeouts
[User impact if declined]: We currently collect TART data unreliably. Our animation data collection/regression testing will get spotty if we don't uplift
[Describe test coverage new/current, TreeHerder]: https://treeherder.mozilla.org/#/jobs?repo=try&revision=00e1c17a72ce
[Risks and why]: The risk is that newer TART performance numbers may not be compatible with the current ones. This is because the older TART did no waiting on asynchronous events that is the root cause of the problems. So far, it doesn't look different: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=08eb5c1cd197&newProject=try&newRevision=00e1c17a72ce&framework=1&showOnlyImportant=0
[String/UUID change made/needed]: N/A
Attachment #8717086 - Flags: approval-mozilla-aurora?
Tracking for 46+, sounds like this is important performance test support for e10s.
Comment on attachment 8717086 [details]
MozReview Request: Bug 1246695 - Fix e10s TART newtab race condition on newtab change r?mconley

OK to uplift to aurora. Sounds like this will fix a race condition maybe improving new tab performance and making our test results more reliable.
Attachment #8717086 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I'll rebase on top of aurora and take a look
Flags: needinfo?(oyiptong)
I've found the root cause of the issue on aurora. It's because the TART patch relies on bug 1240169 and assumes it is applied. D'oh.

I have a fix to bug 1246695 fix that works both with or without bug 1240169 applied: https://hg.mozilla.org/try/rev/31c976cc2b15

Here are the results on treeherder:

Fix, Before bug 1240169: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d275ff7d139b
Fix, With bug 1240169 applied: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d120ac73ce16

If we want to apply bug 1246695 independently from bug 1240169, we will also need to apply the fix.

What's the best way to move forward?
Flags: needinfo?(lhenry)
Here is a build of bug 1240169 and bug 1246695 applied on top of the latest aurora: https://treeherder.mozilla.org/#/jobs?repo=try&revision=54202240a496
Let's uplift both then. Is there a particular order that will work best?
Flags: needinfo?(lhenry) → needinfo?(oyiptong)
Comment on attachment 8717086 [details]
MozReview Request: Bug 1246695 - Fix e10s TART newtab race condition on newtab change r?mconley

Should support better perf testing and a regression fix for 46, please uplift
bug 1240169 and bug 1246695 should ideally be pushed together
Flags: needinfo?(oyiptong)
You need to log in before you can comment on or make changes to this bug.