Closed Bug 1499609 Opened 7 years ago Closed 7 years ago

Convert talos tabpaint and cpstartup to webextensions

Categories

(Testing :: Talos, defect)

Version 3
defect
Not set
normal

Tracking

(firefox65 fixed)

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: aswan, Assigned: aswan)

References

Details

Attachments

(2 files, 2 obsolete files)

These two are similar enough that I think it makes sense to convert them at the same time.
mconley, I've got a wip patch here for tabpaint. In addition to the webextension conversion, it also moves the pages used in the test from chrome: to http: (we discussed this briefly on vidyo a few weeks ago). As part of this I moved the code that computes how long a page took to open from target.html into a frame script (since target.html is now unprivileged). The measurement code derives the load time from window.performance.timing.fetchStart and the paintTimeStamp in the MozAfterPaint event, so I would expect it to come up with the same result regardless of where it runs? But in limited local testing, I'm now getting values that are consistently much higher than without this patch, and the values for opening from parent versus content are about equal (as opposed to before when opening from the parent was consistently faster than opening from content). The higher values could be explained by a performance difference for loading from chrome: vs http:, but it doesn't make immediate sense to me that it would also eliminate any measurable difference between opening a new tab from the parent versus opening from content. I just realized there's a preference (dom.send_after_paint_to_content) that will let unprivileged content listen for MozAfterPaint, perhaps I'll give that a try tomorrow. But something seems fishy here and I can't figure out what it is. I'll keep digging at this but wondering if you have any advice or notice any obvious explanation for these results from the patch. For what its worth, making the target page be a chrome: url is easy enough, I can also try that and if that gets us to numbers that are comparable to the current (ie without this patch) numbers, I assume we'd want to use that approach?
Flags: needinfo?(mconley)
(In reply to Andrew Swan [:aswan] from comment #2) > For what its worth, making the target page be a chrome: url is easy enough, > I can also try that and if that gets us to numbers that are comparable to > the current (ie without this patch) numbers, I assume we'd want to use that > approach? Considering the timeline to remove bootstrap extensions from the codebase, I think we should try to avoid getting caught in a rathole here. I advise testing the chrome:// route, and pushing for that if the numbers are comparable, and then filing a new follow-up to transition us to HTTP (and do investigations on the difference there).
Flags: needinfo?(mconley)
I attached a revised patch that keeps target.html using a chrome: url. It still moves the test driver (tabpaint.html) to http since we want to avoid a race where pageloader tries to load that page before the test extension is fully loaded. The actual test and measurement process should be identical here to the existing (non-webextension) version, but I'm still getting different results (same as what I described in comment 2) in local tests. I'll give this a whirl on try but I'm really not sure what's going on...
Flags: needinfo?(mconley)
Attached patch minimalistic tabpaint conversion (obsolete) — Splinter Review
I've been banging my head on this most of today. Even the minimal patch from comment 6 results in significant changes to the measurements. At one point, I also added a setTimeout() to tabpaint.html to delay the initial TabPaint:Go message and, with the existing bootstrap extension that increased the from-parent measurements noticeably. Not directly related but also confusing me is that the test reports separate figures for opening tabs from the parent process and from content, but in talos we have a single "tabpaint" statistics that I guess is derived from those two measurements? I'm not sure where or how those figures are combined though.
Are you able to produce and post profiles for the original case, and the WebExtension case? You can produce profiles using --geckoProfile with ./mach talos-test You might also need to build symbols, and point ./mach talos-test at it with --symbolsPath in order to get symbolication. You can build symbols with ./mach build symbols. These usually get placed in your objdir under dist, and choose either of the crashreporter symbols zips. So: ./mach build symbols ./mach talos-test -a tabpaint --symbolsPath <path to symbols zip file> --geckoProfile Look through the log spew, and towards the end it should dump out the path to a ZIP file with the symbolicated profiles. You can then upload those to perf.html and share them, and I can take a look.
Flags: needinfo?(mconley) → needinfo?(aswan)
Okay, I think I see what's happening. So in the original bootstrap.js version, target.html is loading in the created content process, and we eventually get its paint event. Here's a single run of tabpaint, just focusing on the parent opening case: https://perfht.ml/2EtdO3D I've filtered out everything in the content process except the load of target.html, and the paint (that's the green splotch). So, ideally, the WebExtension version of tabpaint would do the same thing. Unfortunately, that doesn't appear to be the case: https://perfht.ml/2EqH9vs target.html here is being loaded in the parent process. And then it paints in the parent process. That load and paint is blocked in the parent while we do the machinery of actually switching the tab (which the content process never really needs to care about). So what happens if we make it so that the chrome:// target.html can be loaded in the content process? Does the difference go away?
Flags: needinfo?(aswan)
As discussed on IRC, making target.html load from chrome: but load in a content process is not possible with the programmatic chrome registration that we're now using (as opposed to chrome.manifest). I did another try run with my original patch which loads target.html via http: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c1a1b1b0f0a50664fbbc86c93dfc56027f34235a&selectedJob=206435785 The corresponding tabpaint profile is here: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FMrMqjNh7TleJHeEKFBVGNA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tpaint.zip/calltree/?globalTrackOrder=&localTrackOrderByPid=&thread&v=3
Flags: needinfo?(aswan)
(In reply to Andrew Swan [:aswan] from comment #12) > The corresponding tabpaint profile is here: > https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster. > net%2Fv1%2Ftask%2FMrMqjNh7TleJHeEKFBVGNA%2Fruns%2F0%2Fartifacts%2Fpublic%2Fte > st_info%2Fprofile_tpaint.zip/calltree/ > ?globalTrackOrder=&localTrackOrderByPid=&thread&v=3 I think that's the tpaint profile. I think this is the tabpaint profile: https://perfht.ml/2yKrocM
What seems to be happening is a whole bunch of Activity Stream stuff is happening in the same content process as the tabpaint.html page, _before_ tabpaint.html can be rendered and fire its MozAfterPaint: https://perfht.ml/2yLcGlV (this is the "bad" case, with the WebExtension version) For some reason, that wasn't happening as early before - I think it was happening _after_ the parent open test finished, and just before the content open started: https://perfht.ml/2yILmEA (this is the current "good" case) So is it possible that the WebExtension APIs are being given lower priority, and so Activity Stream starts to load on its own? If so, this might just get fixed once I land bug 1472212, which puts Activity Stream into its own special content process. If you flip the browser.tabs.remote.separatePrivilegedContentProcess to true right now, does the regression go away?
Flags: needinfo?(aswan)
In a local run it appeared to improve the time for the open-from-parent-process case, but did not fully close the gap. But here's a try run with profiles for further analysis: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bfe4e315f7fe7b546073e9abae0fb1efedeac407
Flags: needinfo?(mconley)
Flags: needinfo?(aswan)
Also, random observation: the idea that the test is sensitive to activity stream (or anything really) being busy at the moment it is executing is consistent with the note in comment 7 about using setTimeout() to add some delay between pageloader initially loading tabpaint.html and when the test starts running causing an *increase* in the from-parent measurements...
Looking at the new profile, it looks like we're not painting as soon as we were because we have to go through the HTTP networking stack to load the page, as opposed to getting it through the chrome registry / directly from the file system. That's probably an okay compromise here, for simplicity's sake. I don't mind that extra bit, now that we understand where it's coming from. What I'd recommend is that you modify the definition of the tabpaint test here: https://searchfox.org/mozilla-central/rev/9cb3e241502a2d47e2d5057ca771324a446b6695/testing/talos/talos/test.py#325 to disable Activity Stream preloading altogether, by setting browser.newtab.preload to false. That'll get rid of that wildcard that's probably adding noise to our measurements. With that, even if we're getting a bit of added overhead from the HTTP stuff, I think it's fine to re-baseline on that. Does that give you enough to move forward?
Flags: needinfo?(mconley) → needinfo?(aswan)
Attachment #9017953 - Attachment is obsolete: true
Attachment #9018025 - Attachment is obsolete: true
Flags: needinfo?(aswan)
Whoops didn't mean to clear the needinfo without responding but yes, I think I'm clear to move ahead. Many thanks!
Pushed by aswan@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3d84f10a7e99 Convert tabpaint to webextension r=mconley https://hg.mozilla.org/integration/autoland/rev/e19cd8da255d Convert cpstartup to a webextension r=mconley
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Depends on: 1501963
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: