1013650 - (mozbench) Browser/Game Benchmark Automation (mozbench)

Reporter

Description

•

10 years ago

>>Problem:
The Games and JS teams would like to run automated browser benchmarks on all release channels for Firefox vs Chrome, Firefox OS vs Android.

>>Solution:
Alan Kligman (akligman) has written most of the automation framework (mozbench) for test running and reporting:

https://github.com/modeswitch/mozbench

Unfortunately, Alan has been reassigned to a different project. He has since been working with Joel Maher to bring him up to speed on mozbench. Alan estimates that there are "a couple weeks" of work left to finish the mozbench framework and then automate the test suites.

The primary test suites are those used by Tom's Hardware Guide's "Web Browser Grand Prix":

https://wiki.mozilla.org/Web_Browser_Grand_Prix

>>Mozilla Top Level Goal:
The goal of the mozbench project is to track performance improvements and regressions for Firefox browser and Firefox OS compared to the competition, Chrome and Android, respectively.

>>Existing Bug:
No bug

>>Per-Commit:
The mozbench tests would be run either weekly or nightly. Its goal is to track trends, not bisect changesets.

>>Data other than Pass/Fail:
mozbench will report performance metrics and has its own data collection and reporting server owned by Kyle Lahnakoski (klahnakoski).

>>Prototype Date:
Flexible, but hopefully 1–2 months.

>>Production Date:
No hard deadline.

>>Most Valuable Piece:
Completing and standing up the mozbench framework. Automation of individual test suites can be done later.

>>Responsible Engineer:
akligman developed the framework and Kamil Jozwiak (kjozwiak) will be maintaining the test hardware in Toronto

>>Manager:
Project manager Chris Peterson (cpeterson) and JS engineering manager Naveed Ihsanullah (nihsanullah)

>>Other Teams/External Dependencies:
Martin Best (mbest) from the Games Initiative and Milan Sreckovic (msreckovic) from the GFx team are also interested in these tests.

>>Additional Info:
mozbench differs from Are We Fast Yet (AWFY) in that it runs the actual browser, not just the JS shell.

Alan K [:ack]

Updated

•

10 years ago

Summary: Brower Benchmark Automation (mozbench) → Browser Benchmark Automation (mozbench)

Jonathan Griffin (:jgriffin)

Comment 1

•

10 years ago

The current repo contains the skeleton of a new continuous-integration framework.  What's the motivation for creating this versus using an off-the-shelf solution like Jenkins?

Chris Peterson [:cpeterson]

Reporter

Comment 2

•

10 years ago

Alan: can you please share your design rationale with Jonathan?

My understanding was that the test framework needed to be mobile-friendly for automation on device benchmarks.

Flags: needinfo?(akligman)

Alan K [:ack]

Comment 3

•

10 years ago

* small, lightweight (JS with only pure-JS deps)
* portable wherever Node is (covers Android & FxOS)
* plugs into our existing reporting system
* more scalable than perfy

Not sure which of those are covered by Jenkins.

Flags: needinfo?(akligman)

cmtalbert

Comment 4

•

10 years ago

(In reply to Alan K [:ack] from comment #3)
> * small, lightweight (JS with only pure-JS deps)
> * portable wherever Node is (covers Android & FxOS)
> * plugs into our existing reporting system
> * more scalable than perfy
> 
> Not sure which of those are covered by Jenkins.

What happened to the existing automation system we developed last summer that we already had working on android and Firefox OS for this exact effort? Beyond my general aversion to re-inventing wheels, I want to understand why it was evidently deemed insufficient since that might be important to decide on the proper approach here so that we don't repeat the same mistakes.

Jonathan Griffin (:jgriffin)

Comment 5

•

10 years ago

I think it might be helpful to better understand the ultimate goal, as well:

* What platforms do we want to run the tests on?
* Do we have existing hardware for this or will we need to order more?
* Where do the tests report results, and what interface is used for that?

Chris Peterson [:cpeterson]

Reporter

Comment 6

•

10 years ago

(In reply to Jonathan Griffin (:jgriffin) from comment #5)
> I think it might be helpful to better understand the ultimate goal, as well:
> 
> * What platforms do we want to run the tests on?

Android, Firefox OS, Linux, OS X, and Windows. I believe Android, Linux, and Windows tests are already running an older version of Alan's test framework.

> * Do we have existing hardware for this or will we need to order more?

We have existing hardware (desktop machines and Android devices in Toronto) running an older version of Alan's test framework.

> * Where do the tests report results, and what interface is used for that?

The framework posts (JSON?) test results to a reporting database server maintained by Kyle Lahnakoski.

Jonathan Griffin (:jgriffin)

Comment 7

•

10 years ago

Thanks.  Do we not care to run these tests on B2G?

For the desktop platforms, do we care which flavors we use?

For Android, do we want to test against a variety of hardware, or only one or two reference devices?

I agree with Clint in that I'd prefer to stand this up using existing tools if possible, rather than develop a custom CI/runner.  I'd like to assign an engineer to examine the currently running tests, identify pain points, and come up with a least-cost solution to getting these running long-term.

Chris Peterson [:cpeterson]

Reporter

Comment 8

•

10 years ago

Jonathan: sorry for the late reply.

(In reply to Jonathan Griffin (:jgriffin) from comment #7)
> Thanks.  Do we not care to run these tests on B2G?
> 
> For the desktop platforms, do we care which flavors we use?

We have three dimensions of testing: browsers, operating systems, and benchmarks. After talking with ack, we think the best place to start would be comparing Firefox Nightly and Chrome Canary running a webaudio benchmark on a recent version of Windows. ack has a particular webaudio benchmark in mind.


Operating system priorities:

1. Latest version of Windows
2. Compare current version of B2G to previous B2G releases
3. Compare current version of B2G to Android (Fennec and Chrome) on same 
hardware
4. Latest version of OS X
5. Linux
6. Android
7. Older versions of Windows like 7 or XP?

Browser priorities:

1. Compare Firefox Nightly vs Chrome Canary
2. Compare Firefox's Nightly, Aurora, Beta, and Release channels
3. Compare Chrome's Canary, Beta, and Release channels

Benchmark priorities:

1. webaudio (waiting for link from ack)
2. WebGL and canvas
3. TBD: Other random benchmarks like azakai's MASSIVE (for asm.js) or Browsermark and Peacekeeper (for Tom's Hardware)


> For Android, do we want to test against a variety of hardware, or only one
> or two reference devices?

Only one or two reference Android devices would be needed, but Android is a lower priority platform than desktop.


> I agree with Clint in that I'd prefer to stand this up using existing tools
> if possible, rather than develop a custom CI/runner.  I'd like to assign an
> engineer to examine the currently running tests, identify pain points, and
> come up with a least-cost solution to getting these running long-term.

Yes, that makes sense to use the team's standard tools.

What are the next steps forward? Mark Cote scheduled a "Planning for games benchmarking" meeting for next week. What information should I bring to that meeting or share before then?

Blocks: WBGP

Flags: needinfo?(jgriffin)

Kyle Lahnakoski [:ekyle]

Comment 9

•

10 years ago

Alan's performance results were sent to an ES cluster.  A small set of dashboards used the cluster to compare the various versions of FF and Google Chrome: http://people.mozilla.org/~klahnakoski/perfy/Perfy-Overview.html#

Jonathan Griffin (:jgriffin)

Comment 10

•

10 years ago

(In reply to Kyle Lahnakoski [:ekyle] from comment #9)
> Alan's performance results were sent to an ES cluster.  A small set of
> dashboards used the cluster to compare the various versions of FF and Google
> Chrome: http://people.mozilla.org/~klahnakoski/perfy/Perfy-Overview.html#

Ah, thanks, that was my last question.  :)

Flags: needinfo?(jgriffin)

Alan K [:ack]

Comment 11

•

10 years ago

Web Audio benchmarks:

1. https://github.com/padenot/webaudio-benchmark
2. https://bugzilla.mozilla.org/show_bug.cgi?id=882543

Chris Peterson [:cpeterson]

Reporter

Updated

•

10 years ago

Depends on: 974464

Keywords: meta

Dan Minor [:dminor]

Updated

•

10 years ago

Assignee: jgriffin → dminor

Status: NEW → ASSIGNED

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1039637

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1050880

Chris Peterson [:cpeterson]

Reporter

Updated

•

10 years ago

Depends on: 1050645

Chris Peterson [:cpeterson]

Reporter

Updated

•

10 years ago

Alias: mozbench

Summary: Browser Benchmark Automation (mozbench) → Browser/Game Benchmark Automation (mozbench)

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1051798

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1056094

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1066056

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1066657

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1066665

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1067403

Chris Peterson [:cpeterson]

Reporter

Updated

•

10 years ago

Depends on: 970432

Chris Peterson [:cpeterson]

Reporter

Updated

•

10 years ago

Depends on: 1079438

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1101551

Jonathan Griffin (:jgriffin)

Updated

•

10 years ago

Depends on: 1103035

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103060

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103062

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103063

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103064

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103134

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1103995

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1110270

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1113227

Dan Minor [:dminor]

Updated

•

10 years ago

Depends on: 1113611

Dan Minor [:dminor]

Updated

•

9 years ago

Depends on: 1128908

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Updated

•

9 years ago

Depends on: 1148564

Bobby Chien

Updated

•

9 years ago

Priority: -- → P3

Dan Minor [:dminor]

Comment 12

•

9 years ago

I'm in maintenance mode for mozbench, so I'm unassigning myself in case others are interested in picking up the remaining open bugs.

Assignee: dminor → nobody

Status: ASSIGNED → NEW

Dan Minor [:dminor]

Comment 13

•

8 years ago

We're replacing mozbench with arewefastyet.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → WONTFIX