Closed Bug 1040054 Opened 10 years ago Closed 7 years ago

[new test] consider running bitcoinbench in talos

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

as seen in bug 969391, bitcoinbench is a great way to run a more modern javascript benchmark:
https://www.bitgo.com/bitcoinbenchmark/

heck, maybe we could even generate money as we run this benchmark over thousands of platforms/checkins :)

Seriously though, our dromaeo, kraken, v8 benchmarks serve a purpose, but are the becoming obsolete?  Should we add more or replace them?
Joel, Vladan,

Do we have some strategy for adding talos tests? There are a _lot_ of tests we could add, but do we try to follow some regime here? to cover specific areas or needs, etc?

So far we've added tests on a hunch, of sorts - "this is important to have", "we'd like to have this", etc. but adding tests takes time, and talos runtime is (so I'm told) expensive, so we can't just add as many tests as we'd like.

Do you want to meet and discuss this?

I think it'd be nice to have a strategy where if we bump into a new test, we could check if it fits this strategy or not, and act accordingly.
Flags: needinfo?(vdjeric)
Flags: needinfo?(jmaher)
we need a process for new tests, I figure filing bugs as a reminder of what is out there is a good way to keep things around.  I am open to a process, maybe we can discuss that in our bi-weekly perftesting meeting.
Flags: needinfo?(jmaher)
To give something to think of, I think the top-level strategy should be: "to have good coverage of stuff we care about".

We should then list what we care about, check which areas are covered and which are not, prioritize those who are not with good coverage, and then try to have at least one - possibly two tests, to cover each subject.

Another question which I'm not sure of is "how much we we mind that the test is from external source".

(In reply to Joel Maher (:jmaher) from comment #2)
> we need a process for new tests, I figure filing bugs as a reminder of what
> is out there is a good way to keep things around.  I am open to a process,
> maybe we can discuss that in our bi-weekly perftesting meeting.

In that case, can we file a meta bug for new tests suggestions? Off the top of my head I can recall the web responsiveness benchmark, the SVG (and dhtml/flash/canvas) animation benchmark, and not this bitcoinbench.
> dhtml/flash/canvas) animation benchmark, and not this bitcoinbench.

Typo, the 'not' should not be there             ^.
a meta bug would be a good place- it helps us have a way for anyone to suggest a benchmark :)
Blocks: 1040081
Thanks a lot for working on this!

It'd be good to discuss this with the JS team as well though. Many benchmarks are crap or just not interesting enough to run on every checkin. We'd be happy to help you guys find some good benchmarks.
(I'm not saying bitcoinbench is bad of course, just that it'd be good to investigate this.)
(In reply to Avi Halachmi (:avih) from comment #1)
> Do we have some strategy for adding talos tests? There are a _lot_ of tests
> we could add, but do we try to follow some regime here? to cover specific
> areas or needs, etc?

As discussed during our meeting today, we don't have a well-defined strategy for adding tests. When we work on improving some aspect of Firefox performance, we usually create a new test if one doesn't already exist.

We certainly don't want to add a test when we are not confident it will be valuable. Maintaining tests and resolving regression alerts takes time so we need tests that are 1) reliable (sensitive but not noisy), and 2) that are meaningful, i.e. measure aspects of performance we really care about. We also have to take into account test machine time and whether teams have sufficient resources to fix the regressions the test will find. 

We should have a blog post or wiki about the current state of Firefox testing and outline a policy we can refer to when deciding on new tests.

We can certainly discuss this during the Performance Testing meeting on Friday.
Flags: needinfo?(vdjeric)
Oops, I meant the performance testing meeting this Wednesday.
Additionally, we also add new tests to match industry benchmarks.
closing out old bugs that haven't been a priority
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.