Closed Bug 1392123 Opened 7 years ago Closed 7 years ago

AWFY - Run speedometer test for stylo

Categories

(Testing Graveyard :: AWFY, enhancement)

Version 3
enhancement
Not set
normal

Tracking

(Performance Impact:?)

RESOLVED FIXED
Performance Impact ?

People

(Reporter: bholley, Assigned: bc)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [PI:August][stylo])

Attachments

(4 files)

> Please go ahead and file a bug in Testing:General and assign to me.

> Which machines? Just the quantum reference machine? Or Win8 and OSX too?

On all of them if it's easy enough.

> Which repos? inbound? beta?

central for sure, try also. autoland if it's cheap enough, but we could live without it. inbound isn't necessary because we land stylo patches to autoland rather than inbound.

> Frequency the same as other inbound, beta schedules?

Not sure what frequencies those are. At minimum we'd want every push to central. If it's easier to add more granularity, autoland would be the place to do it.
jryans, what are the currently-recommended environmental variables?

STYLO_FORCE_ENABLED=1 STYLO_THREADS=4  ?
Flags: needinfo?(jryans)
Whiteboard: [PI:August]
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #1)
> jryans, what are the currently-recommended environmental variables?
> 
> STYLO_FORCE_ENABLED=1 STYLO_THREADS=4  ?

Yes, you'll want STYLO_FORCE_ENABLED=1 to enable Stylo at all. 

Most test environments also use STYLO_THREADS=4 to force a known number of threads in parallel mode, but Talos does not (it leaves it unset so it would vary according to the machine).  For sequential mode, everyone including Talos sets STYLO_THREADS=1 as expected.

So, if you want to match most test harnesses in parallel mode, then STYLO_FORCE_ENABLED=1 STYLO_THREADS=4 is what you want.
Flags: needinfo?(jryans)
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #2)
> Most test environments also use STYLO_THREADS=4 to force a known number of
> threads in parallel mode, but Talos does not (it leaves it unset so it would
> vary according to the machine).  For sequential mode, everyone including
> Talos sets STYLO_THREADS=1 as expected.

Seems like we should test Stylo's default thread configuration to make sure it scales as expected for each of our test machines.
Summary: Run speedometer test for stylo → AWFY - Run speedometer test for stylo
For the purposes of the QF project, we'd like to be able to track this sooner than later if possible.
Whiteboard: [PI:August] → [PI:August][qf]
Yes, we really need to get this up ASAP.
fyi, I started looking at this yesterday and it is my focus for today.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #0)
> 
> > Which repos? inbound? beta?
> 
> central for sure, try also. autoland if it's cheap enough, but we could live
> without it. inbound isn't necessary because we land stylo patches to
> autoland rather than inbound.

We currently *do not test* mozilla-central or autoland. The tests are only defined for mozilla-inbound or mozilla-beta as of now.

> 
> > Frequency the same as other inbound, beta schedules?
> 
> Not sure what frequencies those are. At minimum we'd want every push to
> central. If it's easier to add more granularity, autoland would be the place
> to do it.

The frequency is determined by a delay period in between test runs. Every hour, 4 hours, 12 hours, etc.

I am going to do mozilla-inbound and mozilla-beta for this bug. Filed Bug 1393068 to discuss the overall repo situation.
This adds stylo to the possible configs. I'll attach the proposed sql statements in a moment.
Attachment #8900348 - Flags: review?(bbouvier)
Comment on attachment 8900348 [details] [diff] [review]
https://github.com/mozilla/arewefastyet/pull/165

Review of attachment 8900348 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, thanks!
Attachment #8900348 - Attachment is patch: true
Attachment #8900348 - Attachment mime type: text/x-github-pull-request → text/plain
Attachment #8900348 - Flags: review?(bbouvier) → review+
Attached file bug-1392123.sql
I think this covers it. It includes a change to the beta vendor to get rid of the extra parens around Beta and to sync the rangeURL to match the same pattern as the other repos.
Attachment #8900349 - Flags: review?(bbouvier)
Attachment #8900349 - Flags: review?(armenzg)
Attachment #8900349 - Attachment mime type: application/sql → text/plain
Comment on attachment 8900349 [details]
bug-1392123.sql

I'm really not that familiar with the tables so I can't really add much there.

What you're doing looks about right.
Attachment #8900349 - Flags: review?(armenzg) → review+
Comment on attachment 8900349 [details]
bug-1392123.sql

For Firefox Stylo mode: it I think vendor_id = 2 is correct, but I see other modes also use vendor_id = 5. In the vendor table, the lines seem to perfectly match, so maybe one is just a duplicate and could be removed (that could be done as a follow up issue).

I think we need to change all the --submitter-mode arguments from "firefox,default:firefox_stylo" to "firefox,stylo:firefox_stylo". My understanding from a quick glance at the code is that the mapping means [ENGINE],[CONFIG]:[MODE_NAME_ID] where ENGINE is among firefox, chrome, etc.; CONFIG is the config's name (as in config.py); MODE_NAME_ID is the mode name identifier (`name` column in `awfy_mode`).

To check my understanding:
- Speedometer2 will be run on Firefox Beta on the Windows (non QF reference) machine every 12 hours, on 32-bits and 64-bits.
- it will also be run on inbound builds on the QF ref machine every hour, on 32-bits and 64-bits.

If that sounds right, r=me. Thanks!
Attachment #8900349 - Flags: review?(bbouvier) → review+
Attached file bug-1392123-v2.sql
carrying forward r+ with changes.

(In reply to Benjamin Bouvier [:bbouvier] from comment #12)
> Comment on attachment 8900349 [details]
> bug-1392123.sql
> 
> For Firefox Stylo mode: it I think vendor_id = 2 is correct, but I see other
> modes also use vendor_id = 5. In the vendor table, the lines seem to
> perfectly match, so maybe one is just a duplicate and could be removed (that
> could be done as a follow up issue).
> 

awfy_vendor.id = 5 appears to be the more common choice. I'll go ahead and change it to use that. I agree that we can clean up later.

> I think we need to change all the --submitter-mode arguments from
> "firefox,default:firefox_stylo" to "firefox,stylo:firefox_stylo". My
> understanding from a quick glance at the code is that the mapping means
> [ENGINE],[CONFIG]:[MODE_NAME_ID] where ENGINE is among firefox, chrome,
> etc.; CONFIG is the config's name (as in config.py); MODE_NAME_ID is the
> mode name identifier (`name` column in `awfy_mode`).
> 

Thanks.

> To check my understanding:
> - Speedometer2 will be run on Firefox Beta on the Windows (non QF reference)
> machine every 12 hours, on 32-bits and 64-bits.
> - it will also be run on inbound builds on the QF ref machine every hour, on
> 32-bits and 64-bits.
> 
> If that sounds right, r=me. Thanks!

Alrighty then. I'll deploy in a little bit.
Attachment #8900768 - Flags: review+
https://github.com/mozilla/arewefastyet/commit/3435238d05ea26e5a8b45d87737cb0f16b89293e
database updated.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Thanks a lot Bob!
So looking at https://arewefastyet.com/#machine=17&view=breakdown&suite=speedometer-misc and https://arewefastyet.com/#machine=35&view=breakdown&suite=speedometer-misc I see that Stylo results are marked as "Beta".  That doesn't mean these are run on Beta builds, does it?

Also, should these be running on the reference hardware as well?
Flags: needinfo?(bob)
Also, do these tests run on PGO builds or normal inbound builds? (I'm assuming "Beta" in the name is a bug...) :)
"Beta" in the name isn't a bug. As stated comment 12, the Stylo runs are configured as follows:

1. Speedometer2 will be run on Firefox Beta on the Windows (non QF reference) machine every 12 hours, on 32-bits and 64-bits.
2. it will also be run on inbound builds on the QF ref machine every hour, on 32-bits and 64-bits.

1. is what you saw yesterday; runs for 2. have started in the meanwhile on the QF machines.

Of course, the configurations can be changed. Having runs on Firefox Beta/Stylo + inbound/Stylo on different machines seemed like a nice tradeoff to not overload a single machine.
I think we are at the limit of what we can handle for "frequent" jobs running every hour. I'll make both run on Win 8 and the QR machine and adjust the timing to every two hours.

:kmag, you had problems with the colors? Which ones are a problem and how can I change them to make them better?
Status: RESOLVED → REOPENED
Flags: needinfo?(bob) → needinfo?(kmaglione+bmo)
Resolution: FIXED → ---
Attachment #8900768 - Attachment mime type: application/sql → text/plain
Attached file bug-1392123-v3.sql
Ben,

This changes the repo directory names for the tasks I added previously to include the bits and changes the frequency for the mozilla-inbound tasks to once every two hours.

It also adds:

Firefox Stylo mozilla-inbound 32bit and 64bit to the Win8 machine also scheduled for every two hours

Firefox Beta Stylo 32bit and 64bit to the Win10 Quantum machine scheduled for twice per day.

The query

select * from control_tasks, control_unit, awfy_machine where control_tasks.control_unit_id = control_unit.id and control_tasks.machine_id = awfy_machine.id and control_tasks.mode_id = 0 ORDER BY `control_tasks`.`id` ASC 

appears to be ok.
Attachment #8901237 - Flags: review?(bbouvier)
Comment on attachment 8901237 [details]
bug-1392123-v3.sql

Looks good to me, thanks.
Attachment #8901237 - Flags: review?(bbouvier) → review+
Deployed. Leaving open until the tests run.
(In reply to Bob Clary [:bc:] from comment #19)
> :kmag, you had problems with the colors? Which ones are a problem and how
> can I change them to make them better?

I originally had a problem with the Firefox vs. Firefox Stylo, but now that they have lines rather than just circles, I find them much easier to tell apart.

That said, in general it helps a lot if graphs that use similar colors for some series use different shapes for the points in those series.
Flags: needinfo?(kmaglione+bmo)
Yeah we should just turn off the beta graph. It's not really useful and confusing at first glance.
Whiteboard: [PI:August][qf] → [PI:August][qf][stylo]
This has been deployed and I'm calling it fixed. However with the upcoming changes to running Stylo by default, this will have to be removed in bug 1394489.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Component: General → AWFY
Product: Testing → Testing Graveyard
Performance Impact: --- → ?
Whiteboard: [PI:August][qf][stylo] → [PI:August][stylo]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: