Closed Bug 1386806 Opened 7 years ago Closed 7 years ago

SETA data for new win32/win64 stylo tests

Categories

(Tree Management Graveyard :: Treeherder: SETA, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Assigned: jryans)

References

Details

(Whiteboard: [stylo])

Attachments

(2 files, 1 obsolete file)

In bug 1386405, Armen had seta updated to reflect the data for macosx stylo tests.  In bug 1385027, the same tests are going to be enabled for win32/win64.  We should ensure that seta is ready for these tests so we don't run into the same scenario with pending counts.  Of course, with windows, many of the tests run on AWS but we still don't want to spend money when the results of the tests are not useful, especially on high volume branches like autoland and m-i.
jmaher armenzg: Did we make a decision going forward whether we would pre-seed seta with new tests before they are enabled?  In bug 1385027 they want to enable parallel tests on windows to existing tests but with a stylo flag enabled which will significantly add to load.  It should be able to autoscale better than macosx tc as many tests run on tc and AWS but still worth reducing unneeded test runs.
Flags: needinfo?(jmaher)
Flags: needinfo?(armenzg)
Whiteboard: [stylo]
I think we should preseed on seta, here is the file:
https://github.com/mozilla/treeherder/blob/master/treeherder/seta/preseed.json

since this is inside of treeherder, it will take a bit to review/deploy, so possibly we land what is ready now and be ready with preseed inside of treeherder.  It will also help to see the jobs on treeherder so we can preseed them properly :)

do we have a list of jobs that we are adding for this?
Flags: needinfo?(jmaher)
Flags: needinfo?(armenzg)
Attached file List of new Windows Stylo tasks (obsolete) —
Here's a list of the new tasks when the patches from bug 1385027 are applied.

Is that what you are looking for?
Flags: needinfo?(jmaher)
thanks!  one thing to adjust here is we turned off many of those tests which are non-e10s earlier this week.  In addition win10 doesn't run reftest reliably yet, likewise a few other suites.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) (UTC-9) from comment #4)
> thanks!  one thing to adjust here is we turned off many of those tests which
> are non-e10s earlier this week.  In addition win10 doesn't run reftest
> reliably yet, likewise a few other suites.

About non-e10s, I just hadn't rebased past the change that removed them.

For reftest on Windows 10, most runs have been okay, except debug reftest-stylo, so I've disabled those for now.

Is there a general bug for reftest issues on Windows 10?

Updated task list attached.
Attachment #8893911 - Attachment is obsolete: true
Flags: needinfo?(jmaher)
that tasklist looks much better- I am sure there are small details I am overlooking, but for now lets assume that is enough work to get turned on and all greened up!

there is no bug for the reftest failures we had been seeing on win10 in genereal.  We are making a larger push to get all win8 converted to win10 (as much as possible in AWS on a VM, with any perma failing test cases in a separate job on hardware).  I have a couple try pushes today looking into this, I think in the next 2 weeks we will have most all tests running- so don't change anything out based on my comments about reftests.

I am looking forward to this- I expect the preseed.json work to be resolved and deployed by Tuesday at the latest.
Flags: needinfo?(jmaher)
Where are with this work?  Still on track to deploy some time today?
Flags: needinfo?(jmaher)
I think you can go ahead and deploy- treeherder will be updated in a few hours- so if you want to deploy now that is ok or whenever you have time
Flags: needinfo?(jmaher)
Spoke to :jmaher on IRC, we need to add a line for windows7-32-stylo as well.  I'm going to make a PR.
This was deployed last night.
Assignee: nobody → jryans
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
This has not worked as expected.
Due to bug 1368982, the process 'analyze failures' failed many times to complete, thus, not updating the JobPriority table.
Eventually, overnight, the JobPriority table got updated. This inserted a job priority per TH job.

The way we've used preseed.json does not work with "*".
I will be requesting the job priorities for these jobs to be 5 instead of 1.


------------------------------------------------------------
Here are my notes for what I was doing yesterday.
I can see the wildcard entries in the table:
https://sql.telemetry.mozilla.org/queries/14771/source#table
> id 	 testtype 	buildtype 	platform 		priority 	expiration_date 		buildsystem
> 13,082	 *		*		windows7-32-stylo	5		2017-09-01 00:00		*
> 13,083	 *		*		windows10-64-stylo	5		2017-09-01 00:00		*

The jobs are running on every push:
https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=win%20reft%20stylo&group_state=expanded

The stylo jobs do show up in here (which only lists known Treeherder jobs):
https://treeherder.mozilla.org/api/project/autoland/seta/job-types/

The stylo jobs do *not* show up in here:
https://treeherder.mozilla.org/api/project/autoland/seta/job-priorities/?build_system_type=taskcluster&format=json
which do show up in runnable_jobs:
https://treeherder.mozilla.org/api/project/autoland/runnable_jobs/

I set Treeherder up and I believe I did the following:

vagrant up
vagrant ssh (3 different tabs)

1st tab
-------
./manage.py runserver

2nd tab
-------
yarn install --no-bin-links
yarn start:local

3rd tab
-------
./manage.py load_preseed
./manage.py update_runnable_jobs (Wait few minutes)
./manage.py update_job_priority_table

On your browser load the following:
http://localhost:8000/api/project/autoland/runnable_jobs/?format=json

On your browser open the following link and click "stop" when Firefox tells you it is taking forever;check the output of runserver to know when.
http://localhost:8000/api/project/autoland/seta/job-priorities/?build_system_type=taskcluster&format=json&priority=5

You will see tons of "test-windows7-32-stylo/*" jobs listed.
Blocks: 1389118
See bug 1389118 on bumping the priority to every 5th push.
After 2 weeks, the expiration date (grace period) will clear up.
Analyza failures will then consider the jobs and bump them to priority 1 if they catch a regression that *only* that specific job would have caught.
I spoke too soon.
I had to remove the expiration dates as well.

I can now see the jobs in here:
https://treeherder.mozilla.org/api/project/autoland/seta/job-priorities/?build_system_type=taskcluster&priority=5

I still see some oddities on SETA which we will investigate in bug 1389524.
> WARNING [treeherder.seta.job_priorities:54] Job priority (taskcluster,mochitest-style-e10s,debug,macosx64-stylo) not found in accepted jobs list

Also, is there a reason we make linux64-stylo run on every push?
I will leave this open until I fix the last few stylo jobs in the dep bug.
No longer blocks: 1389118
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: seta data for new win32/win64 stylo tests → SETA data for new win32/win64 stylo tests
Depends on: 1389524, 1389118
As expected, the stylo jobs are now skipped 4 pushes at at time (except mochitests bug 1389524).
https://cl.ly/411L0z0k342N
I fixed the Windows mochitest stylo jobs.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: