Open Bug 1500512 Opened 2 years ago Updated 2 months ago

Cleanup stylo handling in AWSY

Categories

(Testing :: AWSY, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: erahm, Unassigned)

References

Details

Currently we force enable stylo [1] during AWSY testing, which is a bit overboard since it's always enabled now. I think we could get rid of force enabling without an issue.

That leaves us with a small problem: we setup awsy to do test runs with stylo enabled back when stylo wasn't prefed on. To differentiate the results from non-stylo enabled we annotated with "stylo", as in "JS opt stylo". We've long since enabled stylo by default, so having this annotation seems kind of excessive. I'm inclined to remove it *but* that would make my data disjoint. That leaves us with a couple options:

#1 - Just keep annotating with 'stylo'
#2 - Remove the annotation and accept a disjoint data set
#3 - The perfherder folks might be able to help us massage the data so we have continuity

Additionally we support and run a 'stylo-sequential' version that limits the stylo threads to 1. I don't think we need this anymore, but if I understand correctly that's tied to an entire stylo-sequential platform. I wonder if we should just retire that. Judging by treeherder it looks like AWSY is the only test run for this platform [2].

[1] https://searchfox.org/mozilla-central/rev/9cb3e241502a2d47e2d5057ca771324a446b6695/testing/mozharness/scripts/awsy_script.py#228-229
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&selectedJob=206613964
(In reply to Eric Rahm [:erahm] from comment #0)
> That leaves us with a small problem: we setup awsy to do test runs with
> stylo enabled back when stylo wasn't prefed on. To differentiate the results
> from non-stylo enabled we annotated with "stylo", as in "JS opt stylo".
> We've long since enabled stylo by default, so having this annotation seems
> kind of excessive. I'm inclined to remove it *but* that would make my data
> disjoint. That leaves us with a couple options:
> 
> #1 - Just keep annotating with 'stylo'
> #2 - Remove the annotation and accept a disjoint data set
> #3 - The perfherder folks might be able to help us massage the data so we
> have continuity

Will, do you have any thoughts on the best way to proceed here?
Flags: needinfo?(wlachance)
(In reply to Eric Rahm [:erahm] from comment #0)
> Additionally we support and run a 'stylo-sequential' version that limits the
> stylo threads to 1. I don't think we need this anymore, but if I understand
> correctly that's tied to an entire stylo-sequential platform. I wonder if we
> should just retire that. Judging by treeherder it looks like AWSY is the
> only test run for this platform [2].

Bobby, does it make sense to rip out the stylo-sequential test(s)?
Flags: needinfo?(bobbyholley)
(In reply to Eric Rahm [:erahm] from comment #1)
> (In reply to Eric Rahm [:erahm] from comment #0)
> > That leaves us with a small problem: we setup awsy to do test runs with
> > stylo enabled back when stylo wasn't prefed on. To differentiate the results
> > from non-stylo enabled we annotated with "stylo", as in "JS opt stylo".
> > We've long since enabled stylo by default, so having this annotation seems
> > kind of excessive. I'm inclined to remove it *but* that would make my data
> > disjoint. That leaves us with a couple options:
> > 
> > #1 - Just keep annotating with 'stylo'
> > #2 - Remove the annotation and accept a disjoint data set
> > #3 - The perfherder folks might be able to help us massage the data so we
> > have continuity
> 
> Will, do you have any thoughts on the best way to proceed here?

(3) is certainly possible, but it involves either me doing a bunch of work (maybe an hour or two) or training someone to do it (probably about the same amount of time, realistically, and of questionable value given how infrequently this kind of scenario pops up).

Basically it would involve doing (2), letting that run for a while, then matching up the stylo-annotated signatures with the non-stylo-annotated signatures, and rewriting all datapoints assigned from the former to the latter.

I don't really mind doing this, but am not sure if I can justify the time. Can you quantify what value it would have to everything in one set? If it's not clear, I'd just proceed with (2) for now -- we can always do (3) later if it turns out there is a compelling reason to.
Flags: needinfo?(wlachance)
(In reply to William Lachance (:wlach) (use needinfo!) from comment #3)

> I don't really mind doing this, but am not sure if I can justify the time.
> Can you quantify what value it would have to everything in one set? If it's
> not clear, I'd just proceed with (2) for now -- we can always do (3) later
> if it turns out there is a compelling reason to.

tl;dr management folks look at the longer term data

My best justification is that we use perfherder now in lieu of areweslimyet.com: the goal is to be able to track memory usage over time and is in particular used to track quarterly progress on the memshrink-content and plain old memshrink projects. If the data becomes disjoint, people who use it to track progress over longer periods of time are no longer going to be able to do so. It will also become confusing to have both stylo and non-stylo data available when in fact they represent the same setup. In theory people could select *both* data points and get reasonable graphs, but that's harder to explain and more tedious, particularly when subtests are involved. Hopefully that's enough to convince you it's worth doing, but if you don't have time right now we can hold of on doing this.
Yeah, long term, it would be great to get rid of the stylo annotations from all of our jobs. This is bigger than just AWSY though, and I agree it's worth investing some amount of effort to make the graphs line up.

I think there is some value in the stylo-sequential stuff (I believe we currently run it under Talos (Tss), and also AWSY), but I'm also fine with retiring it. We can always trigger try pushes with sequential mode if we need to track something down.
Flags: needinfo?(bobbyholley)
(In reply to Eric Rahm [:erahm] from comment #4)
> (In reply to William Lachance (:wlach) (use needinfo!) from comment #3)
> 
> > I don't really mind doing this, but am not sure if I can justify the time.
> > Can you quantify what value it would have to everything in one set? If it's
> > not clear, I'd just proceed with (2) for now -- we can always do (3) later
> > if it turns out there is a compelling reason to.
> 
> tl;dr management folks look at the longer term data
> 
> ...

Ok, I can do this if you're pretty sure it would be valuable. I'm guessing we could get away with just changing the series on mozilla-central, if it's just about looking at the historical trend?

It's about the same effort to do this for just awsy as it would be to do it for everything, so it might be preferable to doing some large scale purge of the `e10s` and `stylo` annotations.
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.