Closed Bug 403724 Opened 17 years ago Closed 17 years ago

Ts regression on Nov 13th

Categories

(Core :: General, defect, P2)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: sicking, Unassigned)

References

Details

(Keywords: perf, regression)

Attachments

(1 file)

Marking blocking on bugs that were checked in the regression range (8am +- a few hours)
Bugs 400327, 118312 and 386202 looks suspect. I'd say especially bug 118312
Adding a few more suspects since it's hard to tell exactly when the regression occurred given the noise.

This makes bugs 396487, 397289 and 397289 likely too.
CCing the people on those bugs to get proper attention.
And for the record. The regression could probably have been anytime between 8am and 4pm. There is too much noise to know for sure.
The one thing today that I would call a clear regression was bug 403354 - prior to that it was all over the upper 920s, immediately after it hit the first 934, and has only dipped below 930 a couple of times. Everything else looks to be within the noise, and if you're going to call hitting 930 a regression, there was talk of calling your checkin yesterday that triggered a 930 a regression, too ;)
I'd say that there is a pretty clear regression, but I agree it's small. It's too noisy to just look at individual values, but you can clearly see a trend in the graph.

It doesn't make much sense that bug 403354 would cause the regression as it's simply using more efficient data structures.
FWIW, it seems very unlikely that
[ Bug 402439 – Clipboard is emptied on application exit ]
would regress anything, as it's removing/simplifying code...

If anything, it should rather make a (tiny) improvement.
(which would mean the regression is a bit worse actually.)
Ts doesn't run print or print preview, so I don't think [Bug 396278 – should take edge values from print settings instead of prefs] could have anything to do with the regression.
I do not see how JS bugs can affect Ts:

Bug 396487 effectively changes which call in a sequence set a bit in a word. 

Bug 397289 removes a field and the corresponding initialization code from a struct that represents a node in the parse tree. The price for that is passing an extra parameter to JS error reporting API.

Bug 400687 removes an extra remapping of small ints representing JS trace and GC types. Since TS is sensitive to the cycle and garabge collectors performance, a bug in the patch could lead to Ts increase. But any bug in that code would inevitably affects leak countaers. But that remained the same. 
>	let me once again express my ignorance - In ref to the Ts issue I see that igor checked in 398609 which actually dropped Ts from 925 to 919, later backed out due to mochitest failures - and the Ts went back up - and stayed in the 925-930 range , are we sure there was a 'real Ts' regress ?
Also wonder if a clobber on the box would help. I know you guys are a lot smarter than me, I'm just invoking some food for thought.

This seems to be the best scale for illustrating it: the + is on the first build after campd's checkin, then his backout is the clear drop just after the last 931.

So, even though at the time I thought I'd unfairly blamed him for just being at the top of a cycle, I think now his patch was a 0.125% Ts regression, and there may have been another 0.125% to 0.25% regression somewhere yesterday, which I would expect to find between 10:00 and 12:00, rather than 08:00 to 10:00.

So, since we're looking for a regression on the order of half or less the size of the normal daily cycle, a quarter or less the size of the bigger cycles that we get around once a week, shouldn't someone be stepping up to do some backing out? I wouldn't expect to see clear results with less than 4-6 hours of runs after a backout, so if we're going to wait a day or two, then back out one patch at a time, we're going to be closed a week or more.
Phil:
Your analysis once again points to bug 118312 (11:14, nsNativeThemeGTK.cpp et al. which are at least files used in the default theme on Linux, not sure if we load a tree on startup) or bug 400327 (10:53, direct effect on FF toolbar splitter, visible on every default window) as suspects, I'd think.
Backed out bug 400327 too see if that helps.
Moving to blocking list.  Did this get resolved?
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
Still showing up, but it's very small so I'm not sure if we should care. Note that I think that it's cross platform, at least Mac looks like it was affected too.
My only remaining suspect is basically bug 400687. Unfortunately I don't have commit access to mozilla/js so I can't back it out :(
Why is that your only remaining suspect? The Mac graph looks more like unfortunate timing - it's been having spikes like that for four weeks now, but if it was an actual regression, it would have been a regression at either 18:00ish or midnight, so there's no reason to restrict yourself to XP patches.

I don't really know whether we've successfully hidden trees from startup completely, but if we haven't (maybe the search engine manager isn't hidden?), then bug 118312 seems like a very strong candidate: perfect time and perfect OS to regress this single test on this single machine, and the times were 925, 926, 927, 926, 926, patch, 930, 926, 926, 928, 929. Hardly a smoking gun, but at this point we're looking for 1ms, 2 tops, and having devoted a day of tree closure to this already, it seems a shame to just give up without seeing what it looks like backed out. Unless we would take it even if it costs 0.25% Ts, in which case we should just reopen and get on with our lives.
Severity: normal → blocker
Igor's patch won't cause any such regression. Phil's probably on the right trail. But the tree opened without bug 118312 being backed out, so will we ever know?

/be
(In reply to comment #20)
> Igor's patch won't cause any such regression. Phil's probably on the right
> trail. But the tree opened without bug 118312 being backed out, so will we ever
> know?

Bug 118312 was backed out at 2007-11-14 18:36 and was subsequently relanded at 2007-11-14 19:47 after it was shown to have not affected Ts.
Seems to me like it would have been a better idea to back *everything* out in the appropriate range and then land bugs individually to determine which caused the regression.  That's basically what the current policy (linked on the left at the bottom of the Firefox tinderbox page) says:

http://www.mozilla.org/hacking/regression-policy.html

Why wasn't that policy followed here?
Basically it wasn't followed since the regression was so small (order of .2-.3%) and the regression range so big. Because the regression was smaller than the noise each landed/backed out patch had to bake for a number of hours before a trend could be seen. So the payoff would have been very small, and the work big, to pin down exactly what patch cased this. And given that it was so small it might not even have been real.

So I'm going to go ahead and WONTFIX this. Yes, it would have been nice to have figured out exactly what happened, but I don't think it was worth doing more than we did unfortunately.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: