Closed Bug 134137 Opened 22 years ago Closed 22 years ago

Tp increased 10+ ms on March 28

Categories

(Core Graveyard :: Tracking, defect)

x86
Linux
defect
Not set
critical

Tracking

(Not tracked)

VERIFIED INVALID

People

(Reporter: stephend, Assigned: cathleennscp)

Details

(Keywords: perf, regression, smoketest)

Tp increased 50+ ms after Hewitt's outliner to tree carpool.

Btek is showing metrics in the 1260+ms, as opposed to the 1204-1214 before Joe's
landing.

http://bonsai.mozilla.org/cvsquery.cgi?module=MozillaTinderboxAll&date=explicit&mindate=1017369900
note that jkeizer also checked in a regression and backed it out after the
hewitt checkin.  So we got some of the time back, hewitt is the other part.
Random notes:
1) I'm not sure how the tree conversion has affected pageload. I know that btek
runs with a default new profile, sidebar open to bookmarks (a tree). I suppose
the other visible tree is in the urlbar/autocomplete (well, not visible, but
'there').

2) I'm kicking off jprof builds on Linux to be able to do some profiling and
testing tomorrow (going home soon).

3) Note that files were copied (or moved) in the cvs repository, so backing out 
is, afaik, not a simple operation and will need help from leaf or seawood.

4) I've looked at the per-url times, and there is no standout page. The slowdown
is across the board. (I'll attach a chart in a moment).

Obviously the tree should not open tomorrow _for_any_reason_ until we either (a)
have a fix checked in or (b) have backed this out while investigation continues
(this conversion _must_ happen for mozilla1.0).
Severity: critical → blocker
the latest cycles shows back to normal timings..1215 - 1222.
A clobber brought the increase down to the 10-15ms range.  See bug 134250 for why.
Summary: Tp increased 50+ ms after Hewitt's outliner to tree carpool. → Tp increased 10+ ms after Hewitt's outliner to tree carpool.
Actually, though, it was up higher already yesterday from earlier checkins. 
(jrgm said, I think, that the first run after a full rebuild tends to be slower.
 If that's the case, the 1215 is within the range for before hewitt landed, and
none of the increase is related to his checkin, although we should wait a few
more cycles to confirm that.  The numbers were a bit more variable yesterday
afternoon than they had been in the past.)
I don't think this can be blamed on hewitt.  The Tp times from btek for March
28/29 that didn't have the messed up dependency problem caused by hewitt's
changes (see bug 134250) (during which time the only thing that happened was
jkeiser landed and then backed out) were:

time  Tp(ms)
 3:51 1200 [March 28]
 4:17 1197
 4:57 1197
 5:50 1198
 6:24 1197
 7:01 1199
 7:35 1197
 8:13 1199
 8:55 1201
 9:30 1200
10:05 1207
10:37 1197
11:15 1204
12:08 1199
12:39 1198
13:17 1206 (first build that took checkins)
14:55 1216
15:40 1205
16:22 1206
17:05 1214
17:44 1214
18:27 1203
 9:50 1222 * [March 29] (clobber build - can be longer for some reason)
11:31 1215
12:14 1215
13:00 1215
13:36 1217

I think it's more important to look for the larger increase that may have
happened earlier in the day, although there's a little too much variation to
really tell.
Summary: Tp increased 10+ ms after Hewitt's outliner to tree carpool. → Tp increased 10+ ms on March 28
For kicks, here are some more data.  I was curious to see if one of the things
other than what we're measuring is more stable -- it doesn't seem like any of
them are.  However, here are the average average, average median (which is Tp),
average cached load, and average uncached load times:

AAvg AMed ACac AUnc  mmdd-HHMM
1228 1197 1363 1195  0328-1037
1236 1204 1384 1199  0328-1115
1230 1199 1370 1195  0328-1208
1229 1198 1365 1195  0328-1239
1242 1206 1395 1204  0328-1317
1248 1216 1389 1210  0328-1455
1238 1205 1381 1203  0328-1540
1240 1206 1383 1204  0328-1622
1247 1214 1388 1212  0328-1705
1248 1214 1391 1212  0328-1744
1238 1203 1385 1201  0328-1827
1258 1222 1406 1221  0329-0950
1250 1215 1402 1213  0329-1131
1248 1215 1391 1213  0329-1214
1246 1215 1380 1213  0329-1300
1250 1217 1389 1216  0329-1336
So this isn't hewitt, it's holding the tree closed and it's still assigned to
hewitt. We need to either open the tree, reassign this to someone that's
responsible or back everyone out (besides hewitt) and get things moving again. 
Cathleen took btek down and was investigating.  Since we really should have btek
up before we open the tree, and because cathleen was investigating, reassigning
to cathleen.
Assignee: hewitt → cathleen
so this could either be due to bug 132329 or bug 133382.

I would have backed out my patch, but btek doesn't pull right now, so I wouldn't
be able to check how it affects Tp.
ok, so bug 132329 can't have caused it, I was told

That leaves bug 133382... hm... it was already backed out once on tbox (at
23:14), but numbers don't show clearly if this fixed the regression or not
Is this still blocking the tree from opening?
The tree has opened...

Is this bug still relevant?
If yes, shouldn't it be downgraded from smoketest blocker?
btek reboot at 03/30 19:59 seemed to give back over 10ms (I don't think any of
the checkins around that time could have contributed: tests, OS/2, ftp,
editorUtils.js to comply with API change and string properties), which brought
us into the range of the old variation before hewitt's checkin. Later, after
several checkins that may have affected perf numbers we are slightly (1ms avg)
better than before hewitt's checkin.

I am in favor of removing the smoketest keyword. Any objections?
downgrade to critical.
Severity: blocker → critical
Is this one still valid?
from the daily loadtime test results that I run, it looks to have been holding 
in the same range (1.29-1.35ms) for several months.
http://ftp.mozilla.org/pub/data/loadtimes/daily_loadtime.html
This bug is over 2 mo. old.  I think we're hitting diminishing returns
and need to wrap this one up.  Cathleen or hewitt, any comments?  
Adding hewitt.
we should kill this bug - any objections?
I'm invalidating this bug.  I think we figured out it was someone else.

If not, and I'm incorrect, you know the drill.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
verified.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.