large test-only RLk regression on balsa 1.8 branch

RESOLVED FIXED

Status

()

Core
General
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: ted, Unassigned)

Tracking

({memory-leak, regression})

1.8 Branch
x86
Linux
memory-leak, regression
Points:
---
Bug Flags:
blocking1.8.1.12 -

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Reporter)

Description

11 years ago
See URL for the graph, RLk regressed pretty badly (although intermittently) on balsa 1.8 branch.  It occasionally leaks ~45Kb.  Looking at the raw data, the first occurance was 2007-09-27 08:58:53.  Checkins from midnight to 8:58AM include:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-09-27&maxdate=2007-09-27+08%3A58&cvsroot=%2Fcvsroot

Three of those are calendar checkins, one touches bloatcycle.html(!) (bug 388854) and the other touches JS GC (bug 390078).
Flags: blocking1.8.1.10+
Keywords: mlk
So it's either the test change or the JS GC bug, right?
(Reporter)

Comment 2

11 years ago
I don't see any other non-calendar checkins in the previous 24 hours.

Updated

11 years ago
Keywords: regression

Comment 3

11 years ago
I'll try backing out my bloatcycle patch locally and running a few quick iterations of --testonly to see if I can catch an intermittent spike.
Assignee: nobody → ccooper

Updated

11 years ago
Status: NEW → ASSIGNED

Comment 4

11 years ago
So I managed to freeze balsa-18branch after it reported a RLk spike:

http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla1.8/1193420160.1193421133.5614.gz

I then tried to quickly cycle on that build in --testonly mode to reproduce the RLk numbers, but didn't see the spike again in 50 iterations.

http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaTest

The spike only seems to appear once every 2-6 hours. I'm going to back out the bloatcycle.html patch now locally and let it run normally on the Mozilla1.8 tree over the weekend to see whether the spike still appears.

Comment 5

11 years ago
balsa-18branch has been running for >12 hours now without a leak spike. I'm going to go ahead and back out that patch officially.

Backing out the patch will remove the leak spike from tinderbox, but this means that we still do have a leak *somewhere* in JS on the 1.8 branch that is exposed by this patch:

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=mozilla/build&command=DIFF_FRAMESET&file=bloatcycle.html&rev1=1.1&rev2=1.1.88.1&root=/cvsroot

Not sure about next steps. Could jesse's tools be used to narrow this down?
Assignee: ccooper → nobody
Status: ASSIGNED → NEW

Comment 6

11 years ago
Given that we've identified that this is due to a testing change rather than an actual leak regression, why back it out?  Is it because the leak is intermittent, making it hard to spot other leaks?

I'm not sure which of my tools would be helpful here.  My guess is that what's needed is for someone with leak-debugging skills to reproduce and debug the leak.

Comment 7

11 years ago
By backing out the patch I hope to unmask any potential leaks that could appear as things get landed on branch for, e.g. 2.0.0.9.
(Reporter)

Comment 8

11 years ago
As coop explained to me, the bloatcycle.html fix was only to fix Mac testing, by allowing the test to shut down cleanly.  Since we're not running the bloat tests on Mac on the 1.8 branch, we don't really need this fix anyway.

FWIW, bsmedberg and I poked around the leak logs, and the leak may have been due to a DNS lookup continuing past shutdown, given the assertion:
  WARNING: unable to post DNS lookup complete message, file /builds/tinderbox/Fx-Mozilla1.8-gcc3.4/Linux_2.4.7-10_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp, line 1852

(from http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla1.8/1193420160.1193421133.5614.gz&fulltext=1 )

Not sure why this change would have exposed it though.  Also, we should knock the max RLk setting for balsa down to 1204, since that's what it consistently reports now.
Blocks: 388854
Summary: large RLk regression on balsa 1.8 branch → large test-only RLk regression on balsa 1.8 branch

Comment 9

11 years ago
(In reply to comment #8)
> FWIW, bsmedberg and I poked around the leak logs, and the leak may have been
> due to a DNS lookup continuing past shutdown, given the assertion:
>   WARNING: unable to post DNS lookup complete message, file
> /builds/tinderbox/Fx-Mozilla1.8-gcc3.4/Linux_2.4.7-10_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp,
> line 1852

bug 102229?
(Reporter)

Comment 10

11 years ago
Marking this FIXED.  We should still knock the leak setting down on balsa.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
It appears this is a testing change and not something that requires a fix on the mozilla 1.8 branch. Please re-nom if we do need this.
Flags: blocking1.8.1.12+ → blocking1.8.1.12-
You need to log in before you can comment on or make changes to this bug.