Closed Bug 401122 Opened 17 years ago Closed 17 years ago

large test-only RLk regression on balsa 1.8 branch

Categories

(Core :: General, defect)

1.8 Branch
x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: ted, Unassigned)

References

()

Details

(Keywords: memory-leak, regression)

See URL for the graph, RLk regressed pretty badly (although intermittently) on balsa 1.8 branch.  It occasionally leaks ~45Kb.  Looking at the raw data, the first occurance was 2007-09-27 08:58:53.  Checkins from midnight to 8:58AM include:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-09-27&maxdate=2007-09-27+08%3A58&cvsroot=%2Fcvsroot

Three of those are calendar checkins, one touches bloatcycle.html(!) (bug 388854) and the other touches JS GC (bug 390078).
Flags: blocking1.8.1.10+
Keywords: mlk
So it's either the test change or the JS GC bug, right?
I don't see any other non-calendar checkins in the previous 24 hours.
Keywords: regression
I'll try backing out my bloatcycle patch locally and running a few quick iterations of --testonly to see if I can catch an intermittent spike.
Assignee: nobody → ccooper
Status: NEW → ASSIGNED
So I managed to freeze balsa-18branch after it reported a RLk spike:

http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla1.8/1193420160.1193421133.5614.gz

I then tried to quickly cycle on that build in --testonly mode to reproduce the RLk numbers, but didn't see the spike again in 50 iterations.

http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaTest

The spike only seems to appear once every 2-6 hours. I'm going to back out the bloatcycle.html patch now locally and let it run normally on the Mozilla1.8 tree over the weekend to see whether the spike still appears.
balsa-18branch has been running for >12 hours now without a leak spike. I'm going to go ahead and back out that patch officially.

Backing out the patch will remove the leak spike from tinderbox, but this means that we still do have a leak *somewhere* in JS on the 1.8 branch that is exposed by this patch:

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=mozilla/build&command=DIFF_FRAMESET&file=bloatcycle.html&rev1=1.1&rev2=1.1.88.1&root=/cvsroot

Not sure about next steps. Could jesse's tools be used to narrow this down?
Assignee: ccooper → nobody
Status: ASSIGNED → NEW
Given that we've identified that this is due to a testing change rather than an actual leak regression, why back it out?  Is it because the leak is intermittent, making it hard to spot other leaks?

I'm not sure which of my tools would be helpful here.  My guess is that what's needed is for someone with leak-debugging skills to reproduce and debug the leak.
By backing out the patch I hope to unmask any potential leaks that could appear as things get landed on branch for, e.g. 2.0.0.9.
As coop explained to me, the bloatcycle.html fix was only to fix Mac testing, by allowing the test to shut down cleanly.  Since we're not running the bloat tests on Mac on the 1.8 branch, we don't really need this fix anyway.

FWIW, bsmedberg and I poked around the leak logs, and the leak may have been due to a DNS lookup continuing past shutdown, given the assertion:
  WARNING: unable to post DNS lookup complete message, file /builds/tinderbox/Fx-Mozilla1.8-gcc3.4/Linux_2.4.7-10_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp, line 1852

(from http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla1.8/1193420160.1193421133.5614.gz&fulltext=1 )

Not sure why this change would have exposed it though.  Also, we should knock the max RLk setting for balsa down to 1204, since that's what it consistently reports now.
Blocks: 388854
Summary: large RLk regression on balsa 1.8 branch → large test-only RLk regression on balsa 1.8 branch
(In reply to comment #8)
> FWIW, bsmedberg and I poked around the leak logs, and the leak may have been
> due to a DNS lookup continuing past shutdown, given the assertion:
>   WARNING: unable to post DNS lookup complete message, file
> /builds/tinderbox/Fx-Mozilla1.8-gcc3.4/Linux_2.4.7-10_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp,
> line 1852

bug 102229?
Marking this FIXED.  We should still knock the leak setting down on balsa.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
It appears this is a testing change and not something that requires a fix on the mozilla 1.8 branch. Please re-nom if we do need this.
Flags: blocking1.8.1.12+ → blocking1.8.1.12-
You need to log in before you can comment on or make changes to this bug.