Closed Bug 10847 Opened 26 years ago Closed 25 years ago

[Webshell] test for webshell leaking?

Categories

(SeaMonkey :: Build Config, defect, P1)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: bruce, Assigned: nisheeth_mozilla)

References

Details

(Whiteboard: [PDT-] need help to verify plus for beta2)

Attachments

(1 file)

It was discussed during one of the previous rounds of beeting on nsWebShell leaks that there should be a way to track the number of allocations and deallocations of nsWebShells and then at exit time, print an error notice if some were leaked. This could be tied into the smoke tests as well so that this type of thing became very noticable very quickly. Thoughts?
Status: NEW → ASSIGNED
Summary: test for webshell leaking? → [Webshell] test for webshell leaking?
Target Milestone: M10
We probably need to implement an observer that gets notified on webshell allocations and deallocations and prints out its tally at exit time. This should be relative simple to do. Setting milestone to M10. I'm away on a seminar for most of M9.
Bul re-assigning webshell bugs to the new webshell owner, Scott Collins. Scott, please adjust the target milestone assignments as you see fit. Thanks.
Assignee: scc → chofmann
Severity: normal → critical
Status: ASSIGNED → NEW
if scc no longer owns webshell, then this needs a new owner. this is critical for us to not start leaking the webshell again once it gets fixed. I understand that there is now code in the tree to do this, but it isn't turned on, and it isn't tied into the tinderbox smoke tests (or daily verifications).
Target Milestone: M10 → M12
this is a small amount of work to help avert huge problems. Should be done ASAP. Setting to M12. If the code already exists, it should be turned on and the bug should be assigned to somebody on the build team to get it hooked up to tinderbox, or at least QA should verify the output during daily tests.
Summary: [Webshell] test for webshell leaking? → [BETA] [Webshell] test for webshell leaking?
nisheeth, do you have the code for this or could you put it together quickly as one last contribution to webshell?
Sure, I'll look at this today...
Priority: P3 → P1
Target Milestone: M12 → M11
The code was in there but ifdef'd debug in the nsWebshell.cpp and ifdef'd out in nsApprunner.cpp. I'm running with it enabled by default in my local build right now. If everything looks kosher after some testing, I'll check in my changes. I'm adding Chris Yeh to the cc list. Chris, what kind of behaviour do we want when the webshell leaks are found. The code in there right outputs a message on the console and stops for user input before proceeding to shutdown. If we want to hook this up to tinderbox, I'm guessing that we'll have to change the "wait for user input" part of the behavior. I'm upping the priority of this to P1 and moving the target milestone in to M11
cyeh is on his way out the door for vacation. I'll offer my opinion. We definitely can't hang up waiting for user input on any tinderbox builds. That defeats the ability to build continously without user intervention at any hour... maybe a pref that enables this behavior, or behavior slightly modified behavior that kicks the message to console and then exits is most effective route.
Blocks: 14469
is one this ready?
Yes it is. I've enabled code and made build system changes for Windows that check for webshell leaks on shutdown of apprunner and print the following message to the console if any leaks are found: "XXX WARNING: Number of webshells being leaked: 1" Patrick has enabled this on the mac. I need somebody who knows how autoconf works to do three build system changes: a) define DETECT_WEBSHELL_LEAKS globally b) define an environment variable DETECT_WEBSHELL_LEAKS globally. c) link in the webshell dll into apprunner inside mozilla/xpfe/bootstrap if DETECT_WEBSHELL_LEAKS is defined in the environment. Once the above is done, webshell leak detection should get enabled on Unix also. I'm ccing Ramiro who'll probably be able to help.
I'd confused as to why we need to change our linkage to make this work, but I'll look more closely. I think we should make the tinderboxes build with DETECT_WEBSHELL_LEAKS and turn the tree orange if any are detected.
Blocks: 15160
No longer blocks: 14469
Whiteboard: what's the status here?
I can't comment on whether this has been hooked into tree status yet but I can confirm(verify) that my Win32 commercial opt build from 1999093012 did print out: XXX WARNING: Number of webshells being leaked: 30 when I shut it down
The status is that is enabled on mac and windows. A Unix guy needs to make the necessary autoconf changes to make this work on Unix when an appropriate build flag (-enable-leak-checking or something) is used to generate the build.
Assignee: chofmann → nisheeth
Status: NEW → RESOLVED
Closed: 26 years ago
Resolution: --- → FIXED
I've checked in the autoconf flag into unix. Leaf needs to add --enable-detect-webshell-leaks to the default unix builds. I'll send him mail about that. Re-assigning this bug to myself and marking it fixed.
Whiteboard: what's the status here? → waiting for unix to get hooked up
I can't verifiy this until I see it in the unix builds. so as of now I'm waiting on leaf I guess.
Whiteboard: waiting for unix to get hooked up → NOV-18 still waiting for unix to get hooked up
Status: RESOLVED → REOPENED
Target Milestone: M11 → M13
Assignee: nisheeth → leaf
Status: REOPENED → NEW
Resolution: FIXED → ---
re-opening and assigning to leaf
isn't this hooked up now? i.e . should be marked fixed?
Assignee: leaf → briano
You are so going to hate me for this, brian.... but i need your help. We're being asked to turn something on in the daily release builds that shouldn't get turned on for the *real* release builds. Do we have a way to do this with spawnbuild and rbuild?
Status: NEW → ASSIGNED
I think I have a way to do this seamlessly using my spawn_build script, but it will require a change to the way flags get passed to rbuild. I'll let you know when I've got this in place.
Status: ASSIGNED → RESOLVED
Closed: 26 years ago25 years ago
Resolution: --- → FIXED
Okay, the necessary changes are checked in, and the test build did the right thing, so I'm marking this fixed. The spawn_build script now has the ability to accept configure flags on the command line and pass them on appropriately. So now we can do "custom" builds using --enable-perf-metrics and/or --enable-detect-webshell-leaks or whatever whenever we want to.
I still don't see the XXX WARNING for webshell leaks in the Linux builds as of 2000011110. Unless someone explains to me why I shouldn't see this I'm going to have to reopen this bug.
Status: RESOLVED → VERIFIED
figured it out. marking VERIFIED, 2000011212 build
Status: VERIFIED → REOPENED
I tricked myself into verifying this bug and shouldn't have. REOPENING. the linux builds as of 12-JAN aren't demonstrating the full test. I can see the webshell count incrementing and decrementing in an appropriate fashion but in the case where they aren't fully decremented at shutdown and we leak a webshell(or several) I do NOT see the XXX Warning message. So seeing as that is the stated original aim of this bug, I am reopening it.
Resolution: FIXED → ---
clearing the fixed resolution
Assignee: briano → travis
Status: REOPENED → NEW
Target Milestone: M13
I'm confused. C, are you saying we see something like the tail of this log that we get on windows, just no XXX message. If we have a count of number of webshells that should be good enough. bug should now go to travis if there is anything else that needs fixing. not an m13 stopper. removing t milestone ! nsSecureBrowserUIImpl found: http://espn.go.com/ change lock icon to unlock ?(http 804b0002) WEBSHELL- = 8 WEBSHELL- = 7 WEBSHELL- = 6 WEBSHELL- = 5 WEBSHELL- = 4 WEBSHELL- = 3 Hey : You are in QFA Shutdown XXX WARNING: Number of webshells being leaked: 3 WEBSHELL- = 2
Target Milestone: M14
Travis, is this yours?
If this still isn't there this sounds more like a builds problem than a coding problem. We know the code is there and works on most platforms, just seems the right environments aren't being set on some of the release or daily builds.
this has always been a build bug AFAIK. Remember that the XXX: Warning message predates by a lot the console messages about incrementing and decrementing the Webshell count. That part just never, never got hooked up on Unix. See Niseeth's comments IN THIS BUG on 9/21, 10/01, and 10/27.
to which I question why it is assigned to me and not someone who would know what needs to change to hook it up on the build correctly. Leaf?
Assignee: travis → leaf
Yes, i'll take this puppy back.
clearing old status, where are we at now with this?
Whiteboard: NOV-18 still waiting for unix to get hooked up
Due to Beta indication in Summary, putting beta1 into keyword field.
Keywords: beta1
Whiteboard: [PDT-]
Whiteboard: [PDT-] → [PDT+]
Argh, i tried to verify that this builds on linux (--enable-detect-webshell-leaks), and i get linker errors. Could be a dependency problem; trying to make clean now.
Summary: [BETA] [Webshell] test for webshell leaking? → [Webshell] test for webshell leaking?
the beta1 stopper is that we still leak a webshell or more on the builds where its enabled (win32). having this assigned to leaf isn'getting us any closer to diagnosing and fixing that leak. travis, how is the best person to start the investigation about why we show one or more webshell leaks as shutdown.
If this bug is about getting the test hooked up on unix, it can probaby be done by friday, provided i get nisheeth's help getting it to build. If it's about actually finding the cause of the leaks, and fixing that, then it should get reassigned.
Whiteboard: [PDT+] → [PDT+] February 11th
this bug as written was for getting the monitoring system hooked up. Unix is the last remaining platform w/o the system fully implemented. We leak webshells left and right in different instances, those are and should be different, specific bugs.
Here is a patch that fixed the problem with docshell. I helped Nisheeth with this bug. Index: Makefile.in =================================================================== RCS file: /cvsroot/mozilla/xpfe/bootstrap/Makefile.in,v retrieving revision 1.78 diff -r1.78 Makefile.in 64c64 < XP_LIBS += -lraptorwebwidget --- > XP_LIBS += -lraptorwebwidget -ldocshell -ljsdom
Waqar's patch is checked in, getting review by cls for the turning on of detect-webshell-leaks by default.
Status: NEW → ASSIGNED
Component: Embedding APIs → Build Config
Whiteboard: [PDT+] February 11th → [PDT+] I have a reviewed diff, waiting for open tree
Fixed as of 2/15/00 2:34pm; this will appear in the next verification builds performed, and any developer's builds (unless they turn off debugging or explicitly switch off this code)
Status: ASSIGNED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
I'm hanging on exit (bug 26608 ?) so i can't verify this yet
anyone know of activities likely to result in a leaked webshell on Linux? ( so i can prove this is fixed)
might be able to find some by running browser buster and then narrowing down to specific sites. we have found some in the past using this method.
Is this running at the correct time? On Solaris, it claims that a webshell is leaking ... and then later on after it claims that, the webshell gets released.
PDT, A more sophisticated brand of testing may be in order here. I'm going to give it some more tries but I have not been able to leak a webshell and exit normally at the same time (to prove that I've leaked) so the verification here is still pending. Currently, I'm working with chofmann's browser buster page but I'm still open to suggestion.
Whiteboard: [PDT+] I have a reviewed diff, waiting for open tree → [PDT+] hard to verify
Maybe bruce or someone on the CC list can help verify this?
Whiteboard: [PDT+] hard to verify → [PDT+] need help to verify
Please read my last comment on this bug. It runs at shutdown, but appears to run at the _wrong_ time. It reports a webshell leaking, but that webshell gets freed subsequently.
If that's the case bruce then this bugs need to be REOPENED for reexamination.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Is this bug a total of 1 object leaked by the time we shut down? Or is the issue that arbitrarilly many objects will be leaked (the longer you work... the more you do... the more you leak)? If this is a fixed amount of "leak," it is really equivalent to overhead bloat, and we can certainly live with that for beta. ...and of course if the leak tool is in error... then it is not even a bug. :-) If this is not a growing (unbounded) leak, can we get this off the beat1 radar? Thanks, Jim
Whiteboard: [PDT+] need help to verify → [PDT+] [NEED INFO] need help to verify
The bug is that we haven't yet got a working system to _prevent_ us from regressing in terms of webshell leaks. They are traditionally a source of long trains of leaked objects, so we've been talking for a long time about making this one of the tinderbox tests. We're so close...
This bug is _not_ about any particular instance of a webshell leak. What it is about is that for the longest time, we were continually seeing new webshell leaks being introduced into the tree. They were very damaging, a source of regressions, and difficult to track down. To help with this, there was a desire to add something so that anyone could tell when they'd introduced a new webshell leak. That's what this is about. All of the other myriad of webshell leaks need bugs of their own.
Finding webshell leak regressions is still a very real problem. I've seen numerous instances over the last few days of mozilla not exiting, or crashing on exit, when the webshell count printed out at the end was 3 or more. Having a reliable webshell leak detector would make it a lot easier to track these.
jar: note this code has been up and running for Win and Mac for a long time now. It just took extra work to get implemented on Linux. This bug is really just, 'make the webshell test work on linux'. Given that, you can make a pdt plusness evaluation. *Question: Are we even shipping a commercial version on linux with all the console messages still turned on?
This bug has nothing to do with what we ship and whether or not it shows console messages. This is something for developers to use to make sure they don't leak webshells. Leaked webshells can make the product completely unusable and are a big nuisance for developers. This may not need to be PDT+, but it is important to trying to get these webshell leaks fixed. Please chat with some of the people who know about webshells and what's involved. This isn't "just another object" leaking.
My last comment wasn't intended as a flame against claudius. I've apologized and explained in separate email.
Thanks for the better description... and sorry if I was missing it. It is clear that this is a call for a fix for the leak measuring infrastructure. That is clearly a Very Good Thing. On the other hand, it is not (an this point) critical for beta1. More simply, we wouldn't hold beta1 if this bug wasn't fixed. I'll mark the bug as going to PDT- on 3/3. That will give you a shot at clearing this up RSN, but we need to move it off the radar, to help focus on the items that might hold beta.
Whiteboard: [PDT+] [NEED INFO] need help to verify → [PDT+] [NEED INFO] need help to verify w/b minus on 3/3
We passed the 3/3 date... so I'm transitioning this to PDT- for beta1, but it will be PDT+ for beta2. We really want this change soon... but it is now off the beat1 radar.
Whiteboard: [PDT+] [NEED INFO] need help to verify w/b minus on 3/3 → [PDT-] need help to verify plus for beta2
So, to be *perfectly* clear: this bug is to track the webshell detection tool? If so, this should be closed, because the tool is checked in. If the bug is open because it's not built by default, then i need to reassign to nisheeth to make the peformance problem with the test go away, so that it can be on by default in the debug builds at least. Performance hits of 100% make the debug builds impossible to realistically test.
Status: REOPENED → ASSIGNED
leaf: yes. this bug is for the tool. Morever, it has been working fine on Mac and Win32. (w/o the console i can no longer verify this) It has never worked on Linux. It did get checked in finally for Linux but it does not work. It was very difficult to verify and for good reason - it's not working. Currently (03/14 builds) it's firing off at the wrong time(too early). I've seen it report 1 webshell leaked only to see the webshell released shortly afterward and decrement to 0.
Ok, so it's an issue with the tool itself still, reassigning to nisheeth. Nisheeth, if you shouldn't own this bug, push back on your management so we can find the appropriate owner.
Assignee: leaf → nisheeth
Status: ASSIGNED → NEW
I have a potential fix which should prevent the error message from printing too early. I've tested it on windows and am about to test it on unix. Will check it in today once the tree opens. Accepting bug and setting target milestone to M15.
Status: NEW → ASSIGNED
Target Milestone: M14 → M15
The fix is checked in.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
Finally! I apparently leaked two webshells on linux yesterday. The warning message appeared to fire off at the correct time. Marking verified with the 2000033009 Linux build
Status: RESOLVED → VERIFIED
Putting on beta2 keyword radar since this was "plus for beta2" during beta1 triaging
Keywords: beta2
Keywords: nsbeta2
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: