Closed Bug 1218297 Opened 4 years ago Closed 4 years ago

Intermittent leakcheck | default process: 1060 bytes leaked (CondVar, EventTokenBucket, Mutex, nsDeque, nsRunnable, ...)

Categories

(Core :: Networking, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
firefox44 + wontfix
firefox45 --- fixed
firefox46 --- fixed

People

(Reporter: cbook, Assigned: mcmanus)

References

()

Details

(Keywords: intermittent-failure, memory-leak)

Attachments

(1 file)

https://treeherder.mozilla.org/logviewer.html#?job_id=16232834&repo=mozilla-inbound

 19:57:02 WARNING - TEST-UNEXPECTED-FAIL | leakcheck | default process: 1060 bytes leaked (CondVar, EventTokenBucket, Mutex, nsDeque, nsRunnable, ...)
this goes back past oct 24th where the builds live no more.
This is a memory leak and callek mentioned this as "always" rather than intermittent at least for the past week or so.
David, thought I would start by pinging you about this memory leak. Though it is only a 1K leak, it is unclear how and whether it grows over time. Would you be able to suggest someone who ought to investigate? Or is this too minor to be looked into? Any help would be awesome. Thanks!
Flags: needinfo?(dbaron)
Not quite always: 2 green out of 25 m3 triggered, and 6 orange out of 25 m5 triggered, which is a bit worse than it was while 44 was on aurora, but not all that much worse, really.

Conveniently, we now run tests, and thus do leak checking, by directory, so we know the m3 leak is something in dom/permissions/ and the m5 is in htmlparser.

I'd like to claim it's just something like a shutdown race that Win7 loses by being the fastest debug platform, but although some of the non-Win7 things Orange Factor shows are just misstars, some of the WinXP instances are actually this, and there's no sense in which you can say that WinXP debug is fast.
Leaking the STS and related paraphernalia -> Necko
Component: General → Networking
Flags: needinfo?(dbaron)
My guess is the EventTokenBucket timer fires right as the SocketTransportService thread is shutting down. The timer event gets dropped and the EventTokenBucket is kept alive in a cycle with its timer past shutdown. nsHttpHandler probably needs to explicitly shutdown the EventTokenBucket on the STS thread during shutdown.
Flags: needinfo?(mcmanus)
Duplicate of this bug: 1233774
Why does nsThread::DispatchInternal leak the event if it fails to dispatch?  That seems ... really dumb.  And it certainly hasn't done that historically?
Flags: needinfo?(nfroyd)
thanks kyle.

This is a long standing small shutdown-only leak... a few weeks back we made a change that would have made executing this code path more common which is probably why its a recent entrant on the orange list.
Assignee: nobody → mcmanus
Flags: needinfo?(mcmanus)
Attachment #8701487 - Flags: review?(valentin.gosu) → review+
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #56)
> Why does nsThread::DispatchInternal leak the event if it fails to dispatch? 
> That seems ... really dumb.  And it certainly hasn't done that historically?

Dispatching to the CurrentThread shouldn't do this.  Dispatching to a different thread almost has to, because otherwise you might be releasing objects on the wrong thread.  It's certainly possible that I screwed up a review in the last couple months, though.
Flags: needinfo?(nfroyd)
Dispatching to a different thread didn't use to do this. Runnables were assumed to be threadsafe, and could be refcounted on any thread. I assume when we added the already_AddRefed stuff we changed this (because then we didn't create the reference so we don't know if it's safe to destroy) but that seems like a design problem ...
Flags: needinfo?(nfroyd)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #62)
> Dispatching to a different thread didn't use to do this. Runnables were
> assumed to be threadsafe, and could be refcounted on any thread. I assume
> when we added the already_AddRefed stuff we changed this (because then we
> didn't create the reference so we don't know if it's safe to destroy) but
> that seems like a design problem ...

Well, the runnables themselves are certainly safe to refcount anywhere, but the contents of the runnables might not be...
Flags: needinfo?(nfroyd)
Right ... and that's a problem of the runnable ...

Maybe we should take this to a mailing list?
https://hg.mozilla.org/mozilla-central/rev/3aef96e9b67b
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Patrick, should we uplift this patch to Beta44? If yes, please nominate for uplift.
Flags: needinfo?(mcmanus)
(In reply to Ritu Kothari (:ritu) from comment #67)
> Patrick, should we uplift this patch to Beta44? If yes, please nominate for
> uplift.

no - this is a small leak that only happens during shutdown. It has no impact on end users.
Flags: needinfo?(mcmanus)
Wontfix for 44 based on comment 69
Please nominate this for Aurora uplift, though. Gecko 45 is going to be with us in ESR form for the next year, so it would be good to clean up oranges like this on it.
Flags: needinfo?(mcmanus)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #73)
> Please nominate this for Aurora uplift, though. Gecko 45 is going to be with
> us in ESR form for the next year, so it would be good to clean up oranges
> like this on it.

do you really feel that's worthwhile? I think changes that only benefit developers, such as shutdown leaks, should stay on central.
Flags: needinfo?(mcmanus)
No, I don't think it's a good idea to leave a frequent intermittent failure in place for the next year on ESR45, especially when there's a low-risk fix in hand. It causes unnecessary confusion and frustration for both developers and Release Management. Unless my assumption about risk is incorrect? What's the downside here?
Flags: needinfo?(mcmanus)
We do often backport fixes for frequent intermittent failures to ESR branches.
Comment on attachment 8701487 [details] [diff] [review]
eventtokenbucket shutdown leak  try: -b do -p linux64,macosx64,win32 -u all -t none

Approval Request Comment
[Feature/regressing bug #]: long time feature (2+ years)
[User impact if declined]: absolutely none
[Describe test coverage new/current, TreeHerder]: improves intermittent orange
[Risks and why]: low - shutdown path change to improve usability of test tools
[String/UUID change made/needed]: none
Flags: needinfo?(mcmanus)
Attachment #8701487 - Flags: approval-mozilla-aurora?
Comment on attachment 8701487 [details] [diff] [review]
eventtokenbucket shutdown leak  try: -b do -p linux64,macosx64,win32 -u all -t none

As KWierso pointed out, taking this in Aurora45 is helpful also to get this ESR45 branch (to be created soon). Taking it.
Attachment #8701487 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Ritu Kothari (:ritu) from comment #79)
> Comment on attachment 8701487 [details] [diff] [review]
> eventtokenbucket shutdown leak  try: -b do -p linux64,macosx64,win32 -u all
> -t none
> 
> As KWierso pointed out, taking this in Aurora45 is helpful also to get this
> ESR45 branch (to be created soon). Taking it.

RyanVM also mentioned it in comment 75 and I am always nice to Sheriffs (new and ex)! :)
You need to log in before you can comment on or make changes to this bug.