Intermittent leakcheck | tab process: 10136 bytes leaked (ChannelEventQueue, ChildDNSService, CondVar, ConsoleReportCollector, DataStorage, ...)

RESOLVED FIXED in Firefox 55

Status

()

defect
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: intermittent-bug-filer, Assigned: schien)

Tracking

({intermittent-failure, memory-leak})

unspecified
mozilla56
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55 fixed, firefox56 fixed)

Details

(Whiteboard: [necko-active][stockwell fixed])

Attachments

(2 attachments)

on my radar, will check if it related to my recent PBg-Http patches.
Assignee: nobody → schien
Keywords: mlk
Whiteboard: [necko-active]
Comment hidden (Intermittent Failures Robot)
looks like it is reproducible on Mac, might have better change to reproduce it locally.
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
this is failing quite frequently on osx/win8 debug.  Win8 is non-e10s, osx is e10s.

here is a more recent osx debug log from trunk:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=113535234&lineNumber=14171

this looks to be leaking in this directory:
chrome://mochitests/content/chrome/dom/xul/test

here is the bloat view:
21:25:24     INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, tab process 664
21:25:24     INFO - 
21:25:24     INFO -      |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
21:25:24     INFO -      |                                      | Per-Inst   Leaked|   Total      Rem|
21:25:24     INFO -    0 |TOTAL                                 |       49    10248|   48754      103|
21:25:24     INFO -   30 |ChannelEventQueue                     |      160      160|       1        1|
21:25:24     INFO -   31 |ChildDNSService                       |      168      168|       1        1|
21:25:24     INFO -   43 |CondVar                               |       64      128|      59        2|
21:25:24     INFO -   45 |ConsoleReportCollector                |      112      112|       7        1|
21:25:24     INFO -   57 |DataStorage                           |      456      912|       3        2|
21:25:24     INFO -   93 |HttpBackgroundChannelChild            |       72       72|       1        1|
21:25:24     INFO -   94 |HttpChannelChild                      |     2232     2232|       1        1|
21:25:24     INFO -  113 |LoadInfo                              |      200      200|     184        1|
21:25:24     INFO -  126 |Mutex                                 |       80      880|     160       11|
21:25:24     INFO -  142 |PHttpBackgroundChannelChild           |       40       40|       1        1|
21:25:24     INFO -  143 |PHttpChannelChild                     |       40       40|       1        1|
21:25:24     INFO -  161 |PollableEvent                         |       24       24|       1        1|
21:25:24     INFO -  185 |ReentrantMonitor                      |       40       80|      32        2|
21:25:24     INFO -  187 |RequestContext                        |       64       64|       2        1|
21:25:24     INFO -  188 |RequestContextService                 |       88       88|       1        1|
21:25:24     INFO -  194 |Runnable                              |       40       80|     366        2|
21:25:24     INFO -  197 |SchedulerEventTarget                  |       48       48|      16        1|
21:25:24     INFO -  236 |TabGroup                              |      240      240|       1        1|
21:25:24     INFO -  264 |WeakReference<nsDocShell>             |       24       24|      10        1|
21:25:24     INFO -  313 |nsAuthURLParser                       |       24       24|       2        1|
21:25:24     INFO -  359 |nsDSURIContentListener                |       88       88|       1        1|
21:25:24     INFO -  362 |nsDefaultURIFixup                     |       24       24|       1        1|
21:25:24     INFO -  369 |nsDocLoader                           |      424      424|       2        1|
21:25:24     INFO -  370 |nsDocShell                            |     1168     1168|       1        1|
21:25:24     INFO -  371 |nsDocShell::InterfaceRequestorProxy   |       32       32|       1        1|
21:25:24     INFO -  377 |nsDocumentOpenInfo                    |       88       88|       2        1|
21:25:24     INFO -  406 |nsHttpAuthCache::OriginClearObserver  |       32       64|       2        2|
21:25:24     INFO -  407 |nsHttpHandler                         |      720      720|       1        1|
21:25:24     INFO -  408 |nsHttpRequestHead                     |      136      136|       1        1|
21:25:24     INFO -  410 |nsIOService                           |      288      288|       1        1|
21:25:24     INFO -  427 |nsJSPrincipals                        |       24       24|      33        1|
21:25:24     INFO -  430 |nsLoadGroup                           |       16       16|       2        1|
21:25:24     INFO -  432 |nsMainThreadPtrHolder<T>              |       40       80|       2        2|
21:25:24     INFO -  477 |nsSiteSecurityService                 |       80       80|       1        1|
21:25:24     INFO -  478 |nsSocketTransportService              |      320      320|       1        1|
21:25:24     INFO -  479 |nsStandardURL                         |      272      272|     834        1|
21:25:24     INFO -  483 |nsStringBuffer                        |        8      144|   14195       18|
21:25:24     INFO -  494 |nsTArray_base                         |        8      168|    9096       21|
21:25:24     INFO -  502 |nsURILoader                           |       32       32|       1        1|
21:25:24     INFO -  505 |nsVariant                             |       72       72|       2        1|
21:25:24     INFO -  508 |nsWeakReference                       |       40      360|      44        9|
21:25:24     INFO -  511 |nsWebNavigationInf



I assume this is in the core::networking because of the ChildDNSService and related bits on the bloatview.

:mcmanus, I see you are the triage owner for core::networking, can you find someone on the networking team to look into this failure sometime in the next two weeks?
Flags: needinfo?(mcmanus)
Whiteboard: [necko-active] → [necko-active][stockwell needswork]
sc already took the bug
Flags: needinfo?(mcmanus)
Comment hidden (Intermittent Failures Robot)
Sorry for the late response, I'll provide analysis in these two days.
Reproduced on my local machine with |./mach mochitest --run-until-failure dom/xul/test/test_bug1271240.xul| for few times.

The leakage is caused by the reference cycle between HttpChannelChild and HttpBackgroundChannelChild.
The reference cycle is supposed to be removed when HttpBackgroundChannelChild::ActorDestroyed, however this operation is pending due until HttpChannelChild receives OnStartRequest IPC message. Unfortunately the OnStartRequest IPC message is dropped because of PContent shutdown procedure. So, the reference removal is never executed, which leaks memory until content process is terminated.
Posted file fail-http.log
relevant HTTP MOZ log.

HttpBackgroundChannelChild @0x7f3a48dafd30 and HttpChannelChild @0x7f3a48d40000 are the leaked object.
Comment hidden (mozreview-request)
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

cancel r? because try result looks bad.
Attachment #8887431 - Flags: review?(honzab.moz)
Comment hidden (mozreview-request)
Comment hidden (Intermittent Failures Robot)
@mayhemer, is there anything blocking the review? I'd like to land this patch before merge day.
Flags: needinfo?(honzab.moz)

Comment 20

2 years ago
mozreview-review
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

https://reviewboard.mozilla.org/r/158278/#review167396

::: netwerk/protocol/http/HttpBackgroundChannelChild.cpp:124
(Diff revision 2)
>  
>    // HttpChannelChild is not going to handle any incoming message.
>    mChannelChild = nullptr;
> +
> +  // Remove pending IPC messages as well.
> +  mQueuedRunnables.Clear();

can any of the messages hold a reference to an object we expect to release on some particular thread?  in that case you would need to release them on the respective threads one by one.

::: netwerk/protocol/http/HttpChannelChild.cpp:3620
(Diff revision 2)
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  // OnStartRequest might be dropped if IPDL is destroyed abnormally
> +  // and BackgroundChild might have pending IPC messages.
> +  // Clean up BackgroundChild at this time to prevent memleak.
> +  if (aWhy != Deletion) {

any reason why we shouldn't do this all the time?  why we don't need to do this when the reason is Deletion?  (no docs for this state..)
c20
Flags: needinfo?(honzab.moz)
Assignee

Comment 23

2 years ago
mozreview-review
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

https://reviewboard.mozilla.org/r/158278/#review167616

::: netwerk/protocol/http/HttpBackgroundChannelChild.cpp:124
(Diff revision 2)
>  
>    // HttpChannelChild is not going to handle any incoming message.
>    mChannelChild = nullptr;
> +
> +  // Remove pending IPC messages as well.
> +  mQueuedRunnables.Clear();

No, all the runnables are contains a copy of data received from IPC. Nothing is required to release on specific thread.

::: netwerk/protocol/http/HttpChannelChild.cpp:3620
(Diff revision 2)
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  // OnStartRequest might be dropped if IPDL is destroyed abnormally
> +  // and BackgroundChild might have pending IPC messages.
> +  // Clean up BackgroundChild at this time to prevent memleak.
> +  if (aWhy != Deletion) {

The only document I found is in the original bug: https://bugzilla.mozilla.org/show_bug.cgi?id=529005#c6
Deletion is paired with Send__delete__(), which is the normal case of closing an IPDL actor.

On parent side we might close the background channel immediately after all messages are sent. If we handle the ActorDestroy immediately we will either lose the chance to process those enqueued messages. Normal Deletion should be treated in order with other IPC messages.
@mayhemer, please see my response in comment #23.
Flags: needinfo?(honzab.moz)
Thanks for this reply!  This is r+ then.
Flags: needinfo?(honzab.moz)

Comment 26

2 years ago
mozreview-review
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

https://reviewboard.mozilla.org/r/158278/#review167748
Attachment #8887431 - Flags: review?(honzab.moz) → review+
(maybe these comments in some simplified form should appear in the code as well.)

Comment 28

2 years ago
Pushed by schien@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/039ee2964fb9
break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed. r=mayhemer
Please request Beta approval on this too.

Comment 30

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/039ee2964fb9
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1015466
[User impact if declined]: memleak until content process shutdown
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: low
[Why is the change risky/not risky?]: change abnormal shutdown handling
[String changes made/needed]: no
Flags: needinfo?(schien)
Attachment #8887431 - Flags: approval-mozilla-beta?
Comment hidden (Intermittent Failures Robot)
Comment on attachment 8887431 [details]
Bug 1369571 - break reference cycle between HttpChannelChild and HttpBackgroundChannelChild while PHttpChannel is destroyed.

necko leak fix, for 55 rc1
Attachment #8887431 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 34

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/2cd8c7f13e6c
Whiteboard: [necko-active][stockwell needswork] → [necko-active][stockwell fixed]
Duplicate of this bug: 1379094
You need to log in before you can comment on or make changes to this bug.