Closed Bug 1078448 Opened 5 years ago Closed 2 years ago

Callscreen app instances linger in bfcache after being closed

Categories

(Firefox OS Graveyard :: Gaia::Dialer, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: drs, Unassigned)

References

Details

(Whiteboard: [planned-sprint c=6])

Attachments

(3 files, 5 obsolete files)

When the callscreen closes because all calls have ended, we kill the app and restart it with no calls. While investigating bug 1070066, we found that the old instances linger in the bfcache.

In particular, see the conversation beginning at bug 1070066 comment 19.
I don't know what I should say here.

Are we sure those window objects are in bfcache?

Has anyone tried to explicitly disable bfcache for callscreen?
(add dummy unload event listener)
Flags: needinfo?(bugs)
I'll take the liberty of need-infoing Doug regarding comment 1 ;)
Flags: needinfo?(drs+bugzilla)
Assignee: nobody → drs+bugzilla
Flags: needinfo?(drs+bugzilla)
Whiteboard: [planned-sprint c=]
Target Milestone: --- → 2.1 S7 (24Oct)
Whiteboard: [planned-sprint c=] → [planned-sprint c=3]
Whiteboard: [planned-sprint c=3] → [planned-sprint c=1]
I tried adding a dumb listener on the "unload" event, to see if it helps with bug 1081714, but so far it does nothing and I still see a lot of DOMWINDOW callscreen.
Doug, I have some PR to submit, sorry for stealing this :)
Assignee: drs+bugzilla → lissyx+mozillians
Attached file Gaia PR (obsolete) —
With this change, I'm not able to reproduce anymore bug 1074379 and bug 1081714, as documented on bug 1074379 comment 44 and bug 1074379 comment 45.
We don't see anymore callscreen leaking in this memory report done after running the 50 voicemail calls UI test.
[Blocking Requested - why for this release]: if it's confirmed to be fixing bug 1081714 and bug 1074379, those two are 2.1+
blocking-b2g: --- → 2.1?
Assignee: lissyx+mozillians → nobody
blocking-b2g: 2.1? → ---
Component: Gaia::Dialer → Event Handling
Product: Firefox OS → Core
Target Milestone: 2.1 S7 (24Oct) → ---
(In reply to Olli Pettay [:smaug] from comment #1)
> I don't know what I should say here.
> 
> Are we sure those window objects are in bfcache?
> 
> Has anyone tried to explicitly disable bfcache for callscreen?
> (add dummy unload event listener)

Yes, that does the trick. I'm resetting this bug to track the underlying platform issue, and I'll file a followup to use this workaround for now in Gaia.
Flags: needinfo?(bugs)
What is the underlying platform issue?
Flags: needinfo?(bugs)
The fact the pages may go to bfcache?
Duplicate of this bug: 1074379
Duplicate of this bug: 1081714
[Blocking Requested - why for this release]: This patch fixes bug 1074379 and bug 1081714 which were both 2.1 blockers. Carrying the status of them to that bug.
blocking-b2g: --- → 2.1?
Is anyone doing some investigation into the underlying platform problem? We are seeing other problems with pages not loading, like bug 1080753.
Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(drs.bugzilla)
(In reply to Kevin Grandon :kgrandon from comment #14)
> Is anyone doing some investigation into the underlying platform problem? We
> are seeing other problems with pages not loading, like bug 1080753.

Not that I'm aware of.
Flags: needinfo?(drs.bugzilla) → needinfo?(bugs)
No, I don't know bfcache and I'm already busy with settings.
Flags: needinfo?(lissyx+mozillians)
Some answer to comment 9 and comment 10 would be nice. I really don't understand what the platform issue is here. Do we have some APIs which haven't been designed so that the page can be in bfcache, yet the implementation doesn't prevent bfcache on the pages they are used or what?
(I don't know which APIs are being used in Callscreen app)
Flags: needinfo?(bugs)
taking for 2.1 since it blocks a blocker.
blocking-b2g: 2.1? → 2.1+
Duplicate of this bug: 1087179
We worked around this issue in bug 1083729 to fix bug 1074379, and bug 1081714. Thus I don't think this should be a blocker.
No longer blocks: 1074379, 1081714
blocking-b2g: 2.1+ → 2.1?
See Also: → 1074379, 1081714
Based on my discussion with Olli, it seems that this is expected behavior, and I agree.

Here, we reload the Callscreen window from the System app:
https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/callscreen_window.js#L152

Olli says that we can remove the Callscreen iframe and re-add it back to the document. This destroys and recreates the DocShell, and the bfcache along with it.
Component: Event Handling → Gaia::Dialer
Product: Core → Firefox OS
Target Milestone: --- → 2.1 S8 (7Nov)
Thanks, removing the 2.1 nom.
blocking-b2g: 2.1? → ---
I still think we should be investigating the platform issue as a 2.1+ blocker here. This may be causing other problems in gaia.
I'm investigating this a bit, but so far I haven't really seen any real platform issue.
System app just loads a page, then about:blank, then a page ,then about:blank and so on.

But we could possibly clean up session history/bfcache in the relevant iframe had some specific
attribute. (I was first thinking not using bfcache for mozbrowser things, but apparently we're dealing with a mozbrowser iframe here, and bfcache is very useful in the normal browsing case.)
But perhaps mozapp should disabled bfcache? I never remember how mozbrowser and mozapp are related to each other (like, when do we need mozapp)
Bfcache is very useful, and I think we should be moving to a state where we have *more* of it, not less. I have a hunch that bug 1080753 is being caused by the same issue here though, still investigating.
Oops, forgot to mention that bug 1080753 has problems when dealing with web content, in a non-mozapp state.
So in normal context we wouldn't use bfcache for iframe, but the whole point of
mozbrowser/app is that iframe starts to behave as a top level browsing context and we do have
bfcache enabled.
Setting browser.sessionhistory.max_total_viewers to 0 should disable the whole bfcache.
Perhaps we want that in the parent process and child processes could have
browser.sessionhistory.max_total_viewers 1 (b2g.js has 1 for that).
Also, if one wants to explicitly do some bfcache eviction, there are various
nsISHistoryInternal.evict*() methods for that.
Assignee: nobody → thills
See Also: → 1083729
Status: NEW → ASSIGNED
Attached patch WIP patch for feedback (obsolete) — Splinter Review
WIP patch for feedback.  Couple of notes:
1) removed the workaround in index.js.  
2) I have a doubt about whether the _unregisterEvents is necessary.  My concern was that we'll be leaking the events if we constantly re-register them without unregistering.

Thanks,
-tamara
Attachment #8513472 - Flags: feedback?(drs.bugzilla)
Comment on attachment 8513472 [details] [diff] [review]
WIP patch for feedback

Review of attachment 8513472 [details] [diff] [review]:
-----------------------------------------------------------------

::: apps/system/js/app_window.js
@@ +1011,5 @@
> +    this.constructor.REGISTERED_EVENTS.forEach(function iterator(evt) {
> +      this.debug('removing ' + evt + ' event handler ...');
> +      this.element.removeEventListener(evt, this);
> +    }, this);
> +  };

From my memory profiling, it looks like we missed this originally. I'm not sure where the additional listeners are being registered from, though. It seems that they will only be attached when we call `this.render()`. Please investigate this further.

::: apps/system/js/callscreen_window.js
@@ +165,5 @@
> +        }
> +      }
> +    }
> +
> +    this.render();

This is overly complicated. All we have to do is detach `this.browser.element` from its parent and then re-attach it.
Attachment #8513472 - Flags: feedback?(drs.bugzilla) → feedback-
Attached patch WIP 1078448v3.diff (obsolete) — Splinter Review
Hi Doug,

I did more investigation on eventListeners all around.  

1. After looking at the memory profiles, I don't have any evidence that we are leaking eventListeners in the Callscreen due to the changes. 

2. After looking at the memory profiles on the system, I don't believe it's necessary to unregister the eventListeners.  I removed that and did not find any evidence that we are leaking them on system and found the code to be unnecessary.  

As for the other listeners in system, each Window app that inherits itself from AppWindow.js has a list of REGISTERED_EVENTS... Callscreen's is a copy of what AttentionWindow's is.  there are close to 1000 eventListeners registered before we even make a call.  When I looked at stacktrace for AppWindow._registerEvents HomeScreenWindow was one of the ones that also called this function.

There are probably tons more eventListeners in system not coming from _registerEvents, but there is no way to get a list of all the eventListeners from the memory tool, just a count.

Thanks,

-tamara
Attachment #8513472 - Attachment is obsolete: true
Attachment #8514419 - Flags: feedback?(drs.bugzilla)
Comment on attachment 8514419 [details] [diff] [review]
WIP 1078448v3.diff

Review of attachment 8514419 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable. When we're ready to go into review with this, we should flag Etienne as well.
Attachment #8514419 - Flags: feedback?(drs.bugzilla) → feedback+
Are we still not doing investigation into the platform issue here? Are we sure that this can't impact other apps?
Attached file PR
Attachment #8518533 - Flags: review?(drs.bugzilla)
Attachment #8518533 - Flags: feedback?(etienne)
Comment on attachment 8518533 [details] [review]
PR

See my comments on the PR.

(In reply to Kevin Grandon :kgrandon from comment #34)
> Are we still not doing investigation into the platform issue here? Are we
> sure that this can't impact other apps?

Setting ni on myself to file bugs and do a writeup on this.
Flags: needinfo?(drs.bugzilla)
Attachment #8518533 - Flags: review?(drs.bugzilla)
Attachment #8518533 - Flags: review-
Attachment #8518533 - Flags: feedback?(etienne)
Whiteboard: [planned-sprint c=1] → [planned-sprint c=1][in-sprint=v2.1-S8]
Target Milestone: 2.1 S8 (7Nov) → 2.1 S9 (21Nov)
Attached file Updated PR after last review (obsolete) —
Hi Doug,

I updated the PR with changes to the test to use validate the iframe is removed/added, vs. using the spies.

To address your comment about the setTimeout.  I don't believe we need this for the functionality as I tested without it and saw no leaks and everything seemed to function fine.  It was there in the code, so I did not remove it as there could be something we're not aware of.  Also, I suspect it may have been there to help with the testing.

Thanks,

-tamara
Attachment #8505522 - Attachment is obsolete: true
Attachment #8514419 - Attachment is obsolete: true
Attachment #8525045 - Flags: review?(drs.bugzilla)
Comment on attachment 8525045 [details] [review]
Updated PR after last review

Please rebase this before we review it.
Attachment #8525045 - Flags: review?(drs.bugzilla)
Attached file updated PR after rebase (obsolete) —
Hi Doug,

This is updated PR after rebasing and retesting bug 834530 to make sure it still works after rebase.

Thanks,

-tamara
Attachment #8525045 - Attachment is obsolete: true
Attachment #8525385 - Flags: review?(drs.bugzilla)
Comment on attachment 8525385 [details] [review]
updated PR after rebase

Looks good! I left some nit-picky comments on the PR. Please address them and then we'll merge.

As I said, we'll have to discuss the platform side of this more. Leaving my ni for now.

Please remember to make a demo and put it on the sprint demo page:
https://wiki.mozilla.org/FirefoxOS/Comms/Dialer/Sprint/v2.1-S9#Demos
Flags: needinfo?(thills)
Attachment #8525385 - Flags: review?(drs.bugzilla) → review+
Hi Doug,

I had a broken test in index_test.js.  This update has that fixed.

Thanks,
-tamara
Attachment #8525385 - Attachment is obsolete: true
Attachment #8526341 - Flags: review?(drs.bugzilla)
Comment on attachment 8526341 [details] [review]
Update to patch -- broken test

In the future, please try to remember to include changes as additional commits. I'm assuming that the only changes were to 'index_test.js'.

I left one nit comment on the PR, but it otherwise looks good.
Attachment #8526341 - Flags: review?(drs.bugzilla) → review+
Backed out: https://github.com/mozilla-b2g/gaia/commit/a3e2f4d39082ffc0ee67c72c48d6e7be2486010e

For causing bug 1105747.
Status: RESOLVED → REOPENED
Depends on: 1105747
Resolution: FIXED → ---
Flags: needinfo?(drs.bugzilla)
Whiteboard: [planned-sprint c=1][in-sprint=v2.1-S8] → [c=1][in-sprint=v2.1-S9]
Target Milestone: 2.1 S9 (21Nov) → 2.2 S1 (5dec)
For comment 36.
Flags: needinfo?(drs.bugzilla)
Olli suggested calling `nsIDocument::DisableBFCaching()` from somewhere in Gecko to disable the BFCache for System iframes, which I believe would fix this issue and any others like it. Unfortunately, I don't know enough about the Gecko code that supports the System app to know where this would go. Kevin, do you know who we could talk with about this?
Flags: needinfo?(drs.bugzilla) → needinfo?(kgrandon)
(In reply to Doug Sherk (:drs) (use needinfo?) from comment #46)
> Olli suggested calling `nsIDocument::DisableBFCaching()` from somewhere in
> Gecko to disable the BFCache for System iframes,

Well, to be precise I suggested calling DisableBFCaching in the API implementation of the thing
callscreen uses. If the API implementation can't deal with pages being in bfcache, it should disable bfcache.
(I don't know which all APIs callscreen uses.).
Callscreen just uses a mozapp="", no? I suppose we could potentially disable it, or maybe create some new iframe property to disable it? I don't really know that code all too well though.

I'm not sure if it's realted or not, but I've just filed bug 1106280 which I think could be due to some bfcache bug?
Flags: needinfo?(kgrandon)
I mean the API which one can use to answer to a call and so.  What is actually causing the issues?
Usually web pages work just fine when they enter to bfcache.
Whiteboard: [c=1][in-sprint=v2.1-S9] → [planned-sprint c=?]
Target Milestone: 2.2 S1 (5dec) → 2.2 S4 (23jan)
Whiteboard: [planned-sprint c=?] → [planned-sprint c=6]
Target Milestone: 2.2 S4 (23jan) → 2.2 S5 (6feb)
Target Milestone: 2.2 S5 (6feb) → 2.2 S6 (20feb)
Target Milestone: 2.2 S6 (20feb) → ---
Assignee: thills → administration
Firefox OS is not being worked on
Status: REOPENED → RESOLVED
Closed: 5 years ago2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.