Closed Bug 993892 Opened 6 years ago Closed 3 years ago

[Message] Threads and messages are kept loading after switch to background

Categories

(Firefox OS Graveyard :: Gaia::SMS, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

(tracking-b2g:backlog)

RESOLVED WONTFIX
tracking-b2g backlog

People

(Reporter: ting, Unassigned)

References

Details

Attachments

(2 files)

STR:
  1. $ cd gaia; make reference-workload-heavy
  2. Launch Messages application, enter BIG-THREAD-MIXED
  3. Press power button to lock screen, but thlui_renderThreads() is still called

Version:
  v1.3t
  gaia  = d9ac78148591a50356dd5a82d2d295eeb7c234a6
  gecko = 04cea893d42ad3291e2e75b1d66a6441ec8f2716

Loading threads/messages when it is in background causes problems like:

  - Messages application to be killed easily by LMK as it consumes memory
  - The foreground application needs to compete for resource

One possible solution is to suspend/resume loading by visibilitychange event.
Blocks: 989713
Assignee: nobody → schung
blocking-b2g: --- → 1.3T?
Summary: Threads and messages are kept loading after switch to background → [Message] Threads and messages are kept loading after switch to background
Would be even better if the loading can be done incrementally while scrolling.
(In reply to Ting-Yu Chou [:ting] from comment #1)
> Would be even better if the loading can be done incrementally while
> scrolling.

This is bug 865743 comment 7.

This is a lot easier to say than to do properly ;)


About the main problem, we'll need to figure what takes the memory: is it the Gecko API or is the Gaia application. If it's the Gaia application, we can try to do what you're saying (even if it's challenging) but if it's the Gecko API then it's probably even more difficult.
On Tarako, USS goes to 2xMB while at STR step 2, and stays around 16MB when there's no cursor.continue(). But I didn't check which part of continue() requires memory.
Yan's patch in bug 972731 might also help with this. I think that if we can make it to land it might also be easier to start/stop adding the threads if need be.
Guys, do you feel confident we can have a not-too-risky solution in the coming weeks?
Ni? Steve for comment 5
Flags: needinfo?(schung)
It's very easy to just stop rendering, what's not easy is to resume it. If we want a hacky solution, we could start rendering from scratch instead of resuming, I think that would be safe enough.

Steve, do you have a better idea?
(In reply to Julien Wajsberg [:julienw] from comment #7)
> It's very easy to just stop rendering, what's not easy is to resume it.

If you have time, could you please let me know why is that?
(In reply to Julien Wajsberg [:julienw] from comment #7)
> It's very easy to just stop rendering, what's not easy is to resume it. If
> we want a hacky solution, we could start rendering from scratch instead of
> resuming, I think that would be safe enough.
> 
> Steve, do you have a better idea?

I agree that stop rendering is easy and resume is another story. Maybe we can keep the thread-list and thread cursor as global object. These 2 cursors will be cleaned up when traversal finished and reset when new request start. Hey julien , I'll try to give a test WIP this week and feel free to ping me if you think this idea won't work or too risky, thanks!
Flags: needinfo?(schung)
(In reply to Steve Chung [:steveck] from comment #9)
> I agree that stop rendering is easy and resume is another story. Maybe we
> can keep the thread-list and thread cursor as global object. These 2 cursors
> will be cleaned up when traversal finished and reset when new request start.
> Hey julien , I'll try to give a test WIP this week and feel free to ping me
> if you think this idea won't work or too risky, thanks!

Careful with that as the SMS app has multiple entry-points (activities mainly) so you'll have to accommodate different scenarios. An additional issue is that part of the SMS app logic is dependent on the DOM (it's bad, we know) so if you haven't inserted all the nodes certain things will behave differently (e.g. selecting all messages).
I don't like keeping the cursor around because I fear this would prevent some resources from being freed. That's why I'd rather do a request again... We don't need to clear whatever has already been added though, so I don't think it's that bad.

(In reply to Ting-Yu Chou [:ting] from comment #8)
> (In reply to Julien Wajsberg [:julienw] from comment #7)
> > It's very easy to just stop rendering, what's not easy is to resume it.
> 
> If you have time, could you please let me know why is that?

Especially because the code we already have has nothing to do this, so we need to do it from scratch. And new code inherently can have more bugs, that's why for 1.3t I'd prefer a simpler solution.
Just tried to stop the message cursor today, but it doesn't work as expect because:

1) Moving message app to backgroung might happend before visibility changed event. Message app would be killed once in backgroung and system does not have sufficient memory.
2) Cursor still has to wait for this round complete then stops traversal and memory would not be freed immediately. So there still a lot of time between we detect the app is hidden and memory comsumption actually drop down.

These 2 issues makes this 'stop cursor while in the background' idea no benifit to the oom problem. I would prefer to focus the patch in bug 972731 that we traverse the cursor incrementally by page scrolling.

Ting-Yu has some rough measurement about memory usage. He said the usage would increase 4~5 MB or more while cursor is running. I'll investigate the comsumption bottleneck is in element rendering or cursor api.
To me, the main issue here is that the foreground application competes for memory/cpu. I don't really care that the Messages app is killed on Tarako, I think that's expected anyway.

(In reply to Steve Chung [:steveck] from comment #12)
> Just tried to stop the message cursor today, but it doesn't work as expect
> because:
> 
> 1) Moving message app to backgroung might happend before visibility changed
> event. Message app would be killed once in backgroung and system does not
> have sufficient memory.

If the messages app is killed, then the competition with the foreground application ends and I think it's fine.

> 2) Cursor still has to wait for this round complete then stops traversal and
> memory would not be freed immediately. So there still a lot of time between
> we detect the app is hidden and memory comsumption actually drop down.

What is "a lot of time" for you? Have you tried to measure it?

> 
> These 2 issues makes this 'stop cursor while in the background' idea no
> benifit to the oom problem. I would prefer to focus the patch in bug 972731
> that we traverse the cursor incrementally by page scrolling.
> 
> Ting-Yu has some rough measurement about memory usage. He said the usage
> would increase 4~5 MB or more while cursor is running. I'll investigate the
> comsumption bottleneck is in element rendering or cursor api.

I bet it's in the cursor API because it tries to cache the db result to make later requests faster. You can ask Vicamo.


I'd like that we focus on the CPU consumption issue here.

Here is some ideas:
* the simplest one, hence my prefered idea:
 + have a boolean "isRenderingFinished" to know whether we finished rendering.
 + on visibilitychange event, if document.hidden is true and isRenderingFinished is false, stop iterating the cursor and appending threads.
 + on visibilitychange event, if document.hidden is false and isRenderingFinished is false, do the rendering again, but don't change threads that haven't change.
 + This probably needs to put all the thread data in the thread dataset so that updateThread knows whether this has changed. Or (java style) we can make Threads calculate a hash of a Thread data and store that hash instead (I prefer this because this keeps the things to check in one place).

* on visibilitychange event, we stop appending threads, but we still iterate the cursor and store the data in an array. When the application is displayed again, we resume displaying threads, using the stored data. The main difficulty here is that we need to update the array data (for example if we receive a message). The good thing is that I think we should eventually have such an array to make an efficient list.

Some more thought:
* We can do the updateThread change of the first idea anyway because updateThread is too dumb right now.
* the first idea seems better for a 1.3+ patch
* the second idea seems better for a master patch
(In reply to Julien Wajsberg [:julienw] from comment #13)
> * the simplest one, hence my prefered idea:
>  + have a boolean "isRenderingFinished" to know whether we finished
> rendering.
>  + on visibilitychange event, if document.hidden is true and
> isRenderingFinished is false, stop iterating the cursor and appending
> threads.
>  + on visibilitychange event, if document.hidden is false and
> isRenderingFinished is false, do the rendering again, but don't change
> threads that haven't change.
>  + This probably needs to put all the thread data in the thread dataset so
> that updateThread knows whether this has changed. Or (java style) we can
> make Threads calculate a hash of a Thread data and store that hash instead
> (I prefer this because this keeps the things to check in one place).

I think this is a no brainer, if we can keep the patch small enough we could probably land this on V1.3T too.

> * on visibilitychange event, we stop appending threads, but we still iterate
> the cursor and store the data in an array. When the application is displayed
> again, we resume displaying threads, using the stored data. The main
> difficulty here is that we need to update the array data (for example if we
> receive a message). The good thing is that I think we should eventually have
> such an array to make an efficient list.

This  would be lovely. In general decoupling the code dealing with threads' data (i.e. the data model) from the rendering part (view) will be beneficial in the long run.
triage: this is nice to have for tarako. let's not block on this unless we have something safe to land in tarako
blocking-b2g: 1.3T? → backlog
Comment on attachment 8406017 [details] [diff] [review]
0001-Bug-993892-Message-Threads-and-messages-are-kept-loa.patch

It's a quick WIP for stopping the cursor while app switch to backgroung. Hi Julien and Gabriele, it would be great if you could confirm resume part or any other feedbacks because I didn't clear the list re-render again. I've tested it briefly but the render part might be introduced in many different scenario.

Hi ting, could you give it a try on tarako? My experiment result is the app could be kill immediatly once the app switch into backgroung... I will discuss problem with you later.
Attachment #8406017 - Flags: feedback?(tchou)
patch for 1.3t
Attachment #8406059 - Flags: feedback?(tchou)
Attachment #8406017 - Flags: feedback?(tchou)
Comment on attachment 8406017 [details] [diff] [review]
0001-Bug-993892-Message-Threads-and-messages-are-kept-loa.patch

Review of attachment 8406017 [details] [diff] [review]:
-----------------------------------------------------------------

some comments,

especially I'd like MessageManager to stay as much close to the API as possible, and not do UI stuff.

Also, I'm afraid of doing 1 more DOM operation for every thread/message... I don't know a good idea right now, besides storing more state...

::: apps/sms/js/message_manager.js
@@ +391,5 @@
>          each && each(this.result);
>  
> +        if (!document.hidden) {
> +          this.continue();
> +        }

I'd prefer not change message_manager.js which should stay at the API level.

Here we can implement something like getMessages' "shouldContinue" so that we can control it from ThreadListUI.renderThreads.

@@ +443,5 @@
>            shouldContinue = each(this.result);
>          }
>          // if each returns false the iteration stops
> +        // if 'shouldContinue' is undefined this is fine
> +        if (shouldContinue !== false && !document.hidden) {

same remark here, we can just return false from the each callback.

::: apps/sms/js/thread_list_ui.js
@@ +27,4 @@
>    // Set to |true| when in edit mode
>    inEditMode: false,
>  
> +  isRenderingFinished: true,

I don't like this as default value, this should be "null".

You need to reset it in "init" for unit tests too.

Maybe we can invert the meaning and use "isRenderingStarted" instead? So that it's "false" by default, and "true" when it's started? It could make namings easier.

@@ +445,5 @@
>      var firstPanelCount = 9; // counted on a Peak
>  
> +    // If the rendering is not started yet, do prepareRendering task for
> +    // reset thread container and append draft.
> +    if (this.isRenderingFinished) {

re: isRenderingFinished naming, this is especially strange to read

@@ +452,4 @@
>  
>      function onRenderThread(thread) {
>        /* jshint validthis: true */
> +      var li = document.getElementById('thread-' + thread.id);

Can't we try to use Threads here? I don't like the idea of doing a DOM operation for each thread

@@ +461,5 @@
>  
> +      if (!li) {
> +        this.appendThread(thread);
> +      } else if (+li.dataset.time < +thread.timestamp) {
> +        this.updateThread(Threads.get(thread.id));

This is not the purpose of this patch, but I don't like the fact that "Threads.get" returns a message; and that updateThread takes a message as parameter. We should eventually rethink our APIs...

@@ +490,5 @@
> +    this.isRenderingFinished = false;
> +  },
> +
> +  onVisibilityChange: function thlui_onVisibilityChange(e) {
> +    if (!document.hide && !this.isRenderingFinished) {

nit: "hidden"

::: apps/sms/js/thread_ui.js
@@ +1483,5 @@
> +      var messageDOM = document.getElementById('message-' + message.id);
> +      if (messageDOM) {
> +        // Skip message append if element is alredy existed.
> +        return true;
> +      }

same thing here, I don't like doing a DOM operation :( Maybe it doesn't have a big performance cost, I'd like that we do a profile to test this if possible...
Attachment #8406017 - Flags: feedback?(felash)
Comment on attachment 8406017 [details] [diff] [review]
0001-Bug-993892-Message-Threads-and-messages-are-kept-loa.patch

Everything that Julien said, especially keeping out UI-related code from the application logic. In addition I'm unsure about the impact of keeping the cursor open for such a long time. Have you tried scenarios were we start rendering the threads, then stop because the application is invisible and then the app receives a message that creates a new thread? What happens then?
Attachment #8406017 - Flags: feedback?(gsvelto)
Status: NEW → ASSIGNED
Thanks for the feedbacks, this quick patch is mainly for testing the possibility that avoid message app OOM while loading threads in the backgroung, but it didn't perform as we expected :( (Maybe bug 996516 will help this problem)

About keeping the cursor, vicamo thought it shouldn't be the bottleneck for memory usage because it just keep the refernce and the memory usage is trivial comparing to UI dom element access(but we didn't verify it yet)

> Have you tried scenarios were we start rendering the threads, then stop because the application is
> invisible and then the app receives a message that creates a new thread? What happens then?
In this patch, it will append the new thread to the unfinished list. When app move back to foreground again, db cursor restart but the unfinished list will be kept on screen and only the unrendered will be created and inseted into the related position in the list.

For the long term solution, I think bug 972731 could be more generic because it could solve not only the unnecessary background loading issue, but also in foreground. Since this issue is moved to backlog, maybe we should focus on 972731 and close this one?
(In reply to Steve Chung [:steveck] from comment #21)
> Thanks for the feedbacks, this quick patch is mainly for testing the
> possibility that avoid message app OOM while loading threads in the
> backgroung, but it didn't perform as we expected :( (Maybe bug 996516 will
> help this problem)
> 
> About keeping the cursor, vicamo thought it shouldn't be the bottleneck for
> memory usage because it just keep the refernce and the memory usage is
> trivial comparing to UI dom element access(but we didn't verify it yet)
> 
> > Have you tried scenarios were we start rendering the threads, then stop because the application is
> > invisible and then the app receives a message that creates a new thread? What happens then?
> In this patch, it will append the new thread to the unfinished list. When
> app move back to foreground again, db cursor restart but the unfinished list
> will be kept on screen and only the unrendered will be created and inseted
> into the related position in the list.
> 
> For the long term solution, I think bug 972731 could be more generic because
> it could solve not only the unnecessary background loading issue, but also
> in foreground. Since this issue is moved to backlog, maybe we should focus
> on 972731 and close this one?

IMO the long term solution is to display less DOM elements. See bug 865743 comment 7.

First step is to keep all informations in a local array, second step is to have a real db if necessary.

I'm still not sure about bug 972731 to be honest; especially I'm not sure at all that it improves the memory pattern...
Attachment #8406059 - Flags: feedback?(tchou)
I think that bug 972731 holds some promise on the performance front and we should focus on it for the time being, we could even take into account the issue here as part of that bug. On the other hand it will not reduce memory consumption in the least, we would need to add only a subset of the elements to the DOM for that; like the visible messages/threads +/- a few so that we can scroll more in as needed. However doing this is even more work.
Another idea: couldn't we always stop adding threads when we have some already displayed (about 10 or 20 threads)? Like we do for a thread already (even if everything's not perfect, see bug 903763)?
blocking-b2g: backlog → ---
Priority: -- → P2
Assignee: schung → nobody
Status: ASSIGNED → NEW
Mass closing of Gaia::SMS bugs. End of an era :(
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Mass closing of Gaia::SMS bugs. End of an era :(
You need to log in before you can comment on or make changes to this bug.