Open Bug 1002430 Opened 10 years ago Updated 2 years ago

Discard graphics resources on memory-pressure only

Categories

(Core :: Graphics: Layers, defect)

defect

Tracking

()

tracking-b2g backlog

People

(Reporter: vingtetun, Unassigned)

References

Details

Attachments

(3 files)

The current codebase optimize memory on the device by clearing cached layers and decoded data of images when an application is going offscreen (when the user hit the home button, or when a new app covers it).

This is nice, and especially on low end device with a very small amount of memory, but for many parts it degrades the UX, and more especially on device with a bit more RAM where it's cool to use it.

For example the layer discarding strategy and decoded image data discarding strategy, actually removes everything when the app receive a setVisible(false).

As a result it forces us to do a few hacks in Gaia in order to try to make a better UX. When an attention screen is opened, we wait five seconds in order to change the visibility of the underlying window (which also change its priority!), or when the home button is clicked, we asked for a screenshot of the homescreen in order to revive it's rendering before switching to it (screenshots are slow and usually takes hundreds of ms and bust the ipc bus).

It would be much better / easier if the resources of an app can be discarded only when it really needs to be discarded.

Asking feedback? to Fabrice, not really for the code but really for the approach as I'm curious to see if it will scale from Tarako to high-end devices.
This is also really not clear to me why we send a low-memory memory-pressure when the app is minimized instead of a heap-minimize ?
Attachment #8413696 - Flags: feedback?(fabrice)
(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment #2)
> This is also really not clear to me why we send a low-memory memory-pressure
> when the app is minimized instead of a heap-minimize ?

That was a side effect of a change I made; after seeing the results of that I think it's better to go back to heap-minimize instead. A low-memory event is dropping the WebGL context too and that's quite problematic (though it made a few hidden WebGL issues in Gaia apparent). We could even devise a new subclass of the memory-pressure event that drops only what we want it to drop, it's not really complicated as there's just a handful of observers in the gecko codebase.
(In reply to Gabriele Svelto [:gsvelto] from comment #3)
> (In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails,
> needinfo? please) from comment #2)
> > This is also really not clear to me why we send a low-memory memory-pressure
> > when the app is minimized instead of a heap-minimize ?
> 
> That was a side effect of a change I made; after seeing the results of that
> I think it's better to go back to heap-minimize instead. A low-memory event
> is dropping the WebGL context too and that's quite problematic (though it
> made a few hidden WebGL issues in Gaia apparent). We could even devise a new
> subclass of the memory-pressure event that drops only what we want it to
> drop, it's not really complicated as there's just a handful of observers in
> the gecko codebase.

More granularity would be helpful for some UX things.

One thing that could have been nice as well is an notification before the phone fire the low-memory message. That would have let us drop some of the resources like layers and image data, before the phone goes onto low-memory.

I have been told that it is hard to have though ?
Question: is gaia able to know the iframe keep or drop the rendering state? I wanna to know if we could skip the screenshot call if we still have the rendering in hidden state.
(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment #4)
> More granularity would be helpful for some UX things.
> 
> One thing that could have been nice as well is an notification before the
> phone fire the low-memory message. That would have let us drop some of the
> resources like layers and image data, before the phone goes onto low-memory.

What kind of scenario do you have in mind? Currently we have only two scenarios that trigger memory pressure events: sending an application to the background or when we hit the low-memory threshold. In the first case we could behave differently, possibly by sending a different string to the memory-pressure event so that observers would behave differently. The idea for that is that background applications shouldn't hold to unnecessary data but we might want to retain certain things. We could also delay this memory pressure event and send another event first (or let the application use their visibility status to decide what to do so that we don't make events proliferate too much). In fact delaying the minimization of a background app could also help with app startup. We might also want to distinguish between an application really going into the background and one being just hidden because we turned off the screen. In the latter case minimization doesn't really make sense as the application will go into the foreground again when the user turns on the phone.

In the second scenario we have no choice: when we hit a low-memory situation we inform the applications ASAP that we're low on memory and they should drop as much as they can. In this scenario if an application doesn't free enough memory fast enough it will be reaped by the LMK so introducing another event before memory-pressure wouldn't help at all1.
(In reply to Alive Kuo [:alive][NEEDINFO!][OOO 4/30 - 5/4] from comment #5)
> Question: is gaia able to know the iframe keep or drop the rendering state?
> I wanna to know if we could skip the screenshot call if we still have the
> rendering in hidden state.

My understanding of the memory-pressure event (already forwarded to Gaia) is that it is sent to all apps when needed. So Gaia could rely on it, assuming that all background apps resources are released on such an event.

Gabriele, does my understanding of the memory-pressure message is correct ?
(In reply to Gabriele Svelto [:gsvelto] from comment #6)
> (In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails,
> needinfo? please) from comment #4)
> > More granularity would be helpful for some UX things.
> > 
> > One thing that could have been nice as well is an notification before the
> > phone fire the low-memory message. That would have let us drop some of the
> > resources like layers and image data, before the phone goes onto low-memory.
> 
> What kind of scenario do you have in mind? Currently we have only two
> scenarios that trigger memory pressure events: sending an application to the
> background or when we hit the low-memory threshold. In the first case we
> could behave differently, possibly by sending a different string to the
> memory-pressure event so that observers would behave differently. The idea
> for that is that background applications shouldn't hold to unnecessary data
> but we might want to retain certain things. We could also delay this memory
> pressure event and send another event first (or let the application use
> their visibility status to decide what to do so that we don't make events
> proliferate too much). In fact delaying the minimization of a background app
> could also help with app startup. We might also want to distinguish between
> an application really going into the background and one being just hidden
> because we turned off the screen. In the latter case minimization doesn't
> really make sense as the application will go into the foreground again when
> the user turns on the phone.
> 
> In the second scenario we have no choice: when we hit a low-memory situation
> we inform the applications ASAP that we're low on memory and they should
> drop as much as they can. In this scenario if an application doesn't free
> enough memory fast enough it will be reaped by the LMK so introducing
> another event before memory-pressure wouldn't help at all1.

I was thinking that a kind of 'will-be-soon-in-low-memory' which is basically a higher threshold than low-memory would be nice.

With such an event we could have throw away the resources of some apps based on how long has happen since the app has been used. So the most recent apps will have kept their resources, while apps that have not been used since a long time will just discard everything they can.

This could also be used by Gaia to know if we need to keep the app screenshot in the card view for old apps.

My ideas are not 100% clear on this topic, but a few more threshold would open the door to as smarter resources discarding strategy.
(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment #7)
> My understanding of the memory-pressure event (already forwarded to Gaia) is
> that it is sent to all apps when needed. So Gaia could rely on it, assuming
> that all background apps resources are released on such an event.
> 
> Gabriele, does my understanding of the memory-pressure message is correct ?

Currently the event is visible only from chrome code, we made a special exception for the system app because we explicitly needed it here. This event is used in various parts of Gecko to purge caches (images, sqlite, etc...), run the garbage and cycle collectors and also do low-level activities such as returning dirty unused pages to the kernel or releasing graphics buffers.

I'm not sure if directly exposing it to Web content would be a good idea; on the other hand it might be possible to use this functionality indirectly. For example, how does WeakMaps interact with memory pressure?

(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment #8)
> I was thinking that a kind of 'will-be-soon-in-low-memory' which is
> basically a higher threshold than low-memory would be nice.
>
> [...]
>
> My ideas are not 100% clear on this topic, but a few more threshold would
> open the door to as smarter resources discarding strategy.

Technically it would be possible, AFAIK we've already experimented on the Tarako with a daemon that monitors the available memory and acts accordingly.

That being said I'm not sure that having explicit levels of memory pressure is a good idea, let me argue why. First of all it would be hard to define them: we could be running out of memory because we have a single app open, then it's important to try and force that app to release memory to keep it alive, but we could also be running out of memory because we have too many apps open in which case it would preferable to just kill a long unused app instead. So the same threshold could apply to different situations making it complicated to choose what to do when it's hit. The second issue is that exposing this explicitly to gaia means that the apps also have to explicitly support those thresholds and react to them. I don't think that's a strategy that would scale well.

On the other hand having some "weak" resources that can disappear when you're low on memory and might need to be re-created seems more natural to me. AFAIK there's no notion of weak references in JavaScript besides WeakMap but some things already behave this way (e.g. a WebGL context can be unloaded and you might need to re-create it).
(In reply to Gabriele Svelto [:gsvelto] from comment #9)
> (In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails,
> needinfo? please) from comment #7)
> > My understanding of the memory-pressure event (already forwarded to Gaia) is
> > that it is sent to all apps when needed. So Gaia could rely on it, assuming
> > that all background apps resources are released on such an event.
> > 
> > Gabriele, does my understanding of the memory-pressure message is correct ?
> 
> Currently the event is visible only from chrome code, we made a special
> exception for the system app because we explicitly needed it here. This
> event is used in various parts of Gecko to purge caches (images, sqlite,
> etc...), run the garbage and cycle collectors and also do low-level
> activities such as returning dirty unused pages to the kernel or releasing
> graphics buffers.
> 
> I'm not sure if directly exposing it to Web content would be a good idea; on
> the other hand it might be possible to use this functionality indirectly.
> For example, how does WeakMaps interact with memory pressure?

I don't want to expose that to Web Content. I was only speaking of the system app. Sorry for any confusion here.

> 
> (In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails,
> needinfo? please) from comment #8)
> > I was thinking that a kind of 'will-be-soon-in-low-memory' which is
> > basically a higher threshold than low-memory would be nice.
> >
> > [...]
> >
> > My ideas are not 100% clear on this topic, but a few more threshold would
> > open the door to as smarter resources discarding strategy.
> 
> Technically it would be possible, AFAIK we've already experimented on the
> Tarako with a daemon that monitors the available memory and acts accordingly.
> 
> That being said I'm not sure that having explicit levels of memory pressure
> is a good idea, let me argue why. First of all it would be hard to define
> them: we could be running out of memory because we have a single app open,
> then it's important to try and force that app to release memory to keep it
> alive, but we could also be running out of memory because we have too many
> apps open in which case it would preferable to just kill a long unused app
> instead. So the same threshold could apply to different situations making it
> complicated to choose what to do when it's hit.

I think that's part of my point. I feel like we could be smarter about memory management for some cases.

One of thing I would like to experiment with for example when Haida will come alive is to be able to remove from the DOM one top level iframe mozbrowser of an app from the system app, without having to kill the whole app.

For example some old inline activities, or old panels may just be removed, instead of killing the whole app. 

> The second issue is that
> exposing this explicitly to gaia means that the apps also have to explicitly
> support those thresholds and react to them. I don't think that's a strategy
> that would scale well.
> 

I never wanted to expose that to apps other than the system app.

> On the other hand having some "weak" resources that can disappear when
> you're low on memory and might need to be re-created seems more natural to
> me. AFAIK there's no notion of weak references in JavaScript besides WeakMap
> but some things already behave this way (e.g. a WebGL context can be
> unloaded and you might need to re-create it).
(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment 
> I think that's part of my point. I feel like we could be smarter about
> memory management for some cases.
> 
> One of thing I would like to experiment with for example when Haida will
> come alive is to be able to remove from the DOM one top level iframe
> mozbrowser of an app from the system app, without having to kill the whole
> app.
> 
> For example some old inline activities, or old panels may just be removed,
> instead of killing the whole app.

I see, sorry for misunderstanding what you were trying to do. So your idea would be to throw away "old" iframes we know are not used as part of a response to a certain level of memory pressure, correct? I've thought a bit about this and I think there's a way to leverage the existing code to use multiple thresholds, it would go roughly like this:

The kernel gives us only one level of memory pressure but we can adjust it at runtime so there's nothing preventing us from setting it at a high bar, I'll call that "light memory pressure". Once we hit that level we send an appropriate event to which the system app can respond by dropping iframes. In the memory pressure detector we always monitor the memory level so if after a few seconds we're still hitting it we can adjust it at a lower bar and wait for that (let's call that dire memory pressure). Vice-versa if we don't hit that level for a certain amount of time we can bring it back to the light memory pressure threshold and start over.

This would not be free of issues, we would need a form of hysteresis to prevent the system from oscillating uselessly between the two (or more) thresholds but it's definitely be worth an experiment.

BTW naturally this should work well on devices with a reasonable amount of memory; trying something like this on the Tarako would be asking for trouble as we're practically always in a low-memory situation there :-)
That's the POC I'm using on the Gaia side to make the situation better.
Spoke a bit with roc (CC'ed him) during a b2g meeting in MV. Seems like one thing that can go wrong is gpu memory consumption. In order to make sure the layer tree is not retained too aggressively we may need something:
 1. a gpu-memory-pressure event. My current understanding is that it depends a lot on drivers and may be hard to ensure it works across all devices.
 2. Rely on allocation failure for layers. When such an allocation failure happens, a message may be broadcast in order to release some of the unused layer tree.
 3. A tool to know the amount of images allocated for a specific <iframe mozbrowser>. Then the gpu-memory management will be more based on a the amount of memory used for a specific iframe, and the time spent since this iframe has been used.
Tim, this is the bug I mentioned in order to make the iframe.setVisible calls to not discard the layer tree / rendering instantly, and get rid of the attention screen hack, as well as the hack iframe.getScreenshot hack when the user hit the home button.
Vivian, FYI, Fabrice already handled the screenshot calls in bug 1014899
Flags: needinfo?(fabrice)
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #15)
> Vivian, FYI, Fabrice already handled the screenshot calls in bug 1014899

Sadly, those are not useless and affects the UX. They are here in order to force the underlying app to repaint itself and minimize the amount of time the user see an empty screen when switching back and forth between apps and the homescreen.

The underlying issue is that app rendering is thrown away while an application goes to background, and it needs to be reconstruct (takes a few hundred ms) when it goes foreground. So removing it for the Tarako only is OK but it affects the UX. I would like a solution that keeps a good UX.
Flags: needinfo?(21)
Well, as Vivien said, bug 1014899 is a workaround only toward tarako, and the idea is given by me. For normal device we definitly needs the screenshot before this issue is fixed.
Flags: needinfo?(fabrice)
Ah ok.  Thanks for the information, Alive and Vivien.
I would really like this stuff for 2.1.
blocking-b2g: --- → 2.1?
More like feature-b2g: 2.1 than blocking I think....
blocking-b2g: 2.1? → ---
feature-b2g: --- → 2.1
Component: General → Graphics: Layers
Product: Firefox OS → Core
Vivien - thanks for letting me know about this bug.  For all involved - it is bad form to commit to a release on a feature for another team without anybody from that team being included on the bug, not to mention part of the discussion that leads to it.
I will take this under discussion in the graphics planning, but at this point we're pretty much full for 2.1, so I wouldn't count on it.  I don't know a way of making this feature-b2g: 2.1?, so I'm removing that flag for now.
feature-b2g: 2.1 → ---
(In reply to Milan Sreckovic [:milan] from comment #21)
> Vivien - thanks for letting me know about this bug.  For all involved - it
> is bad form to commit to a release on a feature for another team without
> anybody from that team being included on the bug, not to mention part of the
> discussion that leads to it.
> I will take this under discussion in the graphics planning, but at this
> point we're pretty much full for 2.1, so I wouldn't count on it.  I don't
> know a way of making this feature-b2g: 2.1?, so I'm removing that flag for
> now.

Asking 2.1 again - because this issue cause us to make many workarounds in Gaia since a long time - and even with the workarounds, it's not perfect.
blocking-b2g: --- → 2.1?
Blocks: b2g-nexuss
Attachment #8413693 - Flags: feedback?(fabrice)
Attachment #8413696 - Flags: feedback?(fabrice)
It sounds like we could use a longer design conversation here, and I'm not sure bugzilla is the best place to figure it out (thought we'd want to record the result there.)  There are some good ideas in here, and we should sort them out.
See Also: → 1039883
There is still nothing actionable here.  We have competing requests - free up as much memory vs. this one - keep it around.  We need strategic, product level direction for "behave differently on different devices" that would include memory management, perhaps behavior, look, feel, etc.  Filed bug 1049673 to track the meta aspect of this.
blocking-b2g: 2.1? → backlog
blocking-b2g: backlog → ---
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.