Closed Bug 867264 Opened 12 years ago Closed 11 years ago

Queuing up too many images to be decoded when visiting Pinterest can cause a OOM

Categories

(Core :: Graphics: ImageLib, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
blocking-b2g -

People

(Reporter: jsmith, Assigned: bugs)

Details

(Keywords: memory-footprint, perf, Whiteboard: [MemShrink:P1] [c=memory p= s=2013.12.20 u=])

Device: Unagi Build: 4/30/2013 STR 1. Go to m.pinterest.com in the FF OS browser on a weak network 2. Scroll down the content at a normal user's speed (like you are reading the content) Expected We don't OOM. Actual In some cases, we can OOM since as we bring new images into view, more and more images are getting queued up to be decoded. As a result, with too many images being decoded at once on Pinterest, we can OOM. See https://bugzilla.mozilla.org/show_bug.cgi?id=860818#c46 for Justin's analysis on this well.
Whiteboard: [MemShrink]
> As a result, with too many images being decoded at once on Pinterest, we can OOM. Maybe. I tried to emphasize in bug 860818 comment 48 that this is only one of several possible explanations for this particular OOM. Can we get a triage decision on this now, so we have the maximum amount of time to fix this bug, if that's necessary?
blocking-b2g: --- → tef?
Jet can you find a victim here?
Assignee: nobody → bugs
Whiteboard: [MemShrink] → [MemShrink:P1]
Longer-term, we want the fix for bug 689623 ported to B2G. That's too much code to backport now, so we need a safer way to catch the low-memory pressure event and have imagelib stop decoding, discard memory, and fail gracefully on B2G. +cc: Milan, Joe, Seth for more info.
> so we need a safer way to catch the low-memory pressure event and have imagelib stop > decoding, discard memory, and fail gracefully on B2G. Well, we first need to establish whether the memory-pressure event is getting delivered quickly enough. If it's not, then no amount of monitoring will save us. We've gotten into trouble in the past for assuming that memory-pressure events are received quickly; we shouldn't make that mistake again.
(In reply to Jet Villegas (:jet) from comment #3) > Longer-term, we want the fix for bug 689623 ported to B2G. That's too much > code to backport now, so we need a safer way to catch the low-memory > pressure event and have imagelib stop decoding, discard memory, and fail > gracefully on B2G. > > +cc: Milan, Joe, Seth for more info. None of this is in the context of tef? and 1.0.1. After bug 689623 and with some additional patches, we'd be in a situation where we are not requesting decoding until the images are visible. I don't know of any work on the "cancel the decoding request if not needed anymore", which would also include some kind of tuning to make sure we only do this if there is a chance of running out of memory. Either way, before we proceed, we really should decide if it's tef or not, because the paths will be drastically different.
> I don't know of any work on the "cancel the decoding request if not needed anymore", We can't do this, because we don't know which are needed. It's "cancel the decoding request if we're running out of memory, or something like that. Maybe we can preferentially cancel requests that we think probably aren't needed, but that's probably the best we can do. > which would also include some kind of tuning to make sure we only do this if there is a > chance of running out of memory. It's the other way around; we don't know whether a request is needed anymore, but upon OOM, we can try canceling existing requests.
Right, in the context of this bug, that's what we would do. I was thinking ahead and of the extension to "only request what's needed". Depending on how expensive it is to check if items are needed, it may be worth doing that check just before the decoding starts. Anyway, different bug.
> I was thinking ahead and of the extension to "only request what's needed". I'm still not sure I'm being clear, so just to be sure: That is not possible on Gecko 18. It's not a question of how expensive it is; we just can't do it without bug 689623.
FWIW, I've been testing Pinterest, and it doesn't crash most of the time for me. The one time I got it to crash, the browser process did not receive a low-memory event before it died. Other processes did. That puts us into the first case from comment 4, where low-memory notifications can't help. Perhaps we can go back to the idea of checking the amount of free memory as we decode images.
(In reply to Justin Lebar [:jlebar] from comment #8) > > I was thinking ahead and of the extension to "only request what's needed". > > I'm still not sure I'm being clear, so just to be sure: That is not possible > on Gecko 18. It's not a question of how expensive it is; we just can't do > it without bug 689623. Yes, we are in violent agreement. Bug 689623 is the "only request what's needed", so the extension to it would only be Gecko 23+.
Based on the comments I'd suggset not to block on this one.
Whiteboard: [MemShrink:P1] → [MemShrink:P1][tef-triage]
blocking-b2g: tef? → -
Whiteboard: [MemShrink:P1][tef-triage] → [MemShrink:P1] c=
I can't receive phone calls when I just have pinterest open. STR: Make pinterest bookmark from e.me Open pinterest. (I don't even have to scroll down) Wait 20 sec and call the phone. Nothing happens. I don't even get the missed call notification. My phone is doing strange stuff lately so it would be great if someone from QA could confirm this.
Keywords: qawanted
Can't reproduce on b2g18 for what was cited in comment 12. I tried hammering many calls to the device being tested at different points while Pinterest loads and was able to get the call UI to come up to allow me to answer the call.
Keywords: qawanted
fwiw I created a perf profile from the time I start the call on the other phone until the call gets disconnected. We do a ton of rendering in the parent process but I also see a lot of idle time. I don't know what's going on but I can reproduce this 100%. When I close pinterest I can receive calls and when I open it I can't receive calls :( http://people.mozilla.com/~bgirard/cleopatra/#report=54b9e3719558db724ec86ba301df944db45d1624
Ok it works for me if I am on wifi but not if I use my data connection. jsmith just said he used data as well. Strange.
> I can't receive phone calls when I just have pinterest open. Can we track this in a different bug? I don't even see the communications app in this profile. Maybe it's getting OOM'ed early in startup or something (I thought we'd fixed that). In this new bug, maybe we could check a dmesg trace? If you can reproduce 100%, we should definitely try to figure it out.
FYI - this appears to also be reproducible on Google+'s mobile site.
Has anyone tried to reproduce this on trunk Gecko? I don't think we're going to have the resources to backport anything like bug 689623 to b2g-18.
Bumping this bug up for some verification on current B2G.
Flags: needinfo?(jsmith)
(In reply to Jet Villegas (:jet) from comment #19) > Bumping this bug up for some verification on current B2G. Sarah - Can you retest this bug?
Flags: needinfo?(jsmith) → needinfo?(sparsons)
I am not able to repro this issue on the Buri 1.3 Build ID: 20131210004003 Gaia 3452fbdb5e1bed0cd27cc6173136537a03e8072f SourceStamp e0c328d99742 BuildID 20131210004003 Version 28.0a2 I scrolled through m.pinterest.com and the Pinterest App for 20 minutes and no OOM occured.
Flags: needinfo?(sparsons)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Keywords: footprint, perf
Whiteboard: [MemShrink:P1] c= → [MemShrink:P1] [c=memory p= s=2013.12.20 u=]
You need to log in before you can comment on or make changes to this bug.