Closed Bug 1128042 Opened 9 years ago Closed 8 years ago

[Flame][Browser]Pinch to zoom in screen several times, device will auto exit browser.

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla39
blocking-b2g 2.2+
Tracking Status
firefox37 --- wontfix
firefox38 --- wontfix
firefox39 --- fixed
b2g-v2.1 --- fixed
b2g-v2.1S --- fixed
b2g-v2.2 --- verified
b2g-v2.5 --- verified
b2g-master --- verified

People

(Reporter: BenWa, Assigned: kats)

References

Details

Attachments

(2 files)

+++ This bug was initially created as a clone of Bug #1121871 +++

Continuing on bug 1121871. We're still seeing OOM crashes with sina.com.cm when zooming in and panning. It's doesn't seem to be the same cause the problem in bug 1121871.
Actually I can reproduce the crash just by letting the page load without panning. I can also reproduce it without tiling. I think this page is just too heavy.
[Blocking Requested - why for this release]: a blocker follow up, nominate it
blocking-b2g: --- → 2.1?
The bug still exist on latest Flame 3.0 build.

Steps:
1. Launch Browser and go to "www.sina.com".
2. Pinch to zoom in screen several times, and pan around in any direction.

Expected Result: 
2. User could zoom in webpage normally and it shouldn't auto exit Browser.

Actual Result: 
2. Device will auto exit browser and go back to Home screen.

Fail rate:1/5

Flame 3.0 version:
Build ID               20150215010209
Gaia Revision          f0b93e0668ef9565bd6f050b15b4f794d59feb65
Gaia Date              2015-02-13 13:13:27
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/e0cb32a0b1aa
Gecko Version          38.0a1
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150215.043133
Firmware Date          Sun Feb 15 04:31:43 EST 2015
Bootloader             L1TC000118D0
QA Whiteboard: [MGSEI-Triage+]
blocking-b2g: 2.1? → 2.2?
Please re-test on latest master/2.2 builds. Since comment 5 we have landed a few changes to the tiling code which may have fixed this. If it still reproduces please confirm that you are testing on sina.com and not sina.com.cn - there was some confusion in previous bugs about which website was being used. Thanks!
Keywords: qawanted
QA Contact: ychung
This issue is still easily reproducible on Flame Master and 2.2. After zooming in all the way and panning the screen around will cause the browser app to crash.
 
Device: Flame Master (KK, 319mb, full flash)
Build ID: 20150312010235
Gaia: 0c4e8b0b330757e261b031b7e7f326ef419c9808
Gecko: 5334d2bead3e
Gonk: ebad7da532429a6f5efadc00bf6ad8a41288a429
Version: 39.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:39.0) Gecko/39.0 Firefox/39.0

Device: Flame 2.2 (KK, 319mb, full flash)
Build ID: 20150312002501
Gaia: 572d60e0a440ee4af50bc6b6adad8876eadbdb4d
Gecko: 244e6ba3c20e
Gonk: ebad7da532429a6f5efadc00bf6ad8a41288a429
Version: 37.0 (2.2)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [MGSEI-Triage+] → [QAnalyst-Triage?][MGSEI-Triage+]
Flags: needinfo?(ktucker)
Keywords: qawanted
QA Contact: ychung
QA Whiteboard: [QAnalyst-Triage?][MGSEI-Triage+] → [QAnalyst-Triage+][MGSEI-Triage+]
Flags: needinfo?(ktucker)
Please confirm you were using sina.com and not sina.com.cn as the website under test.
Flags: needinfo?(ychung)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> Please confirm you were using sina.com and not sina.com.cn as the website
> under test.

Yes, I tested on sina.com, NOT sina.com.cn as instructed in Comment 6.
Flags: needinfo?(ychung)
Component: Gaia::Browser → Panning and Zooming
Product: Firefox OS → Core
check to see if it's same as the cnn issue
Flags: needinfo?(npark)
Kats, is this the same issue as bug 1129126?
Flags: needinfo?(npark) → needinfo?(bugmail.mozilla)
See Also: → 1129126
It's hard to say. A process getting OOM-killed can happen for any number of reasons, it depends on what's using up the memory. It's best to keep the bugs separate for now, until we can track down the code responsible.
Flags: needinfo?(bugmail.mozilla)
Blocking - 20% crash rate
blocking-b2g: 2.2? → 2.2+
Hi Peter, can you also help to take a look? thanks.
Flags: needinfo?(pchang)
Some initial investigation seems to indicate that we are creating and keeping a lot of tiles. Even though the texture pool has a max size of 50 tiles we end up having up to 98 outstanding tiles in the pool. We hit the branch at [1] quite a lot on this page. It will need some more investigation to figure out why this is happening on this page specifically. I can look more into it since nical is swamped with other stuff and BenWa is away.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/TextureClientPool.cpp?rev=a4249d571bbb#175
Assignee: nobody → bugmail.mozilla
Component: Panning and Zooming → Graphics: Layers
So as far as I can tell we're not actually leaking anything. It's just that the transient memory usage gets really high and on a 319MB device it gets high enough to trigger OOMing. There might be more than one cause for this, but the one I just spent a day tracking down is... rounding error!

In the example I logged, layout computes a critical displayport with w=1152.000000, h=2048.000000 which is exactly 4x8=32 tiles. However by the time the ClientTiledPaintedLayer gets a hold of this, it has turned into 1152x2049 which then requires 5x9=45 tiles. So rounding error is causing a 41% increase in the number of tiles (and therefore memory) on this layer, which is quite a lot.

Tracking down the rounding error a bit more:
At [1], screenRect.width is 1152 and res is 3.53899. result.width ends up as 13021 app units (note that there is a 1.5 CSS to LayoutDevice scale), and that's what gets returned from the function, back out to [2].
At [2], the app units value is converted to CSS pixels, and gets stored as w=217.016663.
At [3], this value is fetched and multiplied by the zoom (5.31) and rounded out, resulting in w=1153.

[1] http://mxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp?rev=d226d104ee7d#1002
[2] http://mxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.cpp?rev=d226d104ee7d#748
[3] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/ClientTiledPaintedLayer.cpp?rev=d226d104ee7d#174
I think the other factor that makes it so apparent on this sina.com website is that there's 3 tiled layers that all end up with this giant screen-sized displayport which amplifies the bug. I'll do a bit more digging on that front as well.
Attachment #8580772 - Flags: review?(botond) → review+
https://hg.mozilla.org/mozilla-central/rev/c1b4043ebe54
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
qawanted to see if the patch helped much. I don't expect the OOM-kills to have gone away entirely, but it should at least be harder to hit. If it's still unacceptably easy to hit then I can investigate further.
Keywords: qawanted
Flags: needinfo?(npark)
Attached file zoom.log
I tried out the latest patch, and while the OOM exit does still happen, it appears that the brower holds out longer than before.  Before it used to OOM exit on 1st or 2nd serious attempt of zooming / orientation switch, but now it takes about 3~4 attempts to cause OOM.  Attached is the logcat taken when accessing sina.com and opening a news to do pinch zoom.
Flags: needinfo?(npark)
Ok, thanks. I'll clone this bug yet again to investigate more.
Filed bug 1146387 to continue investigation.
The browser did not crash as much while zooming in and out, however it did crash twice within 30 minutes as testing.  I did crash more while loading sites (5 times) to test this bug than just zooming in and out after it was loaded so it might be related to heavy websites like sina.com and a google.com search results page than the zooming in and out.

Environmental Variables:
Device: Flame 3.0
BuildID: 20150323010204
Gaia: 9b6f3024e4d0e62dd057231f4b14abe1782932ab
Gecko: e730012260a4
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 39.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:39.0) Gecko/39.0 Firefox/39.0
QA Whiteboard: [QAnalyst-Triage+][MGSEI-Triage+] → [QAnalyst-Triage?][MGSEI-Triage+]
Flags: needinfo?(ktucker)
Keywords: qawanted
QA Whiteboard: [QAnalyst-Triage?][MGSEI-Triage+] → [QAnalyst-Triage+][MGSEI-Triage+]
Flags: needinfo?(ktucker)
Please request b2g37 approval on this when you get a chance.
Flags: needinfo?(bugmail.mozilla)
Flags: needinfo?(pchang)
Comment on attachment 8580772 [details] [diff] [review]
Don't round out

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): tiling code
User impact if declined: sometimes when painting stuff we use way more memory than we should because we allocate tile surfaces that are hundreds of pixels wide but only contain one pixel of actual content.
Testing completed: on m-c, no regressions so far
Risk to taking this patch (and alternatives if risky): fairly low risk. the only risk here is that now we might err on the other side and end up with a row of blank pixels when we didn't before, but even so it should be at the edge of the critical displayport (which is generally outside the visible area) so nobody would notice anyway.
String or UUID changes made by this patch: none
Flags: needinfo?(bugmail.mozilla)
Attachment #8580772 - Flags: approval-mozilla-b2g37?
Comment on attachment 8580772 [details] [diff] [review]
Don't round out

Considering the low risk and high win potential this is probably good for 2.1 as well. On 2.1 this needs to be applied to gfx/layers/client/ClientTiledThebesLayer.cpp (the file was renamed since then).
Attachment #8580772 - Flags: approval-mozilla-b2g34?
Attachment #8580772 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Attachment #8580772 - Flags: approval-mozilla-b2g34? → approval-mozilla-b2g34+
This issue is verified fixed in Flame 512MB memory. Pinch-zoomed in and out ~20 times and didn't see browser crash. The website reacts to these actions much quicker than 319MB as well.

The issue IS still occurring in Flame 319MB memory though, it happened about 1 out of ~7 zoom-in or out attempt. In 319MB, just loading the website (without doing any panning or zooming) didn't seem to crash at all for me - loaded the website (www.sina.com) 10 times and crashed 0 times.

On Flame 2.2 319MB, the issue happens MUCH LESS frequently. In fact I only saw the crash once within the 10 minutes I tried. 2.2 definitely handles low memory situation much better.

Verified fixed on:
Device: Flame 2.6
BuildID: 20151204030224
Gaia: e9419046f360dd05b2717c4994990608519b93e4
Gecko: e02b17a2b5b8df7bb84f325fc08eedd2f3cab755
Gonk: 205ac4204bbbb2098a8046444acba551ba5dc75a
Version: 45.0a1 (2.6) 
Firmware Version: v18Dv4
User Agent: Mozilla/5.0 (Mobile; rv:45.0) Gecko/45.0 Firefox/45.0

Device: Flame 2.5
BuildID: 20151203091655
Gaia: 2d54c29f429bed790b5d8284633812dc2b782518
Gecko: 241f079cd53c932561c6aa32b9b93c44cd0846d0
Gonk: 205ac4204bbbb2098a8046444acba551ba5dc75a
Version: 44.0a2 (2.5) 
Firmware Version: v18Dv4
User Agent: Mozilla/5.0 (Mobile; rv:44.0) Gecko/44.0 Firefox/44.0

Device: Flame 2.2
BuildID: 20151204032501
Gaia: 885647d92208fb67574ced44004ab2f29d23cb45
Gecko: 4381c4b69b9c
Gonk: bd9cb3af2a0354577a6903917bc826489050b40d
Version: 37.0 (2.2) 
Firmware Version: v18Dv4
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+][MGSEI-Triage+] → [QAnalyst-Triage?][MGSEI-Triage+]
Flags: needinfo?(jmercado)
QA Whiteboard: [QAnalyst-Triage?][MGSEI-Triage+] → [QAnalyst-Triage+][MGSEI-Triage+]
Flags: needinfo?(jmercado)
You need to log in before you can comment on or make changes to this bug.