Closed Bug 1052095 Opened 10 years ago Closed 10 years ago

Trivial fullscreen app causes 412% overdraw on Flame

Categories

(Core :: Graphics: Layers, defect, P1)

x86
macOS
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: BenWa, Assigned: roc)

References

Details

(Keywords: perf, Whiteboard: [c=automation p= s= u=])

Attachments

(2 files)

Attached file reduced testcase app
This is a simplified testcase of what the Camera app appears to be running into.

We need to fix this to get the Camera app' overdraw under control.
Depends on: 1051101, 1049086
Blocks: 1049086, 1051101
No longer depends on: 1049086, 1051101
Keywords: perf
Whiteboard: [c=automation p= s= u=]
I'm looking at what is wrong with the code but meanwhile if someone can run a regression window this would really be helpful.
STR:
1) Install the app
2) Turn on the FPS counter, the 3rd counter will report 'overdraw'.

Ideal overdraw is 100, the overdraw I see on the flame is 412. A regression window to see which patch(es) regress this would be helpful.
Any change you could help with the above window Jason? This is a minification of a bigger problem with the Camera app to make it easier to bisect and/or fix.
Flags: needinfo?(jsmith)
Jason/Tony,

We need help from QA to do bisection here. BenWa suspects some changes to platform/system app that is causing this regression. 

Thanks
Hema
[Blocking Requested - why for this release]:

Blocks a CAF blocker.

(In reply to Benoit Girard (:BenWa) from comment #3)
> Any change you could help with the above window Jason? This is a
> minification of a bigger problem with the Camera app to make it easier to
> bisect and/or fix.

Sure - I'll get someone assigned here to help with the window.

Josh - Can you assign someone to this bug?
blocking-b2g: --- → 2.0?
Flags: needinfo?(jsmith) → needinfo?(jmitchell)
Flags: needinfo?(jmitchell)
QA Contact: ckreinbring
On the earlist build we have, the background app provided has an overdraw of 200.  Just sitting on the homescreen has an overdraw of 384.

BuildID: 20140417000006
Gaia: 7591e9dc782ac2e97d63a96f9deb71c7b3588328
Gecko: e71ed4135461
Platform Version: 31.0a1
Firmware Version: V123
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Based on the prior comment I'm uncertain that we can do a regression-window here. Regression windows need a clear "broken state" and "working state" and I don't feel this app, these numbers and this set-up provide such an opportunity.
Flags: needinfo?(jmitchell) → needinfo?(jsmith)
Benoit - Any ideas on comment 7?
Flags: needinfo?(jsmith) → needinfo?(bgirard)
Finding any point where there's a non trivial change in numbers is very valuable. Knowing when the numbers go from 200 to 384 for example will be valuable for instance.
Flags: needinfo?(bgirard)
Can someone attach a layer tree dump of either the camera or the test app?
I'm being told this is urgent, so I'm adding needinfo to Brian to have someone from his team look into this.

Based on Benoit's comment 9, one way we could do the window is to look for abnormal changes in the overdraw counter between builds & flag the relevant push log between those builds as a data point for investigation. Let's start by doing weekly overdraw measurements over the past 12 weeks to see if any abnormalities pop up.

Brian - Can you have someone from your team look into this?
Flags: needinfo?(brhuang)
Mike, pls help on this. Thx.
Flags: needinfo?(brhuang) → needinfo?(mlien)
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #10)
> Can someone attach a layer tree dump of either the camera or the test app?

There's a layer tree dump for Camera here: https://bugzilla.mozilla.org/show_bug.cgi?id=1049198
Here's the overdraw numbers I'm getting on Flame with the latest v1.4, v2.0 and v2.1 builds:

v2.1:
Gaia      8f955d80d175e52283275d3197e4eae2574b389f
Gecko     https://hg.mozilla.org/mozilla-central/rev/391f42c733fc
BuildID   20140811160202
Version   34.0a1
Overdraw  400

v2.0:
Gaia      8cb1a949f2e9650bb2c5598e78a6f24a58bbaf97
Gecko     https://hg.mozilla.org/releases/mozilla-aurora/rev/4bd4b0ae7bbe
BuildID   20140721000201
Version   32.0a2
Overdraw  295

v1.4:
Gaia      5587baf1eab05f5a9b3e5718cff719043b91a5bb
Gecko     https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/c668474f8c1d
BuildID   20140811160200
Version   30.0
Overdraw  200

So, we added an additional ~100 to the overdraw between v1.4 - v2.0 and v2.0 - v2.1.
Interesting observation on master (v2.1):

If I launch this test app from the Homescreen, I get an overdraw of "400". However, if I re-open it from the task manager, I get an overdraw of "305".
I flashed "mozilla-central-flame" PVT builds going backwards from today in 1-month intervals until I found the point where this test case no longer regressed. From there, I was able to narrow in on the build where the overdraw number doubled:

OVERDRAW: 400
Gaia      3a1d67246a79e3632c3b3f2460a25291e7e2714c
Gecko     https://hg.mozilla.org/mozilla-central/rev/e4843f4f08a7
BuildID   20140514160245
Version   32.0a1

OVERDRAW: 200
Gaia      a13d42ab5240008e042d0c61bf9c9d05174e70e4
Gecko     https://hg.mozilla.org/mozilla-central/rev/5c99d5136ad7
BuildID   20140514040204
Version   32.0a1

From there, we should be able to get the commits that fall within that range. Hope this helps!
Based on comment 11, TPE QA verify the last 12 weeks and this week's build to see the overdraw trend on both mozilla-central and v2.0 development channel
As comment 15 mentioned, if there has background apps, overdraw result will be influenced
We test three situations' overdraw: Bug attached test app, Camera app's picture mode, and Camera app's video mode
When the first time launch Camera app, geolocation message will prompt and it will raise higher overdraw result
For getting a stable result, we close and remember geolocation setting then kill camera app and re-launch again to test Camera overdraw

All results please refer to: http://goo.gl/PsAzqE
Flags: needinfo?(mlien)
Attached file layer-tree (test-case)
Vladimir: Here is the layer tree dump you requested for this bug's test-case.
Attachment #8471681 - Flags: feedback?(vladimir)
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5c99d5136ad7&tochange=e4843f4f08a7

Looking at that the most suspicious change would be layout first then graphics (since it's a culling problem). Maybe it's caused by bug 1009679.

Going to try a local backout.
It's not bug 1009679. That's a no-op for b2g.
Thanks BenWa, Justin, Vlad, and QA Team to help move this forward!

BenWa, Based on your last comment since you are actively working on this, I am assigning this to you. 

This is blocking a P1 issue raised by CAF for 2.0, hence making it a blocker. 

Thanks
Hema
Assignee: nobody → bgirard
blocking-b2g: 2.0? → 2.0+
Setting needinfo for Alive based on Justin's findings in comment 15. Alive: could there be something happening in the window management code of the system app that could cause us to be doing extra drawing? (Or could there be something related to how the keyboard is displayed that is related to this extra overdraw?)
Flags: needinfo?(alive)
Regression is coming from a gaia change. Bisecting now.
Has anyone looked at this on other devices? What's the overdraw on Hamachi?  I imagine that it might be helpful to check a device with a devicePixelRatio of 1 to make sure this isn't related to that.
(In reply to David Flanagan [:djf] from comment #24)
> Has anyone looked at this on other devices? What's the overdraw on Hamachi? 
> I imagine that it might be helpful to check a device with a devicePixelRatio
> of 1 to make sure this isn't related to that.

Good idea! I'll flash my Hamachi now and check it. May also be related, but I just noticed that in the Camera app, the reason the "overdraw" number increases when in video mode is because on Flame, our video previewSize is a higher resolution than our photo previewSize.
(In reply to Justin D'Arcangelo [:justindarc] from comment #25)
> 
> . May also be related, but
> I just noticed that in the Camera app, the reason the "overdraw" number
> increases when in video mode is because on Flame, our video previewSize is a
> higher resolution than our photo previewSize.

I've wondered about that to. On IRC, Benoit told me he thought the overdraw numbers were based on output pixels (i.e. screen size) not input pixels (i.e. raw size of the video streams). He could have been mistaken about that, however.

(I did find it interesting, however, that turning on the grid overlay in the camera app increased the number by exactly 100 indicating exactly one additional full screen of pixels being drawn in that case.)
(In reply to David Flanagan [:djf] from comment #26)
> (In reply to Justin D'Arcangelo [:justindarc] from comment #25)
> > 
> > . May also be related, but
> > I just noticed that in the Camera app, the reason the "overdraw" number
> > increases when in video mode is because on Flame, our video previewSize is a
> > higher resolution than our photo previewSize.
> 
> I've wondered about that to. On IRC, Benoit told me he thought the overdraw
> numbers were based on output pixels (i.e. screen size) not input pixels
> (i.e. raw size of the video streams). He could have been mistaken about
> that, however.

That is correct, but it doesn't mean there something else is acting differently because of the content scale which can cause the overdraw numbers to be different. For example see bug 1033538 for such a bug.
(In reply to David Flanagan [:djf] from comment #26)
> (I did find it interesting, however, that turning on the grid overlay in the
> camera app increased the number by exactly 100 indicating exactly one
> additional full screen of pixels being drawn in that case.)

Sounds like an additional ThebesLayer was being created for the grid. That can probably be fixed in CSS easily.

So, I'm building flame-kk to flash to my Flame running the KK base image now so I can check the overdraw on KK. According to this:

https://bugzilla.mozilla.org/show_bug.cgi?id=1051101#c9

Sushil is saying that the Camera is already using HWC on v2.0 on Flame (KK). However, on my Flame (JB), it most certainly isn't (the FPS counter is still visible).

Also, going off of what you were suggesting in Comment 24, I flashed my Hamachi to latest v2.0, and the Camera app *IS* using HWC there (the FPS counter disappears when the viewfinder comes up).
(In reply to Mike Lien[:mlien] from comment #17)
> Based on comment 11, TPE QA verify the last 12 weeks and this week's build
> to see the overdraw trend on both mozilla-central and v2.0 development
> channel
> As comment 15 mentioned, if there has background apps, overdraw result will
> be influenced
> We test three situations' overdraw: Bug attached test app, Camera app's
> picture mode, and Camera app's video mode
> When the first time launch Camera app, geolocation message will prompt and
> it will raise higher overdraw result
> For getting a stable result, we close and remember geolocation setting then
> kill camera app and re-launch again to test Camera overdraw
> 
> All results please refer to: http://goo.gl/PsAzqE

That's very nice spreadsheet!

As we can tell the regression window is complex. 2014/7/20 on master shows what a 'good' state is. The 3rd party app should be at 200 and the camera is at 300.

This bug should try to get the 3rd party app back to 200.
From the spreadsheet, I've narrowed down the recent regression window where we went from "200" back up to "400":

Last-known "working" build (overdraw = 200):
Gecko     https://hg.mozilla.org/mozilla-central/rev/e9cdcf646d1c
BuildID   20140720160202
Version   33.0a1

First-known "broken" build (overdraw = 400):
Gecko     https://hg.mozilla.org/mozilla-central/rev/42c6a5418370
BuildID   20140721040210
Version   33.0a1
The regression on that range it mostly likely bug 1022612.
Depends on: 1022612
A big thank you to everyone who contributed in the spreadsheet and getting us a clear regression window on the complex regression range!

From what I can tell we have 3 issues identified by the spread sheet:
(1) There's a serious regression on master caused by bug 1022612. This explains the issue that this bug is seeing with 412% overdraw numbers. We should aim to get these numbers back to 200% like we had them on 2014/7/20 before bug 1022612 landed.
(2) Looks like master never hit 200% overdraw for the simple app or camera picture mode. Perhaps it is missing an optimization that needs to be uplifted on 2.0. bug 1022612 wont be an issue since it's not on v2.0. This will be tracked by bug 1052751.
(3) There's a 100% constant overdraw different between camera picture/video mode which is likely a problem with the app. This will be tracked by bug 1052742.

Moving forward this bug should track issue (1) only. Issues (2)+(3) are forked to their own bugs.
[Blocking Requested - why for this release]: The 400% regression is a master only issue. This should block 2.1 not 2.0. For 2.0 better bugs to focus on are bug 1052751 + bug 1052742.
blocking-b2g: 2.0+ → 2.1?
Can we get an automated test for this?
We already do. It's bug 968290. I asked mchang to up the priority. As we can see in the spreadsheet our overdraw numbers are a real rollercoaster and we need to get them in check.
I asked roc to look into this since it's a 2.1 issue. :justinarc is looking at (3) and I'm looking at (2).
Assignee: bgirard → roc
I am afraid system app has done nothing as I know to a regular fullscreen app. Looks like it's a platform issue by comments above?
Flags: needinfo?(alive)
Priority: -- → P1
Unfortunately I've had great difficult getting a B2G build working on a phone over the last week, and the emulator is so slow it's practically unusable.
Un-blocking Bug 1049086 because we were able to get Camera back on HWC without resolving this issue.
No longer blocks: 1049086
On my Flame, the test app currently gives 210% overdraw or thereabouts.

We paint every pixel twice because of this chunk of the layer tree:

ContainerLayerComposite (0xab88e000) [shadow-clip=(x=0, y=0, w=480, h=854)] [shadow-visible=< (x=0, y=0, w=480, h=854); >] [clip=(x=0, y=0, w=480, h=854)] [visible=< (x=0, y=0, w=480, h=854); >] [opaqueContent]
...
  ColorLayerComposite (0xab986000) [shadow-visible=< (x=0, y=0, w=480, h=854); >] [visible=< (x=0, y=0, w=480, h=854); >] [opaqueContent] [color=rgba(255, 255, 255, 1)]

RefLayerComposite (0xab987000) [shadow-clip=(x=0, y=0, w=480, h=854)] [shadow-visible=< (x=0, y=0, w=480, h=854); >] [clip=(x=0, y=0, w=480, h=854)] [visible=< (x=0, y=0, w=480, h=854); >] [id=8]
  ContainerLayerComposite (0xaac93c00) [shadow-visible=< (x=0, y=0, w=480, h=854); >] [visible=< (x=0, y=0, w=480, h=854); >] [opaqueContent] [metrics={ cb=(x=0.000000, y=0.000000, w=480.000000, h=854.000000) sr=(x=0.000000, y=0.000000, w=320.000000, h=569.333313) s=(x=0.000000, y=0.000000) dp=(x=0.000000, y=0.000000, w=320.000000, h=569.333313) cdp=(x=0.000000, y=0.000000, w=320.000000, h=569.333313) scrollId=3 z=1.500 }]
...
    ColorLayerComposite (0xaac95000) [shadow-visible=< (x=0, y=0, w=480, h=854); >] [visible=< (x=0, y=0, w=480, h=854); >] [opaqueContent] [color=rgba(255, 0, 0, 1)]

ContainerLayerComposite should have determined that layer 0xab88e000 is covered by layer 0xaac93c00. It didn't, because layer 0xaac93c00's transform is not just a translation, so GetOpaqueRect bailed out. That transform is there because the app starts with a default scale (like a regular HTML page would).

So I added this to the test app's <head>:
  <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1">
The problem went away and my overdraw is down to 105%, which is about right since there is some developer overlay at the bottom of the screen.

Most apps have this already. The Camera app certainly does.

So I think this bug, as filed, is WORKSFORME.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
If someone thinks it's important to fix the overdraw in this app without the <meta viewport>, feel free to reopen the bug. I think it's probably not important since very few apps want the user-scaling behavior that you get without <meta viewport>.
That was just an omission in my test case. It's nice-to-have but not important enough to fix.

The real problem was that container layer we're not flagged as opaqueContent but if that's fixed now on trunk then great.
blocking-b2g: 2.1? → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: