Closed Bug 1149461 Opened 9 years ago Closed 9 years ago

[Message]Tap a new number in recipients to edit, a blank or black area appears.

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
blocking-b2g 2.2+
Tracking Status
firefox38 --- wontfix
firefox39 --- wontfix
firefox40 --- fixed
b2g-v2.2 --- verified
b2g-master --- verified

People

(Reporter: huayu.li, Assigned: kats)

References

Details

(Keywords: regression, Whiteboard: [2.2-nexus-5-l] gfx-noted)

Attachments

(6 files)

Attached file logcat3.txt
[1.Description]:
[Nexus 5 v2.2&v3.0][Flame 2.2][Message]There are some new number you added in recipients, tap anyone to edit, there will be a blank or black area.
Found at:15:57
See attachments:logcat3.txt, VIDEO0467.3gp

[2.Testing Steps]: 
1.Launch message.
2.Tap edit button.
3.Tap the recipient inut box.
4.Add some recipients[Input number and tap enter key].
5.Tap the number you added.
6.Try to delete or add more number and tap enter key on keyboard.

[3.Expected Result]: 
6.There should not have any blank or black area.

[4.Actual Result]: 
6.Sometimes, the number added in step 6 cannot be displayed, just a blank area, if we tap enter key, a black area will appear.

[5.Reproduction build]: 
Device:Nexus5_2.2[Affected]
Build ID               20150330162503
Gaia Revision          cc11248ab69f13e89416c8e6bb2e184187e72088
Gaia Date              2015-03-30 22:22:58
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/90a26917ee8f
Gecko Version          37.0
Device Name            hammerhead
Firmware(Release)      5.0
Firmware(Incremental)  eng.cltbld.20150330.201653
Firmware Date          Mon Mar 30 20:17:07 EDT 2015
Bootloader             HHZ12d

Device:Nexus5_3.0[Affected]
Build ID               20150330160204
Gaia Revision          be25b16efa19bab8d54be08f8fe45dcc93bf93d0
Gaia Date              2015-03-29 10:19:00
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/6dedce1ca673
Gecko Version          40.0a1
Device Name            hammerhead
Firmware(Release)      5.0
Firmware(Incremental)  eng.cltbld.20150330.191451
Firmware Date          Mon Mar 30 19:15:04 EDT 2015
Bootloader             HHZ12d

Device:Flame 2.2[Affected]
Build ID               20150330162503
Gaia Revision          cc11248ab69f13e89416c8e6bb2e184187e72088
Gaia Date              2015-03-30 22:22:58
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/90a26917ee8f
Gecko Version          37.0
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150330.201412
Firmware Date          Mon Mar 30 20:14:21 EDT 2015
Bootloader             L1TC000118D0

[6.Reproduction Frequency]: 
occasionally Recurrence,5/10

[7.TCID]: 
Free Test
Attached video VIDEO0467.3gp
Looks like Graphics issues...

Milan, can you help redirecting this?
blocking-b2g: --- → 2.2?
Component: Gaia::SMS → Graphics: Layers
Product: Firefox OS → Core
Flags: needinfo?(milan)
Does this also happen on flame on master? Or only 2.2? From comment 0 it says Flame 2.2 but the tracking flags say v3.0 as well.
See Also: → 1149457
Flags: needinfo?(huayu.li)
And it happens on Flame 2.2 KitKat (I don't know if we have Flame L yet) as well?  Regression?
Flags: needinfo?(milan)
Whiteboard: [2.2-nexus-5-l] → [2.2-nexus-5-l] gfx-noted
Keywords: regression
QA Contact: ktucker
Hi mason, this issue also exist on Flame 3.0
Reproduction Frequency: 2/10
Device:flame 3.0
Build ID               20150331160205
Gaia Revision          03164bd160809747e6a198e0dba1b7c3ee7789f5
Gaia Date              2015-03-31 14:48:14
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/18a8ea7c2c62
Gecko Version          40.0a1
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150331.191641
Firmware Date          Tue Mar 31 19:16:50 EDT
Flags: needinfo?(huayu.li)
blocking-b2g: 2.2? → 2.2+
Tagged as a regression, but I didn't see a mention that it works on 2.1?  Kats, it seems to happen when we start being able to scroll the field, any thoughts?
Flags: needinfo?(bugmail.mozilla)
Mozilla-Inbound Regression Window:

Last working Mozilla-Inbound build:
Device: Flame 3.0
BuildID: 20150224155557
Gaia: f6bfd854fe4746f21bc006eac145365e85f98808
Gecko: b3cf9862c075
Version: 39.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:39.0) Gecko/39.0 Firefox/39.0

First broken Mozilla-Inbound build:
Device: Flame 3.0
BuildID: 20150224210729
Gaia: f6bfd854fe4746f21bc006eac145365e85f98808
Gecko: 9d4ffa427cf8
Version: 39.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:39.0) Gecko/39.0 Firefox/39.0

Working Gaia with Broken Gecko issue DOES reproduce:
Gaia: f6bfd854fe4746f21bc006eac145365e85f98808
Gecko: 9d4ffa427cf8

Working Gecko with Broken Gaia issue does NOT reproduce:
Gaia: f6bfd854fe4746f21bc006eac145365e85f98808
Gecko: b3cf9862c075

Mozilla-Inbound Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=b3cf9862c075&tochange=9d4ffa427cf8

This looks to be caused by either bug 1135677 or bug 1134762


Kartikaya, can you take a look at this please? Unfortunately, we can't do a gecko bisection here yet so we are only guessing here.
Blocks: 1134762, 1135677
QA Whiteboard: [QAnalyst-Triage+]
(In reply to Mason Chang [:mchang] from comment #3)
> Does this also happen on flame on master? Or only 2.2? From comment 0 it
> says Flame 2.2 but the tracking flags say v3.0 as well.

Hi! Mason,

This looks like not Lollipop specific issue. It also happens on flame-kk.
Do you agree to remove "Blocks: 1094121"? Thanks

--
Keven
Flags: needinfo?(mchang)
I was able to reproduce this on a Flame running master, and grabbed a tiling log which shows that there's a mismatch between the client side and the compositor with respect how much of the layer they think is visible. The compositor thinks more of the layer is visible, and therefore thinks that layout didn't paint enough of the layer and that we are checkerboarding. Therefore it aborts high-res painting and drops down to low-res painting, but it often aborts that too because there's always new paints on the way (presumably due to the blinking cursor).

However, I think there's some other underlying issue here which might be aggravating it. I "fixed" the tiling problem by changing AboutToCheckerboard always return false, but I can still reproduce some strange behaviour with the positioning of the elements as you switch between the two views ("to" list visible vs. text entry). I'll attach a video in a sec. I think that fixing that issue might just make this problem go away. Or it might not. I'll keep looking into this but somebody on the Gaia side should look into the positioning issue.
The video I took is just a little too large to attach. I posted it at people.mozilla.org/~kgupta/bug/1149461.mp4 instead - note how when I start typing again the field scrolls off completely and doesn't get repositioned properly until I hit enter. Even then I'm not sure if the layout of this app (on the gaia side) is the way it's intended to be.
Yes, the layout at the very end of the video looks correct.
(In reply to Keven Kuo [:kkuo] from comment #8)
> (In reply to Mason Chang [:mchang] from comment #3)
> > Does this also happen on flame on master? Or only 2.2? From comment 0 it
> > says Flame 2.2 but the tracking flags say v3.0 as well.
> 
> Hi! Mason,
> 
> This looks like not Lollipop specific issue. It also happens on flame-kk.
> Do you agree to remove "Blocks: 1094121"? Thanks
> 
> --
> Keven

Sure, removed the blocking bug.
No longer blocks: gonk-L
Flags: needinfo?(mchang)
Assignee: nobody → bugmail.mozilla
Flags: needinfo?(bugmail.mozilla)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> I was able to reproduce this on a Flame running master, and grabbed a tiling
> log which shows that there's a mismatch between the client side and the
> compositor with respect how much of the layer they think is visible. The
> compositor thinks more of the layer is visible, and therefore thinks that
> layout didn't paint enough of the layer and that we are checkerboarding.

So what's happening here is that we have a scrollable frame that is CSS transformed so that part of it is offscreen. For instance, here is the container layer with a CSS transform of y=-667.5

ClientContainerLayer (0xb18ede00) [transform=[ 1 0; 0 1; 0 -667.5; ]] [visible=< (x=0, y=742, w=480, h=83); >]

and then contained inside is the scrollable frame:

  ClientTiledPaintedLayer (0xb1967800) [clip=(x=58, y=0, w=347, h=825)] [transform=[ 1 0; 0 1; 58 0; ]] [bounds=(x=0, y=9, w=326, h=807)] [visible=< (x=0, y=9, w=326, h=807); >] { hitregion=< (x=0, y=0, w=347, h=825); > dispatchtocontentregion=< (x=0, y=0, w=347, h=825); >} [metrics0={ [cb=(x=58.000000, y=0.000000, w=347.000000, h=825.000000)] [sr=(x=0.000000, y=0.000000, w=231.333328, h=550.000000)] [s=(0,0)] [dp=(x=0.000000, y=0.000000, w=231.333328, h=550.000000)] [cdp=(x=0.000000, y=341.333344, w=231.333328, h=208.666672)] [color=rgba(0, 0, 0, 0)] [scrollId=6] [scrollParent=3] [z=1.5] }] [valid=< (x=0, y=512, w=326, h=151); (x=0, y=672, w=248, h=42); (x=0, y=723, w=266, h=42); (x=0, y=774, w=234, h=42); >]

The critical displayport for this layer starts at y=341.33 because layout knows that only the bottom bit of the layer is visible. But the tiling code doesn't know that, so it thinks layout didn't paint everything it should have, and considers things to be in a state of checkerboarding.
Rather than try to add more smarts to the tiling code (which has not worked out well in the past) I think it's safer to re-enable the code that Benoit added a while back (in bug 1112332) and then disabled (in bug 1134762). This code disables drawing a layer progressively unless we suspect it is being actively scrolled on the compositor. The rationale here is that if it's not being actively scrolled, then there's no point to drawing it progressively because we shouldn't be aborting between tiles anyway.

Note that when Benoit first landed this code it was in a different place (in UseFastPath()) which has since been removed and some of the code has been rearranged. However during the rearrangement Benoit and I agreed that keeping this around made sense and that we would want to re-enable it some day once things were more stable. I think that day has come, as it fixes this bug nicely.
Attachment #8588660 - Flags: review?(nical.bugzilla)
No-Jun, this may be worth a bit of an extra testing.
Flags: needinfo?(npark)
Found Bug 1151635, it is sort of similar but the patch for this bug did not seem to make the bug go away.
Flags: needinfo?(npark)
Attachment #8588660 - Flags: review?(nical.bugzilla) → review+
https://hg.mozilla.org/mozilla-central/rev/9a902a85c828
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Will let this bake a few days before requesting uplift.
I tested the patch, and I couldn't repro the issue on my local build (while I was able to repro the issue on today's nightly build) and other basic graphics beahvior seemed good to me.  I will check again on tomorrow's build when I do the graphics sanity test.  ni?ing myself as a reminder.
Flags: needinfo?(npark)
This issue had been successfully verified on Flame 3.0 and nexus5 3.0
Reproducing rate: 0/5
Device: Flame 3.0[Verified]
Build ID               20150407160201
Gaia Revision          84cbd4391fb7175d5380fa72c04d68873ce77e6d
Gaia Date              2015-04-07 17:33:14
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/078128c2600a
Gecko Version          40.0a1
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150407.193600
Firmware Date          Tue Apr  7 19:36:12 EDT 2015
Bootloader             L1TC000118D0

Device: Nexus5 3.0[Verified]
Build ID               20150407160201
Gaia Revision          84cbd4391fb7175d5380fa72c04d68873ce77e6d
Gaia Date              2015-04-07 17:33:14
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/078128c2600a
Gecko Version          40.0a1
Device Name            hammerhead
Firmware(Release)      5.0
Firmware(Incremental)  eng.cltbld.20150407.192944
Firmware Date          Tue Apr  7 19:30:00 EDT 2015
Bootloader             HHZ12d
didn't find any obvious new graphics glitch after it landed in mozilla-central
Flags: needinfo?(npark)
Comment on attachment 8588660 [details] [diff] [review]
Disable progressive drawing unless the layer is actively scrolling

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): progressive painting code
User impact if declined: sometimes the progressive painting code aborts drawing a little too eagerly, causing unpainted areas/glitching.
Testing completed: locally, baked on m-c for a few days
Risk to taking this patch (and alternatives if risky): I think the main risk here is that we might end up regressing performance in some cases. Without this patch, we abort painting more aggressively, so we can respond to changes faster but at the cost of sometimes having glitchy/bad visible areas. With this patch, we go the other way, so will abort less which means we will paint unnecessarily at times. Still I think it's worth it for the correctness win. I don't think the perf hit will be very noticeable in most cases.
String or UUID changes made by this patch: none
Attachment #8588660 - Flags: approval-mozilla-b2g37?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #23)
> Comment on attachment 8588660 [details] [diff] [review]
> Disable progressive drawing unless the layer is actively scrolling
> 
> NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to
> better understand the B2G approval process and landings.
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): progressive painting code
> User impact if declined: sometimes the progressive painting code aborts
> drawing a little too eagerly, causing unpainted areas/glitching.
> Testing completed: locally, baked on m-c for a few days
> Risk to taking this patch (and alternatives if risky): I think the main risk
> here is that we might end up regressing performance in some cases. Without
> this patch, we abort painting more aggressively, so we can respond to
> changes faster but at the cost of sometimes having glitchy/bad visible
> areas. With this patch, we go the other way, so will abort less which means
> we will paint unnecessarily at times. Still I think it's worth it for the
> correctness win. I don't think the perf hit will be very noticeable in most
> cases.
hey kats, are there any performance test cases that you can run locally to make sure things have not regressed? Do you think running the raptor tests (http://mzl.la/1DRdrJo)to confirm atleast the metrics we have in place :  would be worth it here? Or any other ideas to keep a tap on plausible perf fallouts ?
> String or UUID changes made by this patch: none
Flags: needinfo?(bugmail.mozilla)
(In reply to bhavana bajaj [:bajaj] from comment #24)
> hey kats, are there any performance test cases that you can run locally to
> make sure things have not regressed? Do you think running the raptor tests
> (http://mzl.la/1DRdrJo)to confirm atleast the metrics we have in place : 
> would be worth it here? Or any other ideas to keep a tap on plausible perf
> fallouts ?

There isn't much beyond talos tests that I can run locally for this, and those haven't picked up any regressions since this landed on inbound/central. I haven't heard of the raptor tests before but from reading the wiki page it does sound like it might be worth running. The gaia team also runs startup time tests (not sure if that's the same as raptor) and so far I haven't heard of any regressions in those either.
Flags: needinfo?(bugmail.mozilla)
Yeah, Raptor is the new tool to run startup time test... But I think the current results are broken and do not have any measurement from before you landed your patch :/

So maybe it would be best if you can take measures before/after your patch using instructions at [1], for some sample apps.

It takes time but you can let it run while working on other things :)

[1] https://developer.mozilla.org/en-US/Firefox_OS/Automated_testing/Raptor#Running_a_performance_test
Attached file Rapter log after patch
These are the raptor timing info for the clock, dialer, contacts, and settings apps with 10 runs each. It seems to me that there wasn't any significant regression (or improvement), the numbers seem to be within standard deviation.
Yep, seems about right.

The first app seems to have got more perf regression than the others, but with only 10 runs it's difficult to have an absolute certainty.

But we now recovered the old measures on raptor main UI and I don't see any spike around April 7th. I'd say it's good to go.
Attachment #8588660 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
This issue had been successfully verified on Flame 2.2and nexus5 2.2
Reproducing rate: 0/5
Device:Flame 2.2[Verified]
Build ID               20150414162502
Gaia Revision          16e948bfaaa15dbc0200135d52f16257b4eab193
Gaia Date              2015-04-14 21:08:25
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/0eec28e78eb1
Gecko Version          37.0
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150414.201515
Firmware Date          Tue Apr 14 20:15:24 EDT 2015
Bootloader             L1TC000118D0

Devie:Nexus5 2.2[Verified]
Build ID               20150413162504
Gaia Revision          73645b097720f3ca594a14d288b87d3885d7fc9d
Gaia Date              2015-04-13 19:30:36
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/85ea1be9ac7d
Gecko Version          37.0
Device Name            hammerhead
Firmware(Release)      5.1
Firmware(Incremental)  eng.cltbld.20150413.202024
Firmware Date          Mon Apr 13 20:20:40 EDT 2015
Bootloader             HHZ12f

There is no device of firefox40, leave it for other QA to verify.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: