Closed Bug 1018387 Opened 10 years ago Closed 10 years ago

[B2G] [Browser] Texts in textboxes become unreadably blurry when made to scroll

Categories

(Core :: Panning and Zooming, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
blocking-b2g 2.0+
Tracking Status
firefox31 --- wontfix
firefox32 --- fixed
firefox33 --- fixed
b2g-v1.4 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: ckreinbring, Assigned: kats)

References

Details

(Keywords: regression, Whiteboard: [2.0-flame-test-run-1])

Attachments

(7 files, 4 obsolete files)

2.93 KB, text/plain
Details
38.36 KB, image/jpeg
Details
2.66 KB, text/plain
Details
1.81 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
13.80 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
18.19 KB, patch
kats
: review+
Details | Diff | Splinter Review
3.58 KB, patch
botond
: review+
Details | Diff | Splinter Review
Description:
When a large amount of text is entered in some browser textboxes, the text becomes blurry and unreadable.

Repro Steps:
1) Update Flame to Build ID: 20140530040207
2) Launch the browser and navigate to http://mozilla.github.io/qa-testcase-data/webapi/notifications/index.html
3) Tap one of the textboxes in the Show Notification section and enter enough text to exceed the textbox boundaries.
4) Observe the text in the textbox when it is forced to scroll.

Actual:
The text becomes a blurry mess that does not match up with the entered text.

Expected:
The text is scrolled with no errors.

Environmental Variables
Device: Flame 2.0
Build ID: 20140530040207
Gecko: https://hg.mozilla.org/mozilla-central/rev/e6f113c83095
Gaia: 26d8fcab9b61f46451600f39c51e0387ef3c4f88
Platform Version: 32.0a1
Firmware Version: v10G-2
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Notes:
Repro frequency: 100%
See attached screenshot and logcat logs
Also repros on today's Open C and Buri 2.0, but today's Flame 1.4 renders the textbox text normally.

Open C 2.0
Build ID: 20140530040207
Gecko: https://hg.mozilla.org/mozilla-central/rev/e6f113c83095
Gaia: 26d8fcab9b61f46451600f39c51e0387ef3c4f88
Platform Version: 32.0a1
Firmware Version: P821A10v1.0.0806_LOG_DL
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Buri 2.0
Build ID: 20140530040207
Gecko: https://hg.mozilla.org/mozilla-central/rev/e6f113c83095
Gaia: 26d8fcab9b61f46451600f39c51e0387ef3c4f88
Platform Version: 32.0a1
Firmware Version: V1.2-device.cfg
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Flame 1.4
Build ID: 20140530000202
Gecko: https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/25011f9a8f26
Gaia: fe612fd21389193a8e593aa718831602e5086a62
Platform Version: 30.0
Firmware Version: V10g-2
User Agent: Mozilla/5.0 (Mobile; rv:30.0) Gecko/30.0 Firefox/30.0


The text is also rendered correctly on yesterday's build on all devices.

Build ID: 20140529040201
Gecko: https://hg.mozilla.org/mozilla-central/rev/1e712b724d17
Gaia: 4142df90c71abdc1e3a87cd37dff1a22d5e38b34
Platform Version: 32.0a1
Firmware Version: V10g-2
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
Maybe graphics? We'll know after getting the range.
blocking-b2g: --- → 2.0?
Component: Gaia::Browser → Graphics
Product: Firefox OS → Core
Version: unspecified → Trunk
Flame logcat logs may not be accurate doe to bug 1010993, so attaching logs from today's Open C 2.0
QA Contact: pcheng
Mozilla-inbound regression window:

Last Working Environmental Variables:
Device: Flame
BuildID: 20140529073016
Gaia: 4142df90c71abdc1e3a87cd37dff1a22d5e38b34
Gecko: 1c823c4af278
Version: 32.0a1
Firmware Version: v10g-2

First Broken Environmental Variables:
Device: Flame
BuildID: 20140529103002
Gaia: b669dd2cc321f37cebc7081a79b968cac36b4200
Gecko: 29ca8bc78484
Version: 32.0a1
Firmware Version: v10g-2

Last Working Gaia / First Broken Gecko: Issue DOES reproduce
Gaia: 4142df90c71abdc1e3a87cd37dff1a22d5e38b34
Gecko: 29ca8bc78484

Last Working Gecko / First Broken Gaia: Issue does NOT reproduce
Gaia: b669dd2cc321f37cebc7081a79b968cac36b4200
Gecko: 1c823c4af278

Gecko pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=1c823c4af278&tochange=29ca8bc78484

Notes:
I'm not confident about the above pushlog, so here is the Central Tinderbox gecko pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b85b57f05fda&tochange=38c5e21a80fa

And here is the b2g-inbound gecko pushlog:
http://hg.mozilla.org/integration/b2g-inbound/pushloghtml?fromchange=0ad306bee50e&tochange=6b4c05be821f
Both push logs seem a bit off here - they both are showing pushes from other branches.

Can we double check this?
I switched to using Buri to find the window (as Flame seems to always give me weird gecko pushlogs) and had JMitchell double-check my window before posting.

Mozilla inbound regression window:

Last Working Environmental Variables:
Device: Buri MOZ
BuildID: 20140529090251
Gaia: b669dd2cc321f37cebc7081a79b968cac36b4200
Gecko: 811e55feb157
Version: 32.0a1
Firmware Version: v1.2-device.cfg

First Broken Environmental Variables:
Device: Buri MOZ
BuildID: 20140529095252
Gaia: b669dd2cc321f37cebc7081a79b968cac36b4200
Gecko: ee8f081ebce6
Version: 32.0a1
Firmware Version: v1.2-device.cfg

Gaia is the same on both builds, indicating this as a Gecko issue.

Gecko pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=811e55feb157&tochange=ee8f081ebce6
Okay - that window looks correct.

This is either a regression from bug 897996 or bug 994293.

kats - Can you investigate?
Flags: needinfo?(bugmail.mozilla)
We can't really ship this way, but if we decide to turn off low res tiling, this will stop being a blocker.
Component: Graphics → Panning and Zooming
blocking-b2g: 2.0? → 2.0+
Yeah I'll take a look
Assignee: nobody → bugmail.mozilla
Flags: needinfo?(bugmail.mozilla)
Stephany, just wanted a UX confirmation that "we can't ship this way".
Flags: needinfo?(swilkes)
(In reply to Milan Sreckovic [:milan] from comment #11)
> Stephany, just wanted a UX confirmation that "we can't ship this way".

Isn't this not obvious? The text isn't readable.
You know what - you're right, I was thinking of another bug when I made this comment.  OK, I stand by my original 2.0+ from comment 9.
Flags: needinfo?(swilkes)
It's been a long day of fighting with coordinate systems and I might be hallucinating, but it looks to me that Layer::mVisibleRegion is in LayerPixels. But in the tiling code we do a whole bunch of stuff that treats it like it's in LayoutDevicePixels. It's a minor miracle anything works at all.
To Milan: We shouldn't ship this way. To Kartikaya: Your comment #14 just made my (very long) day. :)
Turns out most of the math is still right even though it's mislabeled. So it's not as bad as I thought..
This patch may not work so well with unified builds. Waiting on try results to see what happens.
No functional changes here. mVisibleRegion and mValidRegion are in LayerPixel space, so we need to keep everything else in LayerPixel space as well to do meaningful operations between them.
Attachment #8436492 - Flags: review?(chrislord.net)
Attached patch Part 4 - The fix (obsolete) — Splinter Review
This is the actual fix, which makes perfect sense once everything is in LayerPixels as it's supposed to be. I need to test this on Fennec, but on B2G it seems to be fine. Try pushes:

https://tbpl.mozilla.org/?tree=Try&rev=135bd80fc902
https://tbpl.mozilla.org/?tree=Try&rev=db161bd578be
Comment on attachment 8436489 [details] [diff] [review]
Part 1 - Log composition bounds in layer dump

Review of attachment 8436489 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Attachment #8436489 - Flags: review?(chrislord.net) → review+
Comment on attachment 8436492 [details] [diff] [review]
Part 3 - Relabel code and comments

Review of attachment 8436492 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, with the 2nd comment addressed.

::: gfx/layers/client/TiledContentClient.cpp
@@ +988,5 @@
>      transformedCompositionBounds.Intersect(aPaintData->mCompositionBounds);
>  
>    // Offset by the viewport origin, as the composition bounds are stored in
>    // Layer space and not LayoutDevice space.
> +  // TODO(kats): does this make sense?

I'm also uncertain of this.

::: gfx/layers/client/TiledContentClient.h
@@ +279,5 @@
>     */
>    CSSToParentLayerScale mResolution;
>  
>    /*
>     * The composition bounds of the layer, in LayoutDevice coordinates. This is

This comment needs updating.
Attachment #8436492 - Flags: review?(chrislord.net) → review+
Comment on attachment 8436493 [details] [diff] [review]
Part 4 - The fix

Review of attachment 8436493 [details] [diff] [review]:
-----------------------------------------------------------------

I have a feeling this patch may break fennec, but it's no more than a feeling - make sure to test it thoroughly (which I'm sure you will).
There's an error in part 2 (see bug 1022381) which makes part 4 not work at all. :( Will need more investigation.
s/part 2/part 3/
Updated to fix the error I described in bug 1022381 and to address the review comment. Carrying r+
Attachment #8436492 - Attachment is obsolete: true
Attachment #8436536 - Flags: review+
Attached patch Part 4 - The fix (v2) (obsolete) — Splinter Review
Still needs testing on Fennec. Try push at https://tbpl.mozilla.org/?tree=Try&rev=9bbe10cf2fa7
Attachment #8436493 - Attachment is obsolete: true
Comment on attachment 8436537 [details] [diff] [review]
Part 4 - The fix (v2)

Review of attachment 8436537 [details] [diff] [review]:
-----------------------------------------------------------------

Tested on Fennec, seems good.
Attachment #8436537 - Flags: review?(chrislord.net)
Attachment #8436537 - Flags: review?(botond)
Comment on attachment 8436535 [details] [diff] [review]
Part 2 - Add some logging code for ClientTiledThebesLayer

Review of attachment 8436535 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure I'm the right person to review this, but it looks sound to me.

::: gfx/layers/TiledLayerBuffer.h
@@ +31,5 @@
> +// of a .cpp file), and then run with NSPR_LOG_MODULES=tiling:5
> +// in your environment at runtime.
> +#ifdef PR_LOGGING
> +#  define TILING_PRLOG(_args) PR_LOG(gTilingLog, PR_LOG_DEBUG, _args)
> +#  define TILING_PRLOG_OBJ(_args, obj) \

I'm not keen on the whole tmpstr thing here, but I assume this is what we do elsewhere?

::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +252,5 @@
>  
>    // Calculate everything we need to perform the paint.
>    BeginPaint();
>    if (mPaintData.mPaintFinished) {
> +    TILING_PRLOG(("TILING 0x%p: Paint finished\n", this));

Would it make more sense to put this where mPaintFinished gets set? I'm wondering if this is hit if there are no repeated transactions.
Attachment #8436535 - Flags: review?(chrislord.net) → review+
Comment on attachment 8436537 [details] [diff] [review]
Part 4 - The fix (v2)

Review of attachment 8436537 [details] [diff] [review]:
-----------------------------------------------------------------

Again, looks sound to me, but I'd wait for Botond's review too, and I'd like the comment addressed.

::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +136,5 @@
>  
>    mPaintData.mTransformDisplayPortToLayer = transformToDisplayPort.Inverse();
>  
> +  CSSToParentLayerScale zoomToParent = displayportMetrics.mDevPixelsPerCSSPixel
> +                                     * displayportMetrics.GetParentResolution();

Can we have a comment here as to why GetZoomToParent isn't used? I can see someone just removing this thinking it's an optimisation/better for readability (or is it a mistake?).
Attachment #8436537 - Flags: review?(chrislord.net) → review+
(In reply to Chris Lord [:cwiiis] from comment #30)
> I'm not keen on the whole tmpstr thing here, but I assume this is what we do
> elsewhere?

Not really. I adapted the MOZ_LAYERS_LOG stuff in LayersTypes.h and introduced tmpStr to make the logging callsites cleaner. It's a little ugly I admit but I couldn't think of (and didn't try too hard to find) a better way.

> >    if (mPaintData.mPaintFinished) {
> > +    TILING_PRLOG(("TILING 0x%p: Paint finished\n", this));
> 
> Would it make more sense to put this where mPaintFinished gets set? I'm
> wondering if this is hit if there are no repeated transactions.

Yeah I can do that. When I added that log I was still trying to wrap my head around the flow in this function. I'll probably refactor this function at some point to be more like a state machine so it's clearer to see what's going on.

(In reply to Chris Lord [:cwiiis] from comment #31)
> > +  CSSToParentLayerScale zoomToParent = displayportMetrics.mDevPixelsPerCSSPixel
> > +                                     * displayportMetrics.GetParentResolution();
> 
> Can we have a comment here as to why GetZoomToParent isn't used? I can see
> someone just removing this thinking it's an optimisation/better for
> readability (or is it a mistake?).

Yeah I'll add a comment. GetZoomToParent is wrong to use here because on the client side the mTransformScale component of the FrameMetrics is not going to be set. This means the nontransient async transform is ignored, and throws off the calculations.
Comment on attachment 8436537 [details] [diff] [review]
Part 4 - The fix (v2)

Review of attachment 8436537 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ -134,5 @@
>    gfx3DMatrix transformToDisplayPort =
>      GetTransformToAncestorsParentLayer(this, displayPortAncestor);
> -  transformToDisplayPort.ScalePost(scrollMetrics.mCumulativeResolution.scale,
> -                                   scrollMetrics.mCumulativeResolution.scale,
> -                                   1.f);

I think this change would more naturally go into Part 3. (I spent a lot of time staring at Part 3 wondering why it's OK to say that this transform starts from Layer space, before I saw the removal of this scale here.)

@@ +136,5 @@
>  
>    mPaintData.mTransformDisplayPortToLayer = transformToDisplayPort.Inverse();
>  
> +  CSSToParentLayerScale zoomToParent = displayportMetrics.mDevPixelsPerCSSPixel
> +                                     * displayportMetrics.GetParentResolution();

While GetZoomToParent() and mDevPixelsPerCSSPixel * GetParentResolution() are both CSSToParentLayerScales, they are actually different quantities:

 - GetZoomToParent() goes from the layer's own CSS space to
   the parent layer's Layer space.

 - mDevPixelsPerCSSPixel * GetParentResolution() goes from
   the parent layer's CSS space to the parent layer's Layer
   space

(This can be seen from the diagram at https://bug935219.bugzilla.mozilla.org/attachment.cgi?id=8380975).

I believe mCriticalDisplayPort is in the layer's own CSS space, so based on the usage here, it sounds like we want GetZoomToParent().

Moreover:

> GetZoomToParent is wrong to use here because on the client 
> side the mTransformScale component of the FrameMetrics is 
> not going to be set. This means the nontransient async 
> transform is ignored, and throws off the calculations. 

mTransformScale contains not only the scale of the nontransient async transform, but also the scale of the CSS transform. These two scales typically cancel each other out, yielding an mTransformScale of 1.

Meanwhile, (mDevPixelsPerCSSPixel * GetParentResolution()) is not just mTransformScale away from GetZoomToParent(): it's also missing the layer's own resolution. 

So, I'm surprised that this value works, and I'd like to better understand why.

@@ -166,5 @@
>    gfx3DMatrix transformToCompBounds =
>      GetTransformToAncestorsParentLayer(this, scrollAncestor);
> -  mPaintData.mCompositionBounds = TransformTo<LayerPixel>(
> -    transformToCompBounds.Inverse(),
> -    scrollMetrics.mCompositionBounds / scrollMetrics.GetParentResolution());

Is the removal of the division by GetParentResolution() related to other changes in this patch, or is that something that was just plain wrong before and now we're fixing it?
(In reply to Botond Ballo [:botond] from comment #33)
> I think this change would more naturally go into Part 3. (I spent a lot of
> time staring at Part 3 wondering why it's OK to say that this transform
> starts from Layer space, before I saw the removal of this scale here.)

I thought it would simpler to split the functional and non-functional changes into separate patches, specially because after changing the types of things, part 4 should "just make sense" in terms of aligning values to their expected coordinate systems. Technically all of part 4 should be rolled into part 3 otherwise.

> @@ +136,5 @@
> While GetZoomToParent() and mDevPixelsPerCSSPixel * GetParentResolution()
> are both CSSToParentLayerScales, they are actually different quantities:

I spent some more time thinking about this, and yeah, my patch is wrong (even though it does work). The fundamental problem I think is that GetTransformToAncestorsParentLayer doesn't do what it says it does. It multiplies all the layer transforms up to the Ancestor's parent layer, but it ignores the LN transforms which exist only in the compositor. That is, multiplying GetTransform()s together will gather all the LC..PC terms but not the LN..PN terms, and so it's not an accurate representation of the final transformation the layer goes through in the compositor. Using GetZoomToParent() however *does* include the LN terms (implicitly, because they cancel out with the LC terms as you said). This is the mismatch that causes the problem.

Because the "ancestor" in this case is the closest ancestor with a displayport, I think there is only one layer which actually has a nontransient async transform, and that is the ancestor/displayport layer itself. The nontransient async transform on that layer will be exactly the mResolution on that layer. This is why using mDevPixelsPerCSSPixel * GetParentResolution() (which excludes the mResolution) makes it match what GetTransformToAncestorsParentLayer does, and makes the patch work.

However, an alternative (and clearer) fix might be to modify GetTransformToAncestorsParentLayer to be GetTransformToAncestorsLayer - that is, instead of going up to the ParentLayer units for the ancestor, just stop at the Layer units. I don't think it's necessary to go to the ParentLayer units at all, because the displayport layer is nearest common ancestor layer for the display port (trivially) and the ClientTiledThebesLayer. Therefore using the displayport's Layer units should give us sufficient information for transforming stuff. Mathematically, this will exclude both the PN and PC terms from the GetTransformToAncestorsLayer, and I would replace the GetZoomToParent() call with LayersPixelsPerCSSPixel().

I'll try this out.

> @@ -166,5 @@
> >    gfx3DMatrix transformToCompBounds =
> >      GetTransformToAncestorsParentLayer(this, scrollAncestor);
> > -  mPaintData.mCompositionBounds = TransformTo<LayerPixel>(
> > -    transformToCompBounds.Inverse(),
> > -    scrollMetrics.mCompositionBounds / scrollMetrics.GetParentResolution());
> 
> Is the removal of the division by GetParentResolution() related to other
> changes in this patch, or is that something that was just plain wrong before
> and now we're fixing it?

I didn't actually check if this change by itself has a noticeable impact, but it looked wrong according to my mental model of how this should work and so I fixed it. :)
Actually because part of the transformation involves the displayportMetrics.mCompositionBounds which is in ParentLayer coordinates it might be simpler to add the missing LN transforms into GetTransformToAncestorsParentLayer using the mResolution on the FrameMetricses.
Attachment #8436537 - Attachment is obsolete: true
Attachment #8436537 - Flags: review?(botond)
Attachment #8436939 - Flags: review?(botond)
Comment on attachment 8436939 [details] [diff] [review]
Part 4 - The fix (v3)

Review of attachment 8436939 [details] [diff] [review]:
-----------------------------------------------------------------

> I thought it would simpler to split the functional and non-functional
> changes into separate patches, specially because after changing the types of
> things, part 4 should "just make sense" in terms of aligning values to their
> expected coordinate systems. Technically all of part 4 should be rolled into
> part 3 otherwise.

Fair enough :)

::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +144,5 @@
>  
>    // Compute the critical display port that applies to this layer in the
>    // LayoutDevice space of this layer.
>    ParentLayerRect criticalDisplayPort =
>      (displayportMetrics.mCriticalDisplayPort * displayportMetrics.GetZoomToParent())

Having identified that using GetZoomToParent() on the client side has the conceptual problem that mTransformScale isn't actually set, and that it only happens to work because we currently expect mTransformScale to be 1 (this might change when we add proper support for CSS transforms), let's add a comment to that effect in this function.
Attachment #8436939 - Flags: review?(botond) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: