Investigate using 24-bit or 32-bit color

RESOLVED FIXED in Firefox 25

Status

()

defect
RESOLVED FIXED
7 years ago
3 months ago

People

(Reporter: luke, Assigned: blassey)

Tracking

(Depends on 3 bugs, {productwanted})

unspecified
Firefox 25
ARM
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox23 affected, firefox24 affected)

Details

Attachments

(9 attachments, 14 obsolete attachments)

399.94 KB, image/png
Details
19.63 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
4.07 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
16.52 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
1.48 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
7.56 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
3.14 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
5.26 KB, patch
cwiiis
: review+
Details | Diff | Splinter Review
2.62 KB, patch
blassey
: review+
Details | Diff | Splinter Review
I first noticed then visiting marketplace.mozilla.org on FF mobile: the gradients all have these terrible lines that make the whole thing look rather cheap.  I think the problem is most easily seen with the little interactive example on this page:
  http://www.cambridgeincolour.com/tutorials/bit-depth.htm
The stock browser is fairly comparable to the desktop at 24-bit: no lines.  On FF mobile, 24-bit looks more like 16-bit.
They are.
Oh, ouch.  How critical is this to (I assume) performance?  I've been noticing (what I thought were) cheap looking sites (usually when they use a gradual gradient like marketplace) for a while now.  I always assumed the site was doing something silly because it looked so bad.
Improving gradients is bug 795247
(In reply to Luke Wagner [:luke] from comment #2)
> Oh, ouch.  How critical is this to (I assume) performance?

Quite. I don't have exact numbers, but I have some memory of 10-30% difference.
That's unfortunate.  Does the stock browser just eat the perf hit or are they doing any tricks that we aren't?

Will the gradients improvement take away the contour lines?
(In reply to Joe Drew (:JOEDREW! \o/) from comment #4)
> (In reply to Luke Wagner [:luke] from comment #2)
> > Oh, ouch.  How critical is this to (I assume) performance?
> 
> Quite. I don't have exact numbers, but I have some memory of 10-30%
> difference.

I'll revisit these measurement for 32-bit content with progressive drawing. But even if the drawing performance is much worse that's means we can only cache half the tiles using the same amount of memory.
is not*
This is a somewhat common request in the forums. There is a certain class of users where image fidelity is important.
tracking-fennec: --- → ?
Summary: images look like they have 16-bit color depth on FF mobile → Investigate using 24-bit or 32-bit color
Just following planet mozilla and hacker news feeds in my RSS reader on Fennec, I find a few images a day that have visible artifacts.  It really cheapens the experience (maybe because it awakens old memories of when Windows 95 didn't recognize my graphics adapter and everything was 16-bit and slow).
Bug 821450 would mitigate this some.
Depends on: 821450
Keywords: qawanted
Keywords: qawanted
There's some more info in bug 688812 and its dependency tree.  I'm not sure which of these bugs if any should be resolved duplicate.
Depends on: 688812
Duplicate of this bug: 811675
tracking-fennec: ? → -
Adding myself to this bug to track as this will help us quite a bit with our site designs. 

We're using CSS more and more to avoid raster graphics, and box-shadow and gradient tags get hit hard visually at 16-bit thus limiting our designs and scaling back designing with code.

Would love to see this happen.

---

Reference graphic: http://cl.ly/image/2a0l2q17473h/o
Duplicate of this bug: 856411
Friendly ping to try and push this forward -- we fall pretty dramatically behind Chrome, especially when we get into displaying gradients: http://cl.ly/image/0b3l353v3m2T
Can somebody take the patch from bug 821450 and see what the results look like for this case?
Here is the original logo image you can test with http://cl.ly/image/2U1H1E2a2X0M/o
I have 32-bit colour enabled locally and on a Galaxy Nexus, the performance isn't noticeably different but the image quality is very noticeably better. Running talos for numbers, but I think if a device has a 32-bit screen, we should render in 32-bit (with the possible exception of armv6 devices/low memory devices).
Also, we should dither on 16-bit too, I don't think this is an either-or situation.
I don't plan on pushing this until we have consensus, but here's a patch that does what I think we ought to be doing.
Assignee: nobody → chrislord.net
Status: NEW → ASSIGNED
Attachment #743774 - Flags: review?(bugmail.mozilla)
Comment on attachment 743774 [details] [diff] [review]
Use 24-bit colour on 24-bit screens

Review of attachment 743774 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/android/nsScreenManagerAndroid.cpp
@@ +46,5 @@
>  nsScreenAndroid::GetPixelDepth(int32_t *aPixelDepth)
>  {
>      // XXX do we need to lie here about 16bpp?  Or
>      // should we actually check and return the right thing?
> +    *aPixelDepth = 24;

Er, pretend this calls AndroidBridge::GetScreenDepth :)
(In reply to Chris Lord [:cwiiis] from comment #21)
> Created attachment 743774 [details] [diff] [review]
> Use 24-bit colour on 24-bit screens
> 
> I don't plan on pushing this until we have consensus, but here's a patch
> that does what I think we ought to be doing.

I think we should push this behind a preference right away. Low code size, makes it easier to test the performance difference and it lets power users that really care about depth opt into it while we decide if it's a good tradeoff for defaults.

I though in some cases we preferred using a 565 layer. I don't see your patch changing that.
Comment on attachment 743774 [details] [diff] [review]
Use 24-bit colour on 24-bit screens

Review of attachment 743774 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me. You probably also want to change the tab thumbnails to not be 16bpp all the time. This is hard-coded in AndroidBridge::CaptureThumbnail (the stride declaration as well as the gfxImageSurface constructor) and ThumbnailHelper.java (capacity declaration). That can be done as a second patch on this bug since it's independent of drawing the main content area in 24bpp.

::: mobile/android/base/GeckoAppShell.java
@@ +1290,5 @@
> +    /**
> +     * Returns the colour depth of the default screen. This will either be
> +     * 24 or 16.
> +     */
> +    public static int getScreenDepth() {

Add a note to the comment that this is called from JNI.

::: widget/android/nsScreenManagerAndroid.cpp
@@ +46,5 @@
>  nsScreenAndroid::GetPixelDepth(int32_t *aPixelDepth)
>  {
>      // XXX do we need to lie here about 16bpp?  Or
>      // should we actually check and return the right thing?
> +    *aPixelDepth = 24;

Also remove the comment here.
Attachment #743774 - Flags: review?(bugmail.mozilla) → review+
Oh, there's also some 565ness hard-coded in nsWindow::OnDraw that needs fixing.
If this is easy to put behind a pref, let's do it. That said, I'd be OK landing this on Nightly to get feedback without a pref. As long as we can put it behind a build flag and keep it from making it to Beta or Release. At least until we are happy with the way it's working.
Talos for 16-bit: https://tbpl.mozilla.org/?tree=Try&rev=d1818ff538db
Talos for 32-bit: https://tbpl.mozilla.org/?tree=Try&rev=67da640de01f

Either there's no appreciable difference in performance or our tests aren't stressful enough. Intuition says that it's a combination of the two; any site that stresses us in terms of fps will probably be due to layer complexity, and in that situation, you'll almost certainly have alpha layers, which are 32-bit regardless of the final compositing depth.

I'd say another revision of the patch with the fix-ups that kats mentions and we push it and see if anyone notices anything bad :)
Another note; there is already a force-16-bit pref, but we don't respect it Java-side, which is where the EGL config is created (and with this patch, where the depth is requested from).

I'm not certain about the startup ordering of things to know if a pref read can/will return before surface creation... It'd be a good idea to respect this pref, even if it isn't on by default in the initial push.

I'll look into it alongside the other fix-ups.
(In reply to Chris Lord [:cwiiis] from comment #27)
> Talos for 16-bit: https://tbpl.mozilla.org/?tree=Try&rev=d1818ff538db
> Talos for 32-bit: https://tbpl.mozilla.org/?tree=Try&rev=67da640de01f
> 
> Either there's no appreciable difference in performance or our tests aren't
> stressful enough. Intuition says that it's a combination of the two; any
> site that stresses us in terms of fps will probably be due to layer
> complexity, and in that situation, you'll almost certainly have alpha
> layers, which are 32-bit regardless of the final compositing depth.

We should also run those builds through the eideticker tests.
This catches the cases you mentioned, but I think warrants re-reviewing. I'll address respecting the force-16-bit pref in a separate patch.
Attachment #743774 - Attachment is obsolete: true
Attachment #744103 - Flags: review?(bugmail.mozilla)
Actually it looks like we regressed drawing tiles with 565:
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxAndroidPlatform.cpp#102

Before tiles would just default to 565 but now they honor the screen dept since the layers refactoring. We should compare against actually using 565 tiles.
(In reply to Benoit Girard (:BenWa) from comment #31)
> Actually it looks like we regressed drawing tiles with 565:
> http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxAndroidPlatform.
> cpp#102
> 
> Before tiles would just default to 565 but now they honor the screen dept
> since the layers refactoring. We should compare against actually using 565
> tiles.

Screen depth is hard-coded to 16-bit before this patch, so tiles would still be 565. (see widget/android/nsScreenManager.cpp::GetScreenDepth (or something like that))
Small fix for an apparently unused code-path(?)
Attachment #744103 - Attachment is obsolete: true
Attachment #744103 - Flags: review?(bugmail.mozilla)
Attachment #744109 - Flags: review?(bugmail.mozilla)
Comment on attachment 744103 [details] [diff] [review]
Use 24-bit colour on 24-bit screens

Review of attachment 744103 [details] [diff] [review]:
-----------------------------------------------------------------

Found a few more things this time through :)

Also, do you know if this will affect GL readback at all? I imagine that if we're in 32-bit mode then our pixel checks in the robocop tests will be more accurate and we might be able to reduce the fuzz amount in FennecMochitestAssert::checkPixel. We probably don't actually want to do that unless we know for sure we're going to always be 32-bit, which is probably not anytime soon.

::: mobile/android/base/Tab.java
@@ +182,5 @@
>          }
>  
>          if (mThumbnailBitmap == null) {
> +            Bitmap.Config config = (GeckoAppShell.getScreenDepth() == 16) ?
> +                Bitmap.Config.RGB_565 : Bitmap.Config.ARGB_8888;

For consistency with other pieces of code I would rather invert this check. That is, check for 24 and use ARGB_8888 in that case, and use 565 for everything else. That way if GeckoAppShell.getScreenDepth returns $random we'll consistently default to 16-bit in all the pieces of code. See for example GLController.chooseConfig and nsWindow::OnDraw where you properly default to 16-bit.

::: mobile/android/base/ThumbnailHelper.java
@@ +104,5 @@
>  
>          mWidth &= ~0x1; // Ensure the width is always an even number (bug 776906)
>          mHeight = Math.round(mWidth * THUMBNAIL_ASPECT_RATIO);
>  
> +        int pixelSize = (GeckoAppShell.getScreenDepth() == 16) ? 2 : 4;

Ditto, check for 24 and default to 16.

@@ +188,5 @@
> +        } else {
> +            // Unfortunately, Gecko's 32-bit format is BGRA and Android's is
> +            // ARGB, so we need to manually swizzle.
> +            for (int x = 0; x < mWidth; x++) {
> +                for (int y = 0; y < mHeight; y++) {

Make the mHeight loop the outer loop. Iterating over y in the inner loop is generally much worse for data read caching in the hardware.

::: mobile/android/base/widget/TopSitesView.java
@@ +315,5 @@
>  
> +                Bitmap thumbnail = null;
> +                try {
> +                    thumbnail = BitmapUtils.decodeByteArray(b);
> +                } catch(IllegalArgumentException e) {

nit: space between catch and "("

::: widget/android/AndroidBridge.cpp
@@ +2665,5 @@
>               nsPresContext::CSSPixelsToAppUnits(srcW / scale),
>               nsPresContext::CSSPixelsToAppUnits(srcH / scale));
>  
> +    bool is16bit = (GetScreenDepth() == 16);
> +    uint32_t stride = bufW * (is16bit ? 2 : 4);

Invert this check as well; make it is24bit

::: widget/android/nsWindow.cpp
@@ +1063,5 @@
>  
>      layers::renderTraceEventStart("Get surface", "424545");
>      static unsigned char bits2[32 * 32 * 2];
>      nsRefPtr<gfxImageSurface> targetSurface =
>          new gfxImageSurface(bits2, gfxIntSize(32, 32), 32 * 2,

I think the size of bits2 and the stride passed to gfxImageSurface here also needs to be updated.
Attachment #744103 - Attachment is obsolete: false
Mid-aired with your obsoletion... please post a new patch incorporating all the fixes.
(In reply to Chris Lord [:cwiiis] from comment #32)
> (In reply to Benoit Girard (:BenWa) from comment #31)
> > Actually it looks like we regressed drawing tiles with 565:
> > http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxAndroidPlatform.
> > cpp#102
> > 
> > Before tiles would just default to 565 but now they honor the screen dept
> > since the layers refactoring. We should compare against actually using 565
> > tiles.
> 
> Screen depth is hard-coded to 16-bit before this patch, so tiles would still
> be 565. (see widget/android/nsScreenManager.cpp::GetScreenDepth (or
> something like that))

Where do you see this? From what I see tiles are now calling this:
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#1733
from:
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/TiledContentClient.cpp#153

which is hardcoded to 24-bits.
(In reply to Benoit Girard (:BenWa) from comment #36)
> (In reply to Chris Lord [:cwiiis] from comment #32)
> > (In reply to Benoit Girard (:BenWa) from comment #31)
> > > Actually it looks like we regressed drawing tiles with 565:
> > > http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxAndroidPlatform.
> > > cpp#102
> > > 
> > > Before tiles would just default to 565 but now they honor the screen dept
> > > since the layers refactoring. We should compare against actually using 565
> > > tiles.
> > 
> > Screen depth is hard-coded to 16-bit before this patch, so tiles would still
> > be 565. (see widget/android/nsScreenManager.cpp::GetScreenDepth (or
> > something like that))
> 
> Where do you see this? From what I see tiles are now calling this:
> http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#1733
> from:
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/
> TiledContentClient.cpp#153
> 
> which is hardcoded to 24-bits.

Sorry, yes, I missed that this uses GetOffscreenFormat... Should we override this in gfxPlatformAndroid to return the screen depth? That seems sensible to me.
(In reply to Chris Lord [:cwiiis] from comment #37)
> (In reply to Benoit Girard (:BenWa) from comment #36)
> > (In reply to Chris Lord [:cwiiis] from comment #32)
> > > (In reply to Benoit Girard (:BenWa) from comment #31)
> > Where do you see this? From what I see tiles are now calling this:
> > http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#1733
> > from:
> > http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/
> > TiledContentClient.cpp#153
> > 
> > which is hardcoded to 24-bits.
> 
> Sorry, yes, I missed that this uses GetOffscreenFormat... Should we override
> this in gfxPlatformAndroid to return the screen depth? That seems sensible
> to me.

In fact, looks like this is meant to be the case - gfxPlatformAndroid stores mOffscreenFormat, but that's unused currently. Will attach a patch.
(In reply to Chris Lord [:cwiiis] from comment #38)
> (In reply to Chris Lord [:cwiiis] from comment #37)
> > (In reply to Benoit Girard (:BenWa) from comment #36)
> > > (In reply to Chris Lord [:cwiiis] from comment #32)
> > > > (In reply to Benoit Girard (:BenWa) from comment #31)
> > > Where do you see this? From what I see tiles are now calling this:
> > > http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#1733
> > > from:
> > > http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/
> > > TiledContentClient.cpp#153
> > > 
> > > which is hardcoded to 24-bits.
> > 
> > Sorry, yes, I missed that this uses GetOffscreenFormat... Should we override
> > this in gfxPlatformAndroid to return the screen depth? That seems sensible
> > to me.
> 
> In fact, looks like this is meant to be the case - gfxPlatformAndroid stores
> mOffscreenFormat, but that's unused currently. Will attach a patch.

And ignore me again - I hadn't read the header. We are using 16-bit layers as gfxAndroidPlatform overrides GetOffscreenFormat and returns mOffscreenFormat, which is either 565 or 888, depending on the depth of the primary screen (which is hard-coded to 16 currently).
(In reply to Chris Lord [:cwiiis] from comment #27)
> Talos for 16-bit: https://tbpl.mozilla.org/?tree=Try&rev=d1818ff538db
It looks like this is actually 24-bit...

> Talos for 32-bit: https://tbpl.mozilla.org/?tree=Try&rev=67da640de01f
...and this is actually mozilla-central with no changes (16-bit).

The compare-talos link below shows some significant regressions, including about 400% regression in trobopan and 30-80% regression in tcheck2.  I triggered some more test runs on both the before and after changesets.  If you reload the link after those jobs finish then it should have more accurate stats:

http://perf.snarkfest.net/compare-talos/index.html?oldRevs=60e522be9d08&newRev=d1818ff538db&submit=true
wlach very kindly ran eideticker with and without the patch on a Samsung Galaxy Nexus (which has a 24-bit screen) on the taskjs test:

http://www.pastebin.mozilla.org/2366441

Similar to talos, no appreciable difference in performance. Things are looking good for this I think, given the UX needs and visual fidelity improvement.
(In reply to Matt Brubeck (:mbrubeck) from comment #40)
> (In reply to Chris Lord [:cwiiis] from comment #27)
> > Talos for 16-bit: https://tbpl.mozilla.org/?tree=Try&rev=d1818ff538db
> It looks like this is actually 24-bit...
> 
> > Talos for 32-bit: https://tbpl.mozilla.org/?tree=Try&rev=67da640de01f
> ...and this is actually mozilla-central with no changes (16-bit).
> 
> The compare-talos link below shows some significant regressions, including
> about 400% regression in trobopan and 30-80% regression in tcheck2.  I
> triggered some more test runs on both the before and after changesets.  If
> you reload the link after those jobs finish then it should have more
> accurate stats:
> 
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=60e522be9d08&newRev=d1818ff538db&submit=true

Ah, I must've missed the pan regression... The difference in tcheck2 is tiny, the 80% is a misnomer (I bet the variance between builds is near if not greater than the lower end of those bounds) - but that trobopan doesn't look great...

I think this could do with a better look at what trobopan actually measures, and what such a drop actually means.
Make the suggested changes.
Attachment #744103 - Attachment is obsolete: true
Attachment #744109 - Attachment is obsolete: true
Attachment #744109 - Flags: review?(bugmail.mozilla)
Attachment #744584 - Flags: review?(bugmail.mozilla)
Comment on attachment 744584 [details] [diff] [review]
Part 1 - Use 24-bit colour on 24-bit screens

Review of attachment 744584 [details] [diff] [review]:
-----------------------------------------------------------------

I assume the bits2 allocation can't use the actual size because it's a stack allocation? I wonder if there's some way to make sure that won't blow the stack, since it is pretty large. Looks fine otherwise.
Attachment #744584 - Flags: review?(bugmail.mozilla) → review+
This reads the pref on startup and overrides the detected screen depth. I've also added the pref to mobile.js, but disabled (so 24-bit colour is enabled).
Attachment #744625 - Flags: review?(bugmail.mozilla)
Attachment #744625 - Flags: review?(bugmail.mozilla) → review+
Funny enough, the bustage may not have been yours. The Android reftest failures definitely were, though.
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=0cb5418906cf
Attachment #744625 - Attachment description: Respect the existing force-16bit pref → Part 2 - Respect the existing force-16bit pref
I assume we're not going to ask too many questions about tests that used to fail and now pass :)
Attachment #752207 - Flags: review?(blassey.bugs)
I don't really know anything about icc to know why this fails in 24-bit and not 16-bit (maybe the pass is just coincidental?) - History only goes as far back as incriminating Jeff on this test, feel free to pass this along if you aren't appropriate.

reftest for reference (it's R1): https://tbpl.mozilla.org/?tree=Try&rev=c18bdec733a8
Attachment #752208 - Flags: review?(jmuizelaar)
Some tests now fail because 24-bit can express more colours - these are all small differences and don't worry me.
Attachment #752209 - Flags: review?(blassey.bugs)
Up to part 5, we're just left with some SVG failures (which I want to run by :jwatt) and a couple of webgl canvas failures that I want to look at a bit more carefully (there might be a real bug there).

https://tbpl.mozilla.org/?tree=Try&rev=a5c0927bf285
Attachment #752207 - Flags: review?(blassey.bugs) → review+
Attachment #752209 - Flags: review?(blassey.bugs) → review+
Comment on attachment 744625 [details] [diff] [review]
Part 2 - Respect the existing force-16bit pref

Review of attachment 744625 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoAppShell.java
@@ +1310,5 @@
>  
> +    public static synchronized void setScreenDepthOverride(int aScreenDepth) {
> +        if (sScreenDepth != 0) {
> +            Log.e(LOGTAG, "Tried to override screen depth after it's already been set");
> +            return;

For the record I feel this is going to cause problems later by introducing a race condition. Gecko may take a bit of time to respond to the pref request and if we have any Java code running on the UI thread that calls getScreenDepth() then it may well race with this code. I don't really have a good solution to this though, and at least the log message will point out the problem clearly enough.
Comment on attachment 752208 [details] [diff] [review]
Part 4 - Mark test jpg-srgb-icc as failing on Android

hmm, looks like this is an intermittent - if you could comment on how you feel about this being random-if(Android) too, that'd be good, possibly save an extra review round.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #53)
> Comment on attachment 744625 [details] [diff] [review]
> Part 2 - Respect the existing force-16bit pref
> 
> Review of attachment 744625 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mobile/android/base/GeckoAppShell.java
> @@ +1310,5 @@
> >  
> > +    public static synchronized void setScreenDepthOverride(int aScreenDepth) {
> > +        if (sScreenDepth != 0) {
> > +            Log.e(LOGTAG, "Tried to override screen depth after it's already been set");
> > +            return;
> 
> For the record I feel this is going to cause problems later by introducing a
> race condition. Gecko may take a bit of time to respond to the pref request
> and if we have any Java code running on the UI thread that calls
> getScreenDepth() then it may well race with this code. I don't really have a
> good solution to this though, and at least the log message will point out
> the problem clearly enough.

I feel this too... I think it doesn't at the moment because we call for the pref before the surface initialisation happens, and given that both are synchronous on the Gecko thread (I think), the pref should come back first. I'm not sure about this though. Thankfully, this is only really a problem if you set a pref that defaults to off, so I don't feel too bad about it right now.
Reading bug 685516 it makes sense that this is intermittent - I guess switching to 24-bit colour has changed the timings enough that this occasionally fails(?)

Reference: https://tbpl.mozilla.org/?tree=Try&rev=a5c0927bf285
Attachment #752208 - Attachment is obsolete: true
Attachment #752208 - Flags: review?(jmuizelaar)
Attachment #752648 - Flags: review?(jmuizelaar)
Given what the WebGL tests do, it appears that the read-back is just broken.
(In reply to Chris Lord [:cwiiis] from comment #57)
> Given what the WebGL tests do, it appears that the read-back is just broken.

Ah, also note these are marked to fail in B2G, which uses 24-bit colour... If there isn't an obvious fix for this, I'll attach a patch doing the same.
(In reply to Chris Lord [:cwiiis] from comment #58)
> (In reply to Chris Lord [:cwiiis] from comment #57)
> > Given what the WebGL tests do, it appears that the read-back is just broken.
> 
> Ah, also note these are marked to fail in B2G, which uses 24-bit colour...
> If there isn't an obvious fix for this, I'll attach a patch doing the same.

It looks like this might be another case of async image loading not completing before screenshotting... The tests passes fine on the GN, and the analyser shows the correct rendering on the test part, and is incorrect on the reference (which compares with an image).

I think this test could be fixed by crafting a reference page that doesn't use an image, so I'll do that - I don't know if that's why it was marked as failing on b2g...
Comment on attachment 752648 [details] [diff] [review]
Part 4 - Mark test jpg-srgb-icc as random on Android

Review of attachment 752648 [details] [diff] [review]:
-----------------------------------------------------------------

This at least needs a comment pointing at bug about why it's random.
Attachment #752648 - Flags: review?(jmuizelaar) → review-
Added comment.
Attachment #752648 - Attachment is obsolete: true
Attachment #753233 - Flags: review?(jmuizelaar)
Another image decoding thing it seems... Replacing black.png with black.html fixes the intermittent failures and saves us some space.
Attachment #753234 - Flags: review?(jmuizelaar)
Some SVG tests that were (mostly) already fuzzed due to layer rendering incorrectly snapping to pixel boundaries for SVG rendering now require a little more fuzzing with the higher colour depth.
Attachment #753279 - Flags: review?(jwatt)
Attachment #753279 - Flags: review?(jwatt) → review+
Comment on attachment 753234 [details] [diff] [review]
Part 6 - Replace WebGL black.png with black.html

Review of attachment 753234 [details] [diff] [review]:
-----------------------------------------------------------------

This should also probably have a comment as to why it's not using wrapper like the rest of the tests.
Attachment #753234 - Flags: review?(jmuizelaar) → review+
Attachment #753233 - Flags: review?(jmuizelaar) → review+
Also produced unexpected passes in webgl-color-alpha-test.html?colorVal=0.0&alphaVal=1.0&nogl and webgl-color-alpha-test.html?colorVal=0.0&alphaVal=1.0 in b2g's reftest-5, dunno if that's an artifact of annotations, or reflecting that you were actually making changes affecting b2g.
Confirmed with blassey that it's ok to add the extra fuzzing necessary for this test to pass, will push again when the tree's open.
And some follow-up fuzzing (part of which got lost in a rebase..)

https://hg.mozilla.org/integration/mozilla-inbound/rev/f0311781c218
This made Tpan and Tcheckerboard worse, and got backed out as a result:

https://hg.mozilla.org/integration/mozilla-inbound/rev/281dc9793a73
To clarify, the justifications for the performance regression that this causes weren't clear to anyone in the Toronto office.
We were also still seeing Android reftest failures like the below:
https://tbpl.mozilla.org/php/getParsedLog.php?id=23546549&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=23547542&tree=Mozilla-Inbound

Please run this through Try with a lot of reftest retriggers to make sure you've got the intermittent failures found and dealt with before relanding.
To make this more actionable, this should have a clear summary in the bug/commit message of the performance regressions that it causes and why we believe they are acceptable. It's worth noting that the performance impact can be different on different devices as it's doubling the amount of memory that we texture upload. This can be especially bad on some devices where we end up hitting the readback path on texture upload.

One thing that would make this change more palatable would be to only use 24bit color when using SurfaceTexture for uploads as that would make the upload performance much easier to reason about.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #77)
> To make this more actionable, this should have a clear summary in the
> bug/commit message of the performance regressions that it causes and why we
> believe they are acceptable. It's worth noting that the performance impact
> can be different on different devices as it's doubling the amount of memory
> that we texture upload. This can be especially bad on some devices where we
> end up hitting the readback path on texture upload.

The rationale is that the image quality just isn't good enough at 16-bit colour - We get terrible banding artifacts - dithering would make this better, but it still isn't ideal. Similar to when we enabled linear filtering on zoom even though it performed badly on Tegra 2, I think this is a situation where the image quality is unacceptable. Especially considering every other Android browser uses 24-bit colour. We've discussed this above, I can add some info to the commit message.

> One thing that would make this change more palatable would be to only use
> 24bit color when using SurfaceTexture for uploads as that would make the
> upload performance much easier to reason about.

Re the texture upload, to say it's doubling is an over-simplification. For the basic one-opaque-layer page, it's doubling, but I would posit that we have bandwidth to spare in this case - the pages that have real difficulty are pages that have multiple content-alpha layers, and in this case, there's little difference.

All the same, do we have a SurfaceTexture upload path? I thought there was some reason that it can't work for us? Assuming that we get to use this path on any ics+ device, I'd be ok with compromising here.
Flags: needinfo?(snorp)
(In reply to Chris Lord [:cwiiis] from comment #78)
> 
> All the same, do we have a SurfaceTexture upload path? I thought there was
> some reason that it can't work for us? Assuming that we get to use this path
> on any ics+ device, I'd be ok with compromising here.

We do not have this today, but I believe we came to the conclusion in Taiwan last week that we probably want it. However, I don't know if this will really mitigate the performance concerns Jeff has. You're still pushing a bunch more pixels around (albeit less frequently, I hope). That being said, I really do think we need to just take the hit here and do what we can to optimize after that.
Flags: needinfo?(snorp)
Duplicate of this bug: 694828
Rebased - I've not added the comment about the perf regressions yet.
Attachment #744584 - Attachment is obsolete: true
Attachment #756655 - Flags: review+
Rebased.
Attachment #744625 - Attachment is obsolete: true
Attachment #756656 - Flags: review+
Rebased.
Attachment #752207 - Attachment is obsolete: true
Attachment #756657 - Flags: review+
Rebased.
Attachment #753233 - Attachment is obsolete: true
Attachment #756658 - Flags: review+
Rebased and fuzzed a couple of extra tests - verified it was ok with :blassey.
Attachment #752209 - Attachment is obsolete: true
Attachment #756659 - Flags: review+
Rebased and also marked b2g as passing (because it does, with this change).
Attachment #753234 - Attachment is obsolete: true
Attachment #756660 - Flags: review+
Rebased.
Attachment #753279 - Attachment is obsolete: true
Attachment #756662 - Flags: review+
I've attached the latest, rebased patches that result in everything green (https://tbpl.mozilla.org/?tree=Try&rev=973742b25d53)

Reassigning this to :blassey to find someone to take care of it while I'm on PTO. Personally, I think this is good to go, but I think :jrmuizel wanted more convincing that we're ok to take the perf hit.
Assignee: chrislord.net → blassey.bugs
I was looking at the areweslimyet numbers, and it looks like this patch increases resident memory usage by ~4.7 MiB on Startup +30s. Something else to justify if we're going to go ahead with these patches.
Karen, can you make a call here? We're trading slower rendering and more memory consumption for things to look prettier.
Flags: needinfo?(krudnitski)
Keywords: productwanted
Blocks: 845944
Is it possible to limit the patches with devices with higher amounts of RAM? I'm uncomfortable making any perceptible impact to performance, but aware that devices with loads of memory will have limited impact.

Which does mean finding a good threshold of RAM, of course, which would mean a few more tests on a range of devices (unfortunately).
Flags: needinfo?(krudnitski)
(In reply to Karen Rudnitski [:kar] from comment #91)
> Is it possible to limit the patches with devices with higher amounts of RAM?
> I'm uncomfortable making any perceptible impact to performance, but aware
> that devices with loads of memory will have limited impact.
> 
> Which does mean finding a good threshold of RAM, of course, which would mean
> a few more tests on a range of devices (unfortunately).

My gut feeling on this is that any device that has 768mb of ram or more should have this enabled by default - these are high-end phones, where the ram increase shouldn't really be felt, and the perceived lack of image quality would feel weird.

Certainly any device that's capable of running Chrome ought to have 32-bit colour by default, as that will be the comparison point and the image quality difference is noticeable.

Perhaps we should say >=honeycomb && >=768mb ram = 32-bit by default?
(In reply to Chris Lord [:cwiiis] from comment #92)
> 
> Perhaps we should say >=honeycomb && >=768mb ram = 32-bit by default?

Assuming it's really a 1GB phone (with some space reserved for GPU), this sounds reasonable to me as well.
blassey suggests using IsOnLowMemoryPlatform, though that's <384mb ram, which I think may perhaps be too little.

Just to explain my choice of honeycomb/768;

- Chrome requires ICS - I think we should use 32-bit colour anywhere we can be compared to Chrome, when possible
- Honeycomb is tablet-only, so we may as well extend to these platforms - banding looks *terrible* on larger screens, especially ones with slightly lower pixel density, which is more common for honeycomb
- I'd say 1gb, but I imagine there are plenty of actually 1gb devices that have some memory reserved for various things (either crappy vendor-specific code, or reserved for things like gpus and cameras), so 768 seems like a nice compromise. This should catch all high-end devices and most mid-range too


I'm happy to base this decision on memory alone, but perhaps 384mb is a little ambitious. Thoughts?
I agree that 384mb as a threshold seems pretty low for this feature. I agree the honeycomb/768 makes sense.
Jeff, what do you think of enabling this with a honeycomb && ram >= 768mb check? This will be a performance hit, but I think it's generally agreed that the image quality isn't good enough at 16bpp.

If it's so much of a hit that we're seriously concerned about panning perf again, we could take another run at optimising the display-port/cancelling/progressive-update region code, I think there's still room for improvement there.
Flags: needinfo?(jmuizelaar)
(In reply to Chris Lord [:cwiiis] from comment #94)
> - Chrome requires ICS

Chrome isn't the only competitor, though. What do stock, Opera (Chromium-based) and Dolphin do on Android 2.3?

Anyway, having proper colors even on devices that satisfy the test you suggest would be awesome compared to status quo.
(In reply to Henri Sivonen (:hsivonen) from comment #97)
> (In reply to Chris Lord [:cwiiis] from comment #94)
> > - Chrome requires ICS
> 
> Chrome isn't the only competitor, though. What do stock, Opera
> (Chromium-based) and Dolphin do on Android 2.3?
> 
> Anyway, having proper colors even on devices that satisfy the test you
> suggest would be awesome compared to status quo.

We spoke about this in this week's mobile meeting and I think we decided to relax the honeycomb requirement and just do it everywhere where memory is reported as >= 768mb. Just waiting on confirmation from :jrmuizel.
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #99)
> Created attachment 769277 [details] [diff] [review]
> patch to force 16bit on devices with lower that 768Mb of RAM

This sounds reasonable to me.
Flags: needinfo?(jmuizelaar)
Comment on attachment 769277 [details] [diff] [review]
patch to force 16bit on devices with lower that 768Mb of RAM

Review of attachment 769277 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1155,5 @@
>          GeckoAppShell.registerGlobalExceptionHandler();
> +        ActivityManager am = (ActivityManager)getSystemService(Context.ACTIVITY_SERVICE);
> +        ActivityManager.MemoryInfo memInfo = new ActivityManager.MemoryInfo();
> +        am.getMemoryInfo(memInfo);
> +        Log.i("GeckoMem", "avail: " + memInfo.availMem);

Remove the changes in this file. Doesn't seem worth it to do all this just to log something that we're not even using.

::: mobile/android/base/GeckoAppShell.java
@@ +1339,5 @@
> +        try {
> +            BufferedReader fr = new BufferedReader(new FileReader("/proc/meminfo"));
> +            String line = fr.readLine();
> +            while (line != null && !line.startsWith("MemTotal"))
> +                line = fr.readLine();

nit: braces

@@ +1342,5 @@
> +            while (line != null && !line.startsWith("MemTotal"))
> +                line = fr.readLine();
> +            String[] tokens = line.split("\\s+");
> +            if (tokens.length >= 2 && Long.parseLong(tokens[1]) >= 786432 /* 768MB in kb*/) {
> +                return false;

Shouldn't this be return true?

@@ +1344,5 @@
> +            String[] tokens = line.split("\\s+");
> +            if (tokens.length >= 2 && Long.parseLong(tokens[1]) >= 786432 /* 768MB in kb*/) {
> +                return false;
> +            }
> +        } catch (Exception ex) { Log.i("GeckoMemory", "exception", ex);}

You need to close the FileReader in a finally clause. Make sure to also catch the scenario where the "new FileReader" works but the "new BufferedReader" throws; it's impossible to handle correctly if you chain the constructors like you did here.

@@ +1345,5 @@
> +            if (tokens.length >= 2 && Long.parseLong(tokens[1]) >= 786432 /* 768MB in kb*/) {
> +                return false;
> +            }
> +        } catch (Exception ex) { Log.i("GeckoMemory", "exception", ex);}
> +        return true;

This should be return false, I think.

@@ +1362,5 @@
> +                    case PixelFormat.RGBX_8888 :
> +                    case PixelFormat.RGB_888 :
> +                    {
> +                        if (isHighMemoryDevice())
> +                            sScreenDepth = 24;

nit: braces
Attachment #769277 - Flags: review?(bugmail.mozilla) → review-
sorry for the obvious issues with the last patch
Attachment #769417 - Flags: review?(bugmail.mozilla)
Comment on attachment 769417 [details] [diff] [review]
patch to force 16bit on devices with lower that 768Mb of RAM

Review of attachment 769417 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed

::: mobile/android/base/GeckoAppShell.java
@@ +1354,5 @@
> +        } catch (Exception ex) {
> +        } finally {
> +            try {
> +                if (br != null)
> +                    br.close();

This still fails to close fr if the "new BufferedReader" statement throws, because br will not get assigned to. Best to close fr directly here. Closing the BufferedReader just passes it through to the underlying input stream and doesn't need any extra work.

@@ +1357,5 @@
> +                if (br != null)
> +                    br.close();
> +            } catch (IOException ioe) {}
> +        }
> +        

nit: trailing whitespace
Attachment #769417 - Flags: review?(bugmail.mozilla) → review+
Everything rebased - there's a questionable change in one of the reftest files, so this may have some intermittent oranges - I'll use this push to weed those out and then push to inbound.

https://tbpl.mozilla.org/?tree=Try&rev=a07c26f06c6a
As before/as expected, this caused Android Talos regressions. From dev-tree-management Digest, Vol 55, Issue 18:

Regression: Mozilla-Inbound - Robocop Checkerboarding Real User Benchmark - Android 2.2 (Native) - 69.7% increase
-----------------------------------------------------------------------------------------------------------------
    Previous: avg 3.558 stddev 0.219 of 12 runs up to revision bc99f68f8946
    New     : avg 6.036 stddev 0.391 of 12 runs since revision b257784cdfdb
    Change  : +2.478 (69.7% / z=11.308)
    Graph   : http://mzl.la/17PxdRT

Regression: Mozilla-Inbound - Robocop Pan Benchmark - Android 2.2 (Native) - 443% increase
------------------------------------------------------------------------------------------
    Previous: avg 11096.429 stddev 2288.228 of 12 runs up to revision bc99f68f8946
    New     : avg 60268.692 stddev 5727.549 of 12 runs since revision b257784cdfdb
    Change  : +49172.262 (443% / z=21.489)
    Graph   : http://mzl.la/17Pxe8D
Is this regression only seen on Android 2.2? Can we also disable this feature for Android 2.2? These numbers feel that it would be right noticeable for our 2.2 users.
Spoke to Brad who explained that the tegras (used to running the tests) are running Android 2.2, and hence labelled as such. He also talked me through the numbers a bit more. Going to do some ad hoc testing at the moment. Apologies in advance for the red herring!
Depends on: 892934
Depends on: 894087
Depends on: 890590
Depends on: 899733
Bug 845944 was tracked for 23.0. It seems risky to uplift to 24.0 Beta though.
tracking-fennec: - → ?
Given 845944 is a long standing regression, we may take an uplift here for our second Beta but not sure why we would track this.The uplift totally depends on the risk here.

Kats/:benwa can you help with risk analysis here ?
I would lean towards not uplifting. There have already been a few regressions from this patch (see the depends-on list) and the regressions may be device-specific, so we might not catch them before release.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #113)
> I would lean towards not uplifting. There have already been a few
> regressions from this patch (see the depends-on list) and the regressions
> may be device-specific, so we might not catch them before release.

+1 on this, I think this is risky enough that we should let it ride the trains. From what I hear, the 4.3 update fixed the slowness of 16-bit rendering on the Nexus 10, and I can't imagine there are too many other affected devices.
4.3 on Nexus 10 is pretty hosed right now (bug 900020).
Given the risk analysis, will let it ride the trains.
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.