Closed Bug 806369 Opened 12 years ago Closed 12 years ago

Blacklisting for H.264 decoding

Categories

(Firefox for Android Graveyard :: Toolbar, defect)

x86_64
Linux
defect
Not set
critical

Tracking

(firefox17+ fixed, firefox18+ fixed, firefox19 fixed)

RESOLVED FIXED
Firefox 19
Tracking Status
firefox17 + fixed
firefox18 + fixed
firefox19 --- fixed

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

Attachments

(3 files, 5 obsolete files)

We're going to extend nsIGfxInfo to support H.264 decoding. It's not great and we're considering refactoring/rewriting it soon, but it should work well enough in the present case and it's what we have.

My understanding is that part of the work that must be done here is also determining the initial blacklist. The crashes for which we'll want to blacklist here are typically crashes with signature containing "stagefright".

I've also been told that we'll want this to be a blacklist on Android 4.1+ and a whitelist on Android 4.0-.

We will specifically want support for downloadable blacklist entries here so we can update it without chemspilling.
This:
 - adds a FEATURE_H264 to nsIGfxInfo
 - adds individual Android versions (e.g. "4.0") as gfxinfo-known OSes
 - defaults to blocking h264 if android version <= 4.0 and allowing h264 if android version >= 4.1.
 - adds some Android-specific gfxinfo fields: product, model, hardware, manufacturer. Reads them from the XML downloadable blocklist; checks them when evaluating blocklist entries.
 - reads the new fields from the device using AndroidBridge

TODO:
 - test it
 - write XPCshell test
 - write initial blocklist entries
Per doublec:
For the pre-JB devices whitelist -- Samsung Galaxy S3, S2, Nexus S, Note, Note2 - almost any Samsung device works running ICS.
For the JB+ blacklist-- any Sony Ericsson device running JB (at least until Bug 787319 is resolved).
Comment on attachment 676346 [details] [diff] [review]
preliminary patch: runs, but not tested at all

Review of attachment 676346 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/android/GfxInfo.cpp
@@ +141,5 @@
> +      mOS = DRIVER_OS_ANDROID_4_2;
> +    else if (mAndroidAPIVersion >= 16)
> +      mOS = DRIVER_OS_ANDROID_4_1;
> +    else if (mAndroidAPIVersion >= 14)
> +      mOS = DRIVER_OS_ANDROID_4_0;

Are we able to differentiate  between API version 14 and 15? The former is Android 4.0, 4.0.1, 4.0.2 and the latter is 4.0.3 and 4.0.4.

::: widget/nsIGfxInfo.idl
@@ +77,5 @@
>    const long FEATURE_WEBGL_ANGLE = 7;
>    /* Whether WebGL antialiasing is supported. */
>    const long FEATURE_WEBGL_MSAA = 8;
> +  /* Whether hardware H.264 is supported */
> +  const long FEATURE_H264 = 9;

Referring to the comment, this really checks if libstagefright decoding is supported, not whether hardware H.264 is supported. Devices support software playback via libstagefright for example.
(In reply to Maire Reavy [:mreavy] from comment #2)
> Per doublec:
> For the JB+ blacklist-- any Sony Ericsson device running JB (at least until
> Bug 787319 is resolved).

It's any Sony Ericsson device running ICS, sorry.
(In reply to Chris Double (:doublec) from comment #4)
> (In reply to Maire Reavy [:mreavy] from comment #2)
> > Per doublec:
> > For the JB+ blacklist-- any Sony Ericsson device running JB (at least until
> > Bug 787319 is resolved).
> 
> It's any Sony Ericsson device running ICS, sorry.

Do we know of any JB devices that don't work?  (To populate the JB+ blacklist)
(In reply to Maire Reavy [:mreavy] from comment #5)
> Do we know of any JB devices that don't work?  (To populate the JB+
> blacklist)

I've only ever seen 3 JB devices (Nexus, Nexus S, and Note 2) and they've all worked. I don't know if any that don't - I'm sure there will be some though!
Summary: Driver blacklisting for H.264 decoding → Blacklisting for H.264 decoding
Depends on: 799863
Blocks: 802620
Blocks: 802827
Blocks: 804768
Blocks: 803794
Adjusting target milestone and importance to get on everyone's radar for Firefox 17.
Severity: normal → critical
Target Milestone: --- → Firefox 17
New patch:
 - allows blocklisting by specific API version (e.g. 14 for Android 4.0.0)
 - has built-in blocklist according to above comments:
   -- on Android < 4.0, everything is blacklisted
   -- on Android 4.0, Samsung phones are whitelisted, the rest is blacklisted
   -- on Android >= 4.1, only Sony phones are blacklisted

Still need:
 - XPCshell test
Attachment #676346 - Attachment is obsolete: true
I'm not sure I understand if it was actually useful to add the blacklist rule for Sony devices: if they were running Android 4.0, they would be blacklisted by default anyways. This is only useful if some faulty Sony devices run 4.1.
New patch: actually checks the blacklist before initializing OMX. Renamed 'H264' to 'OPENMAX' throughout the patch now that I understand that it's about OpenMAX in general more than specifically about H264.

Still need: xpcshell test.
Attachment #676702 - Attachment is obsolete: true
Removed the part where we actually query the blacklist: the previous patch did it in a place that is never hit in http://www.quirksmode.org/html5/tests/video.html and I don't know what is the correct place.

^^^ info needed.
Attachment #676709 - Attachment is obsolete: true
Thanks, now the blacklisting code is hit.

Tested: that a Galaxy Nexus with JB passes the blacklist.
Attachment #676785 - Attachment is obsolete: true
(In reply to Maire Reavy [:mreavy] from comment #7)
> Adjusting target milestone and importance to get on everyone's radar for
> Firefox 17.

Tracking for FF17 (please nominate similar bugs in the future by setting the tracking flag to ?).

This needs to land before Tuesday's b5 go to build.
(In reply to Benoit Jacob [:bjacob] from comment #9)
> I'm not sure I understand if it was actually useful to add the blacklist
> rule for Sony devices: if they were running Android 4.0, they would be
> blacklisted by default anyways. This is only useful if some faulty Sony
> devices run 4.1.

We know the Sony devices running 4.0.x are problematic, but we don't have any Sony devices running 4.1 to test. Our tentative assumption was that the 4.1 devices would be guilty until proven innocent.

I think we should blacklist Sony devices running 4.0 or 4.1 for Beta 17 and Aurora 18. We can remove them from Nightly 19's 4.1 blacklist to get some test coverage.
(In reply to Chris Peterson (:cpeterson) from comment #14)
> (In reply to Benoit Jacob [:bjacob] from comment #9)
> > I'm not sure I understand if it was actually useful to add the blacklist
> > rule for Sony devices: if they were running Android 4.0, they would be
> > blacklisted by default anyways. This is only useful if some faulty Sony
> > devices run 4.1.
> 
> We know the Sony devices running 4.0.x are problematic, but we don't have
> any Sony devices running 4.1 to test. Our tentative assumption was that the
> 4.1 devices would be guilty until proven innocent.
> 
> I think we should blacklist Sony devices running 4.0 or 4.1 for Beta 17 and
> Aurora 18. We can remove them from Nightly 19's 4.1 blacklist to get some
> test coverage.

cpeterson -- I thought that our tentative assumption was 4.1 devices were innocent until proven guilty.  Are you saying these devices are "guilty until proven innocent" because they are Sony devices, or are we being extra conservative because Firefox 17 is so close to shipping?  (Just trying to understand the reasoning.)
Comment on attachment 676813 [details] [diff] [review]
preliminary patch: runs, still needs xpcshell test

Review of attachment 676813 [details] [diff] [review]:
-----------------------------------------------------------------

Drive by review and question:

Can the blacklist feature implemented here be used to query the stagefright/android versions to pick a particular shared library to load in nsMediaPluginHost::TryLoad? For example:

On Samsung Galaxy S2 running 2.3.4 => load libomxplugin_a.so
On Samsung Galaxy S2 running 2.3.6 => load libomxplugin_b.so
On Samsung Galaxy S2 running 4.0+ => load libomxplugin_c.so

etc.

Or should this be implemented by query gfxinfo (or whatever other class holds the information) before querying the blacklist?

::: gfx/layers/TiledLayerBuffer.h
@@ +355,1 @@
>                            !IsPlaceholder(newRetainedTiles.

Was this change included by mistake?

::: widget/nsIGfxInfo.idl
@@ +77,5 @@
>    const long FEATURE_WEBGL_ANGLE = 7;
>    /* Whether WebGL antialiasing is supported. */
>    const long FEATURE_WEBGL_MSAA = 8;
> +  /* Whether hardware H.264 is supported */
> +  const long FEATURE_STAGEFRIGHT = 9;

Remove reference to "hardware H.2645" in the comment. It's is if libstagefright decoding is supported.

::: widget/xpwidgets/GfxInfoBase.cpp
@@ +121,5 @@
>      case nsIGfxInfo::FEATURE_WEBGL_MSAA:
>        name = BLACKLIST_PREF_BRANCH "webgl.msaa";
>        break;
> +    case nsIGfxInfo::FEATURE_STAGEFRIGHT:
> +      name = BLACKLIST_PREF_BRANCH "h264";

should "h264" be "stagefright"?
(In reply to Chris Double (:doublec) from comment #16)
> Comment on attachment 676813 [details] [diff] [review]
> preliminary patch: runs, still needs xpcshell test
> 
> Review of attachment 676813 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Drive by review and question:
> 
> Can the blacklist feature implemented here be used to query the
> stagefright/android versions to pick a particular shared library to load in
> nsMediaPluginHost::TryLoad? For example:
> 
> On Samsung Galaxy S2 running 2.3.4 => load libomxplugin_a.so
> On Samsung Galaxy S2 running 2.3.6 => load libomxplugin_b.so
> On Samsung Galaxy S2 running 4.0+ => load libomxplugin_c.so
> 
> etc.
> 
> Or should this be implemented by query gfxinfo (or whatever other class
> holds the information) before querying the blacklist?

You'll have to query gfxinfo to get information about the phone (e.g. android model/hardware/manufacturer strings), make your decision yourself based on that regarding which actual 'feature' to use, and they query gfxinfo for the blacklisting status of that feature.

Note that gfxinfo is known to suck and is pending a rewrite. It's the right time to point out such feature requests.



> 
> ::: gfx/layers/TiledLayerBuffer.h
> @@ +355,1 @@
> >                            !IsPlaceholder(newRetainedTiles.
> 
> Was this change included by mistake?
> 
> ::: widget/nsIGfxInfo.idl
> @@ +77,5 @@
> >    const long FEATURE_WEBGL_ANGLE = 7;
> >    /* Whether WebGL antialiasing is supported. */
> >    const long FEATURE_WEBGL_MSAA = 8;
> > +  /* Whether hardware H.264 is supported */
> > +  const long FEATURE_STAGEFRIGHT = 9;
> 
> Remove reference to "hardware H.2645" in the comment. It's is if
> libstagefright decoding is supported.
> 
> ::: widget/xpwidgets/GfxInfoBase.cpp
> @@ +121,5 @@
> >      case nsIGfxInfo::FEATURE_WEBGL_MSAA:
> >        name = BLACKLIST_PREF_BRANCH "webgl.msaa";
> >        break;
> > +    case nsIGfxInfo::FEATURE_STAGEFRIGHT:
> > +      name = BLACKLIST_PREF_BRANCH "h264";
> 
> should "h264" be "stagefright"?
Hit save too soon. Regarding your other comments:
 - the TiledLayerBuffer.h change won't be part of the final patch. It's just something that I need to do to get my debug android builds to not crash. The bug is known already.
 - the remnants of 'h264' are just search-and-replace omissions. will fix.
(In reply to Benoit Jacob [:bjacob] from comment #17)
> (In reply to Chris Double (:doublec) from comment #16)
> > Or should this be implemented by query gfxinfo (or whatever other class
> > holds the information) before querying the blacklist?
> 
> You'll have to query gfxinfo to get information about the phone (e.g.
> android model/hardware/manufacturer strings), make your decision yourself
> based on that regarding which actual 'feature' to use, and they query
> gfxinfo for the blacklisting status of that feature.


Exception: if you can do what you want by a series of fall-backs, then you can let gfxinfo do all the work. Like this:

   if (gfxinfo says it's OK to use library A)
     use library A;
   else if (gfxinfo says it's OK to use library B)
     use library B;
   else
     use library C;
(In reply to Maire Reavy [:mreavy] from comment #15)
> cpeterson -- I thought that our tentative assumption was 4.1 devices were
> innocent until proven guilty.  Are you saying these devices are "guilty
> until proven innocent" because they are Sony devices, or are we being extra
> conservative because Firefox 17 is so close to shipping?  (Just trying to
> understand the reasoning.)

I was suggesting Sony 4.1 devices be blacklisted in Firefox 17 because we don't know if they will have the same problems as Sony 4.0 devices.

However, I have changed my mind and I don't think we should bothering blacklisting Sony 4.1 devices. According to [1], Sony will provide a 4.1 update for (some of) its 2012 devices in 2013 Q1 or later. Sony won't provide a 4.1 update for any of its 2011 devices. Firefox 17 will be old news by 2013 Q1, so a blacklist would be pointless. :)

[1] http://www.pocket-lint.com/news/46277/jelly-bean-when-update-coming
What method in gfxInfo provides the actual version number (eg. 2.3,4, 2.3,6, 4.0.4, etc)?
You'll have to apply my patch, and then GfxInfo::OperatingSystemVersion() returns the Android API version.
1) See attachment for the signatures and related bugs w/ links to crash kills for each signature:
Most of the top crashes for nonflashplayer libstagefright:
https://bugzilla.mozilla.org/show_bug.cgi?id=802629
https://bugzilla.mozilla.org/show_bug.cgi?id=802620
https://bugzilla.mozilla.org/show_bug.cgi?id=787319

2) I hadn't look from 10 crashes down, did you want me to do so? 
107 crash signatures w/ 1 only crash each = 107 crashes
27 signatures w/ 2 crashes each = 54 crashes
19 signatures w/ 3 crashes each  = 57 crashes
13 signatures w/ 4 crashes each = 52 crashes
6 signatures w/ 5 crashes each = 30 crashes
3 signatures w/ 6 crashes each = 18 crashes
2 signatures w/ 7 crashes each = 14 crashes
3 signatures w/ 8 crashes each = 24 crashes
3 signatures w/ 9 crashes each = 27 crashes
3 signatures w/ 10 crashes each = 30 crashes
= 413 crashes in the long tail
Adding Scoobi, kbrosnan, aaronmt so they are aware of this bug.
I'm giving up on making a XPCShell test because:

 - even if we make a XPCShell test, we currently don't run those on Android slaves so this won't prevent regressions.
 - trying to make a XPCShell test I ran into lots of bugs that make it hard to do that: bug 790817, bug 807413, bug 807416.

So instead I'm trying to do enough testing with the devices that we have in the Toronto office.
Depends on: 807413, 807416, 790817
(In reply to Naoki Hirata :nhirata from comment #23)
> Created attachment 677036 [details]
> crash signatures - devices

It's unclear to me how we decide that a device should be blacklisted for stagefright. For example a very popular device could look like it's giving many stagefright crashes, but this could be only because it is so popular.

I'm happy to implement whatever blacklist you guys decide on.
Removed the Sony blacklist rule per above discussion.

This patch adds a basic built-in blacklist, blocking all pre-4.0 devices and all non-Samsung 4.0 devices, and allowing all 4.1+ devices. It also *in principle* adds the ability to ship new blacklist rules by Android API version, Model, Product, Hardware, Manufacturer, but that part is only lightly tested by some local testing on a couple of phones. See previous comment for what's blocking from having real good testing here. XPCShell tests are really the only way to go because we need to spoof strings at startup.

r? joe for the most part (GfxInfo); r? doublec for the media-specific part and please also take a look at the current built-in blacklist rules.
Attachment #676813 - Attachment is obsolete: true
Attachment #677099 - Flags: review?(joe)
Attachment #677099 - Flags: review?(chris.double)
Attachment #677099 - Flags: review?(chris.double) → review+
Comment on attachment 677099 [details] [diff] [review]
Stagefright blacklisting. Also extends Android Gfxinfo to support blacklist rules by Android API version, Model, Product, Hardware, Manufacturer

Review of attachment 677099 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/src/init/all.js
@@ +3628,5 @@
>  pref("webgl.prefer-16bpp", false);
>  pref("webgl.default-no-alpha", false);
>  pref("webgl.force-layers-readback", false);
>  
> +//OpenMax prefs

name

::: widget/android/GfxInfo.cpp
@@ +346,5 @@
> +
> +        if (!isWhitelisted) {
> +          *aStatus = nsIGfxInfo::FEATURE_BLOCKED_DEVICE;
> +          return NS_OK;
> +        }

Oughtn't we be looking at the GfxDriverInfo array? As-is, we're just going to return the same thing always, and never evaluate the downloaded blocklist.
Attachment #677099 - Flags: review?(joe) → review-
I am uncertain how we should create a blacklist either.  I have links to each crashes, so if you want to look at specific crashes... we could do it that way.
(In reply to Naoki Hirata :nhirata from comment #29)
> I am uncertain how we should create a blacklist either.  I have links to
> each crashes, so if you want to look at specific crashes... we could do it
> that way.

So, I think our best bet there is to follow up on the work that's been done in bug 797068 on correlations of crash signatures to GPU info. I meant to go back to that, but really didn't have time this week. Until we have that, I suppose we can continue by addressing only the major bugs that have been discussed on bugzilla e.g. the Sony bug discussed above.
Fixed the all.js comment.

Regarding the other objection, we discussed and found that the code was actually OK.
Attachment #677495 - Flags: review?
Attachment #677495 - Flags: review? → review?(joe)
Comment on attachment 677495 [details] [diff] [review]
Stagefright blacklisting. Also extends Android Gfxinfo to support blacklist rules by Android API version, Model, Product, Hardware, Manufacturer

Review of attachment 677495 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the mistaken r- earlier!
Attachment #677495 - Flags: review?(joe) → review+
(In reply to Naoki Hirata :nhirata from comment #29)
> I am uncertain how we should create a blacklist either.
You should file a bug as stated in https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers

> I have links to each crashes, so if you want to look at specific crashes... we
> could do it that way.
You can use rkaiser's files over the beta phase to find all devices for each signature:
https://crash-analysis.mozilla.com/rkaiser/2012-10-31/2012-10-31.fennecandroid.beta.17.0.devices.weekly.html
https://crash-analysis.mozilla.com/rkaiser/2012-10-24/2012-10-24.fennecandroid.beta.17.0.devices.weekly.html
https://crash-analysis.mozilla.com/rkaiser/2012-10-17/2012-10-17.fennecandroid.beta.17.0.devices.weekly.html
Previous try was broken on android opt because required new code was mistakenly put inside a #ifdef DEBUG. New try for android opt:
https://tbpl.mozilla.org/?tree=Try&rev=8fff8de3e6a7
Assignee: nobody → bjacob
Comment on attachment 677495 [details] [diff] [review]
Stagefright blacklisting. Also extends Android Gfxinfo to support blacklist rules by Android API version, Model, Product, Hardware, Manufacturer

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Since we enabled stagefright-based h264 decoding in version 17 we got crashes in libstagefright. See dependent bugs.
User impact if declined: bugs when trying to view h264 videos on certain devices. Like, many Sony Ericsson devices.
Testing completed (on m-c, etc.): just landed on m-i. But we really want this on 17.
Risk to taking this patch (and alternatives if risky): This patch touches mostly gfxinfo. Existing gfxinfo functionality is decently covered by xpcshell tests on desktop, so I'm not concerned about desktop regressions. The worry is that we are not currently running xpcshell tests on android. Fortunately we don't rely too much on gfxinfo on android as of yet. The biggest thing we relied on it for, was WebGL. So there you go, if this patch introduces a bug it could have effects such as disabling WebGL on Android, or wrongly enabling it on devices where it's unsafe.
String or UUID changes made by this patch: none
Attachment #677495 - Flags: approval-mozilla-beta?
Attachment #677495 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/f710f9f6f7ec

Possible to test this?
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
(In reply to Ryan VanderMeulen from comment #38)
> https://hg.mozilla.org/mozilla-central/rev/f710f9f6f7ec
> 
> Possible to test this?

See above. This should be covered by an XPCShell test. But we don't currently run XPCShell tests on android. So testing this is blocked on enabling XPCShell tests on Android.
(In reply to Benoit Jacob [:bjacob] from comment #37)
> Risk to taking this patch (and alternatives if risky): This patch touches
> mostly gfxinfo. Existing gfxinfo functionality is decently covered by
> xpcshell tests on desktop, so I'm not concerned about desktop regressions.
> The worry is that we are not currently running xpcshell tests on android.
> Fortunately we don't rely too much on gfxinfo on android as of yet. The
> biggest thing we relied on it for, was WebGL. So there you go, if this patch
> introduces a bug it could have effects such as disabling WebGL on Android,
> or wrongly enabling it on devices where it's unsafe.

Really appreciate the thoughtful risk evaluation. Once this lands, we'll want to do some testing around WebGL on Android in that case. We should test in QA using a couple of widely used phones as well as a Sony Ericsson device running JB, from my read of the bug.
Keywords: qawanted, verifyme
QA Contact: kbrosnan
Attachment #677495 - Flags: approval-mozilla-beta?
Attachment #677495 - Flags: approval-mozilla-beta+
Attachment #677495 - Flags: approval-mozilla-aurora?
Attachment #677495 - Flags: approval-mozilla-aurora+
Please make sure to land before Tuesday to make our fifth beta. Thanks!
Target Milestone tracks when a fix lands on m-c. Using it for other things can break bug queries that depend on that, so please don't.
Target Milestone: Firefox 17 → Firefox 19
For some reason, this was causing Android NoIon test timeouts on Aurora. No idea why m-c didn't have this problem (and NoIon isn't built on beta). Backed out.
https://hg.mozilla.org/releases/mozilla-aurora/rev/f5f7309eb01a
Blocks: 808375, 808378, 802629
Keywords: qawanted
Depends on: 813818
Blocks: 817478
Keywords: verifyme
Product: Firefox for Android → Firefox for Android Graveyard
See Also: → 1714673
You need to log in before you can comment on or make changes to this bug.