Add fallback for GMP downloads if AUS is down

RESOLVED FIXED in Firefox 52

Status

()

defect
P3
normal
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: cpearce, Assigned: daleharvey)

Tracking

(Blocks 1 bug)

unspecified
Firefox 52
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox49+ wontfix, firefox50+ wontfix, firefox51 affected, firefox52 verified)

Details

Attachments

(1 attachment, 8 obsolete attachments)

Recently the GMP AUS server went down, and new Firefox installs weren't able to play Netflix or use GMPs. We'd like to harden ourselves against this failure being an issue in future.

One solution suggested was to bake in the URL that we'd use to download GMPs into Firefox directly, and use this if the GMP is not installed and the update server is down. So likely this would only affect new Firefox installs/profiles.

We would need to bake in the hash, the version, and the URL. We can probably just put these in prefs.

We'd need to ensure the Adobe/Widevine CDMs only downloaded if MOZ_{ADOBE,WIDEVINE}_EME are #defined.

There may be an issue with the GMP downloader causing Firefox to try to make a network request during mochitests; IIRC if a network request to the outside world is made during mochitests, Firefox is terminated. May need to do something like this:
http://mxr.mozilla.org/mozilla-central/source/testing/profiles/prefs_general.js#46
Kirk: Can you take this?
Flags: needinfo?(ksteuber)
Summary: Add fallback for GMP downloads is AUS is down → Add fallback for GMP downloads if AUS is down
Shouldn't the effort go into hardening the GMP AUS server against outages? Isn't the user running into the same issue should the GMP download server go down?
We can't guarantee that the GMP AUS server won't go down. In the last outage, it went down due to problems with other services, not with the update server itself.

We can't control our partners' download sites.
It seems like this is going to be a pain to maintain and to keep in sync with our online version. Is there a doc that describes the expected behavior when things start getting out of sync? Will you be pushing updates to the baked-in download URLs via system addons?
I will be keeping the baked in URLs in sync with the URLS on the GMP AUS.

Our plan is to push new GMPs to pre-release Firefox versions, and let them ride the trains out to release. We're not intending to be changing GMPs in the release population without a period of baking in pre-release, unless there's an emergency. So I imagine we'll only need to change the baked-in URLs in pre-release versions of Firefox.

Note: we should *only* use the baked in URLs if the GMP AUS is down *and* the GMP is not yet installed. Even if we don't manage to keep the baked-in URLs in sync, I don't think users will switch to Chrome if their Firefox client ends up downloading an older version of the GMP; it'll be updated once the GMP AUS comes back online at the next daily update check. Users will switch to Chrome however, if they can't watch Netflix in their new Firefox install.
This makes it sound like we have low confidence in the reliability of our GMP AUS server. And adding this workaround seems like a sledgehammer for what you correctly describe as affecting a narrow subset of users (server down and no version of GMP installed).
This is about defense in depth. Our AUS servers are pretty good; they have been down for only about 5 days in the past year. But downtime is inevitable. When the AUS servers went down a few weeks back, Netflix emailed me within an hour asking why EME wasn't working. When video is broken, people care, so we need to ensure it is robust and keeps working.
True "defense in depth" would be to package a version of each GMP with Firefox, which is a problematic path to be on.

Assuming this will come my way for review:
1. Can we have the team maintaining the AUS server confirm here that we cannot bring reliability of the server to a level that's sufficient for GMP's purposes, and that we cannot move to a server with higher reliability?
2. Can you provide numbers of users that you expect to be affected without this workaround?

Alternatively, can we get an executive decision from someone that greenlights this change?

I must admit that at this point, I don't believe the added code complexity and (manual) maintenance overhead is justified.
I am currently transitioning away from media and towards e10s. Since this bug does not seem to be part of my previous work and the specifics of the work do not yet seem to be certain, I would prefer not to take this bug right now.
Flags: needinfo?(ksteuber)
Please note that the blocklist also bakes in a fallback URL for the case where the update server goes down. So this isn't an unprecedented request.

(In reply to Stephen A Pohl [:spohl] from comment #8)
> True "defense in depth" would be to package a version of each GMP with
> Firefox, which is a problematic path to be on.
> Assuming this will come my way for review:
> 1. Can we have the team maintaining the AUS server confirm here that we
> cannot bring reliability of the server to a level that's sufficient for
> GMP's purposes, and that we cannot move to a server with higher reliability?

Lawrence: Do you recall some weeks back the GMP update servers went down for several days and fresh Firefox installs weren't able to download GMPs and so Netflix didn't work for new Firefox installs?

Can you please confirm whether it's reasonable to expect the GMP update servers to provide 100% uptime?


> 2. Can you provide numbers of users that you expect to be affected without
> this workaround?

I don't have numbers for how many new installs we get per day. But, I do know that video playback is one of the key things people do on the web. When it doesn't work well, our users switch browsers. So it has to work and it has to work well.

 
> Alternatively, can we get an executive decision from someone that
> greenlights this change?

We discussed baking in the GMP download URLs to harden Firefox against the update server going down at the platform managers meeting a few weeks back. The platform leadership including the VP and directors were present and agreed that this was a reasonable solution.


 
> I must admit that at this point, I don't believe the added code complexity
> and (manual) maintenance overhead is justified.

Could you elaborate where the complexity is? Could you just keep a copy of the fallback update.xml for each platform in the tree/resdir, and if the download of the update.xml fails with a time out and no GMPs are installed yet, just parse the baked-in update.xml in place of the one that was supposed to have been downloaded over the network?

The platform team manages GMP updates, so I expect we'd be in responsible for the manual maintenance of keeping the baked-in update.xml file up to date as new GMPs are rolled out.

To reduce maintenance, the baked-in update.xml files could be kept in sync by a script which just downloaded the update.xml file for each platform/arch combination.
Flags: needinfo?(lmandel)
(In reply to Chris Pearce (:cpearce) from comment #10)
> Please note that the blocklist also bakes in a fallback URL for the case
> where the update server goes down. So this isn't an unprecedented request.

I'm not familiar with the fallback URL used for the blocklist. Is it referring to several different third-party modules like the GMP URL and does it require the manual maintenance that we would require here?

> (In reply to Stephen A Pohl [:spohl] from comment #8)
> > True "defense in depth" would be to package a version of each GMP with
> > Firefox, which is a problematic path to be on.
> > Assuming this will come my way for review:
> > 1. Can we have the team maintaining the AUS server confirm here that we
> > cannot bring reliability of the server to a level that's sufficient for
> > GMP's purposes, and that we cannot move to a server with higher reliability?
> 
> Lawrence: Do you recall some weeks back the GMP update servers went down for
> several days and fresh Firefox installs weren't able to download GMPs and so
> Netflix didn't work for new Firefox installs?
> 
> Can you please confirm whether it's reasonable to expect the GMP update
> servers to provide 100% uptime?

Note that I'm not suggesting that we get GMP update servers to 100% uptime. What I'm suggesting is that if GMP downloads are really that mission critical, we should explore if a move to a server with the same reliability as (for example) our Firefox downloads (probably the most mission-critical product of ours) is possible.

> > 2. Can you provide numbers of users that you expect to be affected without
> > this workaround?
> 
> I don't have numbers for how many new installs we get per day. But, I do
> know that video playback is one of the key things people do on the web. When
> it doesn't work well, our users switch browsers. So it has to work and it
> has to work well.

By moving to a server with the same reliability as Firefox downloads, this should be fixed.

> > I must admit that at this point, I don't believe the added code complexity
> > and (manual) maintenance overhead is justified.
> 
> Could you elaborate where the complexity is?

I believe you've answered this yourself:

> Could you just keep a copy of
> the fallback update.xml for each platform in the tree/resdir, and if the
> download of the update.xml fails with a time out and no GMPs are installed
> yet, just parse the baked-in update.xml in place of the one that was
> supposed to have been downloaded over the network?
> 
> The platform team manages GMP updates, so I expect we'd be in responsible
> for the manual maintenance of keeping the baked-in update.xml file up to
> date as new GMPs are rolled out.
> 
> To reduce maintenance, the baked-in update.xml files could be kept in sync
> by a script which just downloaded the update.xml file for each platform/arch
> combination.

No matter how we slice it, this is inherently more complex, maintenance-heavy and error-prone than moving GMP updates to a server with the same/similar reliability as our Firefox downloads, especially considering the small number of people that we actually expect to run into this issue.
(In reply to Stephen A Pohl [:spohl] from comment #11)
> (In reply to Chris Pearce (:cpearce) from comment #10)
> > (In reply to Stephen A Pohl [:spohl] from comment #8)
> > > True "defense in depth" would be to package a version of each GMP with
> > > Firefox, which is a problematic path to be on.
> > > Assuming this will come my way for review:
> > > 1. Can we have the team maintaining the AUS server confirm here that we
> > > cannot bring reliability of the server to a level that's sufficient for
> > > GMP's purposes, and that we cannot move to a server with higher reliability?
> > 
> > Lawrence: Do you recall some weeks back the GMP update servers went down for
> > several days and fresh Firefox installs weren't able to download GMPs and so
> > Netflix didn't work for new Firefox installs?
> > 
> > Can you please confirm whether it's reasonable to expect the GMP update
> > servers to provide 100% uptime?
> 
> Note that I'm not suggesting that we get GMP update servers to 100% uptime.
> What I'm suggesting is that if GMP downloads are really that mission
> critical, we should explore if a move to a server with the same reliability
> as (for example) our Firefox downloads (probably the most mission-critical
> product of ours) is possible.

No service has 100% uptime. You noted yourself that AUS4 was down a few weeks ago. We should expect that the service will go down on occasion and have a graceful fallback.

Note that AWS EC2 has an uptime SLA of 99.95% [1].

[1] https://aws.amazon.com/ec2/sla/

> > > 2. Can you provide numbers of users that you expect to be affected without
> > > this workaround?
> > 
> > I don't have numbers for how many new installs we get per day. But, I do
> > know that video playback is one of the key things people do on the web. When
> > it doesn't work well, our users switch browsers. So it has to work and it
> > has to work well.
> 
> By moving to a server with the same reliability as Firefox downloads, this
> should be fixed.

I think you mean Firefox updates. Two notes:
1. Firefox updates are advertised by AUS4. (Same service as GMP updates.) Actual bits are served from our CDN.
2. Firefox updates have graceful degradation. If AUS4 is unreachable, the browser will try again later.
Flags: needinfo?(lmandel)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #12)
> (In reply to Stephen A Pohl [:spohl] from comment #11)
> > (In reply to Chris Pearce (:cpearce) from comment #10)
> > > (In reply to Stephen A Pohl [:spohl] from comment #8)
> > > > True "defense in depth" would be to package a version of each GMP with
> > > > Firefox, which is a problematic path to be on.
> > > > Assuming this will come my way for review:
> > > > 1. Can we have the team maintaining the AUS server confirm here that we
> > > > cannot bring reliability of the server to a level that's sufficient for
> > > > GMP's purposes, and that we cannot move to a server with higher reliability?
> > > 
> > > Lawrence: Do you recall some weeks back the GMP update servers went down for
> > > several days and fresh Firefox installs weren't able to download GMPs and so
> > > Netflix didn't work for new Firefox installs?
> > > 
> > > Can you please confirm whether it's reasonable to expect the GMP update
> > > servers to provide 100% uptime?
> > 
> > Note that I'm not suggesting that we get GMP update servers to 100% uptime.
> > What I'm suggesting is that if GMP downloads are really that mission
> > critical, we should explore if a move to a server with the same reliability
> > as (for example) our Firefox downloads (probably the most mission-critical
> > product of ours) is possible.
> 
> No service has 100% uptime. You noted yourself that AUS4 was down a few
> weeks ago. We should expect that the service will go down on occasion and
> have a graceful fallback.
> 
> Note that AWS EC2 has an uptime SLA of 99.95% [1].
> 
> [1] https://aws.amazon.com/ec2/sla/

An uptime SLA of 99.95% is insufficient for GMP downloads? I'm not suggesting that it is or isn't, but this is surprising to me.

> > > > 2. Can you provide numbers of users that you expect to be affected without
> > > > this workaround?
> > > 
> > > I don't have numbers for how many new installs we get per day. But, I do
> > > know that video playback is one of the key things people do on the web. When
> > > it doesn't work well, our users switch browsers. So it has to work and it
> > > has to work well.
> > 
> > By moving to a server with the same reliability as Firefox downloads, this
> > should be fixed.
> 
> I think you mean Firefox updates. Two notes:
> 1. Firefox updates are advertised by AUS4. (Same service as GMP updates.)
> Actual bits are served from our CDN.
> 2. Firefox updates have graceful degradation. If AUS4 is unreachable, the
> browser will try again later.

I am referring to first time Firefox installer downloads. The users that run into the issue described here are users who have no GMP installed, which is typically the case after a first-time install. What's good enough for Firefox installers should be good enough for GMP downloads. Also, my understanding is that we already have graceful degradation for GMP downloads. If the server is down, we will try again later just like Firefox updates.
To clarify my reservations: The belief here is that the currently implemented fallback when servers are down (try to download again later) along with the uptime SLA of 99.95% matches what we do for Firefox updates and should be good enough, and that the additional fallback suggested in this bug adds unnecessary code complexity, maintenance overhead and points of failure for little to no benefit for a very narrow subset of users (no GMP installed, AUS servers down).
Hey cpearce - you'd brought this up as potential work for the Platform UI team, but as far as I can tell, this doesn't have much in the way of a UI component to it... am I misunderstanding?

Removing from our list until I hear back.
No longer blocks: platform-ui-team
Flags: needinfo?(cpearce)
(In reply to Mike Conley (:mconley) - (needinfo me!) from comment #15)
> Hey cpearce - you'd brought this up as potential work for the Platform UI
> team, but as far as I can tell, this doesn't have much in the way of a UI
> component to it... am I misunderstanding?
> 
> Removing from our list until I hear back.

This work doesn't need UI design, it needs implementation in the Firefox code. That is, this is a change that we need made to the GMP downloader, which is code that's in the front end.
Flags: needinfo?(cpearce)
Benjamin - if AUS4 is down then we get no EME (or openh264) for new Firefox installs. This happened for a few days earlier this year which caused Netflix to be broken for a number with clean profiles (presumably new users). Lawrence suggested that it would be worthwhile having a backup.

How important do you think it is?
Flags: needinfo?(benjamin)
If you're talking about pre-shipping the GMP download list with Firefox, the way we pre-ship the blocklist, I think that's probably reasonable. I don't think it makes much sense to have a second backup service, given the huge complexity that would be involved.

But I also don't think this is a very high priority relative to other work.
Flags: needinfo?(benjamin)
Priority: P2 → P3
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> If you're talking about pre-shipping the GMP download list with Firefox, the
> way we pre-ship the blocklist, I think that's probably reasonable. I don't
> think it makes much sense to have a second backup service, given the huge
> complexity that would be involved.

Consider the following STR:

* Disable AUS - it was down for several days this year making EME unusable on new Firefox installs.
* Download and install Firefox
* Navigate to Netflix

Expected results:

Netflix should work

Actual results:

Firefox shows a message explaining that the CDM will be installed shortly. This is not true because AUS is down so Firefox doesn't know the download URL. New installs of Firefox are unusable for the purpose of DRM video until Firefox has successfully contacted AUS.

Proposed solution:

Embed a fallback XML in the Firefox binary to use when there is no CDM installed (i.e. a new Firefox install) and AUS is not available.
For reference, what is the URL end-point we should be falling back to? I can't seem to find it in this bug.
Flags: needinfo?(ajones)
(In reply to Mike Conley (:mconley) - (Needinfo me!) from comment #20)
> For reference, what is the URL end-point we should be falling back to? I
> can't seem to find it in this bug.

We need a copy of the XML file from AUS. This would require occasional syncing between AUS and m-c. One possibility would be to invert the dependency and make m-c the authority on which CDM gets served to which Firefox version.
Flags: needinfo?(ajones)
Bug 1300633 means that it would be fantastic if we can hustle and quickly put together a fallback.
See Also: → 1300633
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #21)
> (In reply to Mike Conley (:mconley) - (Needinfo me!) from comment #20)
> > For reference, what is the URL end-point we should be falling back to? I
> > can't seem to find it in this bug.
> 
> We need a copy of the XML file from AUS. This would require occasional
> syncing between AUS and m-c. One possibility would be to invert the
> dependency and make m-c the authority on which CDM gets served to which
> Firefox version.

I think we should be updating all branches (central, aurora, beta, release, and esr) each time the plugins change. If we don't, it will be exceedingly rare that the majority of users would fall back to a current version of the plugin.

As for the data itself, AUS stores it as JSON and builds the XML on the fly. The simplest thing to do from an administrative standpoint would be to embed those same objects into Firefox rather than translating them to XML first. They look as follows:
{
  "vendors": {
    "gmp-widevinecdm": {
      "platforms": {
        "WINNT_x86-msvc-x64": {
          "alias": "WINNT_x86-msvc"
        },
        "WINNT_x86_64-msvc": {
          "fileUrl": "https://reabcdirector.gvt1.com/crx/blobs/QwAAAHF3InbmK-wFIemaY3I3BCMfKkG1qK-QjduNYAV8iOvMDJ4F-eYG99nyObODFXBo4JlOpkZcZdTq5UX6NspNYV2NfMZSYBiE9bPVePiERn4lAMZSmuUqhMZt9jz3vFFOYvPvjR_f9H5xCQ/",
          "hashValue": "e6b7c40047772660d78be998f0dcff3adff5198a38f3dd004a035fcb8005d3bf4d5c34beb5f7aa609abd03fb03f57117edbdad127cf85191acdf062bade246e3",
          "filesize": 2852841
        },
        "WINNT_x86-msvc": {
          "fileUrl": "https://reabcdirector.gvt1.com/crx/blobs/QwAAAHF3InbmK-wFIemaY3I3BCMfKkG1qK-QjduNYAV8iOvMDJ4F-eYG99nyObODFXBo4JlOpkZcZdTq5UX6NspNYV2NfMZSYBiE9bPVePiERn4lAMZSmuUqhMZt9jz3vFFOYvPvjR_f9H5xCQ/",
          "hashValue": "f34227b20ab4c57bdb9dfe092b1dff102a0b92dc36f6b3c80c37b87dd2cffe6acd87a601acc58e7cf2036fb9a2ba389b95f5b9b8f8235776564e71994cbf5dcd",
          "filesize": 2879474
        },
        "WINNT_x86-msvc-x86": {
          "alias": "WINNT_x86-msvc"
        },
        "Darwin_x86-gcc3-u-i386-x86_64": {
          "fileUrl": "https://reabcdirector.gvt1.com/crx/blobs/QwAAAHF3InbmK-wFIemaY3I3BCMfKkG1qK-QjduNYAV8iOvMDJ4F-eYG99nyObODFXBo4JlOpkZcZdTq5UX6NspNYV2NfMZSYBiE9bPVePiERn4lAMZSmuUqhMZt9jz3vFFOYvPvjR_f9H5xCQ/",
          "hashValue": "3b0027f8e4f52b93ec3492fde00c76b49ffe8500b41b2ef7ca021534565cc44921c061fbb304676b1ad11ba68a02026aa0f923810ea5b0ea400a8e31f4bfc324",
          "filesize": 2160515
        },
        "Darwin_x86_64-gcc3": {
          "alias": "Darwin_x86-gcc3-u-i386-x86_64"
        },
        "Darwin_x86-gcc3": {
          "alias": "Darwin_x86-gcc3-u-i386-x86_64"
        },
        "WINNT_x86_64-msvc-x64": {
          "alias": "WINNT_x86_64-msvc"
        },
        "Darwin_x86_64-gcc3-u-i386-x86_64": {
          "alias": "Darwin_x86-gcc3-u-i386-x86_64"
        }
      },
      "version": "1.4.8.893"
    }
  },
  "hashFunction": "sha512",
  "schema_version": 1000
}
Assignee

Comment 24

3 years ago
Mike has asked if I could take a look at this, just to clarify is there any deadline involved here other than asap? Otherwise looks like enough information to go on, will follow up if I hit trouble.
Assignee: nobody → dale
(In reply to Ben Hearsum (:bhearsum) from comment #23)
> I think we should be updating all branches (central, aurora, beta, release,
> and esr) each time the plugins change. If we don't, it will be exceedingly
> rare that the majority of users would fall back to a current version of the
> plugin.

This precisely. A solution that requires human intervention (manually generated csets to all the branches for example) will inevitably fall out of sync at some point. What is the proposed solution to do this without human intervention?
(In reply to Dale Harvey (:daleharvey) from comment #24)
> Mike has asked if I could take a look at this, just to clarify is there any
> deadline involved here other than asap? Otherwise looks like enough
> information to go on, will follow up if I hit trouble.

Hrm - it's 2AM right about now for k17e (ajones), so I doubt we'll get an answer for that right now from him. Needinfo'ing anyways, but also going to needinfo jet in case he can get you an answer faster.

ni? for comment 24.
Flags: needinfo?(bugs)
Flags: needinfo?(ajones)
I should note that whatever in-product fallback code is produced has the potential to be extra problematic.

the update code for GMP stuff has the ability (and seen it many times) for some OS combos to not use a specific plugin combo, or a specific plugin version, while the same *FIREFOX* binary is used with a different plugin version on a different ver of the OS. E.g. XP vs 8 vs 10.

Or using feature detection, I could see being a thing, like "don't use this version of Widevine if SSE3 isn't available"...

Thats the type of stuff that would make automated updating and automated fallbacks a bit harder if we want to continue to support them.
(In reply to Stephen A Pohl [:spohl] from comment #25)
> This precisely. A solution that requires human intervention (manually
> generated csets to all the branches for example) will inevitably fall out of
> sync at some point. What is the proposed solution to do this without human
> intervention?

I propose we exclude that from the MVP in that something is better than nothing right now. We don't want to make the perfect the enemy of the good here. We will need to do automatic syncing in a follow up.

(In reply to Mike Conley (:mconley) - (Digging through needinfos and reviews) from comment #26)
> (In reply to Dale Harvey (:daleharvey) from comment #24)
> > Mike has asked if I could take a look at this, just to clarify is there any
> > deadline involved here other than asap? Otherwise looks like enough
> > information to go on, will follow up if I hit trouble.

It would be ideal if we can get this in for 49 but we're fairly late it the day for that release.
 
> Hrm - it's 2AM right about now for k17e (ajones), so I doubt we'll get an
> answer for that right now from him. Needinfo'ing anyways, but also going to
> needinfo jet in case he can get you an answer faster.

Use IRC if you want a fast answer. I check that more often than needinfo. My phone number is in phonebook if you need to wake me up to get an answer.

(In reply to Justin Wood (:Callek) from comment #27)
> I should note that whatever in-product fallback code is produced has the
> potential to be extra problematic.

The alternative here is no plug-in. I want to address the initial install issue ahead of worrying about updates. Today we have an issue with the initial install. Updates are a problem for another day.

> the update code for GMP stuff has the ability (and seen it many times) for
> some OS combos to not use a specific plugin combo, or a specific plugin
> version, while the same *FIREFOX* binary is used with a different plugin
> version on a different ver of the OS. E.g. XP vs 8 vs 10.
> 
> Or using feature detection, I could see being a thing, like "don't use this
> version of Widevine if SSE3 isn't available"...
> 
> Thats the type of stuff that would make automated updating and automated
> fallbacks a bit harder if we want to continue to support them.

Having a fallback when there is no GMP and aus is inaccessible doesn't make a difference to any of that.
Flags: needinfo?(ajones)
(In reply to Stephen A Pohl [:spohl] from comment #25)
> (In reply to Ben Hearsum (:bhearsum) from comment #23)
> > I think we should be updating all branches (central, aurora, beta, release,
> > and esr) each time the plugins change. If we don't, it will be exceedingly
> > rare that the majority of users would fall back to a current version of the
> > plugin.
> 
> This precisely. A solution that requires human intervention (manually
> generated csets to all the branches for example) will inevitably fall out of
> sync at some point. What is the proposed solution to do this without human
> intervention?

We don't need to update all branches when we roll out a new GMP in the case of a normal roll out.

Unless there's some kind of emergency, our recent approach to updating GMPs has been to push the new GMP to a pre-release branch (normally up to aurora or beta), and let that ride the trains into release. So we've not been pushing GMP updates to the release branch, as if we did that we'd not have the opportunity to let the new GMP bake time on the version of Firefox we're shipping with.

We're quite happy to have users whose release Firefox is out-of-date to not have the current GMP. They will however have a version which was tested against the version of Firefox they're using.
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #28)
> (In reply to Justin Wood (:Callek) from comment #27)
> > I should note that whatever in-product fallback code is produced has the
> > potential to be extra problematic.
> 
> The alternative here is no plug-in. I want to address the initial install
> issue ahead of worrying about updates. Today we have an issue with the
> initial install. Updates are a problem for another day.
> 
> > the update code for GMP stuff has the ability (and seen it many times) for
> > some OS combos to not use a specific plugin combo, or a specific plugin
> > version, while the same *FIREFOX* binary is used with a different plugin
> > version on a different ver of the OS. E.g. XP vs 8 vs 10.
> > 
> > Or using feature detection, I could see being a thing, like "don't use this
> > version of Widevine if SSE3 isn't available"...
> > 
> > Thats the type of stuff that would make automated updating and automated
> > fallbacks a bit harder if we want to continue to support them.
> 
> Having a fallback when there is no GMP and aus is inaccessible doesn't make
> a difference to any of that.

It sort of does matter though.  Because you're providing a fallback without taking into account the actual state of "OS" support for the plugins. E.g. if we don't ship plugin X to Windows XP (or android/linux) due to criteria that is not === to every user on the same build.

These "updates" are "update from no GMP plugin at all" not "update only if plugin already exists".

For example Widevine is not installed (ever) on winXP...

While from Firefox 42 to Firefox 45 the "CDM" ver 16 is used on x64 windows, and the CDM ver 15 is used on "others" (Firefox 46 and up uses CDM 17)

I completely do not imagine this being the only magic we keep going forward, as I've seen even more special requests/needs in the past.
Assignee

Comment 31

3 years ago
Got a patch for this mostly done, one small question, I want to prevent the addon from installing if it is an update but cant figure out how to distinguish updates from new installs, there is |isInstalled| @ https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/GMPInstallManager.jsm#316 but that looks like its "isthisversioninstalled" (https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/GMPInstallManager.jsm#220) 

Hopefully Anothy is the best to answer, but if not could you forward this along, thanks
Flags: needinfo?(ajones)
Assignee

Comment 32

3 years ago
s/Anothy/Anthony

apologies
spohl or Mossop might know. Let's try spohl.
Flags: needinfo?(spohl.mozilla.bugs)
(In reply to Justin Wood (:Callek) from comment #30)
> It sort of does matter though.  Because you're providing a fallback without
> taking into account the actual state of "OS" support for the plugins. E.g.
> if we don't ship plugin X to Windows XP (or android/linux) due to criteria
> that is not === to every user on the same build.

That logic needs to be captured for a specific Widevine release.

> These "updates" are "update from no GMP plugin at all" not "update only if
> plugin already exists".
> 
> For example Widevine is not installed (ever) on winXP...

We don't run the updater on Windows XP.

> While from Firefox 42 to Firefox 45 the "CDM" ver 16 is used on x64 windows,
> and the CDM ver 15 is used on "others" (Firefox 46 and up uses CDM 17)

That isn't today's problem. If this is the kind of problem that can be sorted out in less than a week then I suggest we deal with it when it is an impending problem. While there is a small risk of a panic, I'd rather not put work into it now.

> I completely do not imagine this being the only magic we keep going forward,
> as I've seen even more special requests/needs in the past.

What are you suggesting we do about it?

(In reply to Dale Harvey (:daleharvey) from comment #31)
> Hopefully Anothy is the best to answer, but if not could you forward this
> along, thanks

I can't answer that question.
Flags: needinfo?(ajones)
I no longer follow the issues that we're trying to work around here. It seems to have gone from AUS being down to some kind of certificate issue. I also don't follow why we believe that any possible workaround will be more reliable than addressing the actual issues in the first place.

Having taken on new responsibilities that require my attention it's best for someone else to ramp up on this and follow this through from a feedback and review perspective.
Flags: needinfo?(spohl.mozilla.bugs)
Assignee

Comment 36

3 years ago
hey Ben, any chance you know where the code is that takes that json and spits out the right xml? I want to make sure to match the logic in terms of the alias' etc
Flags: needinfo?(bhearsum)
(In reply to Dale Harvey (:daleharvey) from comment #36)
> hey Ben, any chance you know where the code is that takes that json and
> spits out the right xml? I want to make sure to match the logic in terms of
> the alias' etc

That's https://github.com/mozilla/balrog/blob/master/auslib/blobs/gmp.py, with a bit of help from the base class in https://github.com/mozilla/balrog/blob/master/auslib/blobs/base.py. getInnerXML is probably the part you care about.

It's a little bit more involved for GMP updates these days - we actually take multiple JSON blobs and combine them into one XML response. Doing this boils down to starting with <updates><addons>, appending the return value from getInnerXML for each blob, and then appending </addons></updates>.

Feel free to find me on IRC if you have more questions, I'm happy to help.
Flags: needinfo?(bhearsum)
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #34)
> (In reply to Justin Wood (:Callek) from comment #30)
> > It sort of does matter though.  Because you're providing a fallback without
> > taking into account the actual state of "OS" support for the plugins. E.g.
> > if we don't ship plugin X to Windows XP (or android/linux) due to criteria
> > that is not === to every user on the same build.
> 
> That logic needs to be captured for a specific Widevine release.

I think you need to include a Widevine release for each supoprted platform, not just the latest Widevine release. Otherwise platforms that don't support the latest version won't have any fallback, right?

> > These "updates" are "update from no GMP plugin at all" not "update only if
> > plugin already exists".
> > 
> > For example Widevine is not installed (ever) on winXP...
> 
> We don't run the updater on Windows XP.

Are you sure we don't run the GMP updater at all on Windows XP? AFAICT, XP is supposed to get both OpenH264 and EME, based on https://aus5.mozilla.org/update/3/GMP/48.0/20120222174716/WINNT_x86-msvc-x64/en-US/release/Windows_NT%205.2.2.0%20(x64)/default/default/update.xml.

> > While from Firefox 42 to Firefox 45 the "CDM" ver 16 is used on x64 windows,
> > and the CDM ver 15 is used on "others" (Firefox 46 and up uses CDM 17)
> 
> That isn't today's problem. If this is the kind of problem that can be
> sorted out in less than a week then I suggest we deal with it when it is an
> impending problem. While there is a small risk of a panic, I'd rather not
> put work into it now.

I think the problem here is that the latest version of the CDM plugin doesn't work on all the platforms that Firefox ships on, which means even if you're on the latest Firefox, you may not be able to use the latest plugin. This is the case with all of the GMP plugins we ship, and it generally gets worse over time because Firefox tends to support older platforms than the plugins.
(In reply to Ben Hearsum (:bhearsum) from comment #38)
> Are you sure we don't run the GMP updater at all on Windows XP? AFAICT, XP
> is supposed to get both OpenH264 and EME, based on
> https://aus5.mozilla.org/update/3/GMP/48.0/20120222174716/WINNT_x86-msvc-x64/
> en-US/release/Windows_NT%205.2.2.0%20(x64)/default/default/update.xml.

We do install OpenH264 on Windows XP.

> I think the problem here is that the latest version of the CDM plugin
> doesn't work on all the platforms that Firefox ships on, which means even if
> you're on the latest Firefox, you may not be able to use the latest plugin.

There are no legacy versions of anything.

> This is the case with all of the GMP plugins we ship, and it generally gets
> worse over time because Firefox tends to support older platforms than the
> plugins.

There is nothing on the horizon.
Assignee

Comment 40

3 years ago
Posted patch 1267495-1.patch (obsolete) — Splinter Review
This is the patch so far, a few minor TODO's in there mostly wanted to get the patch up partly because I lost my branch and had to redo this earlier today and to get any early feedback, currently switching the format of eme_source.json to the format given in https://bugzilla.mozilla.org/show_bug.cgi?id=1267495#c23

Ben thanks for the links to the server source, is there any chance you could clarify exactly what you would expect the format of the embedded JSON files to look like? currently I am going with your example in #23 but if they are different then would be a good time to know, cheers (sorry missed you on irc yesterday)
Flags: needinfo?(bhearsum)
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #39)
> (In reply to Ben Hearsum (:bhearsum) from comment #38)
> > I think the problem here is that the latest version of the CDM plugin
> > doesn't work on all the platforms that Firefox ships on, which means even if
> > you're on the latest Firefox, you may not be able to use the latest plugin.
> 
> There are no legacy versions of anything.

Looks like you're right - I must've misread something. We don't have multiple different versions shipping to Firefox 49.0. However, I think we should take into account that Widevine isn't shipping to XP, and 64-bit Vista?

> > This is the case with all of the GMP plugins we ship, and it generally gets
> > worse over time because Firefox tends to support older platforms than the
> > plugins.
> 
> There is nothing on the horizon.

OK, but this is going to happen at some point. It would behoove us to have a plan of how to handle it when it does.
Flags: needinfo?(bhearsum)
Assignee

Comment 42

3 years ago
Posted patch 1267495-2.patch (obsolete) — Splinter Review
Ok so this is a working patch for me, there are 2 open questions still to do.

1. I havent figured out how to implement |isUpdate()|, if anyone knows any more about that it would be great.

2. This is missing some platform specific rules, as mentioned by bhearsum on IRC we need to skip installing widevine on 64 bit vista + all versions of XP, now these are easy enough to hardcode however the rules may change and we want to ensure syncing them is maintainable between m-c and balrog

I think the easiest way to do this would be to add a field to the vendor JSON file that contains an encoding of these rules

{
  "vendors": {
    "ignorePlatforms": ["XP", "Vista64"],
    "gmp-widevinecdm": {
    
Then maintaining these rules can occur whenever the data is synced to M-C

I dont have an exact format in my head yet but will experiment, bhearsum does that sound like a suitable solution?

(This patch also needs tests, working on them now)
Attachment #8790671 - Attachment is obsolete: true
Flags: needinfo?(bhearsum)
Attachment #8792002 - Flags: feedback?(ksteuber)
Assignee

Comment 43

3 years ago
Also given that this is time sensitive, it is also an option to hardcode the current rules and work on a follow up that makes them more maintainable
(In reply to Dale Harvey (:daleharvey) from comment #43)
> Also given that this is time sensitive, it is also an option to hardcode the
> current rules and work on a follow up that makes them more maintainable

I think this the best option for right now.


> I think the easiest way to do this would be to add a field to the vendor
> JSON file that contains an encoding of these rules

Just in case this wasn't clear before: this isn't JSON provided by vendors, it's stuff that we put together.

> {
>   "vendors": {
>     "ignorePlatforms": ["XP", "Vista64"],
>     "gmp-widevinecdm": {
>     
> Then maintaining these rules can occur whenever the data is synced to M-C
> 
> I dont have an exact format in my head yet but will experiment, bhearsum
> does that sound like a suitable solution?

I'm not a huge fan of this idea because it violates the model that we use in Balrog. Releases contain all the metadata about one particular thing, eg: CDM 17. This includes things like Build Targets, which are the target architecture for a binary. Rules decide who should get that thing. This separation of concerns makes it much easier for humans to reason about things - we only have to look at or modify the Rules table to change who gets what update. I'm sure we can come up with some sort of way to do this, though.

However, I think we should also take a step back and think about a higher level solution. As I understand it, there is two drivers behind this bug:
1) The update server was down for a couple of days earlier this year. This was due to a security incident that caused it to be down for a couple of days (along with tons of other Mozilla systems). It was a very special case, and not one that I think we should design for. That aside, Balrog is a service with extremely high uptime.
2) Some Antivirus software man-in-the-middles the update requests, and that causes certificate pinning to fail. This is a problem that we've already solved for Firefox by signing the update packages instead of pinning (see https://bugzilla.mozilla.org/show_bug.cgi?id=1063111). I think this is a more robust (and secure) solution than pinning + fallback.
Flags: needinfo?(bhearsum)
Comment on attachment 8792002 [details] [diff] [review]
1267495-2.patch

Review of attachment 8792002 [details] [diff] [review]:
-----------------------------------------------------------------

First off, my experience with this code is fairly limited, but I will do my best here.

::: toolkit/modules/GMPInstallManager.jsm
@@ +235,5 @@
> +        // if (usedFallback && gmpAddon.isUpdate()) {
> +        //  log.info("Addon |" + gmpAddon.id + "| not installing updates based" +
> +        //           "on fallback.");
> +        //  return false;
> +        //}

For |isUpdate()|, maybe you could just see if the addon is already installed. This function looks like it might do the job:
http://searchfox.org/mozilla-central/rev/f6c298b36db67a7109079c0dd7755f329c1d58e2/toolkit/mozapps/extensions/AddonManager.jsm#2441

::: toolkit/mozapps/extensions/internal/ProductAddonChecker.jsm
@@ +241,5 @@
> +        if (platforms[target].alias) {
> +          target = platforms[target].alias;
> +        } else {
> +          details = platforms[target];
> +        }

To me it looks like the |detail = ...| line should be outside of the |else| block here.
Attachment #8792002 - Flags: feedback?(ksteuber) → feedback+
(In reply to Ben Hearsum (:bhearsum) from comment #44)
> However, I think we should also take a step back and think about a higher
> level solution. As I understand it, there is two drivers behind this bug:
> 1) The update server was down for a couple of days earlier this year. This
> was due to a security incident that caused it to be down for a couple of
> days (along with tons of other Mozilla systems). It was a very special case,
> and not one that I think we should design for. That aside, Balrog is a
> service with extremely high uptime.
> 2) Some Antivirus software man-in-the-middles the update requests, and that
> causes certificate pinning to fail. This is a problem that we've already
> solved for Firefox by signing the update packages instead of pinning (see
> https://bugzilla.mozilla.org/show_bug.cgi?id=1063111). I think this is a
> more robust (and secure) solution than pinning + fallback.

The plan here is to fix (1) and act as a stop gap for (2). Balrog will no longer need to contain rules for obsolete Firefox versions except where we do a post-release update. The downloader code doesn't handle the case where the GMP is no longer available. It would be good to propagate a 404 up to the UI to tell the user "This component is no longer available".

This is not an either/or situation. We need to fix the fallback and we need to sort out the CA pinning.
Comment on attachment 8792002 [details] [diff] [review]
1267495-2.patch

Review of attachment 8792002 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/modules/GMPInstallManager.jsm
@@ +227,5 @@
>          if (gmpAddon.isInstalled) {
>            log.info("Addon |" + gmpAddon.id + "| already installed.");
>            return false;
>          }
>  

I don't know how the AddonManager stuff works, but when the install succeeds, we write the version installed to a pref. So you can just check whether the KEY_PLUGIN_VERSION pref is non-empty to determine whether the plugin has ever been installed.
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #46)
> (In reply to Ben Hearsum (:bhearsum) from comment #44)
> > However, I think we should also take a step back and think about a higher
> > level solution. As I understand it, there is two drivers behind this bug:
> > 1) The update server was down for a couple of days earlier this year. This
> > was due to a security incident that caused it to be down for a couple of
> > days (along with tons of other Mozilla systems). It was a very special case,
> > and not one that I think we should design for. That aside, Balrog is a
> > service with extremely high uptime.
> > 2) Some Antivirus software man-in-the-middles the update requests, and that
> > causes certificate pinning to fail. This is a problem that we've already
> > solved for Firefox by signing the update packages instead of pinning (see
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1063111). I think this is a
> > more robust (and secure) solution than pinning + fallback.
> 
> The plan here is to fix (1) and act as a stop gap for (2). Balrog will no
> longer need to contain rules for obsolete Firefox versions except where we
> do a post-release update. The downloader code doesn't handle the case where
> the GMP is no longer available. It would be good to propagate a 404 up to
> the UI to tell the user "This component is no longer available".

OK, so the idea is that we wouldn't publish anything to a just-released version from Balrog - right? We'd only start publishing when we issue an out-of-band update?
Flags: needinfo?(ajones)
Assignee

Comment 49

3 years ago
Posted patch 1267495-3.patch (obsolete) — Splinter Review
Thanks for the feedback, the |} else {| was on purpose, but added a comment to clarify, thanks for the help on |isUpdate|, I should have picked up on what was already there in |isInstalled|.

Have the patch ready for review, with:

  pref("media.gmp-manager.url.override", "https://pouchdb.com/test.html");
  pref("media.gmp.log.dump", true);
  pref("media.gmp-manager.cert.checkAttributes", false);

in there to simulate aus being down I can still start with a fresh profile and watch Netflix immediately (after ~1 minute), I have modified the tests to match the current behaviour, I somewhat suspect that it would be good to have a few more tests however I didnt want to hard code the configuration values into the tests so if you think more testing is needed then some guidance on that would be great. Thanks
Attachment #8792002 - Attachment is obsolete: true
Attachment #8793282 - Flags: review?(ksteuber)
Attachment #8793282 - Flags: feedback?(bhearsum)
Comment on attachment 8793282 [details] [diff] [review]
1267495-3.patch

Review of attachment 8793282 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not comfortable reviewing this beyond confirming that the JSON blobs you've got are up-to-date...and they are :)
Attachment #8793282 - Flags: feedback?(bhearsum) → feedback+
Assignee

Comment 51

3 years ago
Comment on attachment 8793282 [details] [diff] [review]
1267495-3.patch

Sorry flagged the wrong reviewer
Attachment #8793282 - Flags: review?(ksteuber) → review?(rhelmer)
Tracking this for now, sounds like something we should be aiming to stabilize on 52-50 and then maybe consider for a possible dot release.
Comment on attachment 8793282 [details] [diff] [review]
1267495-3.patch

Review of attachment 8793282 [details] [diff] [review]:
-----------------------------------------------------------------

Looks like the contents of the JSON files were already reviewed so did not look at these in-depth. Maybe a followup to figure out an automated way to either keep these up to date or have a test fail when they fall out-of-date?

Code changes lgtm, please go ahead and land after fixing the small issues I found, no need to re-review.

::: toolkit/modules/GMPInstallManager.jsm
@@ +246,5 @@
> +        // Do not install from fallback if already installed as it
> +        // may be a downgrade
> +        if (usedFallback && gmpAddon.isUpdate) {
> +         log.info("Addon |" + gmpAddon.id + "| not installing updates based" +
> +                  "on fallback.");

Need a space between "based" and "on fallback".

::: toolkit/mozapps/extensions/internal/ProductAddonChecker.jsm
@@ +218,5 @@
> +  };
> +}
> +
> + /**
> + * If downloading from the netork fails (AUS server is down),

s/netork/network/

@@ +256,5 @@
> +  })).then(addons => {
> +
> +    // Some filters may not match this platform so
> +    // filter those out
> +    addons = addons.filter(x => { return x !== false; });

This can use the "concise body" form of the arrow function to get implicit return: addons.filter(x => x !== false)

@@ +261,5 @@
> +
> +    return {
> +      usedFallback: true,
> +      gmpAddons: addons
> +    };

This (and the equiv change in parseXML) change the return type documented in the jsdoc above GMPInstallManager.checkForAddons() - can you modify that? Right now it reads "The promise is resolved with an array of GMPAddons", it should be this new object instead:

https://dxr.mozilla.org/mozilla-central/rev/62f79d676e0e11b3ad59a5425b3ebb3ec5bbefb5/toolkit/modules/GMPInstallManager.jsm#82-87
Attachment #8793282 - Flags: review?(rhelmer) → review+
Getting this onto RyanVM's radar, as this is something we might want to bang the tires on once it lands.
Flags: needinfo?(ryanvm)
Assignee

Comment 55

3 years ago
Updated with review comments addressed and pushing to try before landing https://treeherder.mozilla.org/#/jobs?repo=try&revision=6eeadfeb35cf

Cheers for the review
Attachment #8793282 - Attachment is obsolete: true
Assignee

Comment 57

3 years ago
Added a pref to disable hitting the network during tests - https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d2e37d82d82
Attachment #8794137 - Attachment is obsolete: true
Attachment #8794199 - Flags: review?(mconley)
Flags: qe-verify+
Flags: needinfo?(ryanvm)
Flags: needinfo?(andrei.vaida)
Assignee

Comment 58

3 years ago
Fixed a typo, I asked Mike for a follow on review due to changes made based on try results - https://treeherder.mozilla.org/#/jobs?repo=try&revision=0af6f387dd7c
Attachment #8794199 - Attachment is obsolete: true
Attachment #8794199 - Flags: review?(mconley)
Attachment #8794233 - Flags: review?(mconley)
Comment on attachment 8794199 [details] [diff] [review]
Bug 1267495 - Fallback to local config for fresh GMP installs. r=ksteubertch

Review of attachment 8794199 [details] [diff] [review]:
-----------------------------------------------------------------

Hey Dale. This looks really reasonable, but I have a few small concerns - see below.

Also, I think your commit message needs to be updated - I think it should be:

Bug 1267495 - Fallback to local config for fresh GMP installs. r=rhelmer, feedback=ksteuber

(I guess you can add me to the r list as well when we're through).

::: toolkit/modules/GMPInstallManager.jsm
@@ +92,5 @@
>    /**
>     * Performs an addon check.
>     * @return a promise which will be resolved or rejected.
> +   *         The promise is resolved with an object with properties:
> +   *           addons: array of GMPAddons

Should this be "gmpAddons"?

@@ +199,5 @@
>     * @return a promise which will be resolved if all addons could be installed
>     *         successfully, rejected otherwise.
>     */
>    simpleCheckAndInstall: Task.async(function*() {
> +

Let's get rid of this newline.

@@ +204,3 @@
>      let log = getScopedLogger("GMPInstallManager.simpleCheckAndInstall");
>  
> +    if (!GMPPrefs.get(GMPPrefs.KEY_UPDATE_ENABLED, true)) {

Because GMPPrefs.KEY_UPDATE_ENABLED was accidentally GMPPrefs.KEY_UPDATE_ENEABLED, I think we never enter this codepath with this patch.

After fixing the above typo, we should make sure this pref works and we enter this block.

@@ +380,5 @@
>      return this.id == "gmp-widevinecdm" || this.id.indexOf("gmp-eme-") == 0;
>    },
> +  get isUpdate() {
> +    return this.version &&
> +      GMPPrefs.get(GMPPrefs.KEY_PLUGIN_VERSION, "", this.id);

Hmmm... I don't think I fully understand this.

this.version is presumably a string or a number > 0. I'm not actually sure which - maybe somebody can educate me there.

The pref is an empty string, or some string.

And we're doing an && between them... are we certain we're not supposed to be checking that one value is greater than another?

::: toolkit/modules/GMPUtils.jsm
@@ +133,5 @@
>    KEY_CERT_CHECKATTRS:          "media.gmp-manager.cert.checkAttributes",
>    KEY_CERT_REQUIREBUILTIN:      "media.gmp-manager.cert.requireBuiltIn",
>    KEY_UPDATE_LAST_CHECK:        "media.gmp-manager.lastCheck",
>    KEY_SECONDS_BETWEEN_CHECKS:   "media.gmp-manager.secondsBetweenChecks",
> +  KEY_UPDATE_ENEABLED:          "media.gmp-manager.updateEnabled",

Typo: KEY_UPDATE_ENEABLED -> KEY_UPDATE_ENABLED.

::: toolkit/mozapps/extensions/internal/ProductAddonChecker.jsm
@@ +161,5 @@
> +  return new Promise((resolve, reject) => {
> +    let xmlHttp = new XMLHttpRequest({mozAnon: true});
> +
> +    xmlHttp.onload = function(aResponse) {
> +      resolve(JSON.parse(this.responseText));

We might want to wrap this in a try/catch and reject in the catch, since JSON.parse will throw if the returned string is invalid.

@@ +218,5 @@
> +    gmpAddons: results
> +  };
> +}
> +
> + /**

Nit: alignment.

@@ +225,5 @@
> + */
> +function downloadLocalConfig() {
> +
> +  return Promise.all(LOCAL_EME_SOURCES.map(conf => {
> +    return downloadJSON(conf.src).then(addons => {

I guess stuff can still go wrong in here. We should probably have reject handlers, and maybe log the hell out of this.

@@ +257,5 @@
> +  })).then(addons => {
> +
> +    // Some filters may not match this platform so
> +    // filter those out
> +    addons = addons.filter(x => x !== false);

Maybe where we set "false" up above, add a comment saying that we'll filter them out later from the Array.
Attachment #8794199 - Attachment is obsolete: false
Comment on attachment 8794233 [details] [diff] [review]
Bug 1267495 - Fallback to local config for fresh GMP installs. r=ksteubertch

Review of attachment 8794233 [details] [diff] [review]:
-----------------------------------------------------------------

I think a lot of my review comments from https://bugzilla.mozilla.org/show_bug.cgi?id=1267495#c59 still apply - can you give those a look and let me know if I'm off my rocker?

::: toolkit/modules/GMPInstallManager.jsm
@@ +92,5 @@
>    /**
>     * Performs an addon check.
>     * @return a promise which will be resolved or rejected.
> +   *         The promise is resolved with an object with properties:
> +   *           addons: array of GMPAddons

As before, shouldn't this be "gmpAddons"?

@@ +378,5 @@
>    },
>    get isEME() {
>      return this.id == "gmp-widevinecdm" || this.id.indexOf("gmp-eme-") == 0;
>    },
> +  get isUpdate() {

Same question from https://bugzilla.mozilla.org/show_bug.cgi?id=1267495#c59 applies here.
Attachment #8794233 - Flags: review?(mconley) → review-
Assignee

Comment 61

3 years ago
> Hmmm... I don't think I fully understand this.
> this.version is presumably a string or a number > 0. I'm not actually sure which - maybe somebody can educate me there.
> The pref is an empty string, or some string.
> And we're doing an && between them... are we certain we're not supposed to be checking that one value is greater than another?

So the this.version check was copied over from isInstalled, not sure why its particularly necessary in either case but figured I would follow the existing code. As for the check Chris said we want to not install from fallback in the case that it was previously installed so we arent doing a version check here, just a check that any version at all has been installed.

> We might want to wrap this in a try/catch and reject in the catch, since JSON.parse will throw if the 
> returned string is invalid.

If JSON.parse throws inside a promise then it will be rejected, any reason to not use Promise semantics?

> Typo: KEY_UPDATE_ENEABLED -> KEY_UPDATE_ENABLED.

Heh apologies I had fixed that and r?'d you on the fixed patch but didnt notify you, sorry. The tests failed with this code being broken which is good, however we dont have full end to end tests that the plugins get installed, right now verifying manually

> I guess stuff can still go wrong in here. We should probably have reject handlers, and maybe log 
> the hell out of this.

Theres a reject handler in GMPInstallManager.simpleCheckAndInstall, but will add some more logging 

The rest of the issues I have addressed.

One more issue, these are failing on android, Geoff I added a pref |user_pref("media.gmp-manager.updateEnabled", false)| to testing/profiles/prefs_general.js in order for the tests to not trigger connections when running this code however this pref does not look like it is being picked up on android, do you know how to make it do so? Thanks
Flags: needinfo?(gbrown)
Attachment #8794323 - Flags: review?(mconley)
Assignee

Updated

3 years ago
Attachment #8794233 - Attachment is obsolete: true
Assignee

Updated

3 years ago
Attachment #8794199 - Attachment is obsolete: true
(In reply to Dale Harvey (:daleharvey) from comment #61)
> Created attachment 8794323 [details] [diff] [review]
> Bug 1267495 - Fallback to local config for fresh GMP installs
>
> So the this.version check was copied over from isInstalled, not sure why its
> particularly necessary in either case but figured I would follow the
> existing code. As for the check Chris said we want to not install from
> fallback in the case that it was previously installed so we arent doing a
> version check here, just a check that any version at all has been installed.
>

Okay, I understand. Can you add a comment above the definition of isUpdate laying that out? (That isUpdate returns true if a version of this GMP has been installed before).

> > We might want to wrap this in a try/catch and reject in the catch, since JSON.parse will throw if the 
> > returned string is invalid.
> 
> If JSON.parse throws inside a promise then it will be rejected, any reason
> to not use Promise semantics?

I'd forgotten that throwing inside a Promise rejects automatically! Thanks for reminding me!

> 
> Heh apologies I had fixed that and r?'d you on the fixed patch but didnt
> notify you, sorry. The tests failed with this code being broken which is
> good, however we dont have full end to end tests that the plugins get
> installed, right now verifying manually

No worries. :)

> 
> Theres a reject handler in GMPInstallManager.simpleCheckAndInstall, but will
> add some more logging 

Thanks!

> 
> One more issue, these are failing on android, Geoff I added a pref
> |user_pref("media.gmp-manager.updateEnabled", false)| to
> testing/profiles/prefs_general.js in order for the tests to not trigger
> connections when running this code however this pref does not look like it
> is being picked up on android, do you know how to make it do so? Thanks

From gbrown in IRC:

"daleharvey: I would expect prefs_general.js to be used for android tests...certainly reftests
2:13 PM I can't think of why your change would not apply to android"

Which tests are you trying to set this pref for?
I have confirmed that changes to prefs_general.js do not affect Android reftests; prefs_general.js is used by Android mochitests. I don't see any code that attempts to use prefs_general.js for reftests on any platform. I'm a little hazy on our test profile preferences story in general now, but I think you may need to update both prefs_general.js and layout/tools/reftest/reftest-preferences.js to cover all tests.
Flags: needinfo?(gbrown)
Comment on attachment 8794323 [details] [diff] [review]
Bug 1267495 - Fallback to local config for fresh GMP installs

Review of attachment 8794323 [details] [diff] [review]:
-----------------------------------------------------------------

Whoops, forgot to set the flag here (review is in comment 62).
Attachment #8794323 - Flags: review?(mconley) → review-
(In reply to Ben Hearsum (:bhearsum) from comment #48)
> OK, so the idea is that we wouldn't publish anything to a just-released
> version from Balrog - right? We'd only start publishing when we issue an
> out-of-band update?

It looks like Maire's team want to push updates to OpenH264 to all channels whereas my team have been letting Widevine ride the trains and follow normal uplifting rules. I suspect we all need to have a proper conversation about this rather than trying to figure it out in bugzilla.
Flags: needinfo?(ajones) → needinfo?(mreavy)
My team and I actually don't feel strongly about having non-security OpenH264 updates being associated with a particular release (i.e. "ride the trains") vs not.  We actually talked about changing it to "ride the trains", but there was lots of overhead to making the switch and everyone was busy.  To be frank, releasing a plugin update currently is a bunch of overhead for me right now,  and most of that would go away if it just rode the trains.  So if anything, I'm in favor modulo we have a lot on our plates now; so if it requires a lot of time from me or my team members, I'd want to carefully choose "when".

I'm happy to have this conversation offline (or off of bugzilla), but I wanted to capture the tl;dr here.
Flags: needinfo?(mreavy)
Assignee

Comment 67

3 years ago
Ok, as far as I can tell it looks like any failures on https://treeherder.mozilla.org/#/jobs?repo=try&revision=27f26b78422f&selectedJob=28057082 are unrelated to my patch and its all looking pretty green.

I applied the |media.gmp-manager.updateEnabled| pref to anywhere that defined |media.gmp-manager.url.override| to ensure the tests dont hit the network.

I also found that XPIProvider.jsm was a consumer of |ProductAddonChecker.getProductAddonList| so I moved the pref that disables hitting the network into that function so it is shared with GMPInstallManager.

Also clarified the previous review points + added comments, Cheers
Attachment #8794323 - Attachment is obsolete: true
Attachment #8794871 - Flags: review?(mconley)
Comment on attachment 8794871 [details] [diff] [review]
Bug 1267495 - Fallback to local config for fresh GMP installs

Review of attachment 8794871 [details] [diff] [review]:
-----------------------------------------------------------------

Hey Dale,

This looks great - there's a removal I'm not sure about though. See below.

::: toolkit/modules/GMPInstallManager.jsm
@@ +379,5 @@
> +   */
> +  get isUpdate() {
> +    return this.version &&
> +      GMPPrefs.get(GMPPrefs.KEY_PLUGIN_VERSION, false, this.id);
> +  }

Nit - convention seems to be to add a trailing comma, so let's put one here as well.

::: toolkit/mozapps/extensions/internal/ProductAddonChecker.jsm
@@ +30,5 @@
>  Cu.import("resource://gre/modules/FileUtils.jsm");
>  Cu.import("resource://gre/modules/NetUtil.jsm");
>  Cu.import("resource://gre/modules/osfile.jsm");
> +/*globals GMPPrefs */
> +Cu.import("resource://gre/modules/GMPUtils.jsm");

It might be worth making this a lazy module getter as well.

::: toolkit/mozapps/extensions/test/xpcshell/test_system_update.js
@@ -288,5 @@
>   * finalState: An optional property, the expected final state of system add-ons,
>   *             if missing the test condition's initialState is used.
>   */
>  const TESTS = {
> -  // Test that an error response does nothing

What's this removal all about? I don't think test_system_update.js has anything to do with the GMP fallback... is this accidental?
Attachment #8794871 - Flags: review?(mconley) → review-
Assignee

Comment 69

3 years ago
> What's this removal all about? I don't think test_system_update.js has anything to do with the 
> GMP fallback... is this accidental?

Not accidental. test_system_update.js calls XPIProvider.updateSystemAddons which calls ProductAddonChecker.getProductAddonList(url), previously this call would throw / reject with the invalid network, now it succeeds (with an empty addon list due to being within tests)
Comment on attachment 8794871 [details] [diff] [review]
Bug 1267495 - Fallback to local config for fresh GMP installs

Review of attachment 8794871 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Dale Harvey (:daleharvey) from comment #69)
> > What's this removal all about? I don't think test_system_update.js has anything to do with the 
> > GMP fallback... is this accidental?
> 
> Not accidental. test_system_update.js calls XPIProvider.updateSystemAddons
> which calls ProductAddonChecker.getProductAddonList(url), previously this
> call would throw / reject with the invalid network, now it succeeds (with an
> empty addon list due to being within tests)

Ah, thanks for clearing that up! With the above nits fixed, r=me! Thanks so much!
Attachment #8794871 - Flags: review- → review+

Comment 71

3 years ago
Pushed by dharvey@mozilla.com:
https://hg.mozilla.org/integration/fx-team/rev/7c1929f35c5d
Fallback to local config for fresh GMP installs. r=mconley, r=rhelmer, feedback=ksteuber

Comment 72

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/7c1929f35c5d
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
Thanks Dale.

k17e - I understood there was some level of urgency behind this bug. Should we be attempting to uplift this as far as beta?

Hey Dale - do you have some instructions for SV folks who might want to test this work manually? Is there a set of steps they can follow to simulate a downed AUS to make sure they can still download the GMPs?
Flags: needinfo?(dale)
Flags: needinfo?(ajones)
Assignee

Comment 74

3 years ago
To reproduce how I tested manually, I set

pref("media.gmp-manager.cert.checkAttributes", false);
pref("media.gmp-manager.url.override", "https://pouchdb.com/test.html"); (any url that 404's / errors should do)

With that I start from a fresh profile, 1 minute after booting it should attempt the install and so from ~1m30s I check that I can watch something on netflix or not
Flags: needinfo?(dale)
Great, thanks Dale. Andrei, is that enough information? (comment 74)
(In reply to Mike Conley (:mconley) - (needinfo me!) from comment #75)
> Great, thanks Dale. Andrei, is that enough information? (comment 74)

I believe so, at least for now. Dale, Mike, thank you for taking the time to provide this information.

Bogdan is looking at this as we speak -- we'll follow up with test results as soon as possible.
Flags: needinfo?(andrei.vaida) → needinfo?(bogdan.maris)
Hi Dale, should we uplift this fix to Aurora51 and Beta50?
Flags: needinfo?(dale)
We should uplift it when we're happy that it is working. See bug 1306516.
Flags: needinfo?(ajones)
Assignee

Comment 79

3 years ago
Yeh its not my call whether to uplift but if theres any work needed to uplift then can do, one thing would like to check is whether the blob versions are dependent on fx versions / we need different configs for uplifts Ben?
Flags: needinfo?(dale) → needinfo?(bhearsum)
(In reply to Dale Harvey (:daleharvey) from comment #79)
> Yeh its not my call whether to uplift but if theres any work needed to
> uplift then can do, one thing would like to check is whether the blob
> versions are dependent on fx versions / we need different configs for
> uplifts Ben?

If we don't care about ESR, the only special cases I see are Windows 7 (64-bit) and Windows Vista (32 & 64-bit) do not support Widevine at all.

If we're backporting to ESR, there's additional restrictions:
* Widevine is not supported at all on the current ESR (45.0)
* 64-bit versions of Windows use CDM-16, except for NT 5.* (XP, 2000, 2003) which do not support the CDM
* All other platforms (Linux, Mac, versions of Windows not mentioned above) use CDM-15
Flags: needinfo?(bhearsum)
Reporter

Updated

3 years ago
Depends on: 1307997
(In reply to Dale Harvey (:daleharvey) from comment #74)
> To reproduce how I tested manually, I set
> 
> pref("media.gmp-manager.cert.checkAttributes", false);
> pref("media.gmp-manager.url.override", "https://pouchdb.com/test.html");
> (any url that 404's / errors should do)
> 
> With that I start from a fresh profile, 1 minute after booting it should
> attempt the install and so from ~1m30s I check that I can watch something on
> netflix or not

The netflix accounts we at Softvision have are not working anymore (for some time now), is there another website that I can use? Or if someone who has access to a netflix account could verify this instead on Nightly would be greatly appreciated.
Flags: needinfo?(bogdan.maris) → needinfo?(cpearce)
Flags: needinfo?(cpearce) → needinfo?(dale)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #81)
> The netflix accounts we at Softvision have are not working anymore (for some
> time now), is there another website that I can use? Or if someone who has
> access to a netflix account could verify this instead on Nightly would be
> greatly appreciated.

There are Widevine test videos on the Shaka Player demo page (though I just hit crash bug 1273372 when trying to load the page in 64-bit Nightly 52):

http://shaka-player-demo.appspot.com/demo/
(In reply to Chris Peterson [:cpeterson] from comment #82)
> (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #81)
> > The netflix accounts we at Softvision have are not working anymore (for some
> > time now), is there another website that I can use? Or if someone who has
> > access to a netflix account could verify this instead on Nightly would be
> > greatly appreciated.
> 
> There are Widevine test videos on the Shaka Player demo page (though I just
> hit crash bug 1273372 when trying to load the page in 64-bit Nightly 52):
> 
> http://shaka-player-demo.appspot.com/demo/

Thanks for this Chris.
Based on the instructions from comment 74 I was able to confirm that Widevine is properly installed even if I change those two prefs, but only using 32-bit builds (Windows 10, 7 and 8.1). Latest Nightly in 64-bit mode did not had problems in installing Widevine.

For some reason Widevine does not install in 64-bit builds on Windows, I also commented in bug 1306516, which is resolved fixed.

I will not say this is verified until I can also check with 64-bit builds as well.
Flags: needinfo?(dale)
Assignee

Comment 84

3 years ago
There has been a follow up bug so we should wait verifying until the issues are dealt with https://bugzilla.mozilla.org/show_bug.cgi?id=1309463
50 is in RC mode, too late to fix this in 50
We could still take this uplift in 51 along with the patch from bug 1309463.
Flags: needinfo?(dale)
Assignee

Comment 87

3 years ago
answered in https://bugzilla.mozilla.org/show_bug.cgi?id=1309463
Flags: needinfo?(dale)
Getting late in the 51 cycle, but we could still potentially take a patch for 51.
Also verified on Firefox 52 beta 3 on Windows 8.1 32bit and Windows 10 64bit. Leaving the status of the bug as Resolved for now until the fix hits Fx 51.
Removing qe-verify flag since the QA work on this bug has been completed.
Flags: qe-verify+
I'm guessing we don't need this ni? from jet now...
Flags: needinfo?(bugs)
You need to log in before you can comment on or make changes to this bug.