Closed Bug 1230206 Opened 9 years ago Closed 9 years ago

Disable Fennec FHR uploads for Fx43, 44 and 45

Categories

(Android Background Services Graveyard :: Firefox Health Report Service, defect)

Firefox 43
All
Android
defect
Not set
normal

Tracking

(firefox43+ fixed, firefox44+ fixed, firefox45+ fixed, firefox46+ fixed, fennec43+)

RESOLVED FIXED
Firefox 46
Tracking Status
firefox43 + fixed
firefox44 + fixed
firefox45 + fixed
firefox46 + fixed
fennec 43+ ---

People

(Reporter: Margaret, Assigned: rnewman)

References

Details

Attachments

(4 files)

Splitting this off of bug 1183320.
Flags: needinfo?(rnewman)
From bug 1183320:

We should still keep the data reporting pref and corresponding UI around, since we use that for opting out of Adjust data collection.

So, to *just* disable the uploading of any data (not the collecting locally), what's the simplest thing we can do? Are there a few lines of code we can comment out?
Unclear if this code path is tested, but it looks like we can just flip `HealthReportConstants.UPLOAD_FEATURE_DISABLED` [1] to true.

afaict, the only place HealthReportUploadService is triggered is in the HealthReportBroadcastService [2], which only occurs through that code path.

[1]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/background/healthreport/HealthReportBroadcastService.java#124
[2]: https://mxr.mozilla.org/mozilla-central/search?string=healthreportuploadserv&find=mobile%2Fandroid&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central
That's the idea, yeah. Services crew doin' flags rite.
Flags: needinfo?(rnewman)
The patch is untested because I'm trying to remember how! I can either:
  * Run the services' tests
  * Use an addon I can't find to force the services to run immediately

I will test before landing, but I'm pretty confident it will work by looking at the code.

That being said, it might be a good idea to fix (remove?) the tests associated with FHR upload to ensure the tests, which don't run in automation afaik, continue to pass (/me crosses fingers they are still passing).
Comment on attachment 8695423 [details]
MozReview Request: Bug 1230206 - Disable FHR uploads while preserving deletion functionality. r=rnewman

https://reviewboard.mozilla.org/r/27103/#review24505

Testing notes:

* Make sure that _unchecking_ the FHR checkbox in Settings still sends the Bagheera delete request!
* Figure out how to get FHR to upload on a current build. That might happen on first run, or 24 hours later. It might happen if you uncheck then re-check the pref in Settings.
* Failing that, https://github.com/ncalexan/healthreport-upload-test-addon
Attachment #8695423 - Flags: review?(rnewman) → review+
(In reply to Michael Comella (:mcomella) from comment #6)
>   * Use an addon I can't find to force the services to run immediately

Found it: https://github.com/ncalexan/healthreport-upload-test-addon

But it inexplicably crashes the browser when installed. I don't see anything obvious in the logs. I found:

12-03 11:56:19.481 W/GeckoConsole(28092): [JavaScript Warning: "expression closures are deprecated" {file: "resource://basicnative-mobile/jni.jsm" line: 785 column: 26 source: "var sigRegex = function() /\[*([VZBCSIJFD]|L([^.\/;]+(\/[^.\/;]+)*);)/g;
12-03 11:56:19.481 W/GeckoConsole(28092): "}]
12-03 11:56:20.506 I/Choreographer(28092): Skipped 52 frames!  The application may be doing too much work on its main thread.
12-03 11:56:20.641 W/ActivityManager( 3036): destPackageName = null
12-03 11:56:20.801 D/AndroidRuntime(28226):
12-03 11:56:20.801 D/AndroidRuntime(28226): >>>>>> AndroidRuntime START com.android.internal.os.RuntimeInit <<<<<<

If it was the fault of the patch, I'd expect some obvious Java stack traces. Guess I'll look into running the tests.
(In reply to Richard Newman [:rnewman] from comment #7)
> * Make sure that _unchecking_ the FHR checkbox in Settings still sends the
> Bagheera delete request!

If we're disabling uploading health reports and no longer using the data in FHR (afaik), why is this necessary? The data will be deleted when the servers are taken offline, right?
Flags: needinfo?(rnewman)
I'm leery of removing the ability for users to proactively delete their data, even if only for a month. IANAL.
Flags: needinfo?(rnewman)
We delete inside SubmissionPolicy.tick [1] but the SubmissionPolicy is only created and ticked when we handle an Intent in HealthReportUploadService [2], which we don't when we change this constant [3]. Alas.

Margaret, is it worth my time to repair the code path to ensure deletion? Richard suggested asking legal. If it's necessary for users to proactively delete their data and we're not using the data anymore, it sounds like it'd be better worth my time for us to delete it all now.

[1]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/background/healthreport/upload/SubmissionPolicy.java#111
[2]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/background/healthreport/upload/HealthReportUploadService.java#87
[3]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/background/healthreport/upload/HealthReportUploadService.java#52
Flags: needinfo?(margaret.leibovic)
I just waited it out and added Log statements to ensure the patch works – this is the patch I used.

So the only thing left to do is decide whether or not we want the "delete" action to work.
Had a meeting w/ Margaret & Richard – instead of using the flag, we can `return` before we upload the document, keeping the deletion mechanism (and other code paths) in place.
Flags: needinfo?(margaret.leibovic)
Blocked bug 1183320 is tracking 43 so so should this, I think.
tracking-fennec: --- → 43+
Comment on attachment 8695423 [details]
MozReview Request: Bug 1230206 - Disable FHR uploads while preserving deletion functionality. r=rnewman

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/27103/diff/1-2/
Attachment #8695423 - Attachment description: MozReview Request: Bug 1230206 - Disable FHR upload. r=rnewman → MozReview Request: Bug 1230206 - Disable FHR uploads while preserving deletion functionality. r=rnewman
Comment on attachment 8695423 [details]
MozReview Request: Bug 1230206 - Disable FHR uploads while preserving deletion functionality. r=rnewman

https://reviewboard.mozilla.org/r/27103/#review24669

I think the simplest fix for this is to change

```
    boolean uploadEnabled = intent.getBooleanExtra("uploadEnabled", false);
```

to

```
    boolean uploadEnabled = false;   // intent.getBooleanExtra("uploadEnabled", false);
```

in `HealthReportUploadService.java`. That should be all you need in order to preserve deletions -- it's exactly the state we'd reach if the user unchecked the box in Settings.
Attachment #8695423 - Flags: review?(rnewman)
Michael, did you give that a try? If this is going in 43 we shouldn't let it drop.
Flags: needinfo?(michael.l.comella)
(In reply to Richard Newman [:rnewman] from comment #17)
> Michael, did you give that a try? If this is going in 43 we shouldn't let it
> drop.

Sorry, I got caught up in orlando and didn't have a chance to try this. I spoke with Margaret today and she feels it's too late to safely land this in 43 and proposed disabling FHR (deletions included) in 44 – my first patch – since the servers will likely be down by then.

Richard, do you see any issues with the servers going down during the 43 cycle and thus us sending pings that are never returned? Assuming the error handling code is correct, it seems fairly safe with exception to more attempted uploads (3 per day, iirc) which can translate to more network and CPU usage, in theory.
Flags: needinfo?(michael.l.comella) → needinfo?(rnewman)
My perspective is that it's safer to flip the uploadEnabled bool for all users -- which takes the existing well-tested code path for when users opt out -- than to subject millions of users to a thorough test of our error recovery code.

I particularly feel this way because I don't know what the decommissioning flow looks like yet -- whether we'll have a 503inator, a quiet-200inator, a timeout, or a DNS failure! Preventing clients from making upload requests altogether seems safest.

I'm happy to take this over if you're swamped or traveling.
Flags: needinfo?(rnewman)
Spoke w/ rnewman on irc – he'll take it over for time constraints and I'll review tonight.
Assignee: michael.l.comella → rnewman
Liz, is there a chance of this making it into 43 if a patch is reviewed by EOD?

In my opinion it's totally safe -- it effectively flips the "opt out" FHR checkbox for every user -- but we're very close to the merge.

I only ask because I think there's more risk in _not_ doing this than in doing it; see Comment 19.

If you think there will be a 43.0.1 within a few weeks, I'd take that option.
Status: NEW → ASSIGNED
Flags: needinfo?(lhenry)
OS: Unspecified → Android
Hardware: Unspecified → All
Version: unspecified → Firefox 43
Manual testing results below. I've added logging to make behavior abundantly clear.

Old behavior was like this:

12-11 12:59:47.240 I/GeckoHealth(23879): fennec_rnewman :: SubmissionPolicy :: Uploading FHR document with ID cc3744ea-8f42-4891-866f-6da05dec5e5b


New behavior is this (I think I'm being rate limited, which is why my uploads are being rejected and the deletions fail):

12-11 13:00:08.335 I/GeckoHealth(24305): fennec_rnewman :: SubmissionPolicy :: Upload disabled and nothing to delete.
12-11 13:00:08.350 I/GeckoHealth(24305): fennec_rnewman :: SubmissionPolicy :: Got soft failure at 1449867607470 deleting obsolete document with id cc3744ea-8f42-4891-866f-6da05dec5e5b: Got status 500 from server. Trying again later.
12-11 13:00:08.510 I/GeckoHealth(24305): fennec_rnewman :: SubmissionPolicy :: Got soft failure at 1449867608342 deleting obsolete document with id cc3744ea-8f42-4891-866f-6da05dec5e5b: Got status 500 from server. Trying again later.


followed evermore by this:

12-11 13:00:20.860 I/GeckoHealth(24305): fennec_rnewman :: SubmissionPolicy :: Upload disabled and nothing to delete.


The success deletion case looks like this:

12-11 12:23:03.040 I/GeckoHealth( 5886): fennec_rnewman :: SubmissionPolicy :: Deleted an obsolete document at 1449865382494.
No bug - Fix logging error in SwitchBoard.
Bug 1230206 - Disable upload of new FHR documents. r?mcomella
Attachment #8697693 - Flags: review?(michael.l.comella)
Comment on attachment 8697692 [details]
MozReview Request: No bug - Fix logging error in SwitchBoard.

https://reviewboard.mozilla.org/r/27693/#review24911

Given HTTP schemes, this correction makes sense for me.

Was this intended to land with this patch though?
Attachment #8697692 - Flags: review+
Comment on attachment 8697693 [details]
MozReview Request: Bug 1230206 - Disable upload of new FHR documents. r?mcomella

https://reviewboard.mozilla.org/r/27695/#review24913

It looks like this patch acts like the preference has been flipped off, causing every client to request a deletion from the server. Is this what we want? It sounds like it'll cause all clients to perform data transmission via deletions, which may also fail to reach the server (so in the worst case it can be just as bad as keeping uploading in). Also, do we actually want to delete all of the data from the server or is this something we were planning on archiving?

It's worth noting that in the best case, once the deletions are complete, the clients will stop transmitting, which is what we want.

I think we actually want to stop uploading but keep the ability for users to delete their data by turning off the preference, as my second patch does.
Attachment #8697693 - Flags: review?(michael.l.comella)
(In reply to Michael Comella (:mcomella) from comment #25)

> Was this intended to land with this patch though?

I wasn't going to go through the work of creating a Mercurial bookmark and a MozReview request for a No Bug no-review-required fix :)


(In reply to Michael Comella (:mcomella) from comment #26)

> It sounds like it'll cause all clients to perform data transmission
> via deletions, which may also fail to reach the server (so in the worst case
> it can be just as bad as keeping uploading in). 

To be clear: every client deletes the previous document on upload via an HTTP header. Now we simply stop uploading the replacement document.

The upload process is continual: we will try three times, every day, forever.

By contrast, the deletion process converges:

* If it succeeds, we'll never make another request.
* If it hard-fails, we'll never make another request.
* If it soft-fails 21 times (HealthReportConstants.DELETION_ATTEMPTS_PER_OBSOLETE_DOCUMENT_ID), we'll never make another request.

If we get this in 43, almost every client will succeed in deletion. If the service is turned off such that we trigger a hard error (which I think is every status code less than 500), we won't try again.


> Also, do we actually want to
> delete all of the data from the server or is this something we were planning
> on archiving?

I have no idea who is shepherding the process of decommissioning -- and, to a point, that I don't know who they are implies that they don't care.

I know there are snapshots of FHR data, and I know that a gap in collection is expected, so I don't think this is a big area of concern.
Comment on attachment 8697693 [details]
MozReview Request: Bug 1230206 - Disable upload of new FHR documents. r?mcomella

mcomella r+'ed via SMS.
Attachment #8697693 - Flags: review+
Katie: could you opine on whether it's acceptable for all active Android FHR documents to be deleted via ordinary Bagheera deletion requests prior to FHR being shut down?
Flags: needinfo?(kparlante)
Summary: Disable FHR uploads for Fx43 → Disable Fennec FHR uploads for Fx43
Richard - we are planning to ship 43.0 build 1 tomorrow. It is too late for another build unless we want to delay the release. Unless this seriously affects stability or security I don't want to do that.  Does this make Firefox for Android unusable? If so then we shouldn't ship and we should go ahead with an RC2 build and delay the release. If we can live with this for a few days, then I'd prefer to wait and do a 43.0.1 Firefox for Android release.
Flags: needinfo?(lhenry)
https://hg.mozilla.org/integration/fx-team/rev/7e6ca87a61bbfa4d0e907a0272f0ef3db81360ee
Bug 1230206 - Disable upload of new FHR documents. r=mcomella, a=kwierso/android-only on a CLOSED TREE
(In reply to Richard Newman [:rnewman] from comment #29)
> Katie: could you opine on whether it's acceptable for all active Android FHR
> documents to be deleted via ordinary Bagheera deletion requests prior to FHR
> being shut down?

mreid's opinion is going to be more useful here
Flags: needinfo?(kparlante) → needinfo?(mreid)
(In reply to Richard Newman [:rnewman] from comment #19)
> My perspective is that it's safer to flip the uploadEnabled bool for all
> users -- which takes the existing well-tested code path for when users opt
> out -- than to subject millions of users to a thorough test of our error
> recovery code.
> 
> I particularly feel this way because I don't know what the decommissioning
> flow looks like yet -- whether we'll have a 503inator, a quiet-200inator, a
> timeout, or a DNS failure! Preventing clients from making upload requests
> altogether seems safest.
Our plan on the pipeline team is to quietly 200. See Bug 1227279.
(In reply to Mark Reid [:mreid] from comment #33)

> Our plan on the pipeline team is to quietly 200. See Bug 1227279.

Great, thanks for the pointer!

Could you also confirm that it's OK for all clients to issue deletes for their existing documents? See Comment 29 and Comment 27.
Depends on: 1227279
(In reply to Richard Newman [:rnewman] from comment #29)
> Katie: could you opine on whether it's acceptable for all active Android FHR
> documents to be deleted via ordinary Bagheera deletion requests prior to FHR
> being shut down?

My current understanding is that it would be acceptable if everyone effectively opted out of Android FHR. The only caveat is that this should clearly *not* opt people out of Unified Telemetry :)

Per my previous comment, we intend to implement a happy little service that just returns 200s so that we can minimally monitor that the FHR request volume declines as expected over time.  It will not retain the submissions themselves, just the basic request counts.

This will also ensure that anyone who does not upgrade to 43 won't need to continually exercise the FHR "upload failure" case.
Flags: needinfo?(mreid)
(In reply to Mark Reid [:mreid] from comment #35)

> My current understanding is that it would be acceptable if everyone
> effectively opted out of Android FHR. The only caveat is that this should
> clearly *not* opt people out of Unified Telemetry :)

Yeah, this is Android-side only, and doesn't flip the pref itself.

The behavior after this patch lands and ships is that every client will send a single Bagheera delete request for the last document they uploaded, and if that's answered with a 200 will never touch the server again, exactly as if they'd opted out.

When we start sending UT pings (Bug 1220177), those will obey the user's previous FHR (opt-out) preference.

The reason I ask is that this behavior is not the same as if you decommissioned the service first -- you'll receive deletions for about half of the stored FHR v3 documents by the time you pull the plug, so you'll be reliant on snapshots before this change reaches release if you want to do any analysis on that dataset.


> Per my previous comment, we intend to implement a happy little service that
> just returns 200s so that we can minimally monitor that the FHR request
> volume declines as expected over time.  It will not retain the submissions
> themselves, just the basic request counts.

Yup.

For Android we expect this to fall off in line with our update rate, which (very roughly) means 50% of our installs within two weeks.


> This will also ensure that anyone who does not upgrade to 43 won't need to
> continually exercise the FHR "upload failure" case.

Always appreciated :D
OK, we are now aiming to land this on m-r, and once it does, I will start a 43.0 RC build 2. We are planning to hold back the 43 mobile release to fix this. I agree with Richard, best not to risk subjecting users to unnecessary and excessive data upload attempts.  Richard will land this on m-r and watch the tree.
I have this patch staged to land, but I'm wrestling with Android tooling to get a release build that I can test before I land it on m-r.

If I continue to fail I'm just going to double-check the fx-team build from treeherder and cross my fingers.
https://hg.mozilla.org/mozilla-central/rev/7e6ca87a61bb
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 46
Any idea on how to test this? Thanks!
Flags: needinfo?(rnewman)
Comment on attachment 8697693 [details]
MozReview Request: Bug 1230206 - Disable upload of new FHR documents. r?mcomella

Approval Request Comment
[Feature/regressing bug #]:
  Decommissioning of FHR.

[User impact if declined]:
  Unnecessary network traffic.

[Describe test coverage new/current, TreeHerder]:
  Manually tested.

[Risks and why]: 
  Already in mozilla-release. Low risk, particularly compared to the alternative.

[String/UUID change made/needed]:
  None.
Attachment #8697693 - Flags: approval-mozilla-beta?
Attachment #8697693 - Flags: approval-mozilla-aurora?
(In reply to Mihai Pop from comment #41)
> Any idea on how to test this? Thanks!

* Set logging.

  adb shell setprop log.tag.GeckoHealth VERBOSE
  adb shell setprop log.tag.SubmissionPolicy VERBOSE
  adb shell setprop log.tag.HealthReportUploadService VERBOSE
  
* Install the earlier version if you haven't already. Wait 24 hours for an FHR document upload to occur, if you haven't already. If you already have a long-term install, you can skip this step.

You'll see logging like:

    Health report upload feature is compile-time enabled; handling upload intent.

You'll see no logging on success, because all of our success logging is guarded by PII flags.


* Install the new version. Wait 24 hours. You'll see logging like:

    Upload disabled. Deleting obsolete document.


* Wait 24 more hours. You'll see logging like:

    Upload disabled and nothing to delete.


We had a test add-on to speed this along (<https://github.com/ncalexan/healthreport-upload-test-addon>), but it's not signed, so it no longer works.
Flags: needinfo?(rnewman)
Comment on attachment 8697693 [details]
MozReview Request: Bug 1230206 - Disable upload of new FHR documents. r?mcomella

Given that this fix was taken in FF43 yesterday before going live, it definitely should be uplifted to Beta44 and Aurora45.
Attachment #8697693 - Flags: approval-mozilla-beta?
Attachment #8697693 - Flags: approval-mozilla-beta+
Attachment #8697693 - Flags: approval-mozilla-aurora?
Attachment #8697693 - Flags: approval-mozilla-aurora+
Summary: Disable Fennec FHR uploads for Fx43 → Disable Fennec FHR uploads for Fx43, 44 and 45
Probably a good idea to track this for all releases since we changed its state so quickly. Just to make sure we remember the followup uplifts!
I'm hitting conflicts trying to apply this to beta 44. I believe it's because we don't have bug 1107811. Can we either get a rebased patch in here to work around it, or get bug 1107811 uplifted first?
Flags: needinfo?(rnewman)
Flags: needinfo?(nalexander)
(In reply to Wes Kocher (:KWierso) from comment #46)
> I'm hitting conflicts trying to apply this to beta 44. I believe it's
> because we don't have bug 1107811. Can we either get a rebased patch in here
> to work around it, or get bug 1107811 uplifted first?

KWierso and I discussed in IRC -- he'll use `hg graft -r 082d21137c0e` to backport the patch from mozilla-release.
Flags: needinfo?(rnewman)
Flags: needinfo?(nalexander)
N.B., either the fx-team or the m-r commits should graft cleanly onto m-a.
Wes, any reason why it didn't land in aurora/45 ? Thanks
Flags: needinfo?(wkocher)
Aurora was still closed at that time for all of the leaks. I'll make a pass through aurora uplifts here today.
Flags: needinfo?(wkocher)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: