Closed Bug 1063714 Opened 10 years ago Closed 10 years ago

Migrate FHR files and preferences in FirefoxProfileMigrator.js

Categories

(Firefox :: Migration, defect)

defect
Not set
normal
Points:
2

Tracking

()

VERIFIED FIXED
Firefox 36
Iteration:
36.1
Tracking Status
firefox35 --- verified
firefox36 --- verified
firefox37 --- verified

People

(Reporter: adw, Assigned: markh)

References

Details

(Whiteboard: [fxgrowth])

Attachments

(1 file, 2 obsolete files)

Broken down from bug 860238.

FirefoxProfileMigrator.js should copy FHR files and preferences from the old profile to the new one.  times.json is one such file, and there may be others.  I don't know what prefs need to be copied, but bug 860238 comment 1 mentions the possibility.

http://mxr.mozilla.org/mozilla-central/source/browser/components/migration/FirefoxProfileMigrator.js

Looks like file migration can be part of OTHERDATA, and prefs should be copied using the SETTINGS type defined in MigrationUtils.jsm.  (The IE and Safari migrators use SETTINGS to migrate prefs to Firefox prefs.)
Component: General → Migration
Flags: qe-verify?
Flags: firefox-backlog+
Points: --- → 2
Flags: qe-verify? → qe-verify+
Whiteboard: [fxgrowth]
Richard, can you provide a complete list of the files and/or prefs that need preserving?
Flags: needinfo?(rnewman)
I think just these things:


Files:

$profile/times.json
$profile/healthreport/
$profile/healthreport.sqlite*  (make sure DB is healthy)

Prefs:

datareporting.*
(perhaps omit current session prefs; take a look in about:config to see what I mean)
Flags: needinfo?(rnewman) → needinfo?(gps)
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Iteration: --- → 35.3
Mark, I heard you were already working on this, did you get to the point of a patch and/or should I finish that and/or do you want to, ... ? :-)
Flags: needinfo?(mhammond)
 jjensen and I curious about what happens *currently* when a profile is reset. Is the existing FHR records just deleted along with the profile? John is wondering if profile reset might be a source of orphaning. Thoughts?
Flags: needinfo?(rnewman)
(In reply to brendan c from comment #4)
>  jjensen and I curious about what happens *currently* when a profile is
> reset.

Basically, we copy the current profile out somewhere, and then we make a new one. Then we migrate a bunch of stuff:
- sessionstore file (open tabs and windows)
- homepage pref
- buildid pref so you don't get a "firstrun" page
- history/bookmarks
- passwords
- form history
- cookies
- personal dictionary (spellchecker)

pretty much everything else (add-ons, other prefs) is gone, for all intents and purposes.

> Is the existing FHR records just deleted along with the profile? John
> is wondering if profile reset might be a source of orphaning. Thoughts?

I don't know what orphaning is for FHR purposes, but considering we don't migrate them, I would expect that for FHR purposes, this currently looks just like a new profile.
(In reply to brendan c from comment #4)
>  jjensen and I curious about what happens *currently* when a profile is
> reset. Is the existing FHR records just deleted along with the profile? John
> is wondering if profile reset might be a source of orphaning. Thoughts?

It is definitely a source of orphaning and that's why I pointed Saptarshi to it in bug 860238. My understanding was that the Metrics team was very much aware of this issue.
Blocks: 1075171
Echoing MattN, yes, Reset Firefox would cause orphaning -- the old data is abandoned, the old prefs lost, and the old doc just stops being uploaded. A new doc will be generated, and FHR will act as if it's a brand new profile.

mreid flagged this as something to address in April 2013: Bug 860238.
Flags: needinfo?(rnewman)
FWIW, from what I'm hearing you guys describing here, this is actually conceptually different from orphaning; it sounds more like churn, i.e., equivalent to a user uninstalling and reinstalling the browser. That phenomenon is expected and I'm not losing any sleep over it.

Orphaning would refer to the case in which an FHR record becomes unlinked from the client that submitted it, but the client continues to submit new records which contain the same data as the first record. Check out this for a picture of orphaning:
https://metrics.mozilla.com/protected/bcolloran/recordDivergence/dayDivergenceTimeline.html

In the case of orphaning, we get duplicate data for some days (which is why it's problematic); in the case you're describing, we simply fail to link a machine's history before and after the profile reset event.

So, unless I'm missing something, this sounds like churn, which is not really orphaning or even bad data per se-- it's something we can deal with (and must deal with for many users anyway)

(I must also admit, the notion of copying the FHR record across profile resets makes me nervous. It feels like another crack through which more orphan-creating bugs could sneak... just make sure that all docIds, stableIds etc are copied too, or else *real* orphans could be introduced, which is much more pernicious than churn.)
(In reply to brendan c from comment #8)
> FWIW, from what I'm hearing you guys describing here, this is actually
> conceptually different from orphaning; it sounds more like churn, i.e.,
> equivalent to a user uninstalling and reinstalling the browser. That
> phenomenon is expected and I'm not losing any sleep over it.

No - in the uninstall/reinstall case, we re-use the same profile (and thus the same FHR record, at least as far as I understand those). The "real life" use case that Reset simulates is a user creating a new profile and never re-using the old one again (that's what happens under the hood).

> Orphaning would refer to the case in which an FHR record becomes unlinked
> from the client that submitted it, but the client continues to submit new
> records which contain the same data as the first record. Check out this for
> a picture of orphaning:
> https://metrics.mozilla.com/protected/bcolloran/recordDivergence/
> dayDivergenceTimeline.html

I don't understand what you mean by "unlinked from the client that submitted it", can you elaborate? The link is not particularly helping me understand this.
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #9)

> No - in the uninstall/reinstall case, we re-use the same profile (and thus
> the same FHR record, at least as far as I understand those). The "real life"
> use case that Reset simulates is a user creating a new profile and never
> re-using the old one again (that's what happens under the hood).

Ah ok. Well, in any case, it's like churn attributable to to hardware upgrades, complete OS reinstallation, that kind of thing.
 
> I don't understand what you mean by "unlinked from the client that submitted
> it", can you elaborate? The link is not particularly helping me understand
> this.

Oy, the explanation of orphaning is somewhat involved, more than I'm probably able to type coherently ATM. Happy to do a vidyo call at some point, or maybe someone has sufficiently clarifying bugs to hand?
(In reply to :Gijs Kruitbosch from comment #3)
> Mark, I heard you were already working on this, did you get to the point of
> a patch and/or should I finish that and/or do you want to, ... ? :-)

The patch in bug 1063710 does copy times.json, but nothing else listed in this bug.  It would be great if you can continue to take it as there may well be some devil in the detail - eg:

(In reply to Richard Newman [:rnewman] from comment #2)
> $profile/healthreport.sqlite*  (make sure DB is healthy)

Might be interesting :)  I'd *expect* FHR would cope with corruption during normal course of action, so copying a corrupted DB should be (a) unlikely and (b) already handled.  But I'm just guessing here, so some investigation is necessary.
Flags: needinfo?(mhammond)
Database corruption is currently not recovered, bug 855413.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #12)
> Database corruption is currently not recovered, bug 855413.

How do I check for it in order to avoid copying the DB in that case?
Flags: needinfo?(benjamin)
That's for gps or somebody who knows our DB stuff. I don't know.
Flags: needinfo?(benjamin)
(In reply to Mark Hammond [:markh] from comment #11)

> Might be interesting :)  I'd *expect* FHR would cope with corruption during
> normal course of action, so copying a corrupted DB should be (a) unlikely
> and (b) already handled.  But I'm just guessing here, so some investigation
> is necessary.

IIRC it does indeed handle a corrupt DB; gps knows his stuff!

I'd just suggest handling the obvious here -- truncated or non-regular files, files that the current user can't read and/or write, etc.

Belt and braces.
(In reply to Mark Hammond [:markh] from comment #11)
> I'd *expect* FHR would cope with corruption during
> normal course of action, so copying a corrupted DB should be (a) unlikely
> and (b) already handled.

I think FHR handling it itself is the better way since it should deal with it anyways. Reset currently doesn't do any database checks and leaves that to each component to handle.
No, FHR does not handle corrupted databases. See bug 855413. At least report only a month ago, only 0.2% of users have corrupted databases, which is a reason implementing recovering from corrupted databases hasn't been done yet.

If I'm interpreting bsmedberg from bug 1063704 comment 12 correctly, it is acceptable lose the old FHR data in the SQLite database. However, we still want FHR to report when profile reset occurs. There are multiple considerations at play.

If you preserve the client ID across reset but don't preserve the data, this will manifest to Metrics as a loss of data, especially in session use time. You had a client uploading tons of data and now that data is gone. That's probably not good.

If you don't preserve the client ID across reset, you don't delete old data from the server. However, now you have multiple client IDs for Metrics to deal with. This could artificially inflate the installed user base count.

Given the concerns about data integrity, migrating FHR state as part of reset and having FHR report on reset are really interchangeable. I'm not sure why we have separate bugs tracking them. Anyway, the relationship between these two bugs is definitely inverted. You want FHR to be reset aware before the state files are actually migrated. That way we have a solution for the data transition issue before it manifests in real life.

Here's how I would implement it.

1) FHR provides an API to write profile continuity information to disk. e.g. healthreporter.saveStateToSurviveReset().then(continueReset). This would collect relevant preferences, etc, save them to disk, and pass the filename during promise resolution.

2) At FHR init time, FHR starts looking for a well-defined file indicating a profile reset has occurred. If found, it loads the file and imports old state. This file would be an instance of the file produced in #1.

3) Profile reset calls the API from #1 during reset and ensures the written file is preserved across reset.

I would have FHR capture the the following for preservation across reset:

1) Probably the entire datareport.healthreport branch
2) The contents of healthreport/state.json
3) Selected critical data from the database. This would include the times of previous profile resets.

During initial FHR start after profile reset, FHR would import the old state. Of particularly importance would be the client ID. I would have FHR generate a new client ID. However, the old client ID would be present in the backed up state. FHR would immediately insert data that says a profile reset occurred. And, it would write the old client ID to the state.json file. FHR would also forever report the list of old client IDs as part of uploads. By doing it this way, Metrics doesn't lose data from old client IDs (they still have the last uploaded record) and Metrics can examine reset information to string multiple client IDs together.

Most of the complexity would live in services/healthreport. That's intentional. Domain specific code for profile migration should live with the feature itself, not with the profile migrator. If there is more than 10 lines of code for FHR migration in browser/components/migration, we likely have a problem. (I would implement profile migration as a category manager or some such so components could register with the profile migrator without involvement from the profile migrator. But that's beyond the scope of this bug.)

I hope this helps. Please needinfo if you have more questions.
No longer blocks: 1075171
Depends on: 1075171
Flags: needinfo?(gps)
Gah, I didn't realize bug 1075171 was for the web component, not FHR itself. Disregard my comments about the separate bugs :)

I do think this bug should be moved into the FHR component, however.
Blocks: 1075171
No longer depends on: 1075171
QA Contact: kamiljoz
(In reply to Gregory Szorc [:gps] from comment #17)
> If you preserve the client ID across reset but don't preserve the data, this
> will manifest to Metrics as a loss of data, especially in session use time.
> You had a client uploading tons of data and now that data is gone. That's
> probably not good.
> 
> If you don't preserve the client ID across reset, you don't delete old data
> from the server. However, now you have multiple client IDs for Metrics to
> deal with. This could artificially inflate the installed user base count.

Yea, we REALLY don't want that :-)

> During initial FHR start after profile reset, FHR would import the old
> state. Of particularly importance would be the client ID. I would have FHR
> generate a new client ID. However, the old client ID would be present in the
> backed up state. FHR would immediately insert data that says a profile reset
> occurred. And, it would write the old client ID to the state.json file. FHR
> would also forever report the list of old client IDs as part of uploads. By
> doing it this way, Metrics doesn't lose data from old client IDs (they still
> have the last uploaded record) and Metrics can examine reset information to
> string multiple client IDs together.

TLDR: if during a profile reset the old clientId is for some reason not available, it's best to just start from scratch entirely; if the old clientId is available, it's best to just keep using it.

longer version:
If we're migrating any historical data, it is essential that we migrate the clientId with it; having multiple clientIds floating around makes me very uncomfortable, and seems unnecessary. If we're going to migrate FHR data, I would only ever generate a new clientId when the existing clientId is for some reason irretrievable, and in that case I would not attempt migrate the existing FHR data-- we should start a new FHR record that maybe has a flag saying that it was reset with a failure to load the old clientId, but we should never allow a situation to exist in which the same client can report the same day/session under multiple clientIds.

If a client did end up in the situation of resetting and being able to access its old record but not its old clientId, it would simply generate a new clientId, discard the old FHR data, and basically start fresh, but with an indication that it had been reset and encountered a problem. No data would be lost on the server, and the worst consequence would be that the longitudinal histories for this client/user would not be linked across the reset; but as I said in a comment above, that basically mirrors phenomena like hardware upgrade, os reinstallation, etc, and it's not sneaky like orphaning.

If we imagine generating a new clientId be default, in the case where the old data is available but the old clientId is not, we'd have the possibility that the new FHR record would include data that was already submitted under the old Id, but since the new record does not contain the old Id, we'd need to resort to the kind of hacks we're using today to link and de-orphan our data.

Also, keeping the same Id is conceptually clearer and more closely resembles the ontological ideal of our situation: that there's one clientId per profile, even if that profile is reset. :-)



> Given the concerns about data integrity, migrating FHR state as part of
> reset and having FHR report on reset are really interchangeable. I'm not
> sure why we have separate bugs tracking them. Anyway, the relationship
> between these two bugs is definitely inverted. You want FHR to be reset
> aware before the state files are actually migrated. That way we have a
> solution for the data transition issue before it manifests in real life.
> 
> Here's how I would implement it.
> 
> 1) FHR provides an API to write profile continuity information to disk. e.g.
> healthreporter.saveStateToSurviveReset().then(continueReset). This would
> collect relevant preferences, etc, save them to disk, and pass the filename
> during promise resolution.
> 
> 2) At FHR init time, FHR starts looking for a well-defined file indicating a
> profile reset has occurred. If found, it loads the file and imports old
> state. This file would be an instance of the file produced in #1.
> 
> 3) Profile reset calls the API from #1 during reset and ensures the written
> file is preserved across reset.
> 
> I would have FHR capture the the following for preservation across reset:
> 
> 1) Probably the entire datareport.healthreport branch
> 2) The contents of healthreport/state.json
> 3) Selected critical data from the database. This would include the times of
> previous profile resets.
> 
> During initial FHR start after profile reset, FHR would import the old
> state. Of particularly importance would be the client ID. I would have FHR
> generate a new client ID. However, the old client ID would be present in the
> backed up state. FHR would immediately insert data that says a profile reset
> occurred. And, it would write the old client ID to the state.json file. FHR
> would also forever report the list of old client IDs as part of uploads. By
> doing it this way, Metrics doesn't lose data from old client IDs (they still
> have the last uploaded record) and Metrics can examine reset information to
> string multiple client IDs together.
> 
> Most of the complexity would live in services/healthreport. That's
> intentional. Domain specific code for profile migration should live with the
> feature itself, not with the profile migrator. If there is more than 10
> lines of code for FHR migration in browser/components/migration, we likely
> have a problem. (I would implement profile migration as a category manager
> or some such so components could register with the profile migrator without
> involvement from the profile migrator. But that's beyond the scope of this
> bug.)
> 
> I hope this helps. Please needinfo if you have more questions.
(In reply to Gregory Szorc [:gps] from comment #17)
> Given the concerns about data integrity, migrating FHR state as part of
> reset and having FHR report on reset are really interchangeable. I'm not
> sure why we have separate bugs tracking them. Anyway, the relationship
> between these two bugs is definitely inverted. You want FHR to be reset
> aware before the state files are actually migrated. That way we have a
> solution for the data transition issue before it manifests in real life.

Doesn't this mean fixing bug 1063704 first?

Generally, I'm confused about what needs to happen here. This started out as "copy some files and prefs" and now it seems to be "do a lot of FHR work", on which I have 0 background. I don't think it makes sense for me to own that work, especially not if we want to be fixing this sooner rather than later.
Assignee: gijskruitbosch+bugs → nobody
Iteration: 35.3 → ---
It will never be as simple as "copy some files" because it is not safe to copy sqlite files while the database is open. To do it right requires touching FHR code. That being said, copying sqlite files is probably safe. But you will be implicitly invoking sqlite's repair code if you copied files during a transaction. That's wrong and should be avoided, if possible. Granted, this likely happens millions of times per day due to unclean shutdown, etc. Still, that's a poor excuse.
(In reply to Gregory Szorc [:gps] from comment #21)
> It will never be as simple as "copy some files" because it is not safe to
> copy sqlite files while the database is open. To do it right requires
> touching FHR code. That being said, copying sqlite files is probably safe.
> But you will be implicitly invoking sqlite's repair code if you copied files
> during a transaction. That's wrong and should be avoided, if possible.
> Granted, this likely happens millions of times per day due to unclean
> shutdown, etc. Still, that's a poor excuse.

The migration code runs very early in the new profile's (and process') lifetime and copies the database out of the old profile, which should be closed because the previous process is dead... this has nothing to do with sqlite or how copying files is hard, it has to do with the approach in comment #0 being completely different from the one in comment #17.
Err, I assumed the profile being reset had an opportunity to collect data from the running profile. Isn't profile reset triggered from an already-running profile (most of the time)?

Having FHR grab data from another, non-running profile should be doable, although a bit more complicated, especially the SQLite bits. We may want to consider storing all transferred data in a JSON file instead of SQLite to reduce that complexity.
(In reply to Gregory Szorc [:gps] from comment #23)
> Err, I assumed the profile being reset had an opportunity to collect data
> from the running profile.

I'm not aware of any notifications or such, but in theory, sure, assuming users use our UI to trigger it...

> Isn't profile reset triggered from an
> already-running profile (most of the time)?

Most of the time, yes, although you can trigger it from the command line + some env vars, I think 99% of our users don't do that (but the ability to do that in case your profile causes Firefox to crash on startup is still useful).
The command line arguments were mostly intended for when it gets triggered from the Windows installer but the current plan for bug 750979 is to detect that after installation and do it from within the app.
(In reply to Gregory Szorc [:gps] from comment #23)
> Having FHR grab data from another, non-running profile should be doable,
> although a bit more complicated, especially the SQLite bits.

Can you please elaborate on this?  The migration code is executed very early at startup and before fhr can possibly run. Couldn't we just attempt to open the DB (as an attempt to demonstrate it's not corrupted), then close it and copy it to the new profile dir being set up - which seems to work fine in my experiments.  Ditto with the .json files in the healthreporter directory.
Flags: needinfo?(gps)
SQLite corruption isn't necessarily detected at db open time: you often have to run some queries (even an insert) before you detect it.

I don't think it is profile migrator's concern to worry about FHR corruption. IMO this is purely an FHR implementation detail. Sure, if you migrate a corrupt database across profiles it is annoying. But the responsibility to deal with corrupted databases should reside with FHR and FHR only.

As long as you manage to copy all the relevant files (including the SQLite WAL) *while the source profile is not running*, I can be convinced to r+ a patch.
Flags: needinfo?(gps)
fwiw, we don't care about corruption for other dbs we migrate (for example places.sqlite), it's indeed responsibility of the component using the db to check that and do whatever possible to avoid such issues (WAL+sync_normal, or rollback+sync_full, or other strategies). And it's not always trivial to check for corruption.
This patch copies the relevant FHR files to the new profile.  It copies the .sqlite file and all top-level files in the 'healthreport' directory.  If the .sqlite file is missing it doesn't copy anything and reports success (as best I can tell, the callback's param is to reflect errors, rather than simply "didn't find something to copy" - requesting feedback from MattN mainly for this).  It also doesn't consider the 'healthreport' directory missing a problem - it still copies the .sqlite file in that case.  It *does* consider 'healthreport' being a regular file, and doesn't copy anything in that case.  

It has tests!  To make the tests sane, I ended up adding a .wrappedJSObject to the object, and also split the getResources() function to make calling the new migration function possible without having multiple profiles etc in place.

Note too that no FHR-related preferences are migrated, and sadly we don't have a good story for migrating preferences between profiles.  However, a quick look at the FHR code implies this might not be a big problem - alot of FHR state is persisted in state.json - so we are just copying the files in the hope FHR does the right thing.  gps, does this sound workable even if not ideal?  (I'd like to avoid the yak-shaving that migrating certain preferences is almost certain to cause from blocking this landing).  Requesting feedback from gps on these FHR parts.
Assignee: nobody → mhammond
Attachment #8501534 - Flags: feedback?(gps)
Attachment #8501534 - Flags: feedback?(MattN+bmo)
Iteration: --- → 35.3
Comment on attachment 8501534 [details] [diff] [review]
0005-Bug-1063714-migrate-FHR-related-data-to-new-profile-.patch

Review of attachment 8501534 [details] [diff] [review]:
-----------------------------------------------------------------

On the right track. There are some preferences we may want to copy if they have a user-set value. Outside of the "upload enabled" flag, not copying the preferences shouldn't result in too much pain. I'd like a second opinion here. rnewman perhaps.

Things that happen if prefs are not copied:

* FHR will re-enable itself if the user has disabled it (datareporting.healthreport.uploadEnabled).
* The user will see the data reporting notification bar on first start (datareporting.policy.dataSubmissionPolicyAcceptedVersion)
* FHR won't schedule submission for +24h from last time (datareporting.healthreport.nextDataSubmissionTime). Instead, it will schedule for +24h from first application run of the new profile.
* Failure backoff will get reset (datareporting.healthreport.lastDataSubmissionRequestedTime, datareporting.healthreport.lastDataSubmissionFailureTime, datareporting.healthreport.lastDataSubmissionSuccessfulTime)
* Custom submission URLs and report URLs won't be carried over (datareporting.healthreport.about.reportUrl, datareporting.healthreport.documentServerURI)
* Custom logging settings will be lost (datareporting.healthreport.logging.*)

I think I'm fine losing everything except datareporting.healthreport.uploadEnabled. We should have a debate about uploadEnabled. I think we should carry it over because it was likely unset by the user and re-opting the user into data collection if they unchecked that box seems like a very un-Mozillian thing to do. On the flip side, maybe a user going through the experience of profile reset will then realize that submitting data to Mozilla would help Firefox get better so they wouldn't have to do a profile reset. I can see both sides of the argument here. Will needinfo bsmedberg to provide an answer.

::: browser/components/migration/FirefoxProfileMigrator.js
@@ +140,5 @@
> +      // the health-reporter can't have been initialized yet so it's safe to
> +      // copy the SQL file.
> +
> +      // We only support the default database name - copied from healthreporter.jsm
> +      const DEFAULT_DATABASE_NAME = "healthreport.sqlite";

You'll also need to copy a healthreport.sqlite-wal file. This contains the write-ahead log for SQLite and is used to replay the state of the database when the database is opened. Please read https://sqlite.org/wal.html.

There is also a healthreport.sqlite-shm file. However, this is a mmap'd shared memory file used by multiple consumers of SQLite databases using the WAL. I *think* this file is only relevant if the database is open. I *think* that if this file exists on disk it means that the SQLite database wasn't properly closed before the process exited. If this is true, not copying the shm file should be OK. I could be very wrong about this. You should find the answer and add a source comment.

@@ +145,5 @@
> +      let path = OS.Path.join(sourceProfileDir.path, DEFAULT_DATABASE_NAME);
> +      let sqliteFile = FileUtils.File(path);
> +      if (!sqliteFile.exists()) {
> +        aCallback(true);
> +        return;

The assumption here is that FHR doesn't store any state worth migrating if the database file that doesn't.

This is a dangerous assumption.

First, gfritzsche is doing things around the persisted client ID in bug 1064333. I haven't seen his patch yet, but we've talked about moving client ID generation *before* FHR init. We could very well have FHR generate state before opening a SQLite database.

Second, this migration code is prone to breaking if FHR ever changes how it deals with corrupt database. For example, I could see FHR dealing with corrupt databases by moving the sqlite file out of the way. You've just broken profile migrator.

We need 2 things: 1) have gfritzsche f+ these patches to ensure they don't interfere with his plans 2) add some comments into the FHR code that opens the database to document assumptions in the migration code. That way if/when we rewrite the database code, people should notice the warning to update the migrator.

I think a generally safer approach is to copy everything FHR related that you see and let FHR deal with it. i.e. if there is no SQLite database, move on and copy the healthreport directory. If FHR gets confused, that's a bug in FHR and not a concern of the profile migrator.
Attachment #8501534 - Flags: feedback?(gps) → feedback+
Georg: please keep an eye on this bug and possibly weigh in with how it will interact with the refactoring you are doing.
Flags: needinfo?(georg.fritzsche)
Benjamin: please see my question in comment 30 abort carrying over uploadEnabled as part of migration.
Flags: needinfo?(benjamin)
Richard: please weigh in on ignoring the FHR prefs during migration. I think it's mostly safe but I'd appreciate extra confirmation.
Flags: needinfo?(rnewman)
We should copy uploadEnabled.
Flags: needinfo?(benjamin)
(In reply to Gregory Szorc [:gps] from comment #30)
> I think I'm fine losing everything except
> datareporting.healthreport.uploadEnabled. We should have a debate about
> uploadEnabled. I think we should carry it over because it was likely unset
> by the user and re-opting the user into data collection if they unchecked
> that box seems like a very un-Mozillian thing to do. 

(In reply to Gregory Szorc [:gps] from comment #32)
> Benjamin: please see my question in comment 30 abort carrying over
> uploadEnabled as part of migration.

During a reset we don't preserve any preferences so there are many other privacy-related preferences that we don't preserve (with the idea that they could be part of the reason for the reset) and FHR is already one of them. Examples of others include opting out of Telemetry, shutting off application/extension/search engine updates, enabling DNT, disabling Safe Browsing, etc. Given the fact that we already weren't preserving the FHR uploadEnabled pref and that we don't preserve other opt-outs plus the fact that we will show the notification bar again, my opinion is that this is okay for a user-initiated action like Reset which states that it doesn't preserve customizations.
Comment on attachment 8501534 [details] [diff] [review]
0005-Bug-1063714-migrate-FHR-related-data-to-new-profile-.patch

Review of attachment 8501534 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/migration/FirefoxProfileMigrator.js
@@ +67,5 @@
>  
> +  return this._getResources(sourceProfileDir, currentProfileDir);
> +}
> +
> +FirefoxProfileMigrator.prototype._getResources = function(sourceProfileDir, currentProfileDir) {

Nit: _getResourcesInternal?

@@ +153,5 @@
> +      if (!subdir) {
> +        // no healthreport directory is somewhat normal, so we do copy the
> +        // sqlite file and report success.
> +        sqliteFile.copyTo(currentProfileDir, "");
> +        aCallback(true);

Calling the callback with true is to indicate completion so doing so when you decide it doesn't need to migrate a file/dir. is fine.
Attachment #8501534 - Flags: feedback?(MattN+bmo) → feedback+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #34)
> We should copy uploadEnabled.

In addition to what Matt said, note that we don't currently have any mechanism for migrating certain prefs, and it is likely to be a can of worms - there are a couple of approaches we could take, but none are trivial.  Benjamin, given this and comment 35, are you OK with doing this as a follow-up?
Flags: needinfo?(benjamin)
Thanks for the quick reviews!

(In reply to Gregory Szorc [:gps] from comment #30)

> Things that happen if prefs are not copied:

This version is still not touching any prefs.

> You'll also need to copy a healthreport.sqlite-wal file.

Done.  I found https://www.sqlite.org/tempfiles.html which says you are correct about the -wal and -shm files, so I've added a descriptive comment and copy the -wal but not the -shm (and updated the tests accordingly)

> The assumption here is that FHR doesn't store any state worth migrating if
> the database file that doesn't.
> 
> This is a dangerous assumption.

Fair enough!  This version copies the .sqlite, .sqlite-wal and healreport directory regardless if any exist without the other.  Even the .sqlite-wal will be copied if no .sqlite which seems a little strange but makes sense in the context of "just let FHR recover what it can".

Note that I still refuse to copy a "healthreport" *file* - it's gotta be a directory.

> We need 2 things: 1) have gfritzsche f+ these patches to ensure they don't
> interfere with his plans

Requesting feedback (and clearing his needinfo as they are for the same thing)

> 2) add some comments into the FHR code that opens
> the database to document assumptions in the migration code. That way if/when
> we rewrite the database code, people should notice the warning to update the
> migrator.

I had trouble finding where to put such a warning.  The code that opens the database is a long way from the code that maintains the state json file etc, and they don't seem to rely on each other.  Could you please suggest a specific place to add such a warning (or gfritzsche, if you can instead I'd appreciate it).

(In reply to Matthew N. [:MattN] (behind on reviews - focused on Loop) from comment #36)

> Nit: _getResourcesInternal?

Done.

> Calling the callback with true is to indicate completion so doing so when
> you decide it doesn't need to migrate a file/dir. is fine.

Great, thanks.
Attachment #8501534 - Attachment is obsolete: true
Attachment #8502263 - Flags: feedback?(georg.fritzsche)
Flags: needinfo?(georg.fritzsche) → needinfo?(gps)
Comment on attachment 8502263 [details] [diff] [review]
0005-Bug-1063714-migrate-FHR-related-data-to-new-profile-.patch

Review of attachment 8502263 [details] [diff] [review]:
-----------------------------------------------------------------

This doesn't affect the Telemetry & FHR unification at the current stage, with one exception:
Bug 1064333 moves the stable client ID to the data reporting service, so you should migrate <ProfD>/datareporting/state.json as well.
(We may have more shared data coming up in that directory in the future)
Attachment #8502263 - Flags: feedback?(georg.fritzsche)
(In reply to Gregory Szorc [:gps] from comment #33)
> Richard: please weigh in on ignoring the FHR prefs during migration. I think
> it's mostly safe but I'd appreciate extra confirmation.

In datareporting.healthreport:

The only ones we actually set are uploadEnabled (which other discussion in this bug indicates is acceptable), firstRun (which only controls first run upload timing) and submission timestamps.

These all seem safe to discard.


In datareporting.policy:

If we discard these, we'll re-show the data reporting notification. Presumably that's acceptable.


In datareporting.sessions:

These are to track the current and previous sessions. Safe to discard.
Flags: needinfo?(rnewman)
I've been trying to follow along but with limited success... can you all clarify for me how the implementation here will handle the clientID? Will it be along the lines of what GPS descibed in comment #17, what I requested in comment #19, or something else entirely?

Again, to avoid double counting it is important that we only migrate data if are able to migrate the clientID. Please confirm that we check that condition before migrating data.
Flags: needinfo?(mhammond)
(In reply to brendan c from comment #41)
> I've been trying to follow along but with limited success... can you all
> clarify for me how the implementation here will handle the clientID? Will it
> be along the lines of what GPS descibed in comment #17, what I requested in
> comment #19, or something else entirely?
> 
> Again, to avoid double counting it is important that we only migrate data if
> are able to migrate the clientID. Please confirm that we check that
> condition before migrating data.

Best I can tell, the client ID is stored in healthreport/state.json, which is copied during the migration.  As per comment 39, I'll update the patch to also copy datareporting/state.json too.  Chatting with gps and gfritzsche on IRC, we all feel confident that will copy the client ID to the new profile.
Flags: needinfo?(mhammond)
This attachment also copies datareporting/state.json (and has tests specifically for that).  Given comment 40 etc, I'm assuming we are fine to go ahead without prefs and if necessary do that in a followup (so clearning needinfo bsmedberg).  There's a needinfo on gps to clarify where he'd like the comments he requested in healthreporter, but clearing needinfo for that under the assumption he'll let me know in review.

Thanks all!
Attachment #8502263 - Attachment is obsolete: true
Attachment #8502890 - Flags: review?(gps)
Flags: needinfo?(gps)
Flags: needinfo?(benjamin)
Iteration: 35.3 → 36.1
Comment on attachment 8502890 [details] [diff] [review]
0005-Bug-1063714-migrate-FHR-related-data-to-new-profile-.patch

Review of attachment 8502890 [details] [diff] [review]:
-----------------------------------------------------------------

Code looks great! I almost forgot what synchronous I/O looked like :)

Feedback is only in tests. It works as is. I'll leave it up to your discretion on whether to improve the test code.

::: browser/components/migration/tests/unit/test_fx_fhr.js
@@ +10,5 @@
> +function checkDirectoryContains(dir, files) {
> +  print("checking " + dir.path + " - should contain " + Object.keys(files));
> +  let seen = new Set();
> +  let enumerator = dir.directoryEntries;
> +  while (enumerator.hasMoreElements()) {

Can't we use OS.File in the tests?

@@ +57,5 @@
> +
> +  do_register_cleanup(() => {
> +    print("Removing temp dirs " + srcDir.path + " and " + targetDir.path);
> +    srcDir.remove(true);
> +    targetDir.remove(true);

You do not need to clean do_get_tempdir(): the xpcshell harness will do it for you.

1) https://hg.mozilla.org/mozilla-central/file/62f0b771583c/testing/xpcshell/head.js#l996
2) https://hg.mozilla.org/mozilla-central/file/62f0b771583c/testing/xpcshell/runxpcshelltests.py#l437
Attachment #8502890 - Flags: review?(gps) → review+
I changed the test so it didn't clean the testdirs, but didn't bother with using OS.File seeing the code under test doesn't and I got a bit lost with recursive promises etc.

https://hg.mozilla.org/integration/fx-team/rev/daf56e706c3c
https://hg.mozilla.org/mozilla-central/rev/daf56e706c3c
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 36
Comment on attachment 8502890 [details] [diff] [review]
0005-Bug-1063714-migrate-FHR-related-data-to-new-profile-.patch

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: This allows the health report to get better statistics about usage and growth.
[Describe test coverage new/current, TBPL]: Reasonable test coverage is part of the patch.
[Risks and why]: Relatively low risk
[String/UUID change made/needed]: None

Note to sheriffs: assuming approval, this patch will need to land before the one in bug 1063710.
Attachment #8502890 - Flags: approval-mozilla-aurora?
Attachment #8502890 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Is it possible that this bug to be verified by manual testing team? 

Mark, if the answer is *yes* can you provide me some reliable steps in order to verify it?
Flags: needinfo?(mhammond)
I'd treat it as low priority, but:

In both cases:

* Start firefox using the default profile. Wait a minute or so and a file "healthreport.sqlite" and/or directories "datareporting" and "healthreport" should exist in the root of your profile dir.

* Help -> Troubleshooting -> Reset Firefox

* Firefox should end up restarting - immediately exit.  Note that the profile dir will have now changed and the old one should have been removed.

Before this patch the files and/or directories above would not exist in that new profile directory.  With this patch, they would.
Flags: needinfo?(mhammond)
(In reply to Mark Hammond [:markh] from comment #51)
> I'd treat it as low priority, but:
> 
> In both cases:
> 
> * Start firefox using the default profile. Wait a minute or so and a file
> "healthreport.sqlite" and/or directories "datareporting" and "healthreport"
> should exist in the root of your profile dir.
> 
> * Help -> Troubleshooting -> Reset Firefox
> 
> * Firefox should end up restarting - immediately exit.  Note that the
> profile dir will have now changed and the old one should have been removed.
> 
> Before this patch the files and/or directories above would not exist in that
> new profile directory.  With this patch, they would.

Verified fixed on Firefox 37.0a1 (2015-01-07), Firefox 36.0a2 (2015-01-07) and on Firefox 35 RC build 2 (20150106233618) using Windows 7 x64, Ubuntu 14.04 x86 and Mac OSX 10.9.5.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: