Closed
Bug 1361242
Opened 8 years ago
Closed 8 years ago
Implement "Prefer" operation for Sync engine.
Categories
(Toolkit :: Form Manager, enhancement, P1)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: selee, Assigned: selee)
References
Details
(Whiteboard: [form autofill])
Attachments
(1 file)
59 bytes,
text/x-review-board-request
|
Details |
After having a Sync + FormAutofill Work Week, "Prefer" operations should be implemented for Sync Engine to finish data synchronization. "Prefer" function is used for conflict resolution and determining which records should be for merging.
Assignee | ||
Updated•8 years ago
|
Summary: Implement "Update" operation for Sync engine. → Implement "Prefer" operation for Sync engine.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → selee
Status: NEW → ASSIGNED
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 4•8 years ago
|
||
Comment on attachment 8867630 [details]
Bug 1361242 - Implement prefer function to handle conflict autofill object.;
Hey Mark, Kit,
Could you take a look the interface of prefer function can satisfy the requirement of Sync engine or not?
I am also curious that the either deleted records should be handled in prefer function? or Sync engine will take care the cases?
Thanks.
Attachment #8867630 -
Flags: feedback?(markh)
Attachment #8867630 -
Flags: feedback?(kit)
Comment 5•8 years ago
|
||
mozreview-review |
Comment on attachment 8867630 [details]
Bug 1361242 - Implement prefer function to handle conflict autofill object.;
https://reviewboard.mozilla.org/r/139206/#review145246
Thanks, Sean!
::: browser/extensions/formautofill/ProfileStorage.jsm:420
(Diff revision 3)
> + * @param {Object} localRecord
> + * The record in local already to be merged.
> + * @param {Object} remoteRecord
> + * The record from sync server to be merged.
> + * @returns {Object}
> + * An object with the merged result. `null` means these two are not
I'm a bit concerned that `null` can mean several things here.
* In the "different GUIDs" case, we already know `localRecord` matches `remoteRecord`. Otherwise, our `findDupe` function wouldn't have returned it as a match. I think `findDupe` will need to migrate as well, since we might be comparing two records with different versions and different GUIDs, but matching properties. So `findDupe` and `prefer` will end up repeating the same work, which seems inefficient.
* In the "same GUIDs" case, `null` means we'll fork `remoteRecord` by giving it a different GUID. Or it could mean migration failed, in which case forking seems like the wrong thing to do. Though, in that case, what happens to the record?
I wonder, then, if it makes sense to split these two cases into separate functions. The two functions could also update storage directly. Currently, we return the merged result to Sync, but then Sync takes that and passes it right back to `update`. We could save a round trip here.
::: browser/extensions/formautofill/ProfileStorage.jsm:424
(Diff revision 3)
> + * @returns {Object}
> + * An object with the merged result. `null` means these two are not
> + * able to be merged.
> + */
> + prefer(localRecord, remoteRecord) {
> + if (localRecord.deleted && !remoteRecord.deleted) {
For now, I made the sync engine take care of the `deleted` case (https://github.com/mhammond/gecko/pull/1), but maybe handling them here is better. Mark pointed out in https://github.com/mhammond/gecko/pull/1#discussion_r117659739 that taking the undeleted record isn't always right, and I need to think some more about that.
Comment 6•8 years ago
|
||
mozreview-review |
Comment on attachment 8867630 [details]
Bug 1361242 - Implement prefer function to handle conflict autofill object.;
https://reviewboard.mozilla.org/r/139206/#review145304
Thanks for the patch, but I Think this needs more work for Sync to be able to leverage it. I also agree with Kit that I now believe (even though I wasn't sure before :) that conflating the "is there a possible duplicate of this item" and "please merge these records into what I should store locally" probably shouldn't be conflated as the semantics are very unclear. I suspect both of them might share common code, but the semantics exposed via the API should be separated.
::: browser/extensions/formautofill/ProfileStorage.jsm:356
(Diff revision 3)
> + let schemaVersion = Math.min(schemaVersion1, schemaVersion2);
> +
> + // If the minimal schema version of two migrated records are still greater
> + // than the current version, returning null indicates the valid fields can
> + // not been determined.
> + if (schemaVersion != this.version) {
The comment above talks about "greater than the current version", but the check is equality.
Also, this is indirectly called by prefer(), where the first arg is defined to be a local record, but I'm unclear on the semantics. Let's say:
* On day 1, we have an incoming record GUID1 that has a version > this.version (eg, created on a device that has a newer version of the addon) - when we apply the record locally, what will its version be? I'm guessing it will remain as the incoming remote version
* On day 2, we have the same incoming record as it was changed remotely.
ISTM that on day 2 we will fail here - both records will have the same version number, but be > this.version, so we will fail to merge them?
::: browser/extensions/formautofill/ProfileStorage.jsm:363
(Diff revision 3)
> + }
> + return this.VALID_FIELDS;
> + }
> +
> + _isRawIdentical(record1, record2) {
> + let validFields = this._findCommonFieldsBySchemaVersion(record1.version,
I find `_findCommonFieldsBySchemaVersion` somewhat strange, both in that it is a helper given it is so trivial, and the name seems misleading. ISTM that just writing:
`
if (Math.min(record1.version, record2.version) != this.version) {
return false;
}
`
makes this a bit clearer (but also clearer that it doesn't seem to capture the semantics we actually want)
::: browser/extensions/formautofill/ProfileStorage.jsm:377
(Diff revision 3)
> + }
> + return true;
> + }
> +
> + _migrate(record) {
> + // There is no any migration process for the first version, so any records
I don't understand the intent here - I guess later versions will start looking at specific versions, but I'd expect something like `_findCommonFieldsBySchema` to be used to handle, for example, that the incoming record's version to be > this.version.
I'd also expect `_migrate()` to never fail to migrate a smaller version to this.version, as the code will need to reliably handle this case whenever this.version is bumped (eg, to migrate existing record regardless of sync).
IOW, in what cases will `_migrate` return null? I understand there might be a "garbage record" case (eg, record.version missing etc), but that seems a different scenario than "can't migrate a valid record"
::: browser/extensions/formautofill/ProfileStorage.jsm:403
(Diff revision 3)
> + oldSchemaRecord.timeCreated);
> + mergedRecord.timeLastUsed = Math.max(newSchemaRecord.timeLastUsed,
> + oldSchemaRecord.timeLastUsed);
> + mergedRecord.timeLastModified = Math.max(newSchemaRecord.timeLastModified,
> + oldSchemaRecord.timeLastModified);
> + if (newSchemaRecord.guid != oldSchemaRecord.guid) {
I think this is the same point that Kit made about some confusion about same vs different GUIDs, but regardless, if the intent really is that these 2 records are to be treated as the same record, we should be adding the timesUsed.
And more generally, timesUsed is going to be tricky for Sync - I don't think we want to make a record as "dirty" and upload it to sync just because it was used locally as that will mean far more sync traffic and almost certainly cause many conflicts (as records will regularly be considered changed locally and remotely).
IOW, I think the semantics of both timesUsed and timeLastUsed should be "last used *on this device*"
::: browser/extensions/formautofill/ProfileStorage.jsm:424
(Diff revision 3)
> + * @returns {Object}
> + * An object with the merged result. `null` means these two are not
> + * able to be merged.
> + */
> + prefer(localRecord, remoteRecord) {
> + if (localRecord.deleted && !remoteRecord.deleted) {
To clarify, the scenario we've hit in the past looks something like:
* Device A is rarely used. It last synced 2 weeks ago.
* Device B (call it "this" device) is syncing regularly. A record is deleted locally and the deletion was put on the server.
* The sync server dies (or there's some other operational issue), so the server gets wiped - the expectation is that devices will then upload all their records to repopulate it.
* But - the first device to reconnect after the server wipe turns out to be device A. It didn't sync the deletion, so doesn't know the record was deleted and uploads the copy it had before the deletion happened.
* Device B now syncs - it seems this non-deleted item coming in from the server, but also knows it was deleted locally as it holds a tombstone.
Preferring the remote non-deleted copy means the deleted item was "resurrected" - the user knows they deleted it, but it now reappears on every device.
IMO, we should consider deletion as a "latch" - an item can only transtion from not-deleted to deleted and never the other way. This means the user really did modify it on one device but modify it on another, the deletion wins.
::: browser/extensions/formautofill/ProfileStorage.jsm:431
(Diff revision 3)
> + }
> + if (remoteRecord.deleted && !localRecord.deleted) {
> + return localRecord;
> + }
> + if (remoteRecord.deleted && localRecord.deleted) {
> + return null;
I think this speaks to Kit's point above a little - here it seems like returning *either* record is fine - we want to keep a single record with a tombstone. null seems to imply they are different and should be "forked"?
::: browser/extensions/formautofill/ProfileStorage.jsm:441
(Diff revision 3)
> +
> + if (!migratedLocalRecord || !migratedRemoteRecord) {
> + return null;
> + }
> +
> + if (this._isRawIdentical(migratedLocalRecord, migratedRemoteRecord)) {
IIUC, if we find 2 records both with .version > this.version, then we will not check fields we don't know about?
eg,
this.version = 1
this.VALID_FIELDS = ["foo"]
and here we have 2 records:
{ version: 2, foo: "bar", pet: "dog" }
{ version: 2, foo: "bar", pet: "cat" }
(ie, version 2 of the addon introduced "pet", but this device is on version 1 so has no knowledge of that)
ISTM that `_isRawIdentical` would return true, which seems wrong.
::: browser/extensions/formautofill/test/unit/test_addressRecords.js:301
(Diff revision 3)
> +
> + const ITEM_2_modified_familyName = Object.assign({}, ITEM_2, {
> + "family-name": "DUMMY",
> + });
> +
> + Assert.equal(profileStorage.addresses._isRawIdentical(ITEM_1, ITEM_2_modified_familyName), false);
The semantics here seem strange - I'd have thought that if one of the records is missing a field and the other has a value, we'd want them treated the same.
::: browser/extensions/formautofill/test/unit/test_addressRecords.js:541
(Diff revision 3)
> + deleted: true,
> + }, {
> + guid: "DUMMY_1000",
> + deleted: true,
> + }), null);
> +});
In general, I think these tests are faithfully checking the implementation does what it does. However, they don't seem to be capturing any of the semantics we desire (eg, as I mentioned above, how we handle versions > this.version when unknown fields are identical vs when they are different, when records of the same version are identical except one has an additional field not present in one, etc.) Comments indicating *why* we get these results would also help - eg, "an incoming record has one additional field that is not present locally. We treat this as {how}, because {why}". IOW, the semantics Sync actually needs here don't seem to be covered.
I also think a separate test file makes sense, and we shuold avoid too much duplication between CC and addresses - eg, making adjustments to these tests seems more difficult than it should be, as identical changes would need to be made in 2 places when there's actually only 1 implementation in use.
::: browser/extensions/formautofill/test/unit/test_creditCardRecords.js:240
(Diff revision 3)
> () => profileStorage.creditCards.update(guid, TEST_CREDIT_CARD_WITH_INVALID_FIELD),
> /"invalidField" is not a valid field\./
> );
> });
>
> +add_task(async function test_findCommonFieldsBySchemaVersion() {
TBH I don't think copy/paste of these tests between these files are worthwhile - we know that the implementation being tested is identical between the 2.
Comment 7•8 years ago
|
||
I wonder if it would help if we defined the semantics we want before we commit it to code. Maybe these could be done by way of "scenarios"?
eg (and note that my "results" below are just a strawman - it may be that we choose a different result - the point is to clarify the semantics even if I haven't got them completely correct here):
In all scenarios below, this.version = 1 (so a field "pet" is unknown to this implementation)
Duplicate semantics (ie, where records have different GUIDs)
local-record: {version: 1, "given-name": "Mark"}
remote-record: {version: 1, "given-name": "Bob"}
result: not duplicate
local-record: {version: 1, "given-name": "Mark"}
remote-record: {version: 1, "given-name": "Mark", "family-name": "Hammond}
result: considered a duplicate, local record updated to remote record? (or maybe not - discuss?)
local-record: {version: 2, "given-name": "Mark", pet: "cat"}
remote-record: {version: 1, "given-name": "Mark"}
result: considered a duplicate, local record is unchanged
local-record: {version: 1, "given-name": "Mark"}
remote-record: {version: 2, "given-name": "Mark", pet: "cat"}
result: considered a duplicate, local record is updated to the remote copy.
local-record: {version: 2, "given-name": "Mark"}
remote-record: {version: 2, "given-name": "Mark", pet: "cat"}
result: considered a duplicate, local record is updated to the remote copy.
local-record: {version: 2, "given-name": "Mark", pet: "dog"}
remote-record: {version: 2, "given-name": "Mark", pet: "cat"}
result: not a duplicate, even though we don't know anything about the fields that differ.
etc:
Merging semantics (ie, where records have the same GUIDs)
local-record: {version: 1, "given-name": "Mark"}
remote-record: {version: 1, "given-name": "Bob"}
result: records are forked
local-record: {version: 2, "given-name": "Mark", pet: "cat"}
remote-record: {version: 1, "given-name": "Mark"}
result: local record is unchanged
local-record: {version: 2, "given-name": "Mark", pet: "cat"}
remote-record: {version: 2, "given-name": "Mark", pet "cat"}
result: local record is unchanged (even though we don't know about pet)
local-record: {version: 2, "given-name": "Mark", pet: "cat"}
remote-record: {version: 2, "given-name": "Mark", pet: "dog"}
result: forked (even though we don't know anything about pet)
local-record: {version: 1, "given-name": "Mark", "family-name": "Hammond"}
remote-record: {version: 2, "given-name": "Mark", pet: "dog"}
result: merged - local record ends up as version 2 with *both* family-name and pet fields, record probably needs reupload. Or maybe not - discuss?
Updated•8 years ago
|
Attachment #8867630 -
Flags: feedback?(markh)
Comment 8•8 years ago
|
||
mozreview-review |
Comment on attachment 8867630 [details]
Bug 1361242 - Implement prefer function to handle conflict autofill object.;
https://reviewboard.mozilla.org/r/139206/#review145326
I think mark and kit already provided lots of great comments on the bug, and it seems we're still not in the same page about the timing of triggering `prefer`. I thought it's a pure function that return a merged record or null if unable to be merged(and let others to keep both records with some proper metadata modification), but mark/kit would prefer to return more information return from the `prefer` even it's unable to be merged(or expecting more actions inside prefer?). Maybe we'll need more time to finalize prefer's output and definition.
::: browser/extensions/formautofill/ProfileStorage.jsm:377
(Diff revision 3)
> + }
> + return true;
> + }
> +
> + _migrate(record) {
> + // There is no any migration process for the first version, so any records
I agree with mark that I'm not sure why we need this `_migrate` here. You could simply return the boolean in this function, otherwise people would expect that the the return record will be chagned based on the specific schema or something.
::: browser/extensions/formautofill/ProfileStorage.jsm:413
(Diff revision 3)
> + }
> + return mergedRecord;
> + }
> +
> + /**
> + * Returns the merged result of local record and remote record.
I guess we might need more context about when the `prefer` will be in used.
::: browser/extensions/formautofill/ProfileStorage.jsm:441
(Diff revision 3)
> +
> + if (!migratedLocalRecord || !migratedRemoteRecord) {
> + return null;
> + }
> +
> + if (this._isRawIdentical(migratedLocalRecord, migratedRemoteRecord)) {
That's a good point, even it should rarely happen in real. I think we can keep 2 records in the local(means unable to merge) since it might introduce less problem when upgrading the local schema.
Attachment #8867630 -
Flags: review?(schung)
Updated•8 years ago
|
Updated•8 years ago
|
Whiteboard: [form autofill:M3] → [form autofill:M4]
Comment 9•8 years ago
|
||
Bug 1363995 is going to give us everything we need here, so I don't think we need this.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Updated•8 years ago
|
Attachment #8867630 -
Flags: review?(MattN+bmo)
Attachment #8867630 -
Flags: feedback?(kit)
Updated•8 years ago
|
Whiteboard: [form autofill:M4] → [form autofill]
You need to log in
before you can comment on or make changes to this bug.
Description
•