Use opaque version numbers instead of timestamps

RESOLVED FIXED

Status

RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: rfkelly, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qa?])

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

6 years ago
Created attachment 657773 [details] [diff] [review]
patch to use opaque version numbers in sync2.0 protocol

Here's a little experiment based on the comments from Bug 778994 - replace all use of timestamps in Sync2.0 with opaque version numbers.

This doesn't actually change the flow of the protocol at all, since we were basically already treating the timestamps as atomic version identifiers.  But removing the restriction that they be actual millisecond timestamps has some advantages:

   * more flexibility in server implementation, (we don't need to throttle writes
     to one per millisecond, we can use normal ints rather than bigints in the db)
   * easier to explain the concurrency model in terms of versions than in terms of 
     timestamps.

The disadvantage is that we no longer track explicit timestamp information at all, so we can't generate human-readable information like "your bookmarks were last synced on Thursday at 2pm".  Not sure this is any great loss, but it's something to consider.

Thoughts?

Anant, if this is positively received then I would also be interested in making a similar change to the AitC protocol. f?ing you on this one for some initial impressions rather than duplicating identical changes into the AitC protocol spec at this stage.
Attachment #657773 - Flags: feedback?(gps)
Attachment #657773 - Flags: feedback?(anant)
Whiteboard: [qa?]

Comment 1

6 years ago
Comment on attachment 657773 [details] [diff] [review]
patch to use opaque version numbers in sync2.0 protocol

rnewman definitely needs to weigh in here.
Attachment #657773 - Flags: feedback?(rnewman)
Considering the protocol and the server alone, I think this solves a lot of problems. But to apply it to the Sync system as a whole and achieve a net improvement, we need a little more finesse.

There are a few places that timestamps are currently used in Sync.

info/collections: when was a collection last changed?
  - Clients use this *only* to decide whether they need to sync. The timestamp isn't compared to a locally generated value.

Protocol headers: what time does the server think it is?
  - Clients use this to bump tracking timestamps, in subsequent requests (XIUS), and potentially could use it to correct clock drift. Ignore the latter for now.

Inside records: when was this record written to the server?
  - Clients use this primarily as a conflict resolution tool. If a record has been modified both locally and on the server, the reconciling algorithm needs to know which one is newer. Right now we do that by assuming that the server clock and the local clock are comparable (ignoring drift).


The first two locations can easily be switched to arbitrarily increasing values. The last is not so straightforward.

The important part is for there to be a comparison operator of fairly fine resolution that can take R(s) and R(c) and determine which is newer. (It's not really enough to know "the local record was modified since I last synced", because by definition any conflict involves both local and remote records having been modified since you last synced!)

There are a few ways I can see this happening.

1. Keep timestamps for records, using version numbers only for the protocol.

2. Keep both timestamps and version numbers for records. Timestamps would only be used if necessary for reconciling, and otherwise we have a Perfect System® ("CLUUUUUUU!").

3. Use frequent syncs and bookkeeping on the client to maintain an approximate range mapping between local timestamps and remote versions. (My theory is rusty, but I think this is equivalent to using a local vector clock and synchronizing the two.)

4. Provide a clock lookup facility on the server to allow clients to determine an approximate time for a given version. This could be packed into an info/collections response as a timeline…


Note that some of these approaches also approximate a solution to human-readable timestamping, as described in Comment 0.
I see no reason why we couldn't continue to store a timestamp associated with each record that isn't indexed in any particular way and just comes back as a piece of metadata. We lose the ability to say "you last synched at 3PM" (though there are ways we could do this if we felt it was important), but I think that addresses the major concerns.

This eliminates clock drift. Info/collections would just return a set of version numbers. Internal timestamps might help with reconciliation, but an increasing set of versionids would do that too.

Timestamp is only meaningful given direct user contextualization. For the computer, it's no different than any other increasing value.
(In reply to Toby Elliott [:telliott] from comment #3)
> I see no reason why we couldn't continue to store a timestamp associated
> with each record that isn't indexed in any particular way and just comes
> back as a piece of metadata. We lose the ability to say "you last synched at
> 3PM" (though there are ways we could do this if we felt it was important),
> but I think that addresses the major concerns.
> 
> Internal timestamps might help with reconciliation, but an
> increasing set of versionids would do that too.

Except that clients (such as Fennec) are only aware of the time, not the corresponding server version at the instant at which a local change has occurred.

The problem with version IDs for reconciling is that for reconciling we really care about which happened first, so the version IDs (from both local and remote sources) need a discoverable order.

> Timestamp is only meaningful given direct user contextualization. For the
> computer, it's no different than any other increasing value.

Not quite true. A timestamp is synchronized against an external sequence; two timestamps taken from arbitrary computer systems are (modulo errors) directly comparable.

By contrast, you can't take the internal IDs from my places.sqlite and Fennec's browser.db (two monotonically increasing values…) and deduce which one changed most recently. This comparison is what we need to reconcile two conflicting records from different sources.
Ah, I see your concern. You're expecting that we wouldn't store the associated versionid somewhere in places, making it impossible to reconcile individual records. That is true. Is there nowhere we can stash this info? We don't need to index on it.

If not, we'd need to get creative with the version number and go with seconds+versionid. That should give us enough granularity for client purposes while still being technically opaque.
(In reply to Toby Elliott [:telliott] from comment #5)
> Ah, I see your concern. You're expecting that we wouldn't store the
> associated versionid somewhere in places, making it impossible to reconcile
> individual records. That is true. Is there nowhere we can stash this info?
> We don't need to index on it.

Storing the version ID from the server is enough to tell if the server version is newer than an unmodified local record -- the version from the server is larger than the version we have stored. (But that's kinda implied in any case, or we wouldn't be fetching it.)

It's not enough to resolve conflicts. The client would need a 'dictionary' timeline to determine the associated timestamp of each server version (or, conversely, the version range in which a local change occurred), because the server timeline is progressing between syncs.

Let's say Fennec syncs at time T0 = V0.

At T5, it modifies a bookmark.

Another client is making concurrent modifications (it doesn't matter in what time period!), and syncs a few times.

At T20 = V10, Fennec fetches that bookmark from the server, and sees that it was modified at V4.

Did V4 occur before or after T5?

That's the problem.


"OK, so what if we use versions on the client?" Doesn't solve the problem. Just change 'T' to 'V[c]' in the above.

Reconciling works right now because both server and client are using the same scale, so we don't need to coordinate explicitly. To remove that external tie to a universal clock, we need to solve distributed version alignment.

We can do that for clients that are always connected -- Tnow is always newer. We can do it if we use some kind of version interleaving, where a client is issued a token and locally stamps it when a change occurs, then asks the server to figure things out and give it a real ordered version number. But it's probably easier to just timestamp each record, either on the server as we do now, or on the client in each record payload that we upload.

> If not, we'd need to get creative with the version number and go with
> seconds+versionid. That should give us enough granularity for client
> purposes while still being technically opaque.

So long as a function exists:

  [-1, 0, 1] compare(millseconds local, version remote);

the client will be happy.
(Moving this out of Server.)
Component: Server: Sync → General
OS: Linux → All
Hardware: x86 → All
I think the expectation is that the version modified at T5 has a version number of V0 - until the client reconciles the item with the remote server, it's assumed to be behind.

What's the strategy for this in the current system? If the client hasn't updated since T1 and modifies a bookmark at T3, then discovers a T2 version on the server blocking the upload, how does it reconcile? It doesn't seem like the relative timestamps are all that important there.
(In reply to Toby Elliott [:telliott] from comment #8)

> What's the strategy for this in the current system? If the client hasn't
> updated since T1 and modifies a bookmark at T3, then discovers a T2 version
> on the server blocking the upload, how does it reconcile? It doesn't seem
> like the relative timestamps are all that important there.

If the local version was modified at T3, and the server at T2, the local version wins.

Similarly, if the local version is modified earlier, it 'loses' to the server version.

Right now, we have a sub-second granularity for this, so == conflicts are rare.


> I think the expectation is that the version modified at T5 has a version
> number of V0 - until the client reconciles the item with the remote server,
> it's assumed to be behind.

It's exactly that reconciling which requires timestamps for accuracy. Let's take my example and put some real numbers in.

(Android devices sync every *eight hours* in practice.)

0730: You add a bookmark on your desktop and sync.
0800: Fennec syncs.
0815: On your desktop you rename the bookmark and sync.
0900: You're on Caltrain, and changed your mind. You rename the bookmark in Fennec.
1800: You get home, and finally your phone has a reliable network connection. It syncs.

You made a change on the phone *after* the desktop change. Desktop synced first against an unmodified server, so the server reflects its change at 0815.

The "assumed to be behind" approach that you outline would have the 45-minute-stale data win the contest when Fennec syncs.

The user expectation (and what the current code does) is that the *chronologically later* change wins in the case of a hard conflict. To preserve this behavior and this user expectation, we need to be able to synchronize the activities of two devices without network connections.
OK, That's the important detail.

It still doesn't require version to be timestampy. It DOES require that timestamp be somewhere in the record body, though. Then, in your above example, the phone sees that there's a newer version of the bookmark, downloads it, compares timestamps and sees that it's safe to overwrite.
(Reporter)

Comment 11

6 years ago
I was *this* *close* to leaving a timestamp field in the body of each BSO record, in addition to the version number.  In the end, I took it out just to see what would happen :-)

If each BSO contained a "modified" field giving last-modified version number, and a "timestamp" field giving last-modified timestamp, would this be sufficient?

Two other possibilities:

 * Clients can put their own timestamp in the BSO payload, making conflict resolution a purely client-driven affair.  This would likely exacerbate any problems with clock drift.

 * We could have the version number include a timestamp in an easily-parsable format, like <timestamp>:<private-server-data>.  This feels clunky to me.
(In reply to Toby Elliott [:telliott] from comment #10)

> It still doesn't require version to be timestampy. It DOES require that
> timestamp be somewhere in the record body, though. Then, in your above
> example, the phone sees that there's a newer version of the bookmark,
> downloads it, compares timestamps and sees that it's safe to overwrite.

"But it's probably easier to just timestamp each record, either on the server as we do now, or on the client in each record payload that we upload."

:)
(Reporter)

Updated

6 years ago
Blocks: 787871
(In reply to Ryan Kelly [:rfkelly] from comment #11)

> If each BSO contained a "modified" field giving last-modified version
> number, and a "timestamp" field giving last-modified timestamp, would this
> be sufficient?

Yes.

You could even go so far as to partition by version in the BSO body, rather than including the version in every object:

{v12345: [{id: ...}, {...}],
 v12346: ...}


>  * Clients can put their own timestamp in the BSO payload, making conflict
> resolution a purely client-driven affair.  This would likely exacerbate any
> problems with clock drift.

Indeed.

>  * We could have the version number include a timestamp in an
> easily-parsable format, like <timestamp>:<private-server-data>.  This feels
> clunky to me.

High-resolution fake decimals? That's how a lot of microsecond random number generators work.
(Reporter)

Comment 14

6 years ago
Created attachment 658791 [details] [diff] [review]
updated patch to use opaque version numbers in sync2.0 protocol

OK, based on this feedback I have added a "timestamp" field to the BSO in addition to its "modified" feel.  This seems like the simplest/cleanest approach to me, keeping versions as an opaque number while still allowing clients to resolve conflicts.  Updated patch attached.

I've also added some more words on the X-Timestamp header, which we would keep given these latest changes.
Attachment #657773 - Attachment is obsolete: true
Attachment #657773 - Flags: feedback?(rnewman)
Attachment #657773 - Flags: feedback?(gps)
Attachment #657773 - Flags: feedback?(anant)
Attachment #658791 - Flags: review?(rnewman)
Attachment #658791 - Flags: review?(gps)
Attachment #658791 - Flags: feedback?(telliott)
Attachment #658791 - Flags: feedback?(anant)
Comment on attachment 658791 [details] [diff] [review]
updated patch to use opaque version numbers in sync2.0 protocol

AitC doesn't strictly need timestamps (opaque version numbers would do just fine), but the presence of them certainly doesn't hurt. Looks good to me!

If we decide to make changes to the BSO as served by the AitC server, the client will need to be updated as we currently reject all unknown  properties. I suspect that's a separate bug though.
Attachment #658791 - Flags: feedback?(anant) → feedback+
Comment on attachment 658791 [details] [diff] [review]
updated patch to use opaque version numbers in sync2.0 protocol

Review of attachment 658791 [details] [diff] [review]:
-----------------------------------------------------------------

f+, but I'd like to see the version-based headers be very distinct from timestamp-esque headers, and there are some clarifications to add.

::: source/storage/apis-2.0.rst
@@ +87,5 @@
>  
> +Version Numbers
> +---------------
> +
> +In order to make it easy for multiple clients to coordinate their changes,

s/make it easy for/allow

@@ +95,5 @@
> +are tracked:
> +
> +    * the current version number for the user's data as a whole.
> +    * the version number at which each individual collection was last modified.
> +    * the version number at which each individual BSO was last modified.

Are these all on the same 'timeline'? I would imagine so, but this document doesn't say that they're comparable and each is modified when the next changes.

@@ +148,5 @@
>  
>      Possible HTTP status codes:
>  
>      - **304 Not Modified:**  no collections have been modified or deleted
> +      since the version in the *X-If-Modified-Since* header.

Alarm! Alarm!

You're now extending a header in a *really weird way*, and one that doesn't make intuitive sense.

Can we please change this to X-If-Version-Outdated, or something similar?

@@ +222,3 @@
>  
> +    - **newer**: a version number. Only objects that were last modified in
> +      versions following the one specified will be returned.

Be explicit about inclusive/exclusive.

@@ +234,5 @@
>        a previous request using the **limit** parameter.
>  
>      - **sort**: sorts the output:
> +       - 'oldest' - orders by last-modified version, oldest first
> +       - 'newest' - orders by last-modified version, newest first

"orders by version number"

@@ +463,4 @@
>      response.
>  
>      It is similar to the standard HTTP **If-Unmodified-Since** header, but the
> +    value is an opaque version number rather than a timestamp.

See earlier comment.

@@ +499,5 @@
>  
> +    This header gives the last-modified version number of the target resource
> +    as seen during processing of the request, and will be included in all
> +    success responses (200, 201, 204).  When given in response to a write
> +    request, this will be equal to the new version number any BSOs created or

X-Current-Version?

@@ +828,5 @@
>  * The WBO fields "parentid" and "predecessorid" have been removed, along with
>    the corresponding query parameters on all requests.
>  
> +* Opaque integer version numbers are now used for tracking and coordination,
> +  rather than decimal second timestamps.

Note that timestamps are still reported in milliseconds rather than seconds! We just do versioning using versions.
Attachment #658791 - Flags: review?(rnewman) → feedback+
(Reporter)

Comment 17

6 years ago
(In reply to Richard Newman [:rnewman] from comment #16)
> @@ +95,5 @@
> > +are tracked:
> > +
> > +    * the current version number for the user's data as a whole.
> > +    * the version number at which each individual collection was last modified.
> > +    * the version number at which each individual BSO was last modified.
> 
> Are these all on the same 'timeline'? I would imagine so, but this document
> doesn't say that they're comparable and each is modified when the next
> changes.

Yes, that's my intention.  I think this ambiguity is key to all of your other comments.

> @@ +148,5 @@
> >  
> >      Possible HTTP status codes:
> >  
> >      - **304 Not Modified:**  no collections have been modified or deleted
> > +      since the version in the *X-If-Modified-Since* header.
> 
> Alarm! Alarm!
> 
> You're now extending a header in a *really weird way*, and one that doesn't
> make intuitive sense.
> 
> Can we please change this to X-If-Version-Outdated, or something similar?

I see your point.  But I wonder if it's the headers or the terminology that needs to change.  Maybe both :-)

In my mind, the "version number" is still a very timeline-y sort of thing.  It's a single monotonically-increasing integer that is a property of the user's data-store as a whole.  But unlike an explicit timestamp, it only changes when the data changes and is not tied to anything in the outside world.

It's possible that calling it something other than "version number" would help the header make more sense.  Saying "no collections have been modified or deleted since point X in the datastore's internal timeline" makes intuitive sense to me.
 
> > +    This header gives the last-modified version number of the target resource
> > +    as seen during processing of the request, and will be included in all
> > +    success responses (200, 201, 204).  When given in response to a write
> > +    request, this will be equal to the new version number any BSOs created or
> 
> X-Current-Version?

Given the above timeline-y view of version number, I don't think it makes sense to speak of "the current version of a resource".

I guess it's like a revision number in subversion.  IMO it doesn't make sense to talk about the "current revision number of a file" in svn, but rather of "the revision number at which this file was last changed".

I'm open to the suggestion that this is not a helpful model - after all, I just compared it to svn!  But I think that *if* the version number should be thought of as an explicitly timeline-like quantity, then it makes sense to use timeline-like headers to deal with it.

At the cost of some rather long headers, could X-Last-Modified-Version, X-If-Modified-Since-Version and X-If-Unmodified-Since-Version be acceptable?
(In reply to Ryan Kelly [:rfkelly] from comment #17)

It's so nice having this timezone offset! :D

> > Are these all on the same 'timeline'? I would imagine so, but this document
> > doesn't say that they're comparable and each is modified when the next
> > changes.
> 
> Yes, that's my intention.  I think this ambiguity is key to all of your
> other comments.

Please clarify in the doc! :D  I already knew the answer, but I could see that some readers would not.

(Loaded question.)

> It's possible that calling it something other than "version number" would
> help the header make more sense.  Saying "no collections have been modified
> or deleted since point X in the datastore's internal timeline" makes
> intuitive sense to me.

To me the problem is that reusing time-based header names implies a correlation with a real-world clock. Correlation with an abstract "clock" is not the issue (for me, anyway).

> > X-Current-Version?
> 
> Given the above timeline-y view of version number, I don't think it makes
> sense to speak of "the current version of a resource".

Agreed.

> I guess it's like a revision number in subversion.  IMO it doesn't make
> sense to talk about the "current revision number of a file" in svn, but
> rather of "the revision number at which this file was last changed".

Yes. That's the thing I'd like to communicate, but reusing the terms from real-time-based headers is misleading, IMO.

> I'm open to the suggestion that this is not a helpful model - after all, I
> just compared it to svn!  But I think that *if* the version number should be
> thought of as an explicitly timeline-like quantity, then it makes sense to
> use timeline-like headers to deal with it.

I think it's valuable to think of it as a timeline-like quantity, in that it's monotonically increasing. I even think it would be valuable to use timeline-like headers… just not ones that are already used to work with seconds and milliseconds. That would violate the principle of least surprise. Version-based clocks are not the obvious implication of "since" etc.

> At the cost of some rather long headers, could X-Last-Modified-Version,
> X-If-Modified-Since-Version and X-If-Unmodified-Since-Version be acceptable?

That works for me! "Since" implies sequentiality, and "Version" says it ain't a UTC clock.

You could use "X-Modified-Version" if you wanted to save some bytes. The others I'm not sure how to abbreviate and retain readability. (Damn you, HTTP!)

Maybe X-V-If-Modified-Since etc.?
(Reporter)

Comment 19

6 years ago
Comment on attachment 658791 [details] [diff] [review]
updated patch to use opaque version numbers in sync2.0 protocol

cancelling remaining flags on this version of the patch, I'll redo it with :rnewman's feedback
Attachment #658791 - Flags: review?(gps)
Attachment #658791 - Flags: feedback?(telliott)
(Reporter)

Comment 20

6 years ago
Created attachment 662446 [details] [diff] [review]
updated version-numbers spec based on feedback

OK, taking another crack at this based on the above feedback.  I've tried to clarify that the version number is a single integer associated with the stored data as a whole, and that the last-modified version numbers point to previous incarnations of this value.

I'm not thrilled about the new, longer header names, but I don't have anything better at this stage.  X-If-[Not]-Version-Above?  Bikeshedding welcome!
Attachment #662446 - Flags: review?(rnewman)
Attachment #662446 - Flags: review?(gps)
Comment on attachment 662446 [details] [diff] [review]
updated version-numbers spec based on feedback

Review of attachment 662446 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: source/storage/apis-2.0.rst
@@ +689,5 @@
>  
> +  * Allocate a new version number, larger than the current version number
> +    of user's stored data.  Call this version number `V`.
> +  * Create any new BSOs as specified by the request, setting their "modified"
> +    field to `V`.

Also mention timestamps here; we've gone from one field to two.

@@ +809,4 @@
>  process should be repeated; if it does not then all available items have been
>  returned.
>  
> +To guard against other clients making concurrent changes to the,

Comma in the wrong place.
Attachment #662446 - Flags: review?(rnewman) → review+
Comment on attachment 662446 [details] [diff] [review]
updated version-numbers spec based on feedback

Review of attachment 662446 [details] [diff] [review]:
-----------------------------------------------------------------

I could probably bikeshed about header names, but I've got nothing better to suggest so I won't.

::: source/storage/apis-2.0.rst
@@ +41,5 @@
>  +---------------+-----------+------------+---------------------------------------------------------------+
> +| modified      | none      | integer    | The version number at which this object was last modified.    |
> +|               |           | 16 digits  | This is a server-assigned integer that is set automatically   |
> +|               |           |            | on each write; any client-supplied value for this field is    |
> +|               |           |            | ignored.                                                      |

How about "serverversion"? This states what it actually is and avoids confusion with previous versions of the protocol.
Attachment #662446 - Flags: review?(gps) → review+
(Reporter)

Comment 23

6 years ago
(In reply to Gregory Szorc [:gps] from comment #22)
> How about "serverversion"? This states what it actually is and avoids
> confusion with previous versions of the protocol.

Would just "version" be acceptable?  The longer version seems a little unwieldy to me, since nothing else in the data talks about "client" or "server".
(Reporter)

Comment 24

6 years ago
Comment on attachment 658791 [details] [diff] [review]
updated patch to use opaque version numbers in sync2.0 protocol

marking previous patch obsolete
Attachment #658791 - Attachment is obsolete: true
(Reporter)

Comment 26

6 years ago
Created attachment 667840 [details] [diff] [review]
code to use opaque version numbers instead of timestamps

Attaching the code that needs to go along with this change.  It's a whole lot of renaming, both of internal function/variable names and of the client-facing header names.

This does not change of version numbers are allocated, there's still just millisecond timestamps internally.  I will file a separate bug for changing them to something better.
Attachment #667840 - Flags: review?(telliott)
Comment on attachment 667840 [details] [diff] [review]
code to use opaque version numbers instead of timestamps

Code looks fine. As I currently read it, is there ever a difference between the timestamp and version?
Attachment #667840 - Flags: review?(telliott) → review+
(Reporter)

Comment 28

6 years ago
(In reply to Toby Elliott [:telliott] from comment #27)
> As I currently read it, is there ever a difference between
> the timestamp and version?

They are no longer synchronized.  So if you happen to do a write at precisely the right time, I think you might observe a version number that does not match the corresponding timestamp.
(Reporter)

Comment 29

6 years ago
Committed in https://github.com/mozilla-services/server-syncstorage/commit/52e62baf8c8883622aa96abedac8b8492f41a1d5

There's plenty more than can be done to implement this better, but I'll leave that for separate follow-up bugs.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.