Closed Bug 951986 Opened 11 years ago Closed 10 years ago

Documentation for "sync1.5" protocol

Categories

(Cloud Services Graveyard :: Server: Sync, defect, P1)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rfkelly, Assigned: rfkelly)

References

(Blocks 1 open bug)

Details

(Whiteboard: [qa+])

Attachments

(2 files, 2 obsolete files)

As part of the push to ship a firefox-accounts-enabled sync, we plan to make a new revision of the sync storage protocol.  It will be a much smaller diff than the ill-fated "sync2.0" protocol, but the very least we can use this opportunity to officially remove a bunch of deprecated protocol features.
Blocks: 951987
OS: Windows 7 → All
Hardware: x86_64 → All
Whiteboard: [qa+]
Linking bug with description of the protocol-deprecation indicators.
Depends on: 908467
Attached file apis-1.2.rst (obsolete) —
OK, here's basically what I have in mind based on discussions last month.  I've used the style and detail of the old sync2.0 docs (because hey, they already exist and have cool examples and stuff) but tried to change as little as possible from the current protocol.  I wound up going with "sync1.2" rather than "sync1.5" because the delta really is pretty small.

The bottom of the document contains a table of changes to the protocol, along with a brief justification for each.  They fall into two categories:  removal of things that the client doesn't actually use, and useful additions we can make without requiring existing client code to change.  In theory, we could point existing client code at a server speaking this protocol and it would Just Work.  I think.

AFAICT the only one that doesn't fit these categories is the "409 Conflict" response.  Personally I think it's low-risk and high-value enough to consider, but defer to client-side judgement for the final call on that one.

Thoughts?

If we feel there's enough good stuff here to justify going ahead with all this, my next step is to try implementing it as a delta against one of the server codebases (likely sync2.0) and sanity-check that we could actually stand up such a server by the looming deadline.
Attachment #8355442 - Flags: feedback?(telliott)
Attachment #8355442 - Flags: feedback?(rnewman)
Comment on attachment 8355442 [details]
apis-1.2.rst

> store and retrieve simple objects called **Weave Basic Objects**

I think we should remove as many Weave references as possible here. Basic Storage Object?

> They *should* be exactly 12 characters; while this isn't enforced by the server, the Firefox client expects it in most cases.

Why don't we enforce it? Going down from var-64 to fixed-12 characters in the db would be a pretty solid storage win (it's part of all the indexes). If we're doing 12-character IDs, let's mandate it.

> If not specified or null, the record will not expire.

Do we need to do this? Can't we specify a default per-collection so that if you want to keep a record, you need to explicitly specify a large ttl? Not having a default puts us in that ugly place where we don't know user intent.

Is ttl returned by the server? It's implied here, but not in the example. I don't think we do.

>Collection names may only contain characters from the urlsafe-base64 alphabet (i.e. alphanumeric characters, underscore and hyphen) and the period.

We should specify a maximum length. We should also revisit the idea of custom collections and make sure everyone is in the same place here. We've shifted thinking on that a few times.

>The user's SyncStorage endpoint URL can be obtained via the Sagrada Discovery and Authentication workflow [1]_.

I don't think we should have references to Sagrada at this point. The discovery workflow is almost certainly nonexistent. In general, this section needs to be rewritten to point to tokenserver specifically.

>Request and response bodies are all UTF8-encoded JSON unless otherwise specified.

I think this should be in the opening section.

>APIs in this section provide a facility for obtaining general info for the authenticated user.

I like your second phrasing of this better: "APIs in this section provide a mechanism for getting high-level user storage information" or something.


>**304 Not Modified:**

You cover this pretty well in the response codes section. I don't think it needs additional highlighting in the individual request documentation. That's true of a lot of the response-possibilities in this section. I'd only call them out if there's something special in their meaning

> The second item will be null if the server does not enforce quotes

Typo at the end there. I also don't think 304 is a likely response for this request.

> **https://<endpoint-url>/info/collection_usage**
> **https://<endpoint-url>/info/collection_counts**

Is the client actually making use of these things right now? If not, I'd kind of like to remove them for a while. They're expensive.

> The response will include an *X-Weave-Records* header indicating the total number of records to expect in the body.

This is actually a subtle change from 1.1. In 1.1, it was an optional header if the db supported it, and if they don't, then providing it is really ineffecient. I'd like to be certain that this is valuable before we make promises about it. (It may have been one of those things the early devs asked for and then possibly abandoned)

> - **400 Bad Request:**  too many ids where included in the query parameter.

s/where/were/


>The request body must be a JSON object giving new data for the WBO.

s/giving/containing/

>Successful responses will contain a JSON object with details of success or failure for each WBO.

If there are no successes/failures, do we return that element?

> Multi-Collection Interaction

Given that there's only the one, I think you could throw it into the high-level collection interaction section.

>**X-If-Modified-Since**

Does the header have to have 2 decimal points of precision? Do we throw an error if it's an integer or provides more?

>**513 Is This Even A Real Code?**

Snerk


> all the examples on how to do concurrency

These are fine, but should be a separate page, since they're not really the API.
Attachment #8355442 - Flags: feedback?(telliott) → feedback-
Awesome, thanks for the detailed feedback.  Some initial thoughts below:

>> store and retrieve simple objects called **Weave Basic Objects**
> I think we should remove as many Weave references as possible here. Basic Storage Object?

Works for me.

> I don't think we should have references to Sagrada at this point. The discovery workflow is almost
> certainly nonexistent. In general, this section needs to be rewritten to point to tokenserver
> specifically.

Cool, then I can replace links to wiki pages with internal links within the API docs.

>> The response will include an *X-Weave-Records* header indicating the total number of records to expect in the body.
> This is actually a subtle change from 1.1. In 1.1, it was an optional header if the db supported it,

Good catch; I intend the behavior to stay unchanged here but the wording got lost.

>> **513 Is This Even A Real Code?**
> Snerk

In all seriousness though, my google-fu failed to turn up a standard "reason" string for this response code.  Is it from an extension RFC, or something we just decided to use internally?
513 was proposed (and accepted) in Bug 895518. Dunno where the inspiration for that was.
Aha, I guess it's a telliott original: https://bugzilla.mozilla.org/show_bug.cgi?id=895518#c5

Something like "513 Service Obsolete" works for me :-)
>> They *should* be exactly 12 characters; while this isn't enforced by the server, the Firefox client expects it in most cases.
> Why don't we enforce it? Going down from var-64 to fixed-12 characters in the db would be a pretty
> solid storage win (it's part of all the indexes). If we're doing 12-character IDs, let's mandate it.

I defer to :rnewman as to whether this is safe.  It might break some third-party sync engines if they're not following this guideline as strictly as they should.
>> If not specified or null, the record will not expire.
> Do we need to do this? Can't we specify a default per-collection so that if you want to keep a record, > you need to explicitly specify a large ttl? Not having a default puts us in that ugly place where we
> don't know user intent.

IIRC the current client code depends on "never expires" being the default, I don't think we can change this in a backwards-compatible manner.

> Is ttl returned by the server? It's implied here, but not in the example. I don't think we do.

It's not, will clarify.
(In reply to Ryan Kelly [:rfkelly] from comment #8)

> IIRC the current client code depends on "never expires" being the default, I
> don't think we can change this in a backwards-compatible manner.

Correct. (I'm pretty sure we talked about this before, no?)

We can change _our_ clients. The key is that third-party Sync engines probably don't set a TTL. We could work around that, too, but how much do you care? All we'd do instead is send some huge TTL as part of the request; right now that's implicit, which is a little cheaper both on the client and on the wire.


(In reply to Ryan Kelly [:rfkelly] from comment #7)

> I defer to :rnewman as to whether this is safe.  It might break some
> third-party sync engines if they're not following this guideline as strictly
> as they should.

All of our own code should be 12-char-safe, yes. There's nothing in the Sync implementation on desktop to restrict third-party engines, so it's quite probable that they don't follow the same guideline.

Again, this is something we can change if we want; it'll only affect documentation on the client. But I'd probably push some of these changes back to a future storage implementation, in which we can offer things like salted IDs to avoid imposing pain on engines.
>> **https://<endpoint-url>/info/collection_usage**
>> **https://<endpoint-url>/info/collection_counts**
> Is the client actually making use of these things right now? If not, I'd kind
> of like to remove them for a while. They're expensive.

Unsure about collection_counts, but the usage info is exposed in the UI.
(In reply to Richard Newman [:rnewman] from comment #9)

> All of our own code should be 12-char-safe, yes. There's nothing in the Sync
> implementation on desktop to restrict third-party engines, so it's quite
> probable that they don't follow the same guideline.
> 

I'll follow up with bobm to see if we can find >12 character records in the system. Aside from Dolphin, do we know of any third party engines?


> Again, this is something we can change if we want; it'll only affect
> documentation on the client. But I'd probably push some of these changes
> back to a future storage implementation, in which we can offer things like
> salted IDs to avoid imposing pain on engines.

It's a potentially big win that can be done invisibly if nobody is using it. It's the same calculus as dropping predecessorid - if nobody is using it, we benefit from killing it.

Assuming we're keeping separate dbs for old and new (still under debate, obviously), then we'll have an api bump to accommodate the auth changes. Obviously, we're trying to minimize your work, but this is one that, as you say, is only documentation.
(In reply to Ryan Kelly [:rfkelly] from comment #10)
> >> **https://<endpoint-url>/info/collection_usage**
> >> **https://<endpoint-url>/info/collection_counts**
> > Is the client actually making use of these things right now? If not, I'd kind
> > of like to remove them for a while. They're expensive.
> 
> Unsure about collection_counts, but the usage info is exposed in the UI.

Right. The question is more whether the new UI uses it. We aren't going to turn it off in the old one, and I'd like a conscious statement of "we'd like to retain it in the new UI" before just keeping it by default.
Attached file apis-1.2.rst (obsolete) —
Attaching updated version based on Toby's feedback.  I've kept the examples in the same document for now just to make managing this bug easier, happy to split them out into a sub-page for final commit.
Attachment #8355442 - Attachment is obsolete: true
Attachment #8355442 - Flags: feedback?(rnewman)
Attachment #8355973 - Flags: feedback?(rnewman)
(In reply to Ryan Kelly [:rfkelly] from comment #10)

> Unsure about collection_counts, but the usage info is exposed in the UI.

Android uses collection_counts to find out how many bookmarks we're about to screw up.

(In reply to Toby Elliott [:telliott] from comment #12)

> Right. The question is more whether the new UI uses it. We aren't going to
> turn it off in the old one, and I'd like a conscious statement of "we'd like
> to retain it in the new UI" before just keeping it by default.

I've seen a number of users try to find out how much data they have in the system, for various reasons ("did my stuff just disappear? am I near my quota?"). That doesn't mean that collection_usage is the best way to do that, but I'd be hesitant to remove the only way to ask the server for how much data it's storing.
(In reply to Toby Elliott [:telliott] from comment #11)

> I'll follow up with bobm to see if we can find >12 character records in the
> system. Aside from Dolphin, do we know of any third party engines?

There are third-party clients, third-party apps, and third-party engines.

Clients: Blackberry (BerrySync?), maybe others. These might be all over the map (e.g., syncing emails, contacts, yadda yadda).

Apps: I've seen reference made to todo apps etc. in previous collection inquiries, and several StackOverflow etc. posts about getting libraries to talk to a Sync server.

Engines: AdBlock Plus, Flashblock, probably others.

Grabbing the list of custom collection names is probably a good starting point.


> It's a potentially big win that can be done invisibly if nobody is using it.
> It's the same calculus as dropping predecessorid - if nobody is using it, we
> benefit from killing it.
> 
> Assuming we're keeping separate dbs for old and new (still under debate,
> obviously), then we'll have an api bump to accommodate the auth changes.
> Obviously, we're trying to minimize your work, but this is one that, as you
> say, is only documentation.

Yup. The trick will be somehow letting add-on authors know that their Sync engine no longer works.
(In reply to Richard Newman [:rnewman] from comment #14)
> 
> Android uses collection_counts to find out how many bookmarks we're about to
> screw up.
> 

Bleah, OK. We'll keep it. It's a known pain point, though.

(In reply to Richard Newman [:rnewman] from comment #14)

> Yup. The trick will be somehow letting add-on authors know that their Sync
> engine no longer works.

Well, we already have that problem insofar as their engine isn't going to work moving forwards and we have to communicate that. This change goes hand in hand with the auth change (with the caveats noted above that migration is still not locked down). We're not going to do it to people using the current system.

If we go with the sync-1.1-supports-new-auth approach and share a db, then all bets are off.
(In reply to Toby Elliott [:telliott] from comment #16)

> Bleah, OK. We'll keep it. It's a known pain point, though.

I'd be happy to switch to a method that returns the count for a single collection, if that would help. (Extra points if it's back-compatible, like ?only=bookmarks, returning the same structured response.)

> Well, we already have that problem insofar as their engine isn't going to
> work moving forwards and we have to communicate that.

Note that it will lock out third-party *apps* and *clients*. Engines -- the pluggable bits of JS that conform to desktop Sync's internal APIs -- will continue to work. They just might misbehave (such as if we now mandate a TTL, and they don't provide one).
Attached file apis-1.5.rst
Ugh, sync1.1 doesn't restrict the character set used for WBO ids.  I've dropped it from "urlsafe-base64" to "printable ASCII" for compatibility.  Bob is running some server db analysis to see if we can tighten it up any further.

I also went back to "sync1.5" since that's what everyone is using to talk about this thing.
Attachment #8355973 - Attachment is obsolete: true
Attachment #8355973 - Flags: feedback?(rnewman)
Attachment #8356884 - Flags: feedback?(rnewman)
Comment on attachment 8356884 [details]
apis-1.5.rst

The documentation gods thank you for your offering. Many pages, much pleased.

Did you intentionally leave in mentions of Sagrada? And are we still using something we'd call HAWK?

> | The previously-undocumented               | This actually *is* used so we better document it. |
> | *X-Weave-Quota-Remaining* header has been |    

Period after "been".

> | added and can be used on all GET request. | will allow future client code to avoid            |

s/request/requests

Could you please file four bugs for desktop and Android clients to stop sending X-Confirm-Delete, and to start handling 409 Conflict?
Attachment #8356884 - Flags: feedback?(rnewman) → feedback+
> Did you intentionally leave in mentions of Sagrada?

No, I must have missed one, will remove.

> And are we still using something we'd call HAWK?

Yes, the request-signing scheme is still HAWK.
Blocks: 959032
Blocks: 959033
Blocks: 959034
Blocks: 959035
Depends on: 959878
Notes from IRC:  if we can get some of the linked bugs fixed by FF29, we'll drop the max-length of BSO ids to 16.  This should provide a nice win for the server by reducing the size of the index.

At the very least, we should include a sternly-worded warning in the docs that they'll be limited to 16 in the next iteration of the protocol.

Richard, while we're on the subject, can we safely reduce the allowed character set from "printable ascii" to something a bit stricter?
Flags: needinfo?(rnewman)
(In reply to Ryan Kelly [:rfkelly] from comment #21)

> Richard, while we're on the subject, can we safely reduce the allowed
> character set from "printable ascii" to something a bit stricter?

Can I get back to you on that? Unlikely to change in Mozilla clients before 29, so I presume this is just a doc thing…
Flags: needinfo?(rnewman)
> Can I get back to you on that? Unlikely to change in Mozilla clients before 29,
> so I presume this is just a doc thing…

Yep.  Not looking to change the code, just wondering if we can tighten up the docs based on what the code actually does.
Leaving needinfo on me for that, then.
Flags: needinfo?(rnewman)
Depends on: 960878
Testing uncovered another difference that's not highlighted in the doc - "GET /storage/collection" on a non-existent collection now returns a 404, whereas in sync1.1 it returns an empty list.  We're having a quick IRL chat about whether to keep it or revert it...
Preliminary opinion is to keep it as a 404 and shake out any hopefully-super-easy-to-fix client bugs that this triggers as we go.
Review of android codebase suggests that the changed emtpy-collection handling is too much risk for too little gain.  I'm adding an explicit backwards-compatibility note to the docs and will update the server to handle this explicitly.
Attachment #8364699 - Flags: review?(nalexander)
Comment on attachment 8364699 [details] [diff] [review]
sync1.5-empty-collections.diff

clearing a tiny and rather pointless review from :nalexander's plate
Attachment #8364699 - Flags: review?(nalexander)
Depends on: 972074
Priority: -- → P1
No longer depends on: 959878
:rfkelly hows are progress here?
What changes are likely to be added now that 1.5 is post-release?
Status: NEW → ASSIGNED
It's released, whatever we've got is what we're stuck with :-)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: needinfo?(rnewman)
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: