1361178 - go-sync limits GET BSO payload sizes

Assignee

Description

•

7 years ago

A sync server should not limit the amount of BSO records that can be returned on a fetch. This has been addressed in PR 167 [1]. 

As a workaround the production go-sync box's GET limit has been increased to 100,000 records by setting LIMIT_MAX_BSO_GET_LIMIT=100000 on the configuration. 

[1] https://github.com/mozilla-services/go-syncstorage/pull/167

Benson Wong [:mostlygeek]

Assignee

Comment 1

•

7 years ago

Added a few people to the CC list. 

Not sure how bad the effect of this bug is. If a client requests an unbounded (no limit) GET for all of their BSOs and only gets back 1000 of them how broken are they? Do they have any chance of recovering? 

I suppose if they didn't get their bookmark folder BSOs they're be in a pretty bad state.

Edouard Oger [:eoger]

Updated

•

7 years ago

Status: NEW → ASSIGNED

Component: Firefox Sync: Backend → Server: Sync

Mark Hammond [:markh] [:mhammond]

Comment 2

•

7 years ago

(In reply to Benson Wong [:mostlygeek] from comment #1)
> Not sure how bad the effect of this bug is. If a client requests an
> unbounded (no limit) GET for all of their BSOs and only gets back 1000 of
> them how broken are they?

Very :)

> Do they have any chance of recovering? 

Probably not, unless the BSOs returned were returned in date order (in which case a next sync should get more, and they might correctly reparent on the later sync when they appear.)

Ryan Kelly [:rfkelly]

Comment 3

•

7 years ago

> (in which case a next sync should get more, and they might correctly reparent on the later sync when they appear.)

Even then, ISTM the client will think "I've got all BSOs modified before time X" and then any future requests will have "?newer=X" on them, and will not return the skipped bookmarks.

Richard Newman [:rnewman]

Comment 4

•

7 years ago

(In reply to Mark Hammond [:markh] from comment #2)

> Probably not, unless the BSOs returned were returned in date order (in which
> case a next sync should get more, and they might correctly reparent on the
> later sync when they appear.)

Nah, the client will have advanced its timestamp on success -- it'll only retrieve the skipped records if they're modified or something causes the client to reset and resync.

Benson Wong [:mostlygeek]

Assignee

Comment 5

•

7 years ago

Hmm. Even if the bug is fixed it seems safe to migrate everybody off the server.

Benson Wong [:mostlygeek]

Assignee

Comment 6

•

7 years ago

Deployed mozilla/go-syncstorage:1.6.7r3 to the two production go servers.

Bob Micheletto [:bobm]

Comment 7

•

7 years ago

:grisha, should we migrate iOS and Android users from these servers due to this bug?

Flags: needinfo?(gkruglov)

Richard Newman [:rnewman]

Comment 8

•

7 years ago

(In reply to Bob Micheletto [:bobm] from comment #7)
> :grisha, should we migrate iOS and Android users from these servers due to
> this bug?

Why do we think this only affects mobile clients?

Richard Newman [:rnewman]

Comment 9

•

7 years ago

(In reply to Richard Newman [:rnewman] from comment #8)

> Why do we think this only affects mobile clients?

As far as I can tell from reading the code, desktop still fetches non-history collections without a limit, and the history limit is 5000, so I expect desktop to be just as affected as any other client.

You should node-reassign every user with more than 1000 records in any collection.

:Grisha Kruglov

Comment 10

•

7 years ago

Seems like a reasonable thing to do, unfortunately. Otherwise we have clients with partial views of their collections. Migration will mean that clients might even get a chance to upload their partial view of the data to a new node, which is bad. It seems like either way there's a good chance we're going to corrupt data, and migrating sooner will hopefully make situation a bit better.

Alternatively:
I wonder if bumping modified timestamps for all records in all collections might also do the trick? It will force clients to re-download all of their collections. We're probably going to run into various reconciliation bugs here, but I reckon that's better than potentially loosing subsets of users' data by having them upload incomplete collections in some cases.

Perhaps we can only do that for more structure sensitive ones.

Is that a reasonable approach, or would it be too hard to orchestrate?

:Grisha Kruglov

Updated

•

7 years ago

Flags: needinfo?(gkruglov)

Benson Wong [:mostlygeek]

Assignee

Comment 11

•

7 years ago

(In reply to :Grisha Kruglov from comment #10)

> Migration will mean that
> clients might even get a chance to upload their partial view of the data to
> a new node, which is bad. 

Client's conflict resolution couldn't/wouldn't catch this?

Richard Newman [:rnewman]

Comment 12

•

7 years ago

> > Migration will mean that
> > clients might even get a chance to upload their partial view of the data to
> > a new node, which is bad. 

Older Android clients will have _already_ screwed everything up: once Android is confident that it's downloaded everything, it fixes up bookmark trees and uploads whatever changes it needs.


> Client's conflict resolution couldn't/wouldn't catch this?

There's no conflict to detect. This is just part and parcel of Sync: we assume that it's safe for any client to restore to a server, and that once all clients have synced they'll all converge on the same good place. Naturally that assumption isn't true if a client is forced to write before they can finish reading.


> I wonder if bumping modified timestamps for all records in all collections
> might also do the trick?

On an updated Go node, that would force a redownload, but it would also cause data loss on clients as the remote timestamp wins a conflict.

If you want to force a resync on all clients without a node migration, simply permute the meta/global syncID for that collection.

I'd be inclined to simply node-reassign, tbh. It's a more tested flow.

Benson Wong [:mostlygeek]

Assignee

Updated

•

7 years ago

Depends on: 1362180

Benson Wong [:mostlygeek]

Assignee

Comment 13

•

7 years ago

Resolving this issue. The bug was fixed in gosync and all users were migrated away in Bug 1362180.

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

1 year ago

Product: Cloud Services → Cloud Services Graveyard

Bugzilla

Quick Search

go-sync limits GET BSO payload sizes

Categories

(Cloud Services Graveyard :: Server: Sync, enhancement)

Tracking

(Not tracked)

People

(Reporter: mostlygeek, Assigned: mostlygeek)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Updated

Comment 13

Updated