Incremental sync follow up: user 'older' parameter to make recent history stage more efficient

NEW
Unassigned

Status

()

defect
P3
normal
3 years ago
8 months ago

People

(Reporter: Grisha, Unassigned, Mentored)

Tracking

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(fennec+)

Details

Reporter

Description

3 years ago
With reintroduction of the 'older' sync query param, we can now improve the recent history stage, quoting Richard:

"
What we'd like to do is fetch the 500 most recent, sort=newest, newer=lastsync (defaulting to zero), then fetch all, sort=oldest, older=oldest-new-record+1, newer=lastsync.

We can indeed generalize that to run on each sync, and remove the logic for doing this only once.

Very often we'd find that newer=older, and we'd have no work to do in the second stage (when < 500 changed items).

If we're able to specify both newer and older, I'd like to do this: it'll improve responsiveness in the general case, always fetching most recent history, and it'll avoid potentially pulling down hundreds of KB of redundant records on a first sync. (The most recently visited history is likely to also be your most visited, so each will have 20 visits…)
"
Reporter

Updated

3 years ago
Depends on: 1291821
See Also: 1291821
Reporter

Updated

3 years ago
Priority: -- → P2
Whiteboard: [MobileAS]
Priority: P2 → P3
Reporter

Updated

2 years ago
Mentor: gkruglov
Reporter

Comment 1

2 years ago
Hi Nevin,

It seems like since this bug isn't part of the Firefox for Android component, I can't set a fennec tracking flag. Could you include it in your triage nevertheless?

Thanks!
Flags: needinfo?(cnevinchen)
tracking-fennec: --- → ?
Assignee: nobody → cnevinchen
Flags: needinfo?(cnevinchen)
p2 per triage meeting.
tracking-fennec: ? → +
Priority: P3 → P2
Reporter

Comment 4

2 years ago
(In reply to Nevin Chen [:nechen] from comment #3)
> I'm now tracing the code in
> http://searchfox.org/mozilla-central/source/mobile/android/base/java/org/
> mozilla/gecko/dlc/SyncAction.java#126
> Will try to make a patch this week

The link you posted is about "download content" stuff, and isn't part of Sync.

Here are the docs for the Sync Storage API 1.5 you'll be interacting with: https://docs.services.mozilla.com/storage/apis-1.5.html#individual-collection-interaction (older parameter doesn't seem documented yet, so see Bug 1314171).

And this document should help you get a broad understanding of what's going on in Android Sync; specifically, read the "implementation notes, high level": https://docs.google.com/a/mozilla.com/document/d/17njXz0MryGT7_KdXgHGM0VIv3PDtvhV479tatxa6fcM/edit?usp=sharing
(I need to move this doc into the tree...)

So, if you were to look at the HTTP requests we're making, the current behaviour is roughly this:

# On first sync:
# fetch recent history first:
https://sync-server/bookmarks/?sort=newest&newer=0&limit=500
# ui updates, user sees some history!
# fetch all history (might take a long time), including the 500 items you just fetched
https://sync-server/bookmarks/?sort=oldest&newer=0

# assuming that the most recent record's last-modified is 10...

# on a subsequent (non-first) sync, we skip the "recent history" stage, and only fetch "all history":
# this gets us all records with last-modified>1
https://sync-server/bookmarks/?sort=oldest&newer=10

Now, the proposed behaviour that this Bug sets out to achieve is the following:

# On first sync, get the recent history:
https://sync-server/bookmarks/?sort=newest&newer=0&limit=500
# assuming the oldest item we just fetched has a last-modified of 5
# fetch all history, excluding the items you just fetched (note the `older` param):
https://sync-server/bookmarks/?sort=oldest&newer=0&older=5

# on subsequent syncs, just repeat the above:
# assuming the most recent thing we fetched in first sync had last-modified of 10
# fetch recent first
https://sync-server/bookmarks/?sort=newest&newer=10&limit=500
# assuming the oldest item we fetched has a last-modified of 15
https://sync-server/bookmarks/?sort=oldest&newer=10&older=15

Plotted against last-modified time, we get:

first sync:
      full         recent
0              5             10
|--------------|--------------|

second sync:
     full          recent
10            15             20
|--------------|--------------|


Our assumption is that most of the time the "recent" stage will actually cover all of the records that changed, and the "full" stage won't download much, if anything at all. We also remove the duplication of record downloads on that first sync, assuming that users have move than 500 history records.

Another thing we should consider is that each "stage" consists of two parts:
- download & merge records
- upload our current state

It doesn't seem wise to upload our current state in the "recent" stage, before we're sure we got all of the changed records. So I propose we investigate feasibility of disabling the upload part of the "recent history" stage, and ensuring that the "full history" stage will continue uploading any changed records correctly.

In terms of changes required:
- allow the "recent history" stage to run on every sync, removing logic at [0]
- modify the regular history stage to include "older" parameter in the GET queries
- investigate disabling upload part of the "recent" stage - perhaps short-circuiting success of the second "flow" in SynchronizerSession

[0] https://dxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/sync/stage/AndroidBrowserRecentHistoryServerSyncStage.java#105
Reporter

Comment 5

2 years ago
Err, DLC is "downloadable content", not "download content".
(In reply to :Grisha Kruglov from comment #4)

> It doesn't seem wise to upload our current state in the "recent" stage,
> before we're sure we got all of the changed records.

Indeed: this will potentially result in two different history records, with different GUIDs, describing the same URL, which is not ideal.

Another thing to think about: if we're able to fetch sort=newest, with older=, then we can actually use that for both parts: on a first sync we fetch all records from newest to oldest, then on subsequent syncs we continue with the previous download and add an additional one to catch newer records.

Given that it's possible for hundreds of new records to be added since our last sync, history sync becomes a _collection_ of non-overlapping ranges that haven't yet been fetched, with one new one each time, and we try to make progress on narrowing those ranges each time we sync. Typically we'll knock them all out within one or two syncs, but in the worst case…
Assign to :jwu since he'll work on Sync.
Assignee: cnevinchen → topwu.tw
Grisha, is this still relevant to activity stream? If not, could you remove the mobileAS whiteboard tag? Thanks!
Flags: needinfo?(gkruglov)
Reporter

Updated

2 years ago
Flags: needinfo?(gkruglov)
Whiteboard: [MobileAS]
Reporter

Updated

2 years ago
Assignee: topwu.tw → nobody
Reporter

Updated

2 years ago
Priority: P2 → P3

Updated

2 years ago
Product: Android Background Services → Firefox for Android
See Also: → 1414437
You need to log in before you can comment on or make changes to this bug.