Closed Bug 1465705 Opened 7 years ago Closed 2 years ago

Sanity-check whether storing tabs in memcached is still a good tradeoff

Categories

(Cloud Services Graveyard :: Server: Sync, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: rfkelly, Unassigned)

References

(Blocks 1 open bug)

Details

The sync storage server currently has some special-case handling of the "tabs" collection, on the assumption that client behaviour for that collection is special. Clients only ever upload a single item, and we expect that item to change frequently, so rather than writing it to disk in the DB we maintain it directly in memcached. This has negative consequences if memcached ever goes down, e.g. because we need to reboot the storage node. All other collections will survive a reboot, but the "tabs" collection will be wiped. See additional discussion in the sync-dev thread here: https://mail.mozilla.org/pipermail/sync-dev/2018-May/001642.html How confident are we that this special treatment of "tabs" is still the right tradeoff in practice? Do we have numbers to support "tabs" being significantly more churny than, say, "history"? ISTM we could get both a complexity reduction and a durability increase if we didn't need to treat tabs as special in this way. A good starting point may be to crunch the logs to compare write characteristics of "tabs" to other collections. We could also consider A/B testing a couple of nodes in production, since this special treatment can be easily preffed on or off in config.
(In reply to Ryan Kelly [:rfkelly] from comment #0) > How confident are we that this special treatment of "tabs" is still the > right tradeoff in practice? Do we have numbers to support "tabs" being > significantly more churny than, say, "history"? Uploading tabs has been the second most common request after getting info/collections. > A good starting point may be to crunch the logs to compare write > characteristics of "tabs" to other collections. We could also consider A/B > testing a couple of nodes in production, since this special treatment can be > easily preffed on or off in config. Some basic load testing might be in order first, but I agree making this change on a few hosts and observing is the only way to know for sure.
(In reply to Bob Micheletto [:bobm] from comment #1) > (In reply to Ryan Kelly [:rfkelly] from comment #0) > Uploading tabs has been the second most common request after getting info/collections. A quick sanity check on a production node shows that I'm way off on that. It's a lot lower volume that we see either posts to bookmarks, history, or some other collections. We should make a larger study to be sure, but it looks to be low enough.
Most tab activity bumps the engine score by 1 or less. The multi-device threshold is 300. The single-device threshold is 1000. We will upload tabs quite rarely. I have 460 Sync log files in my profile directory. I have 430 info/collections fetches, 402 history uploads, and only 249 tab uploads.
(In reply to Richard Newman [:rnewman] from comment #3) > Most tab activity bumps the engine score by 1 or less. The multi-device > threshold is 300. The single-device threshold is 1000. Yep - but that just controls *when* a sync happens. > We will upload tabs quite rarely. I think it's more like "tabs changing will force a sync quite rarely" - but if the tabs collection believes anything has changed (which it does on many tab operations), then it will be synced the next time a sync happens. > I have 430 info/collections fetches, 402 history uploads, and only 249 tab > uploads. That surprises me a little - I'd expect that any time there's history to upload there will usually be tabs to upload too, as most things that cause new history happened in a tab. eg, in my local profile I have log files from May-16 to Jun-02 and see: % cat * | grep -c "POST.*/storage/history" 429 % cat * | grep -c "POST.*/storage/tabs" 488 % cat * | grep -c "POST.*/storage/bookmarks" 149
I'd like to make this change in production. We can roll out the change slowly. Thoughts?
Flags: needinfo?(rfkelly)
Flags: needinfo?(markh)
I support a slow staged rollout to see what the effect is in practice, downside risk seems pretty minimal.
Flags: needinfo?(rfkelly)
👍
Flags: needinfo?(markh)

As this is no longer relevant to the sync storage architecture, I am closing it as invalid.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INVALID
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.