Closed Bug 1408243 Opened 8 years ago Closed 5 years ago

[meta] Simpler Sync

Categories

(Firefox for Android Graveyard :: Android Sync, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: Grisha, Unassigned)

References

Details

(Keywords: meta)

Below I describe what I think of as a "conservative" approach to simplifying Android Sync codebase. It heavily leans toward "modify" vs "rewrite", and looks for minimal, incremental, positive changes that can be made to the current system. Simplest yet significant positive transformation we can make: make record flow sequential and "as single-threaded as possible". We're inherently I/O bound during a sync. When downloading records, we ensure we have everything before we store records locally. We essentially do the same on upload, except a few steps removed - any local fetch failures will fail a sync. So, for "fetch" methods of our sessions (both local and server), we want to ensure we fetched everything before attempting to store or upload. Making that shift essentially changes the flow of records from: - fetch modified records from the "source" session - often in "batches" - and flow them one-by-one into the "sink" session to: - fetch modified records from the "source" session - often in "batches" - ensure all are fetched, then flow them into the "sink" session. We currently perform buffering of records as they're flowing into a "sink" session, via buffering middleware which wraps a sink session, overriding "store" calls. After receiving a "storeDone" call the middleware flows all of the buffered records. Reversing this will support the proposed flow much better: - buffer records when they're fetched, not when they're being stored - let buffered records flow out of the "source" session after all have been fetched - that is, after "onFetchCompleted" call. A notable exception: history records. They're not buffered as they're downloaded and stored; we decided on this for memory usage reasons. This allows us to resume interrupted downloads, since we can maintain a "high water-mark". This will need to be dealt with, somehow. A sidenote: Additionally, this creates a clear point at which we can inspect a set of all flowing records as a whole before attempting to store them on the other end. The "inspection" needs differ depending on which way the records are flowing. If we're uploading, we'd like to ensure all of our records respect various server size limits. Unless we can assume that our server supports batching mode, we can't do _all_ of the necessary checks - but at the very least we should be able to perform a basic size check for individual records, and possibly filter out the offending ones if record type allows (e.g. skip 'history' records which are too large, but fail record flow if a 'bookmark' record is too large). Now, let's take a look at what's between two sessions - RecordsChannel. It plays a few roles: - maintaining a queue of records to be processed - managing a "concurrent records consumer" which flows records out of the queue - managing some session "lifecycle" logic and 'kicking off' a flow (session's 'begin' callbacks, 'fetchModified' call) - maintaining "flow stats" - fetched/stored counters - short-circuiting a flow when there's no data available - calling into sessions' "side-effect" methods at appropriate moments - "cleanUp", "storeIncomplete" - calling into its delegate (SynchronizerSession) at appropriate moments - flow completed, flow failed We can simplify RecordsChannel by: - removing the queue of records - removing the concurrent records consumer and all of the associated logic - "begin" work that sessions perform is all synchronous, so the "begin delegates" could be removed Each session also has two "work queues": one for "fetch" calls (delegateQueue) and one for "store" calls (storeWorkQueue). The simplest change is to turn these two queues into one single-threaded work queue, ensuring that everything is well-ordered. TODO: our flow is already sequential and single-threaded at this point, do we still need this work queue? Records will then be stored on the "sink" in the same thread they were fetched on the "source" session. The indirection of the RecordsChannel is removed, along with the concurrent record flow and its implications (possibility of races in the uploader, etc). The records queue is also the biggest source of memory churn, so this should have a sizeable performance impact. The above largely limits the work to RecordsChannel and the few layers below it: - "reversal" of the buffering middleware - generally light modifications to repository sessions - possible light modifications on the "periphery" of these layers The goal here is to minimize scope of this work as much as possible while still achieving a sizeable impact. I'm sure to have missed some non-insignificant details already, so I feel that the less of this system we touch, the better. In a way, my proposal suffers from a classic "I didn't build it, so I still feel uneasy about it" syndrome - but hopefully not too much. Overall impact of this work should be a much simpler and more efficient flow of records (I expect a performance improvement that will be felt by end users - not just profilers - especially on older devices), codebase that's easier to think about, setting stage for further improvements around record flow as a "set" instead of "one by one", and likely we'll even knock off a few multi-threading bugs. A worthy follow-up improvement to consider as we're going up the abstraction layers is to turn SynchronizerSession/ServerLocalSynchronizerSession classes (which coordinate channel flow and synchronization success/failure) into a (fairly simple) finite state machine. It already is one, except one has to trace through a bunch of code and up/down the delegate chains to figure out what all the various states are, and what the transitions are between them.
Flags: needinfo?(rnewman)
In general I don't understand the details of the android sync codebase to comment on most specifics, they all seem reasonable to me. (In reply to :Grisha Kruglov from comment #0) > Below I describe what I think of as a "conservative" approach to simplifying > Android Sync codebase. It heavily leans toward "modify" vs "rewrite", and > looks for minimal, incremental, positive changes that can be made to the > current system. This also allows work to be done in parallel, although we should probably consider rewriting parts that are only the way they are due to the multithreaded nature of the code. > Simplest yet significant positive transformation we can make: make record > flow sequential and "as single-threaded as possible". This does seem to be the biggest bang for our buck. > A sidenote: Additionally, this creates a clear point at which we can inspect > a set of all flowing records as a whole before attempting to store them on > the other end. The "inspection" needs differ depending on which way the > records are flowing. If we're uploading, we'd like to ensure all of our > records respect various server size limits. Unless we can assume that our > server supports batching mode, we can't do _all_ of the necessary checks - > but at the very least we should be able to perform a basic size check for > individual records, and possibly filter out the offending ones if record > type allows (e.g. skip 'history' records which are too large, but fail > record flow if a 'bookmark' record is too large). As discussed in IRC, we can basically support all necessary checks if we implement the bounds checking the way desktop does it (see http://searchfox.org/mozilla-central/source/services/sync/modules/record.js#798-1004). Specifically, most things just work if you set missing limits to Infinity (or, for Java, probably something like Long.MAX_VALUE). See also bug 1407438 (which I'll probably be taking soon). > In a way, my proposal suffers from a classic "I didn't build it, so I still > feel uneasy about it" syndrome - but hopefully not too much. I think "removing unnecessary use of threads" is pretty hard to argue against. > A worthy follow-up improvement to consider as we're going up the abstraction > layers is to turn SynchronizerSession/ServerLocalSynchronizerSession classes > (which coordinate channel flow and synchronization success/failure) into a > (fairly simple) finite state machine. It already is one, except one has to > trace through a bunch of code and up/down the delegate chains to figure out > what all the various states are, and what the transitions are between them. It's not clear to me that what they do actually models a state machine, but simplifying the abstractions there sounds good.
(In reply to :Grisha Kruglov from comment #0) > Simplest yet significant positive transformation we can make: make record > flow sequential and "as single-threaded as possible". > > We're inherently I/O bound during a sync. > > When downloading records, we ensure we have everything before we store > records locally. We essentially do the same on upload, except a few steps > removed - any local fetch failures will fail a sync. … I think it might be worth summarizing here the original state of the system, in order to understand what's different about your proposal. Android Sync was originally designed to be quite concurrent; indeed, it was intended to work with an async HTTP client, making fetches in parallel if desired, and async storage. We thought we might have history records queued for application while we're already fetching logins, and be uploading tabs, too. It's designed around lots of delegate interfaces which pass records and status events around — "here's a record", "I'm done storing records", "that upload failed". Changed records are fetched from one 'repository' and stored into another. The two repositories then switch sides and do the same again. This flow of records is implemented by a `RecordsChannel` and a `RecordsChannelDelegate`. The channel itself pulls records from upstream as fast as it can, buffering them, and delivering them to the consumer, which applies them to storage as fast as it can. The delegate is informed of flow-level status: failures in begin, fetch, or store, and completion of the flow. Remember that in 2011 or 2012, a write to the local Android DB might take 100ms or more. We didn't locally buffer records, either. So the channel was the pipeline that fetched records from the server in chunks and played them into the local store, and it took care of keeping the pipeline full. Things are different now. We buffer downloaded records in many cases; we don't know about store failures until we're done mirroring the contents of the server, because we don't even try to apply them. We get no real advantage to having that buffering abstracted by the channel interface. We buffer outgoing records, too -- we don't want to upload _any_ bookmarks if we would fail to store some. We have not implemented parallelism in the HTTP layer; while we do want syncing to complete more quickly, particularly first syncs, we probably would not choose to get there by arbitrarily throwing threads at the problem. Our synchronization code would be dramatically simpler if it were entirely sequential — after all, we only do Sync work in the `SyncAdapter`, in a dedicated thread. In essence, we have a sequential system implemented via chained delegate calls across single-threaded Executors. iOS is very different: a type-safe storage client exists to fetch records, and the synchronizer itself decides what to do (typically fetching everything, writing it somewhere, and then uploading changes). While the storage and network operations themselves are asynchronous, they're chained together via `Deferred`, and no delegate interfaces or channel abstractions exist.
(In reply to :Grisha Kruglov from comment #0) > We're inherently I/O bound during a sync. Don't forget that there are two sides to the IO: network and disk. Part of the point of the record channel was to keep both busy. I think you're betting, when changing this code, that one of three things is true: 1. Thread/executor/code thrashing is costly enough that making this code sequential -- that is, not having disk writes overlap with network fetch -- doesn't slow it down too much. Note that we still get simultaneous fetch and write when we're writing into a buffer in the current model. 2. That we are currently harming the user experience by having Sync throw lots of threads and lots of simultaneous network and disk access at its work while the browser is in the foreground. 3. That the code is complex enough that any speedup from concurrent writes isn't worth the impact to programmer productivity, dex size, etc. > - buffer records when they're fetched, not when they're being stored > - let buffered records flow out of the "source" session after all have been > fetched - that is, after "onFetchCompleted" call. I'm not sure that this kind of change is really worthwhile. The complexity is still there; it's just shuffled around. IMO if you make a change here, it should be eliminating delegate interfaces. > - "begin" work that sessions perform is all synchronous, so the "begin > delegates" could be removed I think this is a great place to start. Take all the places that invoke delegate callbacks, and have them throw exceptions or return values instead. Eventually you'll have something like the `RecordsChannel` that has a constructor, a `start` method that throws, and nothing else. Then you can start to think about where buffering happens, because you'll have simpler primitives to work with. > In a way, my proposal suffers from a classic "I didn't build it, so I still > feel uneasy about it" syndrome - but hopefully not too much. I did build it, and I'm still uneasy about it :D > Overall impact of this work should be a much simpler and more efficient flow > of records (I expect a performance improvement that will be felt by end > users - not just profilers - especially on older devices) Can you explain your instinct? > easier to think about, setting stage for further improvements around record > flow as a "set" instead of "one by one" I think that's a good concept.
Flags: needinfo?(rnewman)
Depends on: 1408585
(In reply to Richard Newman [:rnewman] from comment #3) > (In reply to :Grisha Kruglov from comment #0) > > > We're inherently I/O bound during a sync. > > Don't forget that there are two sides to the IO: network and disk. Part of > the point of the record channel was to keep both busy. > > I think you're betting, when changing this code, that one of three things is > true: I think my bet, and an intuition about this thing, is a combination of your points. Gains the concurrent nature of the current system hopes to provide are dwarfed by environment factors (slow mobile networks, potentially slow disks), by implications of its design (intense rate of memory allocation at the channel bottleneck), and by poor programmer productivity (things get complex in details and require a mental ramp up whenever you step away for a bit). > I'm not sure that this kind of change is really worthwhile. The complexity > is still there; it's just shuffled around. IMO if you make a change here, it > should be eliminating delegate interfaces. I completely agree after doing some of simplification work. > > Overall impact of this work should be a much simpler and more efficient flow > > of records (I expect a performance improvement that will be felt by end > > users - not just profilers - especially on older devices) > > Can you explain your instinct? Specifically for the performance, profiling a running Sync was illuminating as to how inefficient the current flow is in a memory constrained environment. The largest bottleneck I see is the RecordsChannel's moving of records between repositories. This is also the largest source of programmer (read, myself) confusion when it comes to details (memory visibility, etc). Solving this shouldn't be very difficult - mostly "churny" - and I think will bring the biggest wins overall. Then, there's the record parsing, encryption/decryption, etc. They're generally straightforward to think about, but we aren't fast. Bug 1204559 might help some, and re-thinking flow in terms of sets of records should probably help even more - aiming to reduce any overhead around parsing/crypto to the minimum. In contrast to concurrency, while I find having to trace through chains of delegates annoying, they're not inherently complicated to think about; it's mostly just tedious, and I find myself having to recreate these chains in my head after not looking at them for a bit. There, I'm mostly concerned about flow inefficiencies, and hopefully making things more linear bit by bit.
Depends on: 1408710
Priority: -- → P3
Summary: Synchronous record flow → [meta] Simpler Sync
Re-triaging per https://bugzilla.mozilla.org/show_bug.cgi?id=1473195 Needinfo :susheel if you think this bug should be re-triaged.
Priority: P3 → P5
Keywords: meta
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.