Closed Bug 1408710 Opened 7 years ago Closed 6 years ago

Serialize record flow

Categories

(Firefox for Android Graveyard :: Android Sync, enhancement, P1)

enhancement

Tracking

(firefox60 fixed)

RESOLVED FIXED
Firefox 60
Tracking Status
firefox60 --- fixed

People

(Reporter: Grisha, Assigned: Grisha)

References

Details

Attachments

(5 files)

Part 2 of (many), split from Bug 1408243.

The goal of this bug is to:
- remove concurrent record consumer from RecordsChannel
- move buffering of records to RecordsChannel (thus removing buffering middleware!)
- maintain prior behaviour - don't buffer history records (same concerns apply around memory usage, etc)
Sketched out an approach, it's very straightforward, appears to work (heh), and look at the change count: +64/-772. Tests don't pass (or even complete), but that's for another afternoon.

I also did some profiling and:
- removed a few churny log statements from some hot paths
- increased BaseNCodec's default buffer size from 8kb to 16kb (used in base64 encoding when doing crypto on records), which prevents most of the buffer resizing (big bottleneck). Need to actually think about this change.

This seems to have removed most of the "low-hanging fruit" memory allocation bottlenecks.

Preliminary results from a mechanical chronograph-driven profiling (it's Saturday...) of a full sync of a profile with 1100 bookmarks and ~33000 history records.

Running on API16 emulator:
- before: ~4m on
- after: ~2m15s

Running on API26 emulator:
- before: crashes since I hit Bug 1401415 due to too much garbage collection going wrong :/
- after: ~30s-40s
Comment on attachment 8918601 [details]
Bug 1408710 - Serialize RecordsChannel

https://reviewboard.mozilla.org/r/189436/#review194938

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:23
(Diff revision 3)
>  import org.mozilla.gecko.sync.repositories.NoStoreDelegateException;
>  import org.mozilla.gecko.sync.repositories.RepositorySession;
>  import org.mozilla.gecko.sync.repositories.delegates.DeferredRepositorySessionStoreDelegate;
>  import org.mozilla.gecko.sync.repositories.delegates.RepositorySessionFetchRecordsDelegate;
>  import org.mozilla.gecko.sync.repositories.delegates.RepositorySessionStoreDelegate;
> +import org.mozilla.gecko.sync.repositories.domain.HistoryRecord;

?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:95
(Diff revision 3)
>      this.source    = source;
>      this.sink      = sink;
>      this.delegate  = delegate;
>    }
>  
> -  /*
> +  private ArrayList<Record> toProcess = new ArrayList<>();

OK, so the block comment that's missing from the commit is: rather than queueing records in a `ConcurrentLinkedQueue` for delivery to a consumer, we now always buffer downloaded records in an `ArrayList`. We can then hand it over to the receiving `RepositorySession`.

That lets us kill the buffering middleware.

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:202
(Diff revision 3)
> -    // Sink will be informed once consumer finishes.
> +    delegate.onFlowCompleted(this);
> -    this.consumer.halt();
>    }
>  
>    @Override
>    public void onFetchedRecord(Record record) {

We are missing a trick here. When we download items, the `X-Weave-Records` header tells us how many items will be returned. When we know that, we should call `toProcess.ensureCapacity(size)`, which will avoid array resizing costs.

Indeed, `onFetchedRecord` is perhaps a noticeably sub-optimal way of doing this work. Can we get more of the records in advance, so we can call `toProcess.addAll(records)`, rather than `toProcess.add(record)` over and over?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:210
(Diff revision 3)
> +    // not buffering:
> +    // - roughly half the memory footprint
> +    // - ability to resume downloads, since records are stored as they're fetched, and we can
> +    // -- maintain a "high watermark"
> +    // All other record types are buffered.
> +    if (record instanceof HistoryRecord) {

Oh, now I see why there's that import!

Can we do this in a way that isn't horrifically gross? A delegate method, a subclass with a class method, or something else?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:230
(Diff revision 3)
> -    Logger.trace(LOG_TAG, "onFetchCompleted. Stopping consumer once stores are done.");
> -    this.consumer.queueFilled();
> +    fetchedCount.set(toProcess.size());
> +
> +    Logger.info(LOG_TAG, "onFetchCompleted. Fetched " + fetchedCount.get() + " records. Storing...");
> +
> +    try {
> +      for (Record record : toProcess) {

Is `onFetchCompleted` guaranteed to be called on the same thread as `onFetchedRecord`?

If not, then you're relying on `toProcess` being thread-safe, and `ArrayList` is not.

::: mobile/android/thirdparty/org/mozilla/apache/commons/codec/binary/BaseNCodec.java:66
(Diff revision 3)
>  
>      /**
>       * Defines the default buffer size - currently {@value}
>       * - must be large enough for at least one encoded block+separator
>       */
> -    private static final int DEFAULT_BUFFER_SIZE = 8192;
> +    private static final int DEFAULT_BUFFER_SIZE = 2 * 8192;

Can you explain this change?
Attachment #8918601 - Flags: review-
Priority: -- → P1
Comment on attachment 8953670 [details]
Bug 1408710 - Pre: Remove ServerLocalSynchronizer*

https://reviewboard.mozilla.org/r/222888/#review229140

I assume this is just straight inlining of stuff before a `super.doWhatever()` call -- ReviewBoard won't show me the deleted file contents. If so, rubberstamp!
Attachment #8953670 - Flags: review?(rnewman) → review+
Comment on attachment 8953671 [details]
Bug 1408710 - Pre: Just use the delegateQueue in the downloader instead of creating a new one

https://reviewboard.mozilla.org/r/222890/#review229142
Attachment #8953671 - Flags: review?(rnewman) → review+
Comment on attachment 8953672 [details]
Bug 1408710 - Pre: for clarity, rename session's delegateQueue to a more appropriate name

https://reviewboard.mozilla.org/r/222892/#review229146
Comment on attachment 8918601 [details]
Bug 1408710 - Serialize RecordsChannel

https://reviewboard.mozilla.org/r/189436/#review229150

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:76
(Diff revision 7)
>  
>    // Expected value relationships:
>    // attempted = accepted + failed
>    // reconciled <= accepted <= attempted
>    // reconciled = accepted - `new`, where `new` is inferred.
> -  private final AtomicInteger storeAttemptedCount = new AtomicInteger();
> +  // TODO these don't need to be atomic anymore, right? they're written from the same thread, and might

File a follow-up?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/synchronizer/RecordsChannel.java:222
(Diff revision 7)
> -  @Override
> -  public void onBatchCompleted() {
> -    this.sink.storeFlush();
> +    fetchedCount.set(toProcess.size());
> +
> +    Logger.info(LOG_TAG, "onFetchCompleted. Fetched " + fetchedCount.get() + " records. Storing...");
> +
> +    try {
> +      for (Record record : toProcess) {

Future: use an `ArrayDeque` instead of an `ArrayList`, so you can push records into it in amortized constant time, and incrementally pop records off it in amortized constant time right here. For large records, that will give the GC opportunity to free memory.
Attachment #8918601 - Flags: review?(rnewman) → review+
Comment on attachment 8953709 [details]
Bug 1408710 - Don't pass around GUIDs of individual record store success, just an aggregate counter

https://reviewboard.mozilla.org/r/222944/#review229156

Let me get this straight.

- At the start of the sync we have a set of changed records, `R`.
- During a sync we upload all of `R`, and we receive success and error results `S` and `E`.
- At the end of a sync the local store has a set of changed records `R'`, which is a superset of `R`.

Android Sync currently assumes the following:

- `R` == `R'`
- `S` + `E` == `R`

Of course, the first is incorrect. If you remove this capability, will it hamper your ability to improve things?

::: mobile/android/services/src/main/java/org/mozilla/gecko/sync/net/BaseResource.java:308
(Diff revision 1)
>    }
>  
>    private void execute() {
>      HttpResponse response;
>      try {
> +      Logger.info("GRIGRI", request.toString());

Ahem.
(In reply to Richard Newman [:rnewman] from comment #14)
> I assume this is just straight inlining of stuff before a
> `super.doWhatever()` call -- ReviewBoard won't show me the deleted file
> contents. If so, rubberstamp!

That's correct!
(In reply to Richard Newman [:rnewman] from comment #18)
> Comment on attachment 8953709 [details]
> Bug 1408710 - Don't pass around GUIDs of individual record store success,
> just an aggregate counter
> 
> https://reviewboard.mozilla.org/r/222944/#review229156
> 
> Let me get this straight.
> 
> - At the start of the sync we have a set of changed records, `R`.
> - During a sync we upload all of `R`, and we receive success and error
> results `S` and `E`.
> - At the end of a sync the local store has a set of changed records `R'`,
> which is a superset of `R`.
> 
> Android Sync currently assumes the following:
> 
> - `R` == `R'`

This isn't correct for version-tracked repositories (bookmarks), where it's made explicit that R' != R. Non-versioned repositories are very vague about this, and just sort of hope that things will work out.

(R' won't be a superset, either - but that depends on your definition of "changed" in the context of "end of sync".)

> - `S` + `E` == `R`

I expect this to be true if the system is functioning correctly. I think we can plot these counts from the data we get in the sync ping to confirm.

> If you remove this capability, will it hamper your ability to improve things?

We still maintain the "this GUID failed" delegate channel, and if S+E=R relationship holds, we don't loose any information by removing the "this GUID succeeded" messaging. In theory we don't even need the "15 records succeeded" channel as that number could be implied...

Right now it's a waste of resources. If it becomes necessary, we can always grow this back easily enough.
Blocks: 1441281
Comment on attachment 8953709 [details]
Bug 1408710 - Don't pass around GUIDs of individual record store success, just an aggregate counter

https://reviewboard.mozilla.org/r/222944/#review229346
Attachment #8953709 - Flags: review?(rnewman) → review+
Comment on attachment 8953672 [details]
Bug 1408710 - Pre: for clarity, rename session's delegateQueue to a more appropriate name

https://reviewboard.mozilla.org/r/222892/#review229348
Attachment #8953672 - Flags: review?(rnewman) → review+
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s e49e0f9461742880c1af1d9e640ff928c9fdba2e -d 1d5ae4b0b826: rebasing 448996:e49e0f946174 "Bug 1408710 - Pre: Remove ServerLocalSynchronizer* r=rnewman"
other [source] changed mobile/android/base/android-services.mozbuild which local [dest] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? u
unresolved conflicts (see hg resolve, then hg rebase --continue)
Rebases manually just fine.
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 09a15c9692ad55b07083070a4f083b6de09605f9 -d 1ff022a3d2b7: rebasing 449010:09a15c9692ad "Bug 1408710 - Pre: Remove ServerLocalSynchronizer* r=rnewman"
other [source] changed mobile/android/base/android-services.mozbuild which local [dest] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? u
unresolved conflicts (see hg resolve, then hg rebase --continue)
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 09a15c9692ad55b07083070a4f083b6de09605f9 -d 1ff022a3d2b7: rebasing 449010:09a15c9692ad "Bug 1408710 - Pre: Remove ServerLocalSynchronizer* r=rnewman"
other [source] changed mobile/android/base/android-services.mozbuild which local [dest] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? u
unresolved conflicts (see hg resolve, then hg rebase --continue)
Pushed by gkruglov@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/42ccbb911db3
Pre: Remove ServerLocalSynchronizer* r=rnewman
https://hg.mozilla.org/integration/autoland/rev/82986cd4578a
Pre: Just use the delegateQueue in the downloader instead of creating a new one r=rnewman
https://hg.mozilla.org/integration/autoland/rev/430a0bfcdb93
Pre: for clarity, rename session's delegateQueue to a more appropriate name r=rnewman
https://hg.mozilla.org/integration/autoland/rev/a2227052b4aa
Serialize RecordsChannel r=rnewman
https://hg.mozilla.org/integration/autoland/rev/ef963ffb6d1b
Don't pass around GUIDs of individual record store success, just an aggregate counter r=rnewman
Depends on: 1442351
Depends on: 1442248
Depends on: 1445462
Depends on: 759338
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: