1328123 - Full re-sync triggered; investigation and breakdown needed

Reporter

Description

•

8 years ago

I noticed my local Fennec was syncing longer than I expected and sucking battery. At around the same time I enabled passcode and touch ID, but I believe the long syncs were already occurring. My device was already populated with data and syncing: ``` [SyncStateMachine.swift:667] advance() > Cached meta/global fetched at 1479165390330, newer than server modified 1467128053210. Using cached meta/global. … [Profile.swift:1119] takeActionsOnEngineStateChanges > No collections need reset. Moving on. … [ClientsSynchronizer.swift:353] synchronizeLocalClients(_:withServer:info:) > No remote changes for clients. (Last fetched 1483303746260.) ``` My logins DB already was configured: ``` [BrowserDB.swift:473] tablesExist > 2 tables exist. Expected 2 ``` but for some reason we decided to download _everything_, and the local DB was empty when the records were applied: ``` [LoginsSynchronizer.swift:169] synchronizeLocalLogins(_:withServer:info:) > Synchronizing passwords. Last fetched: 0. [LoginsSynchronizer.swift:175] synchronizeLocalLogins(_:withServer:info:) > Records header hint: Optional(739) [SQLiteLogins.swift:740] applyChangedLogin(_:local:mirror:) > Never seen remote record {941bd3e5-0541-134c-9b47-d8ec03693788}. Mirroring. ``` Bookmarks got to run next, but had no work to do: ``` 2017-01-02 09:09:44.362 [Info] [Downloader.swift:45] init(collectionClient:basePrefs:collection:) > Downloader configured with prefs 'profile.sync.synchronizer.bookmarks..downloader.bookmarks..'. 2017-01-02 09:09:44.362 [Debug] [BookmarksDownloader.swift:119] go(_:greenLight:) > Downloading up to 100 records. 2017-01-02 09:09:44.363 [Debug] [Downloader.swift:127] go(_:limit:) > Modified: 1483272229100; last 1483272229100. 2017-01-02 09:09:44.363 [Debug] [Downloader.swift:129] go(_:limit:) > No more data to batch-download. ``` That download was on Sunday Jan 1 at noon local time, when I was at home. Then history continued from apparently running earlier, given newer=0 and a continuation token: ``` 2017-01-02 09:09:44.365 [Info] [Downloader.swift:45] init(collectionClient:basePrefs:collection:) > Downloader configured with prefs 'profile.sync.synchronizer.history..downloader.history..'. 2017-01-02 09:09:44.366 [Debug] [Downloader.swift:127] go(_:limit:) > Modified: 1483305044110; last 0. 2017-01-02 09:09:44.367 [Debug] [Downloader.swift:157] downloadNextBatchWithLimit(_:infoModified:) > Fetching newer=0, offset=Optional("1481690632950:94"). ``` and downloaded 80,000 records, applying them one by one to a dirty DB: ``` 2017-01-02 09:09:46.792 [Debug] [SQLiteHistory.swift:911] insertOrUpdatePlace(_:modified:) > Warning: history item WHYYzJNZRRh2 changed both locally and remotely. Comparing timestamps from different clocks! … 2017-01-02 09:21:38.564 [Debug] [Downloader.swift:192] handleSuccess > Advancing baseTimestamp to 1483345007450 - 1 2017-01-02 09:21:38.565 [Debug] [Downloader.swift:204] handleSuccess > Advancing lastModified to Optional(1483345007450) ?? 1483345007450. 2017-01-02 09:21:38.572 [Debug] [BrowserDB.swift:306] checkpoint() > Checkpointing a BrowserDB. ``` Then the horror: ``` [HistorySynchronizer.swift:174] uploadOutgoingFromStorage(_:lastTimestamp:withServer:) > Uploading 84852 modified places. ``` These logs are congruent with logins.db having been discarded, and a full sync reset having occurred for both logins.db and browser.db. My best course of action now is to uninstall and set up Sync again: I can't possibly wait for 80,000 records to upload, and then redownload on each of my other devices. There are probably multiple bugs here: - logins.db should not have been trashed. We should re-examine our recovery approach: can we recover without losing the contents of the file? SQLite itself should not lose data even in a crash. - This re-download and re-apply shouldn't result in 80,000 records to upload. I'd need to leave my device at home on wifi for hours. Those history items aren't really changed. - I'm puzzled why we were triggered to re-download bookmarks and history. Bookmarks that I created locally (which can't have synced) are still present in the DB, which implies that browser.db was not deleted and recreated. My hypothesis is that logins.db got trashed, NotificationDatabaseWasRecreated was sent, but the filename was not "logins.db", so handleRecreationOfDatabaseNamed reset Bookmarks and History, too. - We should again reconsider whether we download all history. A user with a decent backlog but a poor network connection might never get fully in sync!

Jenn Chaulk (:jchaulk)

Updated

•

8 years ago

tracking-fxios: ? → 6.0+

Priority: -- → P2

Whiteboard: [MobileCore]

Stefan Arentz | :st3fan | ⏰ EST | he/him

Comment 1

•

8 years ago

Richard, can you find out if this happens on the v6.x branch and give us an idea if this is serious enough to block 6.0? We are on a very tight schedule. With 6.0 supposed to go in Review on Friday. So if this is happening on 6.0 we need to act fast.

Richard Newman [:rnewman]

Reporter

Comment 2

•

8 years ago

There's no chance I can try to repro this on 6.x; I'm traveling internationally (in the lounge right now) and over-committed until at least Friday morning. The only things that you might want to check immediately: - Does adding a passcode or touch ID have any chance of leaving logins.db in a bad state, thus causing a sync reset (particularly a reset of more than just logins)? This should be easy enough to determine just by watching logs in Xcode. - If logins.db is rendered unopenable, and is thus replaced, is the filename in the notification "logins.db" or something else? If it's something else, I would suggest blocking ship -- it's not a regression, but that would be serious. This should be easy enough to manually fake to test. Someone could also pre-emptively refactor that code to not use filenames for the notification, and instead to use some enum passed in when BrowserDB is instantiated. Everything else can realistically wait for a 6.x or 7.0, because the investigation will be more involved.

Stephan Leroux [:sleroux]

Updated

•

8 years ago

Assignee: nobody → sleroux

Status: NEW → ASSIGNED

Stefan Arentz | :st3fan | ⏰ EST | he/him

Comment 3

•

8 years ago

I can't reproduce these results on v6.x. I have tried adding a passcode and enabled touch id and i see nothing weird happen. My sync is pretty stable. I don't see any messages about logins.db or any retries.

Stephan Leroux [:sleroux]

Comment 4

•

8 years ago

> - Does adding a passcode or touch ID have any chance of leaving logins.db in a bad state, thus causing a sync reset (particularly a reset of more than just logins)? This should be easy enough to determine just by watching logs in Xcode. Not that I can see. The only thing I can see in the code is both the touch ID/passcode metadata and logins.db SQLCipher key are both stored in the same keychain but the keys don't collide so I don't see how one would affect the other. > - If logins.db is rendered unopenable, and is thus replaced, is the filename in the notification "logins.db" or something else? If it's something else, I would suggest blocking ship -- it's not a regression, but that would be serious. This should be easy enough to manually fake to test. I confirmed that if logins.db is unable to be created or opened, the NotificationDatabaseWasRecreated will pass the correct database filename to Profile.swift to only regenerate the logins.db and does not touch browser.db and vice-versa. I could preemptively refactor to enums but probably best to leave it along until we know what's going on just in case. Could this be related to our springboard/background crash at all? I can't see the DB being in an inconsistent state if the OS won't allow disk access at all in that case but who knows.

Richard Newman [:rnewman]

Reporter

Comment 5

•

8 years ago

So then it's back to the root facts: - Somehow history and logins (and perhaps other engines?) were forced to re-sync from 0. - The server and local database still had all of my history data. - After downloading and applying all of the history records, Firefox decided that every record needed reupload. I haven't been node-reassigned. My #1 concern is the reupload of history. Everything else is an annoyance, but the reupload is a killer. I'll file a separate bug for that. (In reply to Stephan Leroux [:sleroux] from comment #4) > Could this be related to our springboard/background crash at all? I can't > see the DB being in an inconsistent state if the OS won't allow disk access > at all in that case but who knows. I'd be surprised: my profile wasn't missing data, which is why I hypothesized that we might be accidentally resetting timestamps too broadly.

Richard Newman [:rnewman]

Reporter

Updated

•

8 years ago

Depends on: 1330045

Jenn Chaulk (:jchaulk)

Updated

•

8 years ago

tracking-fxios: 6.0+ → 7.0+

Farhan Patel [:farhan]

Updated

•

8 years ago

tracking-fxios: 7.0+ → 8.0+

Farhan Patel [:farhan]

Updated

•

8 years ago

Priority: P2 → P3

Justin D'Arcangelo [:justindarc]

Updated

•

8 years ago

Assignee: sleroux → nobody

Status: ASSIGNED → NEW

Justin D'Arcangelo [:justindarc]

Comment 6

•

8 years ago

We should add some Sentry logging to find out if this is happening frequently.

Severity: critical → normal

tracking-fxios: 8.0+ → +

Flags: needinfo?(jdarcangelo)

Whiteboard: [MobileCore] → [MobileCore][Sync]

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 7

•

1 years ago

Clear a needinfo that is pending on an inactive user.

Inactive users most likely will not respond; if the missing information is essential and cannot be collected another way, the bug maybe should be closed as INCOMPLETE.

For more information, please visit BugBot documentation.

Flags: needinfo?(justindarc)

Bugzilla

Full re-sync triggered; investigation and breakdown needed

Categories

(Firefox for iOS :: General, defect, P3)

Tracking

()

People

(Reporter: rnewman, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: [MobileCore][Sync])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Updated

Updated

Comment 6

Updated

Comment 7