I'm not able to replicate the issue of the wrong scroll restoration manually, because it tests scroll positions in background tabs. Since it happens when we trigger a process switch, I figured maybe I could try to reproduce it without bfcache enabled but with a navigation that would cause a process switch, and this was unsuccessful as well. I suspected this might be due to racing issues, but I added a timeouts
await new Promise(resolve => setTimeout(resolve, 1000)) in the test after we update scrolls, which didn't help either.
here comes the interesting part:
The scroll update for page1 (so i.e. the
collector->RecordInputChange() call) gets issued when we disconnect listeners and as DocumentChannel is process switching, but somehow it ends up taking much longer to complete than the update that gets issued later for page2.
So, perhaps, in
nsDocShell::EndPageLoad, if we have been initially called with aStatus != 0, we shouldn't update Session Store data (solution 1). Running sessionstore tests with such a change didn't show any errors. This might be the solution to go with here (so, don't call this code if we are taking this branch).
Otherwise, we might need to think about adding guarantees to ensure that calling
collector->RecordInputChange() (or more generally
SessionStoreDataCollector::Collect()) doesn't override updates from later loads, or in other words, updates that were issued later cannot be overriden by updates issued earlier (solution 2). We have a SessionStore epoch, but we don't seem to have anything more granular than that (which I understand is way more difficult if we are issuing updates from different processes).