Closed Bug 1707138 Opened 2 months ago Closed 1 month ago

[BFCache] Don't send session store updates after we start process switching

Categories

(Core :: DOM: Navigation, defect, P3)

defect

Tracking

()

RESOLVED FIXED
90 Branch
Fission Milestone M7a
Tracking Status
firefox90 --- fixed

People

(Reporter: annyG, Assigned: annyG)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

No description provided.

As per Olli's comment https://bugzilla.mozilla.org/show_bug.cgi?id=1701735#c8 test browser/components/sessionstore/test/browser_scrollPositions.js is failing for fission + bfcache

Severity: -- → S3
Status: NEW → ASSIGNED
Fission Milestone: --- → M7a
Priority: -- → P3

So the part of the test that is actually failing with bfcache enabled is not the same what bug 1701735 reported (assertions failing in test_scroll_nested).

Instead, these are the tests that are failing with bfcache enabled. Specifically, the assertion that is failing is this one

After debugging, I see that we are restoring the second page with the wrong scroll and realized that the problem can be traced back to us not saving the scroll correctly. Even though here, we check that the scroll has been set correctly, it somehow gets overwritten later.

Investigating the process of how session store data gets saved, I see that when we loading page2, because of bfcache code here, which flags that a replacement of a browsing context is needed and uses process switching mechanisms for that, we end up process switching, and thus DocumentLoadListener disconnects listeners and sends NS_BINDING_ABORTED error when it triggers a process switch. This eventually causes nsDocShell's EndPageLoad to be called, which tries to collect and update session store data.

I'm not able to replicate the issue of the wrong scroll restoration manually, because it tests scroll positions in background tabs. Since it happens when we trigger a process switch, I figured maybe I could try to reproduce it without bfcache enabled but with a navigation that would cause a process switch, and this was unsuccessful as well. I suspected this might be due to racing issues, but I added a timeouts await new Promise(resolve => setTimeout(resolve, 1000)) in the test after we update scrolls, which didn't help either.

here comes the interesting part:
The scroll update for page1 (so i.e. the collector->RecordInputChange() call) gets issued when we disconnect listeners and as DocumentChannel is process switching, but somehow it ends up taking much longer to complete than the update that gets issued later for page2.

So, perhaps, in nsDocShell::EndPageLoad, if we have been initially called with aStatus != 0, we shouldn't update Session Store data (solution 1). Running sessionstore tests with such a change didn't show any errors. This might be the solution to go with here (so, don't call this code if we are taking this branch).

Otherwise, we might need to think about adding guarantees to ensure that calling collector->RecordInputChange() (or more generally SessionStoreDataCollector::Collect()) doesn't override updates from later loads, or in other words, updates that were issued later cannot be overriden by updates issued earlier (solution 2). We have a SessionStore epoch, but we don't seem to have anything more granular than that (which I understand is way more difficult if we are issuing updates from different processes).

Andreas, do you have any thoughts on this? I can put up a patch for solution 1, but I'm afraid it's not really a solution and just a band-aid.

Flags: needinfo?(afarre)

Let's see if I understand this correctly:

  1. We start a load, which eventually will end up in EndPageLoad
  2. We realize that the load needs a process switch so we process switch
  3. We start a new load that also ends up in EndPageLoad
    4 or 5) The load in 1 calls EndPageLoad
    4 or 5) The load in 3 calls EndPageLoad

and the fact that we can't guarantee that 3 always comes after 1 means that 3 (which is correct)

I think it's the actual solution. If we're aborting a load, and not ending it, I don't think that we should collect anything for it. One successful EndPageLoad or another should get through at one point.

Flags: needinfo?(afarre)

Sounds good. The first load ends up in EndPageLoad twice - first time for regular reasons, because the load is done, and second time, because of document channel disconnecting listeners and process switching. During the second time, a session store update call is issued (by the first load) which somehow ends up taking longer then the first session store update call issued by the second load, in EndPageLoad, when its load is done.

Summary: [BFCache] Fix browser/components/sessionstore/test/browser_scrollPositions.js for Fission + Bfcache → [BFCache] Don't send session store updates after we start process switching
Pushed by agakhokidze@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ccc0c54f621c
Don't send session store updates after we start process switching, r=farre
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch
See Also: → 1705689
You need to log in before you can comment on or make changes to this bug.