Open Bug 396519 Opened 12 years ago Updated 2 years ago

Encapsulate content viewer eviction in session history

Categories

(Core :: DOM: Navigation, defect)

defect
Not set

Tracking

()

REOPENED
mozilla1.9.1b1
Tracking Status
blocking2.0 --- -

People

(Reporter: ajschult784, Assigned: ajschult784)

References

Details

Attachments

(7 files, 3 obsolete files)

7.72 KB, patch
Details | Diff | Splinter Review
31.43 KB, patch
Details | Diff | Splinter Review
2.24 KB, patch
Details | Diff | Splinter Review
2.96 KB, patch
Details | Diff | Splinter Review
83.16 KB, application/zip
Details
88.29 KB, application/zip
Details
87.11 KB, application/zip
Details
Spun off from bug 358599:

Content viewer eviction currently happens in nsSHistory via EvictContentViewers (called by nsDocumentViewer), but the parameters it passes are obtained from nsDocShell.  nsDocShell got those parameters from... nsSHistory (the same one).  It makes sense to have nsSHistory handle content viewer eviction internally.

[comment 40 from bug 358599]
This implements what I mentioned in the previous comment.  nsSHistory takes
responsibility for evicting content viewers (based on distance from current
focus).  Global eviction still happens from nsDocumentViewer::Show. 
Conceivably, this could happen from within nsSHistory in some method that gets
called regularly (to further simplify things), but I'm not what would be most
appropriate.

The docshell state bits (previous and loaded indices) that were causing
problems don't exist.

The one issue I found was that if GotoIndex is called (and the new index is far
away from the old one), the old index won't have a content viewer (yet), but
once the load is finished, docshell will save the content viewer, and it's
effectively lost (never evicted).  Current behavior is the it gets saved and
immediately evicted, which (while better than losing it) seems silly.  With
this patch, nsSHistory tags the entry so that docshell won't save the content
viewer in the first place.  The tag should get removed if the entry is ever
loaded again.

I've left nsSHistory's mRequestedIndex field and docshell still calls
nsSHistory::UpdateIndex.  It seems the primary use of these is gone, but I
don't know that removing them wouldn't have side effects in some cases.
Attachment #281274 - Flags: review?(bzbarsky)
this verifies that content viewers get evicted when they should.  They pass (before and after this patch) with the patch from bug 396649.
To be honest, I don't know when I'll get to this.  This is scary code that I don't really know that well, so I'll have to relearn it to review. Given my time constraints, right now, that does not sound promising.

Sadly, I'm not sure whom else to recommend, other than bryner.  :(
I've been running with this patch for a few weeks now and I've not encountered any of the session history eviction crashes that I used to get intermittently.
(In reply to comment #2)
> Sadly, I'm not sure whom else to recommend, other than bryner.  :(

Would any of you two (now) have time to review the patch ?
Flags: in-testsuite?
Flags: blocking1.9?
Whiteboard: [has-patch]
I'm not likely to until spring, to be honest.  Maybe a miracle will happen before that, though.
Comment on attachment 281274 [details] [diff] [review]
encapsulate content viewer eviction

(Officially asking bryner, per comment 2 and 5.)
Attachment #281274 - Flags: review?(bzbarsky) → review?(bryner)
Attachment #281274 - Flags: superreview?(cbiesinger)
Comment on attachment 281274 [details] [diff] [review]
encapsulate content viewer eviction

Boris emailed me:
{{
bryner doesn't work on Mozilla anymore.
}}

Restoring request to Boris...
(Sorry, I didn't know...)
Attachment #281274 - Flags: review?(bryner) → review?(bzbarsky)
fixes crashes, has tests, but is risky. Must land by beta2, and that means reviewers must be rounded up. Not going to block.
Flags: wanted1.9+
Flags: blocking1.9?
Flags: blocking1.9-
b2 freeze is tomorrow... poke?
any chance we still can get this fixed somewhere?
Blocks: 398751
Comment on attachment 281424 [details] [diff] [review]
content viewer mochitests
[Checkin: Comment 15]

I was going to land these, but a few cases fail randomly -- mostly it seems that doTest gets called before viewer eviction happens (but that the appropriate content viewers are saved/evicted eventually).  Changing the timeout to 1s helped some cases but not others.  And gdb seems uncooperative, so I'll need to re-investigate things before landing the tests.
Can you find a regression range, since they were passing before (c.f. comment 1)?

You have my gdb sympathies, for whatever it's worth.
I don't know of any straightforward way to run a chrome mochitest in an old nightly build.
Comment on attachment 281424 [details] [diff] [review]
content viewer mochitests
[Checkin: Comment 15]

landed.  this just needed to listen for pageshow instead of pagehide.
Attachment #281424 - Attachment description: content viewer mochitests → content viewer mochitests [Checkin: Comment 15]
Awesome, thanks!
Flags: in-testsuite? → in-testsuite+
Comment on attachment 281274 [details] [diff] [review]
encapsulate content viewer eviction

I'm no docshell expert, but I wonder if the SaveContentViewerFlag accessors shouldn't be infallible, and just return the PRBool for get, void for set.

Then the code pattern would be

if (!mOSHE->ShouldSaveContentViewer())
    return NS_ERROR_FAILURE;

or

mOSHE->SetShouldSaveContentViewer(PR_FALSE);

(You already don't check the result of SetSCVF in GotoIndex or LoadEntry, so I bet you really want this change in your heart of hearts. :) )
> just return the PRBool for get

Hmm... but only if I cast mOSHE to an nsSHEntry, right?
Comment on attachment 281274 [details] [diff] [review]
encapsulate content viewer eviction

I think (I could be wrong!) you still want to call shInternal->UpdateIndex() in nsWebShell.cpp so it updates nsSHistory's mIndex after e.g. going back in history. I think you want mIndex updated so that the call to SetHistoryEntry() succeeds (see the comment right above it).
Right.  I wasn't sure how much we'd still need UpdateIndex after this patch, but I was intending to leave the calls.  Looking through CVS history, it looks like it's there so that the index (returned by GetIndex) only changes after a page starts loading.
Comment on attachment 281274 [details] [diff] [review]
encapsulate content viewer eviction

>Index: docshell/base/nsDocShell.cpp
> nsDocShell::HistoryPurged(PRInt32 aNumEntries)

This seems to be a no-op now.  Can we just get rid of it?

> nsDocShell::CaptureState()
>+    PRBool shouldSaveContentViewer = PR_FALSE;
>+    nsresult rv = mOSHE->GetSaveContentViewerFlag(&shouldSaveContentViewer);
>+    if (NS_FAILED(rv) || !shouldSaveContentViewer)
>+        return NS_ERROR_FAILURE;

This is to handle the "we're about to evict it anyway" case, right?

>Index: docshell/base/nsWebShell.cpp
>-            shInternal->UpdateIndex();

That call needs to stay here.

>Index: docshell/shistory/public/nsISHEntry.idl
>+    /** attribute to indicate whether the contentViewer should be saved */
>+    attribute boolean saveContentViewerFlag;

Would it be reasonable to put this at the end of the interface?  It's not a huge deal, but nice from a compat point of view.

>Index: docshell/shistory/src/nsSHistory.cpp
>+      currentEntry->SetSaveContentViewerFlag(PR_FALSE);

So this works because we only do bfcache for toplevel loads, right?

And we restore the PR_TRUE when we load the entry again, not right after the LoadEntry call?  I guess the check that needs that boolean is done async, not inside LoadEntry.  OK.

r+sr=bzbarsky with the webshell fix and maybe the HistoryPurged removal.  Or you can do HistoryPurged nixing in a separate bug.  I promise a _much_ faster review.

Thank you for cleaning this stuff up, and sorry it took me so long to get to the point where I could review this....
Attachment #281274 - Flags: superreview?(cbiesinger)
Attachment #281274 - Flags: superreview+
Attachment #281274 - Flags: review?(bzbarsky)
Attachment #281274 - Flags: review+
Attachment #281274 - Attachment is obsolete: true
Attachment #337741 - Flags: superreview+
Attachment #337741 - Flags: review+
Pushed changeset ff043f7356f7.  Andrew, thank you again for doing this!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Attachment #337741 - Attachment description: Updated to my comments → Updated to my comments [Backout: Comment 24]
OK, this fixes the memory issue.  What was happening was that once we started purging history (so after 50 pageloads) the PurgeHistory call would decrement mIndex, so that oldIndex would end up equal to the new mIndex and hence we would not evict any content viewers; the result was that after 50 loads we would behave as if gHistoryMaxViewers were 50 or so.

This patch fixes that, and moves some of the code in EvictWindowContentViewers around so that we don't early-return in that situation and instead hit the assert about viewers being too far away from aFromIndex, which starts firing in the situation above.
Attachment #337741 - Attachment is obsolete: true
Pushed changeset 60fc5f224a5b.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Whiteboard: [has-patch]
Target Milestone: --- → mozilla1.9.1b1
Attachment #340734 - Attachment description: With the memory issue fixed → With the memory issue fixed [Checkin: Comment 27]
Backed out again.  There were no rss issues, but there was still a Mac-only Tp regression.  I have no idea why, but I'll dig into it; this bug is blocking my fixing bug 449780 in a sane way.
Blocks: 449780
Flags: blocking1.9.1?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
blocking+ based on blocking 449870. bz, can I assign this to you based on your last comment?
Assignee: ajschult → bzbarsky
Flags: blocking1.9.1? → blocking1.9.1+
I bit the bullet and took the perf hit of fixing bug 449870 without fixing this bug.  No longer blocking.  I've spent a good bit of time trying to figure out why Tp regressed here, but with no luck.

Back to Andrew.  I'll post a version of this patch merged to live on top of my fix for bug 449870.
Assignee: bzbarsky → ajschult
Flags: blocking1.9.1+
No longer blocks: 449780
What bug are you really talking about? Bug 449870 isn't what you think it is.
it was bug 449780
Attached patch Said mergeSplinter Review
Attachment #340734 - Attachment is obsolete: true
Er, yes.  What andrew said in comment 32.  I coulda sworn I copy/pasted that bug number....
So just to summarize what I know so far:

1)  This patch in causes a Tp regression of about 20ms on the mac try server
    box but no obvious regression on the others.  It also causes a Tp
    regression on Mac when checked into m-c.
2)  This might be related to the fact that the Mac try server is the one
    closest to our 250ms timeout for paint suppression in its Tp times, but
    there are boxes that are comparably fast on Windows on the m-c tinderbox. 
    I don't recall whether those showed the regression.
3)  The hypothesis that the problem is that eviction currently happens after
    onload for fast-loading pages (because we unsuppress after onload) and that
    this patch is just moving it to before onload was tested (by moving the
    paint unsuppression to before onload) and found to not account for the full
    regression.  In fact, that change only changed the timings by about 4-5ms
    at the most.
4)  I tried timing the evictions, both old and new.  The actual call to evict
    doesn't take long enough, on average, to account for the full regression.  
5)  Keeping the current eviction mechanism (so not actually landing this patch)
    but moving the evict call to right after AddToSessionHistory in docshell
    leads to exactly the same Tp regression as this patch.  In other words, the
    issue is certainly the eviction timing, not the actual eviction procedure
    itself.
Oh, and:

6)  I checked on try server and there seem to be no GC or CC calls happening
    between the time the new eviction happens and the time the old eviction
    happens.
blocking2.0: --- → ?
I don't see a reason for this to block 1.9.3, please speak up if this should block.
blocking2.0: ? → -
You need to log in before you can comment on or make changes to this bug.