Last Comment Bug 320742 - crash [@ nsSHistory::EvictGlobalContentViewer] because session history listener shouldn't get called in middle of nsSHistory::AddEntry()
: crash [@ nsSHistory::EvictGlobalContentViewer] because session history listen...
: fixed1.8.1, verified1.8.0.2
Product: Core
Classification: Components
Component: Document Navigation (show other bugs)
: Trunk
: x86 Windows XP
: -- critical (vote)
: ---
Assigned To: Brian Ryner (not reading)
: Andrew Overholt [:overholt]
Depends on:
  Show dependency treegraph
Reported: 2005-12-18 14:09 PST by Matthew Waymost
Modified: 2008-07-31 02:49 PDT (History)
6 users (show)
dveditz: blocking1.8.0.2+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

example javascript code (12.96 KB, application/x-javascript)
2005-12-18 14:17 PST, Matthew Waymost
no flags Details
simpler bug example code (1.64 KB, application/x-javascript)
2006-01-22 10:40 PST, Matthew Waymost
no flags Details
my test code (1.04 KB, text/plain)
2006-01-22 13:54 PST, Nickolay_Ponomarev
no flags Details
minimum testcase (1.26 KB, application/x-javascript)
2006-01-23 10:12 PST, Matthew Waymost
no flags Details
patch (2.41 KB, patch)
2006-01-25 16:58 PST, Matthew Waymost
no flags Details | Diff | Splinter Review
patch v2 (2.30 KB, patch)
2006-01-25 23:24 PST, Matthew Waymost
bryner: review-
Details | Diff | Splinter Review
retrieve the currentTxn again after notifying observers (1.13 KB, patch)
2006-01-27 08:46 PST, Marria Nazif
bryner: review-
Details | Diff | Splinter Review
patch v3.1 (2.21 KB, patch)
2006-01-27 11:33 PST, Matthew Waymost
bryner: review+
Details | Diff | Splinter Review
final patch (2.21 KB, patch)
2006-01-27 11:46 PST, Matthew Waymost
no flags Details | Diff | Splinter Review
patch that was checked in (1.20 KB, patch)
2006-01-27 12:49 PST, Brian Ryner (not reading)
dveditz: approval1.8.0.2+
Details | Diff | Splinter Review

Description Matthew Waymost 2005-12-18 14:09:26 PST
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5

When an entry is added to the session history, the session history listeners get called by AddEntry() before the entry is added to the history, but after the nsSHTransaction corresponding to the current entry is called. If someone registers a nsISHistoryListener that changes mIndex, the browser will crash from an access violation in EvictGlobalContentViewer (see TB13034494Y).

Either the listeners should be called after the entry is added or before currentTxn is acquired.

Reproducible: Always

Steps to Reproduce:
1. Register nsISHistoryListener that changes mIndex at OnHistoryNewEntry (by calling GetEntryAtIndex(entry, true).
2. Run Fx and surf to new page not in session history.

Actual Results:  
Browser crash.

Expected Results:  
Browser should not crash upon changing nsSHistory.mIndex using nsISHistoryListener.OnHistoryNewEntry via GetEntryAtIndex(entry, true).

This bug prevents a nsISHistoryListener from changing the index that the session history is currently at. This affects extensions working with the session history as well as updates/modifications to the session history made by the browser on the JS-level.
Comment 1 Matthew Waymost 2005-12-18 14:17:13 PST
Created attachment 206262 [details]
example javascript code

Here's an example of code that causes this bug. See lines 253-272.
Comment 2 Adam Guthrie 2005-12-18 14:24:01 PST
This might be related to bug 320488.

Incident ID: 13034494
Stack Signature	nsSHistory::EvictGlobalContentViewer 25b32c3e
Product ID	Firefox15
Build ID	2005111116
Trigger Time	2005-12-16 14:46:11.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	firefox.exe + (0032183a)
URL visited	
User Comments	SHE test crash
Since Last Crash	2224 sec
Total Uptime	596063 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852
Stack Trace 	
nsSHistory::EvictGlobalContentViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852]
nsSHistory::EvictContentViewers  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 646]
nsPresContext::EnsureVisible  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresContext.cpp, line 1294]
PresShell::UnsuppressAndInvalidate  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5022]
PresShell::UnsuppressPainting  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5070]
nsDocShell::EndPageLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 4747]
nsWebShell::EndPageLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsWebShell.cpp, line 664]
nsDocShell::OnStateChange  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 4673]
nsDocLoader::FireOnStateChange  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 1210]
nsDocLoader::doStopDocumentLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 844]
nsDocLoader::OnStopRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 665]
nsLoadGroup::RemoveRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/netwerk/base/src/nsLoadGroup.cpp, line 732]
nsDocument::UnblockOnload  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/content/base/src/nsDocument.cpp, line 4968]
DestroyImagePLEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/content/base/src/nsImageLoadingContent.cpp, line 670]
SHELL32.dll + 0x4e0c24 (0x778b0c24)
Comment 3 Nickolay_Ponomarev 2006-01-21 15:12:53 PST
I was unable to crash a recent trunk build by evaluating this code in the scope of browser window, calling the onload handler manually, and navigating to a page not in session history.
1) Is it still a problem with trunk builds?
2) If so, could you provide some code that can be used to reproduce the crash.
Comment 4 Matthew Waymost 2006-01-22 10:40:55 PST
Created attachment 209283 [details]
simpler bug example code

The relevant source (nsSHistory::AddEntry and nsSHistory::EvictGlobalContentViewer) hasn't changed between the branch and trunk so the bug should still be there. I am including simpler example code to make it easier to review. All you should have to do is make the SomeHistoryListener object a SHistoryListener. Then you should get a dialog right before the crash occurs. (This is to help make sure that the listener is being added correctly, since I had issues doing this through the js console, as well as easily verify the crash).

As I mentioned earlier, it should be a pretty simple fix: just move the shistorylistener notification code to either before all the add entry code or after it all, not in the middle of it like it is now.
Comment 5 Nickolay_Ponomarev 2006-01-22 13:54:44 PST
Created attachment 209297 [details]
my test code

Okay, I'm probably not helping, since I'm actually not familiar with this code and just trying to reproduce.

That last testcase has syntax error. Here's the (a bit simplified, but I believe equivalent) code I'm testing. I get the alert, then no crash. The session history is truncated to the first two entries. This happens on both trunk debug and 1.5 release, so no difference there :)
Comment 6 Matthew Waymost 2006-01-22 22:13:46 PST
Looking back at my original testcase, the crash may also have to do with the replaceEntry calls, which I originally didn't think we're playing a role. The location in the code where the history listeners get called should be fixed regardless, but it sounds like something else in that original block of code is contributing to the crash. With the correct minimum case, it should definitely be crashing, because when I first came upon this, I tried plenty of different things and it always crashed at the exact same spot in the code.

If you could e-mail the js to get the listener working through the console (which would be easier than reconstructing the old extension code at this point), then I can hammer out a correct minimum testcase that actually reproduces the crash, and repost it here for confirmation.

(btw, Nikolay, you definitely are helping; since this is turning out to be caused by something different than what I suspected initially)
Comment 7 Matthew Waymost 2006-01-23 10:12:20 PST
Created attachment 209367 [details]
minimum testcase

So I'm pretty sure I got to the bottom of this bug and the recent confusion. It does have to do with the listener being called in the middle of the AddEntry function, but there's another wrinkle I didn't notice until this morning.

What my original and new code does, when the listener is registered and a new history entry is being created, is get the index and highest index (history.count - 1) in the session history, see if they are unequal and, if so, go to a different entry in the session history -- the last entry. That's the wrinkle. The code I posted to Nikolay's initial comment didn't get the last entry, it got the entry at index 1, because I hadn't realized this aspect of the problem. AFAICT, this is what's happening in the code.

Given: the session history currently has an mLength of 5, is current at mIndex=2, my most recent shistorylistener is attached
1) I click on a link, generating a new history entry, calling nsSHistory:AddEntry
2) The code before the listener runs: the current transaction is gotten and the persist flag is verified. (If the persist flag is false, the new entry won't get created and we'll be okay, but that's only the case with things like chrome urls.)
3) The SHistoryListeners are iterated through and each one's OnHistoryNewEntry method is called.
4) My listener verifies that we are not at the last entry in the session history (which, given the assumptions, is true) and then sets the current entry of the session history to the last entry in it. This causes mIndex to be set to 4 (mLength - 1).
5) AddEntry then creates a new transaction for the new entry and appends it to the current transaction. (At this point, the transaction list is actually okay. The only problem we have is mIndex being wrong.)
6) mLength is set to (++mIndex)+1. Now mLength is set to 6 since mIndex is erroneously 4. This is the crux of the problem. The session history should only have a length of 4 (with mIndex = 3). So now, any code that uses mLength and uses it to iterate along the transaction list will break...
7) which happens in nsSHistory::EvictGlobalContentViewers. The code uses mLength to set the end condition of the for loop. SHTransactions are iterated through. The last transaction is reached, GetNext is called, and *pop* we get an access violation and crash.

So that's what's going on. The attachment here should create the crash by attaching the listener, getting a few entries in the session history, go back a few entries, and click a link. You'll get the polite dialog saying you're going to crash, and you should. I don't have the trunk installed, but I looked at the code and it's exactly the same from what I can tell, so it should crash on either.

Here's a talkback entry for a crash with the attached code. TB14319458X

Stack Signature	 nsSHistory::EvictGlobalContentViewer bcd008dc
Product ID	Firefox15
Build ID	2005111116
Trigger Time	2006-01-23 09:28:54.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	firefox.exe + (0032183a)
URL visited
User Comments	bug 320742 crash
Since Last Crash	161211 sec
Total Uptime	3577157 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852
Stack Trace 	
nsSHistory::EvictGlobalContentViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852]
nsSHistory::EvictContentViewers  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 646]
nsPresContext::EnsureVisible  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresContext.cpp, line 1294]
PresShell::UnsuppressAndInvalidate  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5022]
PresShell::ProcessReflowCommands  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 6938]
ReflowEvent::HandleEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 6696]
PL_HandleEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpcom/threads/plevent.c, line 689]
SHELL32.dll + 0x4e0c24 (0x778b0c24)
Comment 8 Matthew Waymost 2006-01-25 13:23:13 PST
With regards to where the listener should be called, it makes sense to me that it be called after the new entry is added to the history. That way, one can write code that can access the new entry. Currently, the newURI is passed, but that doesn't do anything for the actual new entry. The AddEntry call can't be overriden by a SHistoryListener either (unlike every other action a session history listener can react to), so why not allow access to the new entry via OnHistoryNewEntry?

As an aside, why don't we allow the listener to prevent the addition of the new entry? It can stop any other session history navigation, purge, and so forth.

Also, should we have this bug block bug 324285 so the IDL documentation can be more specific as to when OnHistoryNewEntry is called?
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2006-01-25 13:45:04 PST
I think fully adding and only then notifying makes the most sense, yes.

We might as well fix the IDL here.  That is, no need to block bug 324285 -- just let it land and then patch on top of it.

Matthew, would you be willing to take a stab at fixing this?  At this point you probably know this code as well as anyone else does....
Comment 10 Matthew Waymost 2006-01-25 14:24:04 PST
Sure, I'm more than willing to write up a fix.
Comment 11 Matthew Waymost 2006-01-25 16:58:26 PST
Created attachment 209647 [details] [diff] [review]

Here's the patch with the listener called after the entry has been completely added. The only code that comes after the listener call is whether to purge the history of extra entries, since the code in the listener might add/remove entries.
Comment 12 Matthew Waymost 2006-01-25 17:57:40 PST
Hmm, I just realized an unintended consequence of the patch. If we complete the addition of the entry, then we lose the index that the session history was previously at, which might also be useful. We could pass this through the OnHistoryNewEntry as an argument, but that, of course, would require the API to be changed.

We could also put the call to the listener before we add the entry to the history (which would be consistent with when other functions of SHistoryListener is called since every other action is stoppable). Then we don't have access to the added entry.

For my purposes as someone who uses the SHistoryListener in extensions, I'd find the first option easiest. Currently, Session History Extension uses the index of where the session history was previously and then adds a WebProgressListener to get the newly added entry. Avoiding this hacky solution would only be accomplished by passing the index through to the method via arguments. I'm sure there are other concerns as well, but that's my two cents as an extension author. (IMHO, the current SHistoryListener interface is somewhat inflexible and not as useful as it could be, but that's a discussion which probably doesn't belong in this bug.)

(Also, I guess I can't assign myself to the bug? Can a driver assign me then?)
Comment 13 Matthew Waymost 2006-01-25 22:18:21 PST
Comment on attachment 209647 [details] [diff] [review]

I'm working on a new version of the patch bcause I think there's a better solution than just moving the listener calling code around.
Comment 14 Matthew Waymost 2006-01-25 23:24:02 PST
Created attachment 209676 [details] [diff] [review]
patch v2

I figured the best thing to do for now was to fix the bug while keeping everything else consistent. Since every other method in SHistoryListener is called prior to the event occurring, I think keeping that consistent for now makes most sense.

So the listener is actually called in the exact same spot, except I store mIndex prior to calling the listeners. Then, if mIndex exceeds the original current index, I set it back (basically nullify the change). This gets rid of the bug without changing how anything else works.
Comment 15 Brian Ryner (not reading) 2006-01-27 00:34:12 PST
Comment on attachment 209676 [details] [diff] [review]
patch v2

It seems like a better solution would be to refetch currentTxn after notifying the listener.  That way, the listener can still change mIndex if desired (certainly that seems like what your extension wants to do), and we'll still cut off the list at the right place.
Comment 16 Marria Nazif 2006-01-27 08:42:54 PST
bryner and I stepped through this and found that what caused unexpected behavior was when a listener changes mIndex and then currentTxn is not pointing to the same place as mIndex.  An easy fix to this is retrieving the currentTxn again after the listeners are called.
Comment 17 Marria Nazif 2006-01-27 08:46:36 PST
Created attachment 209851 [details] [diff] [review]
retrieve the currentTxn again after notifying observers
Comment 18 Matthew Waymost 2006-01-27 10:20:27 PST
(In reply to comment #15)
> (From update of attachment 209676 [details] [diff] [review] [edit])
> It seems like a better solution would be to refetch currentTxn after notifying
> the listener.  That way, the listener can still change mIndex if desired
> (certainly that seems like what your extension wants to do), and we'll still
> cut off the list at the right place.

That definitely makes more sense since it actually fixes the expected behavior as opposed to working around it. And you're absolutely right, that is what my extension wants to do.

Should we add a comment to the new currentTxn retrieval, so its clear why we're getting it a second time? Maybe something along the lines of "Getting currentTxn again in case a listener modified mIndex." I know if I looked at the code without having gone through this, I'd be a little confused by what would look like a redundant call.
Comment 19 Brian Ryner (not reading) 2006-01-27 10:50:00 PST
Comment on attachment 209851 [details] [diff] [review]
retrieve the currentTxn again after notifying observers

>--- docshell/shistory/src/nsSHistory.cpp	5 Jan 2006 21:01:24 -0000	1.74
>+++ docshell/shistory/src/nsSHistory.cpp	27 Jan 2006 16:37:51 -0000
>@@ -298,16 +298,19 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt
>       nsCOMPtr<nsIHistoryEntry> hEntry(do_QueryInterface(aSHEntry));
>       if (hEntry) {
>         hEntry->GetURI(getter_AddRefs(uri));
>         listener->OnHistoryNewEntry(uri);
>       }
>     }
>   }
>+  if(mListRoot)
>+    GetTransactionAtIndex(mIndex, getter_AddRefs(currentTxn));

I'm a little worried about this slowing down entry insertion in the common case (where mIndex will be unchanged) since it has to walk a linked list up to mIndex.  So I'd suggest two changes:

1. Compare the index before and after calling the listener, and only call GetTransactionAtIndex a second time if the index has changed.  Add a comment saying that the listener could change the index, since it's clearly not obvious given this bug.

2. Move the whole thing up into the |if (hEntry)| block, since we know that a history listener is the only thing that could change mIndex during execution of this method.
Comment 20 Matthew Waymost 2006-01-27 11:33:35 PST
Created attachment 209874 [details] [diff] [review]
patch v3.1

Patch updated per bryner's changes.

When deciding whether to refetch currentTxn, I only check whether currentIndex == mIndex. If we're inside |if (hEntry)| block, we should already know that there's an mListRoot already, making that check unnecessary.
Comment 21 Brian Ryner (not reading) 2006-01-27 11:40:20 PST
Comment on attachment 209874 [details] [diff] [review]
patch v3.1

>--- xpfe/components/shistory/src/nsSHistory.cpp	30 Sep 2005 02:14:22 -0000
>+++ xpfe/components/shistory/src/nsSHistory.cpp	27 Jan 2006 19:25:06 -0000
>@@ -292,25 +292,32 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt
>     NS_ENSURE_SUCCESS(currentTxn->SetSHEntry(aSHEntry),NS_ERROR_FAILURE);
>     currentTxn->SetPersist(aPersist);
>     return NS_OK;
>   }
>   nsCOMPtr<nsISHTransaction> txn(do_CreateInstance(NS_SHTRANSACTION_CONTRACTID));
>+  PRInt32 currentIndex = mIndex;

This should move inside the |if (hEntry)| block as well.  r=me with that fixed, I'll check it in.
Comment 22 Matthew Waymost 2006-01-27 11:46:42 PST
Created attachment 209876 [details] [diff] [review]
final patch

Final patch
Comment 23 Brian Ryner (not reading) 2006-01-27 12:34:18 PST
Comment on attachment 209876 [details] [diff] [review]
final patch

this doesn't actually have the change I mentioned.  don't worry about attaching another patch though.
Comment 24 Matthew Waymost 2006-01-27 12:41:08 PST
Whoops, attached the same patch again, instead of the new one. Sorry about that.
Comment 25 Brian Ryner (not reading) 2006-01-27 12:47:38 PST
/cvsroot/mozilla/docshell/shistory/src/nsSHistory.cpp,v  <--  nsSHistory.cpp
new revision: 1.75; previous revision: 1.74

I'm shooting to get this in on the branches as well.
Comment 26 Brian Ryner (not reading) 2006-01-27 12:49:41 PST
Created attachment 209884 [details] [diff] [review]
patch that was checked in
Comment 27 Brian Ryner (not reading) 2006-01-28 10:07:57 PST
Comment on attachment 209884 [details] [diff] [review]
patch that was checked in

drivers approval isn't needed for 1.8.1
Comment 28 Brian Ryner (not reading) 2006-01-28 10:11:00 PST
checked in on MOZILLA_1_8_BRANCH.
Comment 29 Brian Ryner (not reading) 2006-02-05 13:43:24 PST
This crash is still showing up a little bit on, so it would be good to get this fix in for the next update.
Comment 30 Matthew Waymost 2006-02-13 08:25:17 PST
This bug fix has been checked in on both the 1.8.1 branch and trunk for over two week without any reports of regression. Based on that and what bryner mentioned in comment 29, this should get checked in for
Comment 31 Boris Zbarsky [:bz] (still a bit busy) 2006-02-13 08:44:07 PST
Matthew, in case you missed it the patch is waiting on driver approval for  Drivers are currentl working on the 1.7 branch releases, after which they will look at stuff.
Comment 32 Daniel Veditz [:dveditz] 2006-02-22 00:50:30 PST
Comment on attachment 209884 [details] [diff] [review]
patch that was checked in

approved for 1.8.0 branch, a=dveditz
Comment 33 Brian Ryner (not reading) 2006-02-22 09:46:20 PST
checked in on MOZILLA_1_8_0_BRANCH
Comment 34 Brian Ryner (not reading) 2006-02-23 11:06:51 PST
oops, fixed1.8.0.2, not
Comment 35 Dave Liebreich [:davel] 2006-03-02 10:47:08 PST
Marking [rft-dl] (ready for testing in Firefox release candidates)
Comment 36 Tracy Walker [:tracy] 2006-03-02 12:14:41 PST
Verified with Fx on Windows build 2006-03-02-06-mozilla1.8.0

Note You need to log in before you can comment on or make changes to this bug.