Last Comment Bug 311269 - Crash on when sending password request [@ nsSHEntry::ContentInserted] [@ nsSHEntry::DropPresentationState] [@ nsSHEntry::DocumentMutated]
: Crash on when sending password request [@ nsSHEntry::ContentInse...
: crash, fixed1.8, regression, topcrash
Product: Core
Classification: Components
Component: Document Navigation (show other bugs)
: 1.8 Branch
: All All
: -- critical (vote)
: ---
Assigned To: Jonas Sicking (:sicking) PTO Until July 5th
: 311144 311351 (view as bug list)
Depends on:
Blocks: 292962
  Show dependency treegraph
Reported: 2005-10-05 15:01 PDT by Marcia Knous [:marcia - use ni]
Modified: 2011-08-05 22:33 PDT (History)
19 users (show)
asa: blocking1.8rc1+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

stack trace from talkback report (3.53 KB, text/plain)
2005-10-05 15:23 PDT, Asa Dotzler [:asa]
no flags Details
another stack with one more line at the top (2.15 KB, text/plain)
2005-10-05 15:28 PDT, Asa Dotzler [:asa]
no flags Details
back out 292962 (2.62 KB, patch)
2005-10-05 15:49 PDT, Scott MacGregor
no flags Details | Diff | Splinter Review
Set up mutation observer at right place (2.66 KB, patch)
2005-10-05 17:37 PDT, Jonas Sicking (:sicking) PTO Until July 5th
bryner: review+
bryner: superreview+
asa: approval1.8rc1+
Details | Diff | Splinter Review

Description Marcia Knous [:marcia - use ni] 2005-10-05 15:01:43 PDT
From Asa's blog, I filed this bug because we are getting numerous reports:

Build 2005100418 for Mac OS X crashed on each time I tried
to send the password via the request form of the e-mail module,
which is NOT the standard login of The nightly of 2005-10-03 (06
or 07, can't remember) didn't have that problem.

I also am getting a consistent crash visiting site and logging
in. I can reproduce this 100% on the 2005100418 Mac build but not on the 2005103

The first lines of my Apple stack show commonality with the yahoo stack trace -

1 org.mozilla.firefox 0x002f1610 nsSHEntry::DocumentMutated() + 224
2 org.mozilla.firefox 0x001b159c nsDocument::ContentRemoved(nsIContent*,
nsIContent*, int) + 140
3 org.mozilla.firefox 0x00248e60 nsGenericElement::RemoveChildAt(unsigned, int)
+ 856
Comment 1 Scott MacGregor 2005-10-05 15:05:09 PDT
Some folks have suggested that this might be related to Bug 292962
Comment 2 Jesse Ruderman 2005-10-05 15:08:16 PDT
This crash accounts for at least 11 of the 24 crashes reported against the
2005-10-04-18 branch builds of Firefox:

This crash looks exploitabe.
Comment 3 Asa Dotzler [:asa] 2005-10-05 15:16:48 PDT
cc'ing some people that were involved at bug 292962. Sounds like we have to get
this fixed for beta 2. 
Comment 4 Asa Dotzler [:asa] 2005-10-05 15:20:47 PDT
This is all platforms. 

Comment 5 Asa Dotzler [:asa] 2005-10-05 15:23:33 PDT
Created attachment 198634 [details]
stack trace from talkback report
Comment 6 Asa Dotzler [:asa] 2005-10-05 15:28:11 PDT
Created attachment 198636 [details]
another stack with one more line at the top
Comment 7 Boris Zbarsky [:bz] 2005-10-05 15:34:23 PDT
So.. we should really not be triggering on the DOM mutations Sanitize()
triggers, at the very least.
Comment 8 Brian Ryner (not reading) 2005-10-05 15:37:59 PDT
No, we shouldn't, or most pages won't be cached.
Comment 9 Boris Zbarsky [:bz] 2005-10-05 15:40:51 PDT
That's what both stacks are showing -- we hit a DOM mutation from inside
Sanitize(), which causes us to drop everything.  Not sure why that crashes, but...
Comment 10 Scott MacGregor 2005-10-05 15:49:01 PDT
Created attachment 198642 [details] [diff] [review]
back out 292962

After a mini pow-wow with bryner and Asa. We think we should just back out
292962, respin some bits so we have something that doesn't crash for our end

292962 wasn't fixing any known bug or crash.

This at least gives us a build in hand we can ship with as a backup plan.
Comment 11 Scott MacGregor 2005-10-05 16:22:59 PDT
backed out on the branch for now. 
Comment 12 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-05 16:42:07 PDT
Ok, this is wrong on many levels

1. We set up the mutation-observer before we sanitize the document (wrong)
2. The sanitation causes document mutations (why?)
3. We end up in nsSHEntry::DropPresentationState which sets
   mContentViewer = nsnull
   where mContentViewer is 0xdddddddd  (why)

Still looking, but there are bigger problems here then just bug 292962
Comment 13 Boris Zbarsky [:bz] 2005-10-05 16:53:23 PDT
> 2. The sanitation causes document mutations (why?)

It resets the value of inputs, which means modifying the anonymous content
inside the editor.  That notification goes to the document observers, of course.

>   where mContentViewer is 0xdddddddd  (why)

Sounds like the nsSHEntry has been deleted.  Presumably, the only ref being held
to it at this point is by the content viewer, so it dies when
mContentViewer->ClearHistoryEntry() is called...
Comment 14 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-05 16:59:29 PDT
Ok, so what happens is that we first set up the mutationobserver. We then
sanitize the document causing a mutation and DropPresentationState is called.
The first thing it does is to tell the contentviewer to release the reference to
the SHEntry. However the contentviewer holds the last entry to the SHEntry and
so the SHEntry dies. When DropPresentationState then continues all members are
of course deleted.

Working on a patch now.
Comment 15 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-05 17:15:33 PDT
So I can fix 1 and I understand 2, but 3 is still a mystery to me. If i put a
breakpoint in nsDocumentViewerImpl::Destroy mSHEntry usually has a refcount of
2. However on the page where the crash happens the refcount is for some reason
1. So even once i've fixed the observer to be set up at the right time we are
still going to go through a bunch of hassle to save the presentation in the
SHEntry and then just delete the SHEntry when the contentviewer releases its
reference to it.

I'll file that as a new bug, but ideas are welcome.
Comment 16 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-05 17:37:40 PDT
Created attachment 198647 [details] [diff] [review]
Set up mutation observer at right place

I think we want this patch either way. Without it there is going to be a lot of
pages that we don't store in the bfcache. Specifically all pages with forms
modified by the user (not sure the 'modified by user' requirement is there
Comment 17 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-05 17:42:50 PDT
Btw, steps I used to reproduce:

1. Enter the mozilla store.
2. If you are logged in, log out.
3. Click the 'My Account' link.
4. Enter a bogus email and passwd and press 'sign in'
5. Repeat 4

I always crash when pressing 'sign in' the second time.
Comment 18 Boris Zbarsky [:bz] 2005-10-05 19:24:32 PDT
So the reason the SHEntry is about to die is that we're doing a LOAD_LINK load
of the same URI.  We get into nsDocShell::OnNewURI (called from
CreateContentViewer _after_ the last time we check CanSavePresentation) and this
resets mLoadType to LOAD_NORMAL_REPLACE (see line 7278).  So we save the
presentation in the old SHEntry, then do a replace and drop the old SHEntry on
the floor.
Comment 19 Marcia Knous [:marcia - use ni] 2005-10-05 19:50:07 PDT
anyone on this cc list - if you can please try to reproduce the crash on on the new set of builds we just put out -
It should no longer crash due to a backout.

On the Mac build from that directory, I no longer see the crash.
Comment 20 logan 2005-10-05 20:29:19 PDT
crash on confirmed with 2005100504 linux build, 2005100517 is ok
Comment 21 Mike Schroepfer 2005-10-05 20:44:24 PDT

On windows cannot repro crash
Comment 22 dex_sf 2005-10-06 04:07:55 PDT
I wrote the comment on Asa's blog you quoted. 

With the Build 2005100518 on MAC the password request form of the
email module works perfectly again. Not the slightest problem anymore, thanks!!
No problem with the Build 2005100517 on WinXP Pro SP2 either.

Comment 23 Andreas Morawietz 2005-10-06 07:07:54 PDT
*** Bug 311144 has been marked as a duplicate of this bug. ***
Comment 24 Jesse Ruderman 2005-10-06 10:49:08 PDT
*** Bug 311351 has been marked as a duplicate of this bug. ***
Comment 25 Frank Yan (:fryn) 2005-10-06 14:21:06 PDT
someone label this with the keyword:

Comment 26 Brian Ryner (not reading) 2005-10-06 14:29:12 PDT
Comment on attachment 198647 [details] [diff] [review]
Set up mutation observer at right place

Would be nice if we could avoid the kungFuDeathGrip, but that's fairly
difficult when both the SHEntry and the ContentViewer are going to be
Comment 27 Scott MacGregor 2005-10-06 14:38:15 PDT
putting the keyword back. This was fixed for 1.5 Beta 2 by backing out 292962.
We need this keyword or it shows up as a blocker which hasn't been checked in
yet in our queries. Please don't remove it. 
Comment 28 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-07 16:27:39 PDT
Comment on attachment 198647 [details] [diff] [review]
Set up mutation observer at right place

So I think we should land this for 1.8.

Without this patch we will never put pages with password fields or fields with
autocomplete=off in the bfcache.

Not sure if we should try to also reland the patch that got backed out, though
this patch does fix the crash that we saw on two levels.
Comment 29 Boris Zbarsky [:bz] 2005-10-07 16:32:04 PDT
Fwiw, I think we should take 292962 if there are no outstanding issues from it.
Comment 30 Asa Dotzler [:asa] 2005-10-10 14:55:34 PDT
Comment on attachment 198647 [details] [diff] [review]
Set up mutation observer at right place

It's too late to make serious changes after beta2 since we won't get any more
serious testing opportunities. The added perf we'd get from this on password
pages isn't worth the risk at this point.
Comment 31 Boris Zbarsky [:bz] 2005-10-10 16:30:04 PDT
Asa, I'm pretty unhappy about shipping bfcache enabled if we're not planning to
fix bug 292962.  I realize that testing is going to be hard to come by, but then
again I think this bug should have blocked 1.8b5 like it's flagged to...
Comment 32 Asa Dotzler [:asa] 2005-10-18 14:21:22 PDT
Comment on attachment 198647 [details] [diff] [review]
Set up mutation observer at right place

OK. let's get this landed ASAP.
Comment 33 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-18 16:23:51 PDT
Does this mean that we want to reverse the backout done on the branch too?
Comment 34 Scott MacGregor 2005-10-18 16:26:42 PDT
yup, we plussed 292962 again for the branch so it can be re-landed.
Comment 35 Jonas Sicking (:sicking) PTO Until July 5th 2005-10-18 16:31:39 PDT
done and done

Note You need to log in before you can comment on or make changes to this bug.