Closed Bug 348497 Opened 15 years ago Closed 15 years ago

crashes enabling designMode [@ nsQueryInterfaceWithError::operator()]

Categories

(Core :: DOM: Editor, defect)

1.8 Branch
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.8.1beta2

People

(Reporter: dbaron, Assigned: pkasting)

References

(Blocks 1 open bug)

Details

(5 keywords)

Crash Data

Attachments

(2 files, 1 obsolete file)

A set of crashes started showing up in Firefox nightlies on the 1.8 branch, with the following top-of-stack:

nsQueryInterfaceWithError::operator()  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpcom/build/nsCOMPtr.cpp, line 64]
nsEditor::AddDocumentStateListener  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/editor/libeditor/base/nsEditor.cpp, line 2018]
nsEditingSession::SetupEditorOnWindow  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/editor/composer/src/nsEditingSession.cpp, line 445]
nsEditingSession::MakeWindowEditable  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/editor/composer/src/nsEditingSession.cpp, line 225]
nsHTMLDocument::SetDesignMode  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/content/html/document/src/nsHTMLDocument.cpp, line 3649]

These first appeared in the 2006-08-09 (03:00 PDT) build, and have been appearing at a rate of 2-3 crashes per day since then:
http://talkback-public.mozilla.org/reports/firefox/FF2x/FF2x-topcrashers.html
Flags: blocking1.8.1?
This testcase is based on an url from one of the talkback reports: http://beta.zb5.zeroboard.com
It crashes on current branch builds for me, but not on trunk builds.
It regressed on branch between 2006-08-08 and 2006-08-09:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-08-08+05&maxdate=2006-08-09+06&cvsroot=%2Fcvsroot
Probably a regression from bug 347200.
Blocks: 347200
Keywords: regression, testcase
So yeah, this looks like a regression from bug 347200.  The basic problem is the following stack (somewhat abridged):

#6  0xb295f16c in nsEditingSession::MakeWindowEditable (this=0xb07e19e8, 
    aWindow=0xb07d6360, aEditorType=0xb5aa4674 "html", aDoAfterUriLoad=0)
    at ../../../../mozilla/editor/composer/src/nsEditingSession.cpp:220
#7  0xb577da51 in nsHTMLDocument::SetDesignMode (this=0xb0707b70, 
    aDesignMode=@0xbfffbdec)
    at ../../../../../mozilla/content/html/document/src/nsHTMLDocument.cpp:3648
#8  0xb5420959 in nsSubDocumentFrame::ShowDocShell (this=0xb0750264)
    at ../../../mozilla/layout/generic/nsFrameFrame.cpp:771
#9  0xb541edbf in nsSubDocumentFrame::Init (this=0xb0750264, aPresContext=0xb27c4fd0, 
    aContent=0xb0147c48, aParent=0xb27f1050, aContext=0xb0750214, aPrevInFlow=0x0)
    at ../../../mozilla/layout/generic/nsFrameFrame.cpp:278
...
#15 0xb5376259 in nsCSSFrameConstructor::RecreateFramesForContent (this=0xb079f730, 
    aContent=0xb0147c48) at ../../../mozilla/layout/base/nsCSSFrameConstructor.cpp:11944
...
#23 0xb5775db2 in nsHTMLDocument::FlushPendingNotifications (this=0xb0707b70, 
    aType=Flush_Frames)
    at ../../../../../mozilla/content/html/document/src/nsHTMLDocument.cpp:1264
#24 0xb295f85b in nsEditingSession::SetupEditorOnWindow (this=0xb07e19e8, 
    aWindow=0xb07d6360)
    at ../../../../mozilla/editor/composer/src/nsEditingSession.cpp:396

This is the stack to nsComposerCommandsUpdater::~nsComposerCommandsUpdater.

So the problem is that in SetupEditorOnWindow we create an nsComposerCommandsUpdater, take a ref to it in mStateMaintainer, but pass around the stateMaintainer pointer to things like nsTransactionManager::AddListener.  In this case, we reenter SetupEditorOnWindow, which clobbers mStateMaintainer and deletes the original (from outer invokation) value.  Then we unwind to the outer SetupEditorOnWindow call, use the now-deleted object, and crash.

The reason bug 347200 is responsible is because it changed what GetDesignMode() reports during editor setup, so that we hit the block of code that toggles designMode in nsSubDocumentFrame::ShowDocShell.

What we should probably do is:

1)  Fix SetupEditorOnWindow to hold an nsRefPtr on the stack to the object it's 
    gonna pass around.
2)  Flush a lot earlier.  Either at the top of SetupEditorOnWindow (and add
    code to detect that this reentered and bail out of the outer invokation) or
    even in nsHTMLDocument::SetDesignMode...
3)  Add an internal API that does NOT flush for reenabling designMode, since
    flushing from inside a frame Init() method is just bad juju.

In all this I'm not sure why this is not crashing on trunk; it looks like the same problems should happen there.
Flags: blocking1.9?
Another option (possibly sanest) is to move the designmode toggling of nsSubDocumentFrame::ShowDocShell onto an event.
Though that might not work if a script causes a restyle and then tries to perform editor commands on the iframe.... 
Flags: blocking1.8.1? → blocking1.8.1+
Oh dear.  I have time to help with this, but I don't understand much of what Boris just said.
Flags: blocking1.8.1+ → blocking1.8.1?
Flags: blocking1.8.1? → blocking1.8.1+
Peter, catch me on IRC and we can talk about this?
From IRC:

<bz> So the sequence of events is the following:
<bz> 1)  We enter SetupEditorOnWindow because the JS calls SetDesignMode()
<bz> 2)  SetupEditorOnWindow flushes.
<bz> 3)  There is a pending "display" change which gets processed.
<bz> 4)  The nsSubDocumentFrame for the <iframe> gets recreated
<bz> 5)  nsSubDocumentFrame calls SetDesignMode()
<bz> (this last is to work around the fact that editor caches the presshell and we just created a new one)
<bz> 6)  We reenter SetupEditorOnWindow
<bz> So the really unsafe part is step 5
<bz> Which [was] the fix for bug 284245
Attached patch hack fix, not usable (obsolete) — Splinter Review
I don't think jst's fix from bug 284245 was the right thing to do.  This patch tried changing it to just update the presShell on the editor directly.  Sadly, it doesn't really work: it does fix the crash on this bug, but it re-breaks bug 284245.  This is being posted for bz's amusement.
Assignee: nobody → pkasting
Target Milestone: --- → mozilla1.8.1beta2
This is the bare minimum I can think to do to prevent this crash.

This is basically item 2 from comment 2 above, but without even any reentrancy checking (I didn't need it).

The testcase for bug 284245 ( https://bugzilla.mozilla.org/attachment.cgi?id=175927 ) "works" equally well with and without this patch (I think), but there seem to be weird bugs there either way.  For example, if you enable designmode, type a character into the box, and then disable designmode, then you can still edit in the box--even after going back and then forward.  This is just all screwy.  I think there are major problems going on here.

Also, I note that nsSubDocumentFrame::ShowDocShell() isn't the only place that twiddles designmode off and then on; nsHTMLDocument::OpenCommon() does it too.

Someone really needs to take a step back and figure out how designMode is supposed to work, and then get it right.  Unfortunately, no one seems well-suited to do that, and in the meantime we have a topcrash that needs to get fixed by B2...
Attachment #233692 - Attachment is obsolete: true
Attachment #233906 - Flags: review?(bzbarsky)
Comment on attachment 233906 [details] [diff] [review]
attempted band-aid, branch version

I think I'd prefer the bandaid either in nsHTMLDocument::SetDesignMode (just flush there) or earlier in nsEditingSession::MakeWindowEditable before we call TearDownEditorOnWindow.  Either one will ensure that we don't need obvious reentry protection.  I don't know why not dealing with reentry works, so I'm not sure I want to trust it to continue working...

I do agree that if ReInit() doesn't obviously work we should do this and file a followup to make this work right (possibly trunk-only, certainly with blocking1.9? set).
Comment on attachment 233906 [details] [diff] [review]
attempted band-aid, branch version

pkasting convinced me that flushing in SetDesignMode is more fragile if something in between introduces new pending notifications.  Let's do this for branch and file a followup for trunk.
Attachment #233906 - Flags: superreview+
Attachment #233906 - Flags: review?(bzbarsky)
Attachment #233906 - Flags: review+
Checked in bandaid on trunk to bake.

/mozilla/editor/composer/src/nsEditingSession.cpp 1.46

I will file a followup bug on fixing the issues here more properly, but this will hopefully be "good enough" for branch.
Whiteboard: [baking]
Attachment #233906 - Flags: approval1.8.1?
Blocks: 348802
Comment on attachment 233906 [details] [diff] [review]
attempted band-aid, branch version

a=dbaron on behalf of drivers.  Please land on MOZILLA_1_8_BRANCH and add the fixed1.8.1 keyword once you have done so.

Although if you end up having to continue to tweak this code to fix further regressions at some point we'll have to back the whole thing (i.e., the fix that caused this regression) out.
Attachment #233906 - Flags: approval1.8.1? → approval1.8.1+
Fixed on branch.

/mozilla/editor/composer/src/nsEditingSession.cpp 1.43.4.1
Status: NEW → RESOLVED
Closed: 15 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
Whiteboard: [baking]
verified with Windows and Mac 2.0b2 builds from 20060821
Status: RESOLVED → VERIFIED
Flags: blocking1.9?
RCS file: /cvsroot/mozilla/testing/mochitest/tests/test_bug348497.html,v
done
Checking in tests/test_bug348497.html;
/cvsroot/mozilla/testing/mochitest/tests/test_bug348497.html,v  <--  test_bug348497.html
initial revision: 1.1
done
Flags: in-testsuite+
Crash Signature: [@ nsQueryInterfaceWithError::operator()]
You need to log in before you can comment on or make changes to this bug.