Closed Bug 859099 Opened 11 years ago Closed 11 years ago

crash in nsObjectLoadingContent::DoStopPlugin

Categories

(Core Graveyard :: Plug-ins, defect, P1)

21 Branch
x86
Windows XP
defect

Tracking

(firefox20 unaffected, firefox21+ fixed, firefox22 unaffected, firefox23 unaffected)

RESOLVED FIXED
Tracking Status
firefox20 --- unaffected
firefox21 + fixed
firefox22 --- unaffected
firefox23 --- unaffected

People

(Reporter: scoobidiver, Assigned: johns)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [qa-])

Crash Data

Attachments

(1 file, 3 obsolete files)

It's currently #21 top browser crasher in 21.0b1.

Without the fix of bug 711953 it's hard to determine the regression range but it seems to have showed up in 21.0a2/20130328. The regression range is:
http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=34f00081ec61&tochange=b8ceed3fed5d
It's likely a regression from bug 851378.

There are no crashes in branches 22 and 23 so far so I assume a fix exists for 22.0 but bug 851378 has no dependent bugs.

Signature 	msvcr100.dll@0x8af06 More Reports Search
UUID	43e810cf-4a3d-4f2e-8350-4638d2130405
Date Processed	2013-04-05 09:02:05
Uptime	31307
Install Age	8.7 hours since version was first installed.
Install Time	2013-04-05 00:19:01
Product	Firefox
Version	21.0
Build ID	20130401192816
Release Channel	beta
OS	Windows NT
OS Version	5.1.2600 Service Pack 3
Build Architecture	x86
Build Architecture Info	GenuineIntel family 15 model 4 stepping 1
Crash Reason	EXCEPTION_NONCONTINUABLE_EXCEPTION
Crash Address	0x0
User Comments	It just totally quit on me w/o a moment's notice
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x2582, AdapterSubsysID: 01791028, AdapterDriverVersion: 6.14.10.4543
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- 
Processor Notes 	sp-processor08.phx1.mozilla.com_23314:2008
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x2582
Total Virtual Memory	2147352576
Available Virtual Memory	1590165504
System Memory Use Percentage	95
Available Page File	330821632
Available Physical Memory	23764992

Frame 	Module 	Signature 	Source
0 	msvcr100.dll 	msvcr100.dll@0x8af06 	
1 	xul.dll 	nsObjectLoadingContent::DoStopPlugin 	content/base/src/nsObjectLoadingContent.cpp:2519
2 	xul.dll 	nsObjectLoadingContent::StopPluginInstance 	content/base/src/nsObjectLoadingContent.cpp:2566
3 	xul.dll 	nsObjectLoadingContent::~nsObjectLoadingContent 	content/base/src/nsObjectLoadingContent.cpp:684
4 	xul.dll 	mozilla::dom::HTMLObjectElement::~HTMLObjectElement 	content/html/content/src/HTMLObjectElement.cpp:153
5 	xul.dll 	mozilla::dom::HTMLObjectElement::`vector deleting destructor' 	
6 	xul.dll 	nsNodeUtils::LastRelease 	content/base/src/nsNodeUtils.cpp:258
7 	xul.dll 	mozilla::dom::FragmentOrElement::Release 	content/base/src/FragmentOrElement.cpp:1723
8 	xul.dll 	ReleaseSliceNow 	js/xpconnect/src/XPCJSRuntime.cpp:636
9 	xul.dll 	XPCIncrementalReleaseRunnable::ReleaseNow 	js/xpconnect/src/XPCJSRuntime.cpp:691
10 	winmm.dll 	timeGetTime 	
11 	xul.dll 	XPCIncrementalReleaseRunnable::Run 	js/xpconnect/src/XPCJSRuntime.cpp:720
12 	xul.dll 	nsCOMPtr_base::assign_with_AddRef 	obj-firefox/xpcom/build/nsCOMPtr.cpp:48
13 	ntdll.dll 	RtlEnterCriticalSection 	
14 	nspr4.dll 	PR_Lock 	nsprpub/pr/src/threads/combined/prulock.c:201
15 	nspr4.dll 	PR_Unlock 	nsprpub/pr/src/threads/combined/prulock.c:315
16 	xul.dll 	mozilla::Mutex::Unlock 	obj-firefox/dist/include/mozilla/Mutex.h:83
17 	xul.dll 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:238
18 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:82
19 	xul.dll 	_SEH_epilog4 	
20 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:182
21 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:163
22 	xul.dll 	nsAppShell::Run 	widget/windows/nsAppShell.cpp:154
23 	xul.dll 	XREMain::XRE_mainRun 	toolkit/xre/nsAppRunner.cpp:3871
24 	xul.dll 	XREMain::XRE_main 	toolkit/xre/nsAppRunner.cpp:3938
25 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:4141
26 	firefox.exe 	do_main 	browser/app/nsBrowserApp.cpp:224
27 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:105
28 	firefox.exe 	__tmainCRTStartup 	crtexe.c:552
29 	kernel32.dll 	BaseProcessStart 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=msvcr100.dll%400x8af06
Assignee: nobody → jschoenick
Keywords: topcrash
Something especially bad is happening here - we shouldn't have a plugin alive in the destructor:

http://hg.mozilla.org/releases/mozilla-beta/annotate/2e91ff229d84/content/base/src/nsObjectLoadingContent.cpp#l683
Priority: -- → P2
Re steps-wanted: Is there something worthwile in the report comments maybe?
I confirm 22.0 and 23.0 are unaffected at least with the above crash signature and have never been.
I got these links from the comments:
www.centrum.sk
twinspires.com
HOTMAIL.COM
facebook
http://ivytech.readi.info/
ebay
gmail
satramana.org 
yahoo
But none of them crashed my FF 19.0.2, Win 7
(In reply to Paul Silaghi [QA] from comment #4)
> But none of them crashed my FF 19.0.2, Win 7

We're looking at the 21 situation here.
Yeah, but the most of the crashes were reported on 19.0.2.
I'll try again on 21b2.
Actually I've tried on FF 21b1 Win XP, as stated in comment 0.
But with no success either.
(In reply to Paul Silaghi [QA] from comment #6)
> Yeah, but the most of the crashes were reported on 19.0.2.
There are no crashes with the stack trace of comment 0 in 19.0.2 or 20.0.
Paul, I found some other comments which might be good areas to test:

> Printing a pdf from a web page

> The problem happens in a website with <audio> tag, which contains two <source> tags
> one with MP3 and another with OGG file. Audio continues playing even when closing 
> the page and a Firefox crash happens 1-2 minutes later.

> It always crashes when I'm on my email, Juno

> slashdot.org often hangs with 50% cpu usage. 
> if i don't manage to close the tab, page file usage climbs over 2GB and 
> firefox eventually crashes

> This notice came up after I had closed Firefox and put the laptop to sleep for the night!

> Repeatable crash with XKCD "Time" & NoScript
QA Contact: paul.silaghi
> The problem happens in a website with <audio> tag, which contains two <source> tags
> one with MP3 and another with OGG file. Audio continues playing even when closing 
> the page and a Firefox crash happens 1-2 minutes later.

This is interesting. It's possible the commenter is wrong about the source of the continuing audio -- if it is a plugin that didn't stop when it should have, then the next GC/CC that nuked this element would hit this exact codepath.
See Also: → 856171
Comment on attachment 735462 [details] [diff] [review]
TeardownProtoChain differences 22 -> 21

@bz - When we landed bug 851378 we moved code from nsJSNPRuntime -> nsObjectLoadingContent, the content of which changed between 21 and 22. This is the relevant diff of aurora -> beta, where we go through a bunch of extra stuff to avoid |thisContent->GetWrapper|. Does any of this code seem suspect?

I also found that bug 724532 has a very similar crash that occurred in this exact code, when it lived in nsJSNPRuntime...
Attachment #735462 - Attachment description: TeardownProtoChain differences 21 -> 22 → TeardownProtoChain differences 22 -> 21
Attachment #735462 - Flags: feedback?(bzbarsky)
Comment on attachment 735462 [details] [diff] [review]
TeardownProtoChain differences 22 -> 21

Hmm.  I'd expect GetWrapper to work on 21 as well... why is it not working there?
Flags: needinfo?(jschoenick)
(In reply to Boris Zbarsky (:bz) from comment #13)
> Comment on attachment 735462 [details] [diff] [review]
> TeardownProtoChain differences 22 -> 21
> 
> Hmm.  I'd expect GetWrapper to work on 21 as well... why is it not working
> there?

It looks like you changed this to use GetWrapper in bug 827158 patch 7. If that would work as well on beta it can be uplifted... but do you see any reason why the current code would be crashing on beta?
Flags: needinfo?(jschoenick)
Ah, I see.

So given the lack of info in the stacks, my best guess is that sgo->GetGlobalJSObject() is null at this point in page teardown.  So we get a null deref there.

But generally, this setup of passing around, and doing virtual calls on, objects in the middle of their destructor is not great.  I think we should just avoid calling this method under the destructor, since if we're there we have no wrapper anyway.
Attachment #735462 - Flags: feedback?(bzbarsky)
Priority: P2 → P1
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #9)
> Paul, I found some other comments which might be good areas to test:
Only comments for 21.0 are applicable (see https://crash-stats.mozilla.com/report/list?version=Firefox:21.0&version=Firefox:21.0a2&signature=msvcr100.dll%400x8af06):
* Updated to new flash player
* I had just updated Adobe Flash this morning, after which the installer attempted to open a web page in Firefox, causing the crash.

(In reply to Scoobidiver from comment #0)
> It's currently #21 top browser crasher in 21.0b1.
It's now #10 top browser crasher in 21.0b2 maybe related to the recent release of Flash 11.7.
In case it's not clear, these are crashes calling the MSVC pure-virtual handler, not null-derefs.

We're operating on an already-deleted nsObjectLoadingContent `this` here: http://hg.mozilla.org/releases/mozilla-beta/annotate/c4dfe07f855c/content/base/src/nsObjectLoadingContent.cpp#l2768

We could probably make this a lot more reproducible by poisoning nsObjectLoadingContent.

If we *do* get here under the destructor, we're going to call NPP_Destroy, which not only is likely to recreate the wrapper for the object we're deleting (Flash almost always scripts its object during destruction), but it can potentially spin the event loop and do all sorts of other weird things.

I definitely agree that it should be an invariant that we should rip out http://hg.mozilla.org/releases/mozilla-beta/annotate/c4dfe07f855c/content/base/src/nsObjectLoadingContent.cpp#l684, and perhaps poke around for bugs filed with that assertion...  I think I may have seen at least one recently. I might even support making that at NS_RUNTIMEABORT (on trunk, not beta) to measure if we're hitting it a lot.
Attached file Testcase (obsolete) —
Okay! This horrible mess causes the crash. Requires the nptest plugin. Patch in progress.
Attachment #735462 - Attachment is obsolete: true
What's happening:

- A plugin runs script/events when it is instantiating
- During that script the document is torn down, going inactive and removing the plugin
- The plugin misses both the UnbindFromTree and NotifyOwnerDocumentActivityChanged chances to despawn, because it's nested inside InstantiatePluginInstance call and hasn't set mInstanceOwner
- After returning from Instantiate mType == eType_Null, but mInstanceOwner is set. The node has no references left.
- The destructor is called, sees mInstanceOwner, asserts, and tries to stop the plugin.

This worked in 20, but in 21 StopPluginInstance chain calls TeardownProtoChain which is pure virtual at this point. In 22+ we also check for despawning plugins when we remove the frame due to bug 784131, but I suspect it can be triggered there if we nuke the frame inside the callback.
Attachment #738624 - Attachment is obsolete: true
This happens on all platforms for 21+, but the complicated steps necessary only seem to be happening on the Adobe Flash update page for windows users only, and missing the step needed to crash 22/23, hence the lack of reports.
OS: Windows XP → All
Hardware: x86 → All
removing a couple keywords since we have a reduced test case.
What are the crash ID/signature in the trunk?
(In reply to Scoobidiver from comment #23)
> What are the crash ID/signature in the trunk?

Actually I wasn't paying enough attention here, the testcase runs into bug 854082 on trunk, but I believe it would hit this if it survived long enough -- so that's probably masking the crashes. I'm working on fixes for both, so I'll verify that this is hit on trunk w/854082 fixed. (The test case is actually very similar to the one on 854082)

https://crash-stats.mozilla.com/report/index/5cc05431-046e-4c77-979b-ab2bd2130417
See Also: 856171
Track if we enter into a new load during instantiate, and delete the dangling
plugin if so. This is ugly because DoStopPlugin() touches protochains and tries
to do (broken) delayed stop magic, but bug 767635 / bug 818785 would be better for addressing that on trunk.
Who is going to review the patch?
So the testcase here does not trigger the issue if OOP plugins are enabled, which would make sense if the majority of these crashes are XP. I did find at least one crash report from Vista, but it's possible that a very precisely timed plugin crash could cause this, or possibly a re-entry from layout, both of which my patch would fix.

I'm going to spin out the patch into its own issue, and leave this bug about the XP crashes.

@Scoobidiver - can you tell how often these crashes are happening on Vista+, where OOP plugins are forced? If they're common, there might be another vector causing this :(
Status: NEW → ASSIGNED
Flags: needinfo?(scoobidiver)
Opened bug 863792 for the re-entrance issue, so we can use this to track specific crashes on 21/XP (whereas the root issue affects all branches/platforms)
Depends on: 863792
OS: All → Windows XP
Hardware: All → x86
Comment on attachment 738821 [details] [diff] [review]
Handle re-entry during plugin instantiation

Moving to bug 863792
Attachment #738821 - Attachment is obsolete: true
Comment on attachment 738655 [details]
Reduced testcase that crashes when plugins are in-process

I suspect there's another way to trigger this with OOP plugins as well, but it's probably not worth spending hours messing with event ordering to make a test case.
Attachment #738655 - Attachment description: Reduced testcase that also crashes trunk → Reduced testcase that crashes when plugins are in-process
(In reply to John Schoenick [:johns] from comment #28)
> @Scoobidiver - can you tell how often these crashes are happening on Vista+,
> where OOP plugins are forced? If they're common, there might be another
> vector causing this :(
The breakdown per OS in 21.0 Beta is the following:
Windows XP 	45.286 %
Windows 7 	42.78 %
Windows 8 	6.189 %
Windows Vista 	5.629 %
Windows Unknown 0.116 %
Several comments still talk about updating Flash.
Flags: needinfo?(scoobidiver)
(In reply to Scoobidiver from comment #32)
> Several comments still talk about updating Flash.

Do we have any way of knowing which version of Flash users are updating from/to?
I suspect |navigator.plugins.refresh(true)| is called by the flash update page, possibly while inside InstantiatePluginInstance. There are a few other bugs with this call I discovered while looking at this I need to file, but i didn't find a concrete way to use this to cause this exact crash.

I'm about 80% certain the patch on bug 863792 covers these cases, but we should get it landed ASAP to make sure that it solves the problem.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #33)
> Do we have any way of knowing which version of Flash users are updating
> from/to?
Comments were written between April 10 and 15 so I assume it's an update from Flash 11.6.602.180 to 11.7.700.169 (see http://www.adobe.com/support/security/bulletins/apsb13-11.html).
Tracking note: Bug 863792 will fix this and is nom'd for 21
Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130409194949

I could not reproduce the crash on Firefox 20.0.1. I updated several flash versions (pave over, install/uninstall, PFS) to the latest one (Flash 11.7.700.169) while Firefox was running and Flash content was loaded.

I updated the following Flash versions:
10.1.102.64
10.2.159.1
10.3.183.15
11.0.1.152
11.3.300.273
11.4.402.265
11.5.502.149
11.6.602.168
11.6.602.171
11.6.602.180.
(In reply to Simona B [QA] from comment #37)
> I could not reproduce the crash on Firefox 20.0.1.
Firefox 20 is unaffected. See the tracking flags.
(In reply to Simona B [QA] from comment #37)
> Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0
> Build ID: 20130409194949
> 
> I could not reproduce the crash on Firefox 20.0.1. I updated several flash
> versions (pave over, install/uninstall, PFS) to the latest one (Flash
> 11.7.700.169) while Firefox was running and Flash content was loaded.
> 

I'm not sure exactly how you updated Flash Player while content was running. I think by default it asks you to close your browser when you install in interactive (UI) mode. One of the most common install vectors these days is the silent background install that you can trigger like so:
1. Install old version of Flash Player, start browser and load Flash content.
2. Open command line and install the latest Flash Player like so:
     <installer-name>.exe -install -iv 9

The next time that a SWF is loaded, the old version of Flash Player detects the new version on the system and will attempt to do a navigator.plugins.refresh(false).

If this is how you did it, please ignore my comment.
(In reply to Stephen Pohl [:spohl] from comment #39)
> (In reply to Simona B [QA] from comment #37)
> > Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0
> > Build ID: 20130409194949
> > 
> > I could not reproduce the crash on Firefox 20.0.1. I updated several flash
> > versions (pave over, install/uninstall, PFS) to the latest one (Flash
> > 11.7.700.169) while Firefox was running and Flash content was loaded.
> > 
> 
> I'm not sure exactly how you updated Flash Player while content was running.

Another way to do it:
1. Install an older Flash version (for e.g 11.6.602.180)
2. Launch Firefox and play any video that is using Flash
3. Download the latest Flash installer
4. Run the installer and at the end click on the Finish button
5. Go to the tab where the video is playing, refresh it and check for the used Flash version.
(In reply to Scoobidiver from comment #38)
> (In reply to Simona B [QA] from comment #37)
> > I could not reproduce the crash on Firefox 20.0.1.
> Firefox 20 is unaffected. See the tracking flags.

Why is Firefox 20.0.1 accounting for nearly 42% of the crashes here then:
https://crash-stats.mozilla.com/report/list?signature=msvcr100.dll%400x8af06

Am I reading this wrong?
(In reply to Scoobidiver from comment #38)
> (In reply to Simona B [QA] from comment #37)
> > I could not reproduce the crash on Firefox 20.0.1.
> Firefox 20 is unaffected. See the tracking flags.

Is there a way to make sure that such a bug is not reproducible on Firefox 20 when there are so many crash reports for that version?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #41)
> Why is Firefox 20.0.1 accounting for nearly 42% of the crashes here then:
> https://crash-stats.mozilla.com/report/list?signature=msvcr100.dll%400x8af06

This is for all crashes at that address in that Microsoft DLL. They all end in msvcr100.dll@0x8af06, but the code that triggers it isn't the same.

Looking through a few sample for Fx20, nsObjectLoadingContent::DoStopPlugin doesn't appear in the stacks.
Thanks for clarifying. Given this has a minimized testcase and is assigned to :johns does this need reproducing or regressing further?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #44)
> Thanks for clarifying. Given this has a minimized testcase and is assigned
> to :johns does this need reproducing or regressing further?

STR on a real site would be helpful to verify that my testcase matches what is actually happening. Bug 863792 is landing with a test today, though, so if the crashes stop it's a good bet that its test is sufficient.
(In reply to Stephen Pohl [:spohl] from comment #39)
> I'm not sure exactly how you updated Flash Player while content was running.
> I think by default it asks you to close your browser when you install in
> interactive (UI) mode
Newer versions can be installed while content is running
(In reply to Paul Silaghi [QA] from comment #46)
> (In reply to Stephen Pohl [:spohl] from comment #39)
> > I'm not sure exactly how you updated Flash Player while content was running.
> > I think by default it asks you to close your browser when you install in
> > interactive (UI) mode
> Newer versions can be installed while content is running

I should have given a bit more detail: The application that is downloaded from Adobe's website to install Flash Player is a wrapper. It wraps the actual Flash Player installer executable and executes it silently with arguments similar to the ones given in comment 39, just that the number after '-iv' will be different.

Since the actual installer executable takes a different execution path depending on the number after -iv, I was trying to point out that the most frequently executed option is '-iv 9'. 9 is the install vector for the Flash Player Background Updater[1]. Since the crash seems difficult to reproduce, it may be that it only happens for specific install vectors. Just a well-intentioned guess, no guarantees. :-)

If you want to give the different install vectors a shot make sure to download the offline installer[2], not the wrapper.

[1] http://www.adobe.com/devnet/flashplayer/articles/background-updater-windows.html
[2] http://download.macromedia.com/pub/flashplayer/current/support/install_flash_player.exe
Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20100101 Firefox/21.0
Build ID: 20130423212553

I couldn't crash Firefox 21 beta 4 either - this time the update was done in the background using the guidance from Comment 39 and 47, while Flash Player content was running.

Updated the following Flash versions:
- 11.4.402.287
- 11.5.502.135
- 11.6.602.168
- 11.6.602.171
- 11.6.602.180
Bug 863792 is on nightly+aurora and might be uplifted to beta, which we're moderately sure would stop this crash.

Another alternative for fixing this would be backing out bug 851378 from beta, and disabling the new plugin-crashed UI (which is broken without 851378)
(In reply to John Schoenick [:johns] from comment #49)
> Bug 863792 is on nightly+aurora and might be uplifted to beta, which we're
> moderately sure would stop this crash.
> 
> Another alternative for fixing this would be backing out bug 851378 from
> beta, and disabling the new plugin-crashed UI (which is broken without
> 851378)

I am leaning towards the alternative here which is definitely a less risky option for Fx 21 as https://bugzilla.mozilla.org/show_bug.cgi?id=863792#c16 sounds rather riskier . WE discovered this top-crasher two weeks back given Bug 851378 almost landed a month back. So it may happen that the fallout's from the forward fix(even if taken)  may be worse may and not come to light before we ship Fx21.

Wanted to consider :bsmedberg's read on this before finalizing as he was not in favor of backing out the comment/URL UI. Benjamin, do we think we can apply any short term alternatives that could be applied for Fx 21 to fix Bug 843671, which was the main reason we took bug 851378 for, causing instability now.
I'm torn. The plugin URL/comment UI is pretty important for future stability work. The patch in bug 863792 looks pretty straightforward, but obviously we're dealing with plugins so the risk is medium in general.

OTOH, backing out both bug 851378 *and* bug 648675 also carried some risk.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #51)
> I'm torn. The plugin URL/comment UI is pretty important for future stability
> work. The patch in bug 863792 looks pretty straightforward, but obviously
> we're dealing with plugins so the risk is medium in general.
> 
> OTOH, backing out both bug 851378 *and* bug 648675 also carried some risk.

Can you explain a bit more about why it would be risky to be backout bug 851378 *and* bug 648675.Wouldn't we be in the same state as FX 20 ? Was there any additional plugin code land which may depend on that ?

Also if going either ways is risky, can we do something in either approach to mitigate it here ? will have extensive QA testing or having a testday around plugin's gain more confidence in either approach.Is it easier to find out regressions/fallout's in one way or the other ?  

In addition we should also discuss what are next steps would be in case Bug 863792 does not the resolve the crashes here ? We are strictly relying on crash-stat and a sample test-case to fix the issue. The crashes do not have on aurora either to make sure they were fixed.
I just talked with bbajaj, leaving some comments here about the situation.

(In reply to bhavana bajaj [:bajaj] from comment #52)
> Can you explain a bit more about why it would be risky to be backout bug
> 851378 *and* bug 648675.Wouldn't we be in the same state as FX 20 ? Was
> there any additional plugin code land which may depend on that ?

Many of our recent plugin issues, including bug 863792, are actually old bugs that happened to spike for one reason or another. The plugin code is generally very old and touchy, so our ongoing work to improve it uncovers these latent problems. For instance this crash appeared due to bug 851378 uncovering the bad behavior in bug 863692 -- by no fault of bug 851378 specifically.

The risk then, if we backout 851378, 843671, and 648675, is that other plugin work (and work in other areas of the code that affect plugins) has landed, and we'd have a new configuration of code that, while by itself is known good, might aggravate an entirely different set of latent issues.

> Also if going either ways is risky, can we do something in either approach
> to mitigate it here ? will have extensive QA testing or having a testday
> around plugin's gain more confidence in either approach.Is it easier to find
> out regressions/fallout's in one way or the other ?  

Unfortunately many of these issues occur in only specific use cases of plugins that are hard to predict. Plugins are also used in a wide variety of places around the web, so its very difficult to come up with any specific test plan that would help narrow down issues.

> In addition we should also discuss what are next steps would be in case Bug
> 863792 does not the resolve the crashes here ? We are strictly relying on
> crash-stat and a sample test-case to fix the issue. The crashes do not have
> on aurora either to make sure they were fixed.

We don't have specific URLs or steps to reproduce outside of this testcase (which made tracking down 863692 rather hard). We can take another look at URLs and see if QA can find any specific reproduction, but otherwise we are just relying on crash stats.
Keywords: needURLs
bug 863792 was just landed on beta, let's hope this signature disappears in b5
Flags: needinfo?(kairo)
Apart from banging on the URLs and testcases called out in this bug, is there any way QA can discretely verify this or is this more speculative?
Keywords: verifyme
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #55)
> Apart from banging on the URLs and testcases called out in this bug, is
> there any way QA can discretely verify this or is this more speculative?

Unfortunately this is somewhat speculative. The testcase triggers this crash reliably when plugins are in-process, though I was not able to modify it to also affect OOP plugins -- the timing and event ordering ends up getting very complicated. Aside from verifying that the testcase does not crash when plugins are in-process, the only other verification we can come up with is looking for STR on live sites w/URLs and watching crash stats.
After some overnight testing we were not able to come up with a reproducible testcase of verifying this bug; thus marking as [qa-].
Keywords: verifyme
Whiteboard: [qa-]
@scoobidiver Does it look like this has stopped in 21b6? It looks like there have been no reports on that build but I'm pretty terrible at navigating crash stats
Flags: needinfo?(scoobidiver)
(In reply to John Schoenick [:johns] from comment #59)
> @scoobidiver Does it look like this has stopped in 21b6?
Yes completely (see https://crash-stats.mozilla.com/report/list?version=Firefox:21.0b6&signature=msvcr100.dll%400x8af06). There are still crashes with the msvcr100.dll@0x8af06 signature but none related to this bug.
Thus, it's fixed by bug 863792.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(scoobidiver)
Resolution: --- → FIXED
Flags: needinfo?(kairo)
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: