Closed Bug 715885 Opened 13 years ago Closed 12 years ago

Crash in -[ChildView draggedImage:endedAt:operation:] @ PrepareAndDispatch while dragging, in 32-bit mode, with Torbutton extension

Categories

(Core :: Widget: Cocoa, defect)

All
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla12
Tracking Status
firefox9 --- affected
firefox10 + affected
firefox11 + wontfix

People

(Reporter: scoobidiver, Assigned: smichaud)

References

Details

(4 keywords, Whiteboard: [inbound])

Crash Data

Attachments

(4 files)

It's #7 top crasher in 9.0.1 on Mac OS X.

Signature 	PrepareAndDispatch More Reports Search
UUID	e0d2f715-d8ef-4f81-a3c1-be5aa2120103
Date Processed	2012-01-03 06:50:51.693212
Uptime	4599
Last Crash	11.7 hours before submission
Install Age	9.0 hours since version was first installed.
Install Time	2012-01-03 05:47:46
Product	Firefox
Version	9.0.1
Build ID	20111220165912
Release Channel	release
OS	Mac OS X
OS Version	10.6.8 10K549
Build Architecture	x86
Build Architecture Info	family 6 model 14 stepping 8
Crash Reason	EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash Address	0xd
User Comments	try to drag tab so that it's in a new window not new tab
App Notes 	
Renderers: 0x21900,0x20400

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	PrepareAndDispatch 	xptinfo.h:191
1 	XUL 	-[ChildView draggedImage:endedAt:operation:] 	widget/src/cocoa/nsChildView.mm:4554
2 	AppKit 	-[NSCoreDragManager _dragUntilMouseUp:accepted:] 	
3 	AppKit 	-[NSCoreDragManager dragImage:fromWindow:at:offset:event:pasteboard:source:slideBack:] 	
4 	AppKit 	-[NSWindow dragImage:at:offset:event:pasteboard:source:slideBack:] 	
5 	XUL 	nsDragService::InvokeDragSession 	widget/src/cocoa/nsDragService.mm:318
6 	XUL 	XUL@0xfeb0af 	
7 	XUL 	XPCWrappedNative::CallMethod 	js/src/xpconnect/src/xpcwrappednative.cpp:3150
8 	XUL 	XPC_WN_CallMethod 	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1629
9 	XUL 	js::InvokeKernel 	js/src/jscntxtinlines.h:296
10 	XUL 	js::Interpret 	js/src/jsinterp.cpp:4036
11 	libSystem.B.dylib 	szone_realloc 	
12 	libSystem.B.dylib 	szone_realloc 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=PrepareAndDispatch
This isn't new -- FF versions at least as far back as 7.0.1 are also effected.

But it may have increased in volume recently.
These started occurring in substantial volume on 2011/12/16 ... but in both FF 9 and 8.0.1 at the same time.

So presumably whatever happened was an external factor.  Or maybe this is just a fault (or an artifact) in the Socorro data.
These crashes have a high correlation with QuickTime and (especially) the Torbutton extension:

100% (10/10) vs.  54% (786/1449) QuickTime
100% (10/10) vs.   1% (21/1449) {e0204bd5-9d31-402b-a99d-a6aa8ffebdca}
                      (Torbutton, https://addons.mozilla.org/addon/2275)

At the Torbutton AMO psge (listed above), it says "this add-on had been removed by its author".  At the Torbutton home page (https://www.torproject.org/torbutton/), it appears that Torbutton users are now encouraged to use the "Tor Browser Bundle" (including a bundled (forked?) version of Firefox).

But the Torbutton add-on is still available, and a new "stable" version was released on or just before 2011/12/17.

Whoever's willing and able, please install the Torbutton extension and see if you can find STR for this bug.  I'll try it later myself, when I have time.
Summary: Crash in -[ChildView draggedImage:endedAt:operation:] @ PrepareAndDispatch while dragging → Crash in -[ChildView draggedImage:endedAt:operation:] @ PrepareAndDispatch while dragging, with Torbutton extension
Version: 9 Branch → unspecified
Correlations show it happens with the latest version of TorButton:
* 20120108:    100% (16/16) vs.   3% (22/809) {e0204bd5-9d31-402b-a99d-a6aa8ffebdca} (Torbutton, https://addons.mozilla.org/addon/2275) (1.4.5.1)
*20120109:    100% (11/11) vs.   2% (22/1033) {e0204bd5-9d31-402b-a99d-a6aa8ffebdca} (Torbutton, https://addons.mozilla.org/addon/2275) (1.4.5.1)

Here is one comment:
"Firefox crashes every (!) time an image is dragged from a site and released in the same window. Reproducable: yes. Goto firefox.com, drag the Firefox-logo an drop it in the very same tab or window. Tadaa, crash! firefox-9.0.1, firefox-aurora, firefox-8, doesn't matter. System was Mac OS 10.5."
Version: unspecified → 9 Branch
Version: 9 Branch → unspecified
These crashes happen up through the aurora branch (currently FF 11), where they're the #12 topcrasher.  They don't happen on the trunk (FF 12), but that's presumably because Torbutton won't run FF 12 (the trunk).
A substantial number of crashes at PrepareAndDispatch also happen on Windows.  So we need to check whether not they're related.  But https://crash-stats.mozilla.com is currently broken, so we can't do it now.
(Following up comment #6)

Crash-stats is now back up ... but it's very difficult to tell whether or not the Windows crashes are related to this one, or even to each other.

It'll be difficult to learn more without being able to reproduce this bug.  But now that we know about the Torbutton correlation the chances of finding STR look pretty good.

Once we have STR, it should also be easier to tell whether this is our bug or Torbutton's.
STR are in comment 4.
Keywords: reproducible
> STR are in comment 4.

Those aren't yet *proven* STR.  There's a big difference.  Often (in fact usually) user comments leave out crucial steps or conditions.

If no-one beats me to it, I'll be looking for true STR later today or tomorrow.
A quick look indicates that there are in fact a *lot* of steps missing.  There's more to using Torbutton than just installing the extension.  I'll keep working on it ... but it may take a while.

Part of the problem is that the Tor instructions now assume that you're running the Tor Browser Bundle (with it's bundled copy of Firefox).
Keywords: reproducible
OK, I've now figured out the missing steps!

With these I crash easily in 32-bit mode, even with today's
mozilla-central nightly.  I don't crash in 64-bit mode (which is of
course the default on OS X 10.6 and 10.7).  This matches Socorro's
crash stats -- almost all of these crashes are in 32-bit mode. 

Full STR:

1) If need be download and install the "Vidalia Bundle" for OS X from
   https://www.torproject.org/download/download.  Once the DMG is
   downloaded, all you need to do is drag "Vidalia" to the
   Applications folder.

2) If need be run Vidalia.

3) Run FF (any version) in 32-bit mode.

4) If need be install the Torbutton extension (version 1.4.5.1) from
   https://www.torproject.org/dist/torbutton/torbutton-current.xpi.

5) By default Torbutton is disabled when you install it or restart
   Firefox -- its button (just to the left of the location bar) is
   X-ed out.  To enable it click on the button and choose "Toggle for
   status".

6) Do either of the following.	Both crash for me in 32-bit mode.

   a) Visit any page in a new tab, then drag the tab to the Desktop to
      turn it into a window.

   b) Visit any page that has an icon on it, then drag and drop the
      icon somewhere else in the same pate.
Keywords: reproducible
Attached file Gdb crash stack
Assignee: nobody → smichaud
Status: NEW → ASSIGNED
Summary: Crash in -[ChildView draggedImage:endedAt:operation:] @ PrepareAndDispatch while dragging, with Torbutton extension → Crash in -[ChildView draggedImage:endedAt:operation:] @ PrepareAndDispatch while dragging, in 32-bit mode, with Torbutton extension
Including Mike Perry, who maintains TorButton, here. Since this started occurring in FF8/9 at the same time, an add-on update likely tickled something here.
These crashes go back to FF 3.5.19, but not FF 3.0.9!

I'll download some ancient trunk nightlies to look for a regression range :-)
I've found the regression range!

firefox-2009-05-26-03-mozilla-1.9.1
firefox-2009-05-27-03-mozilla-1.9.1

Next I'll find which patch was the cause/trigger.
Here's the patch that caused/triggered this bug's crashes.  Tomorrow I'll try to figure out why.

author	Asaf Romano <aromano@mozilla.com>
	Wed May 27 02:30:38 2009 +0300 (at Wed May 27 02:30:38 2009 +0300)
changeset 25748	92b062d4764e
Bug 494795 - tabs do not tear off unless you drag them vertically out of the tab strip. r=enn,sr=roc.
Actually, as I found when I got my gdb crash stack, these crashes also happen in current trunk code.
Attached patch Possible fixSplinter Review
This is a subtle bug.  But it's definitely ours (not Torbutton's).

Here's a fix for it.  But the bug involves xptcall code, which is very
complex and about which I know very little.  So my fix may actually be
a workaround for a bug (as yet unidentified) in xptcall code.  I'll
try to find a reviewer for this patch who knows something about
xptcall code, and who can decide whether this is really an xptcall
bug, and if so whether we should try to fix it there (rather than
trying to work around it in widget code).

The reason Torbutton triggered this bug is that it hooks an
nsIDragService method (invokeDragSessionWithImage(), in
components/external-app-blocker.js), which means (among other things)
that all nsIDragService methods in -[ChildView
draggedImage:endedAt:operation:] are called via xptcall.  Both
nsIDragService and nsIDragSession methods are called in -[ChildView
draggedImage:endedAt:operation:].  The patch for bug 494795 added a
call to an nsIDragSession method from a pointer (dragService) to an
object that instantiates a non-XPCOM class that inherits from both
nsIDragService and nsIDragSession.  This is what caused all the
trouble.  I won't repeat my more detailed explanation in the patch
comment.

My patch rewrites the patch for bug 494795 to avoid confusing xptcall.

It also adds a null check to PrepareAndDispatch() in
xpcstubs_gcc_x86_unix.cpp (used in 32-bit mode), corresponding to a
null check already present in xpcstubs_x86_64_darwin.cpp (used in
64-bit mode).  That this null-check was already present in 64-bit mode
is the reason this bug's crashes only happen in 32-bit mode.

The tryservers are currently closed.  Once they reopen I'll start a
tryserver build.
Comment on attachment 588524 [details] [diff] [review]
Possible fix

The only reason I'm asking you to review this patch is that I need someone who knows xptcall, and you're a module peer :-)

Feel free to pass the review to someone else if you think that's appropriate.
Attachment #588524 - Flags: review?(benjamin)
Comment on attachment 588524 [details] [diff] [review]
Possible fix

This is a very roundabout way of saying that this line is incorrect:

nsDragService* dragService = static_cast<nsDragService *>(mDragService);

That line can only be correct if nsDragService is the only implementation of nsIDragService. It's really up to you/roc whether that should actually be true or not. If it is true, then torbutton should not be allowed to replace the default drag service, and we should mark nsIDragService as [builtinclass] (which means that JS is not allowed to implement that interface.

If torbutton *should* be allowed to replace the default implementation of nsIDragService, then the line I noted above should be removed. In any case, the big comment and the QI contortions aren't necessary (and there is no bug in xptcall).
Attachment #588524 - Flags: review?(benjamin) → review-
(In reply to comment #20)

I'm tired and am going to save thinking about the major issues until next week.

But don't you think it's a good idea to port the null-check from xpcstubs_x86_64_darwin.cpp to xpcstubs_gcc_x86_unix.cpp?
This should fix the problem. It may cause torbutton to not work properly, but I think that is the correct decision in this case.
Attachment #589918 - Flags: review?(roc)
(In reply to comment #22)

I disagree rather strongly with this.

My suggestion from comment #21 would temporarily work around this problem without breaking the Torbutton extension, while we look for a better fix.
The fix from comment 21 is patently incorrect. The null check is unnecessary when there is correct code and in every other version of xptcall is just an assertion. Here we are just enforcing an assumption our widget code already makes.
Can we land this on m-c in time for tonight's nightly (via m-i or directly)?  Like to gather enough data to evaluate this by Monday for Aurora and Beta.
Comment on attachment 589918 [details] [diff] [review]
Mark nsIDragService as a builtin class, rev. 1

Apparently bsmedberg isn't going to land his patch, so I've done it myself (to avoid messing up our testing schedule).

Landed on mozilla-inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/d59f9cae9cd4

I still disagree with landing this patch now, for reasons I've already given.  Yes, my suggestion from comment #21 is "incorrect".  But it's also harmless and doesn't break the Torbutton extension.  It would also have given us time to try to find an equally harmless and more correct solution.
Whiteboard: [inbound]
https://hg.mozilla.org/mozilla-central/rev/d59f9cae9cd4
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
Please nominate for Beta 11 approval if we think that the patch for this topcrasher is sufficiently low risk and well-tested on m-c.
It's in the top 10 crashes for mac on 11b3. Not super high volume but would be nice to take in beta if the fix is low risk.
This bug's current patch (as anticipated) breaks the Torbutton
extension, but in an interesting way:  The extension is allowed to
load, but then all image-dragging stops working!

I've got a patch that unbreaks the Torbutton extension "correctly"
(and without reintroducing any crashes).  But it's rather complex, and
I doubt there will be any enthusiasm for landing it.

I'll post my new patch later this week or next week.  But in the
meantime I *don't* think the current patch should be rushed into a
release -- not unless the number of crashes grows much larger.
Using the latest nightly, I tried Steven's steps in Comment 11. The issue I am having is I cannot tear away any tabs. I am using a new profile with only the Tor button extension installed.

Build ID: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20120228 Firefox/13.0a1
(In reply to Steven Michaud from comment #30)
> I'll post my new patch later this week or next week.  But in the
> meantime I *don't* think the current patch should be rushed into a
> release -- not unless the number of crashes grows much larger.

Thanks for the risk evaluation - marking as wontfix for 11.
Wow. This bug is a rather impressive display of apathy fail. (Except for you, Steve. Thanks for trying).

For the record, I need to wrap the drag and drop service to prevent the OS from recording and pre-loading urls of tabs and images that are dragged. This happens on Ubuntu and Mac, and possibly also Windows 7, and causes bypass of the Tor proxy settings. Pretty critical issue for us.

If there's any other way of preventing the OS from snooping on DnD content, I'm all ears.

Otherwise Steve, I would like to land your fix for this in Tor Browser.

I will be letting Torbutton continue to break DnD as a result. It's better than the alternative.
I missed comment 13 where my unused gmail account was Cc'd. I've stopped using that account in favor of this one.

So for my part, sorry for that and for not noticing this bug until today as a result. This new bugzilla account goes directly to my main inbox.
Thanks Steven. That patch seems to fix the drag crash bug in TBB on Mac (though now we've got fresh unrelated one).

However, I had to add in a QueryInterface to nsIDOMNSDataTransfer to fix a compile issue on MacOS, and there were of course a few minor tweaks needed to get the patch to apply to FF12 source. Our version is kept here: https://gitweb.torproject.org/torbrowser.git/blob/maint-2.2:/src/current-patches/firefox/0016-Adapt-Steven-Michaud-s-Mac-crashfix-patch-for-FF12.patch
Resolved, fixed? In 12-rel dragging functionality just completely BROKEN after installing Torbutton on Windows XP and Windows 2003 Server. TBB works ok, but "balloon" messages (after installing addons, for example - BROKEN (no background, just text)). You try to fix crash on Mac only, but do that Torbutton can usable only in TBB.
FYI: Unrelated crash definitely was unrelated. (There appear to be build issues using Xcode 4's gcc/g++ as opposed to clang).

Should we reopen this bug (or create a new one) to make sure the patch actually gets merged in mainline? I am still holding on to the last shreds of hope that Mozilla will actually decide to care about privacy against the ad networks and Tor Browser can stop being a fork someday. As such, I'd love to see the real fix merged.
OK, I'll ask for review(s) of my patch, and see where that leads.

Even if it (or something like it) is accepted, though, it will take a while to get into a Firefox release (as do most patches).
Attachment #618644 - Flags: review?(benjamin)
Comment on attachment 618644 [details] [diff] [review]
Unbreak Torbutton extension "correctly"

None of the new methods (on nsPIDragService) are [noscript]. So why bother with the new interface, instead of just putting them on nsIDragService itself? But I really don't think that torbutton replacing the drag service is the long-term solution we want... instead we should deal with the specific issue of privacy leakage via drag events and either build that into the interface directly or have a better way of filtering that activity.
Attachment #618644 - Flags: review?(benjamin) → review-
> But I really don't think that torbutton replacing the drag service
> is the long-term solution we want... instead we should deal with the
> specific issue of privacy leakage via drag events and either build
> that into the interface directly or have a better way of filtering
> that activity.

Fair enough.  "Hooking" a service seems an unusual thing for
extensions to do (when I looked earlier, I couldn't find any other
examples).  So it's not terribly surprising that we don't support it
very well.

I don't	know if	there's	any other way for the Torbutton	extension to
do what it needs to do.	 But it's worth	trying to find one.

Mike, could you	open a new bug on this subject and CC me and Benjamin?

> None of the new methods (on nsPIDragService) are [noscript]. So why
> bother with the new interface, instead of just putting them on
> nsIDragService itself?

If we can't find any other way, would you (Benjamin) be willing to r+
a patch that just adds these new methods to nsIDragService?
XPCOM hooking is a technique employed by extensions that need to substantially alter browser functionality. It has been used for addons such as SafeCache and SafeHistory, and is still used in Torbutton for applying similar filters to prevent auto-launch of external apps. Removing the ability of extensions to hook XPCOM components would severely cripple the ability to extend Firefox in novel ways...

I've created Bug 752625 for finding another way to do this for Drag and Drop, but what about in the meantime? Is the plan really to just break Drag and Drop for all non-Tor Browser Torbutton users instead of properly fixing the crash?

I mean, we're thinking about deprecating Torbutton support for Firefox sooner or later because the toggling is really hard to keep safe, but it might be nice to have *some* way of using normal Firefox safely with Tor so long as you don't toggle...
Testing nightly version with Torbutton. Tooltips are not displayed, moving does not work. So much time has passed, but the problem was not resolved. What a shame. Firefox authors wants to kill Torbutton and Tor Browser?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: