Closed Bug 243343 Opened 20 years ago Closed 19 years ago

[gtk2] Segfault when saving large files

Categories

(Core Graveyard :: GFX: Gtk, defect)

x86
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED EXPIRED

People

(Reporter: wsheets, Assigned: blizzard)

References

Details

(Keywords: regression)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a) Gecko/20040511
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a) Gecko/20040511

Mozilla segfaults when saving (for example) any jpeg over 73612 bytes in size.
(I have also seen the segfault occasionally when saving html, but using a jpeg 
will reproduce the segfault every time.

This began sometime in the past week (1st week in May 2004).  If someone can
confirm this bug for me I will be happy to track down the exact date/commit
which causes this bug.

Reproducible: Always
Steps to Reproduce:
1.Load a jpg image (> 73612 bytes in size) from local disk or web site.
2.Use either "Save Image As" or "Save Page As" to save the loaded image to disk.


Actual Results:  
The image *is* stored to disk, but mozilla segfaults just afterwards.

Expected Results:  
No segfault.
are you using a build with talkback? if so, could you find out the talkback id?
(run components/talkback/talkback)
also, do you have an example url?
this worksforme using a screenshot I just made, size 192547 bytes, build
2004051006, gtk1 (no xft)
(In reply to comment #1)
> are you using a build with talkback? if so, could you find out the talkback id?
> (run components/talkback/talkback)

No, I'm compiling from source each day.  I can't tell from the mozilla web
site whether I can configure it to build talkback.  Is it possible?
(In reply to comment #2)
> also, do you have an example url?

Any large jpeg does it, but here's one:
http://www.nasa.gov/multimedia/imagegallery/image_feature_156.html
I notice that 'Save Page As' does not cause a segfault -- only 'Save Image As'.
(In reply to comment #4)
> No, I'm compiling from source each day.  I can't tell from the mozilla web
> site whether I can configure it to build talkback.  Is it possible?

you can't, and it wouldn't be useful (the server needs to have symbols for the
builds)

on the other hand, that means you could use gdb on your build (unless you used
--enable-strip), like this:
./mozilla -g
then type "run" at the gdb prompt
then let mozilla crash
type "bt" at the gdb prompt

paste output of that here.

(In reply to comment #5)
> http://www.nasa.gov/multimedia/imagegallery/image_feature_156.html

doesn't crash here :(
Here is the output from gdb.  Let me know if there is any other info I
can supply:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 21713)]
0x40fb295a in IM_get_input_context(_MozDrawingarea*) () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
(gdb) bt
#0  0x40fb295a in IM_get_input_context(_MozDrawingarea*) () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
#1  0x40fb2267 in nsWindow::IMEGetContext() () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
#2  0x40fb1ec6 in nsWindow::IMELoseFocus() () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
#3  0x40fb1dce in nsWindow::IMEDestroyContext() () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
#4  0x40faa430 in nsWindow::Destroy() () from
/usr/local/lib/mozilla-1.8a/components/libwidget_gtk2.so
#5  0x414893aa in nsView::~nsView() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#6  0x4148c4d3 in nsViewManager::~nsViewManager() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#7  0x4148c65e in nsViewManager::Release() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#8  0x08062266 in nsCOMPtr_base::~nsCOMPtr_base() ()
#9  0x412fd564 in DocumentViewerImpl::~DocumentViewerImpl() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#10 0x412fcf66 in DocumentViewerImpl::Release() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#11 0x08062266 in nsCOMPtr_base::~nsCOMPtr_base() ()
#12 0x414a4cb0 in GlobalWindowImpl::Close() () from
/usr/local/lib/mozilla-1.8a/components/libgklayout.so
#13 0x40a2a287 in XPTC_InvokeByIndex () from /usr/local/lib/mozilla-1.8a/libxpcom.so
#14 0x40a9efec in XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode) ()
   from /usr/local/lib/mozilla-1.8a/components/libxpconnect.so
#15 0x40aa581e in XPC_WN_CallMethod(JSContext*, JSObject*, unsigned, long*,
long*) ()
   from /usr/local/lib/mozilla-1.8a/components/libxpconnect.so
#16 0x4004d961 in js_Invoke () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#17 0x40055d5d in js_Interpret () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#18 0x4004d9b6 in js_Invoke () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#19 0x4004dbdf in js_InternalInvoke () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#20 0x4004dd5c in js_InternalGetOrSet () from
/usr/local/lib/mozilla-1.8a/libmozjs.so
#21 0x40062bc3 in js_SetProperty () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#22 0x4005511d in js_Interpret () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#23 0x4004d9b6 in js_Invoke () from /usr/local/lib/mozilla-1.8a/libmozjs.so
#24 0x40a9a7d2 in nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned
short, nsXPTMethodInfo const*, nsXPTCMiniVariant*) () from
/usr/local/lib/mozilla-1.8a/components/libxpconnect.so
#25 0x40a95f13 in nsXPCWrappedJS::CallMethod(unsigned short, nsXPTMethodInfo
const*, nsXPTCMiniVariant*) ()
   from /usr/local/lib/mozilla-1.8a/components/libxpconnect.so
#26 0x40a2a41a in PrepareAndDispatch () from /usr/local/lib/mozilla-1.8a/libxpcom.so
#27 0x41a2bf2f in nsDownload::OnStateChange(nsIWebProgress*, nsIRequest*,
unsigned, unsigned) ()
   from /usr/local/lib/mozilla-1.8a/components/libappcomps.so
#28 0x416756f2 in nsWebBrowserPersist::OnStopRequest(nsIRequest*, nsISupports*,
unsigned) ()
   from /usr/local/lib/mozilla-1.8a/components/libembedcomponents.so
#29 0x40c9444d in nsFileChannel::OnStopRequest(nsIRequest*, nsISupports*,
unsigned) ()
   from /usr/local/lib/mozilla-1.8a/components/libnecko.so
#30 0x40c3766d in nsInputStreamPump::OnStateStop() () from
/usr/local/lib/mozilla-1.8a/components/libnecko.so
---Type <return> to continue, or q <return> to quit---./mozilla: line 184: 21703
Terminated              "$dist_bin/run-mozilla.sh" $script_args
"$dist_bin/$MOZILLA_BIN" "$@"
huh. I wonder who is to blame here :) probably widget or view manager, no idea
which of those.

are you using any IMEs?
Any chance of a build with -g (it'll be about 2GB on disk instead of the 200MB
of a normal build) so we can get some line numbers too?
(In reply to comment #8)
> huh. I wonder who is to blame here :) probably widget or view manager, no idea
> which of those.
> 
> are you using any IMEs?

Sorry, I dunno what IME means  :o(
(In reply to comment #10)
> Sorry, I dunno what IME means  :o(

input method extension. this means you probably don't :)
(They are used for entering chinese/japanese/... characters. which is about
where my knowledge of them end. but your stack had some functions with IM in the
name which I thought might indicate that)
This seems to be gtk2-specific.. I don't see the problems in gtk1 builds, and
the stack passes through the gtk2 widget lib....
Assignee: general → blizzard
Component: Browser-General → GFX: Gtk
Depends on: 243436
QA Contact: general → ian
Summary: Segfault when saving large files → [gtk2] Segfault when saving large files
(In reply to comment #9)
> Any chance of a build with -g (it'll be about 2GB on disk instead of the 200MB
> of a normal build) so we can get some line numbers too?
Well, I'm a bit over my head here, but I tacked a -g onto the end of my
ac_add_options --enable-optimize="-O -march=athlon-xp -pipe -g"

If you intended something else please let me know...

(gdb) run
Starting program: /usr/local/lib/mozilla-1.8a/mozilla-bin
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 17902)]
[New Thread 32769 (LWP 17908)]
[New Thread 16386 (LWP 17909)]
Detaching after fork from child process 17910.
[New Thread 32771 (LWP 17923)]
[New Thread 49156 (LWP 17924)]
[New Thread 65541 (LWP 17925)]
[New Thread 81926 (LWP 17926)]
[New Thread 98311 (LWP 17927)]
[New Thread 114696 (LWP 17928)]
[New Thread 131081 (LWP 17929)]
[New Thread 147464 (LWP 17930)]
Detaching after fork from child process 17931.
Detaching after fork from child process 17932.
Detaching after fork from child process 17933.
Detaching after fork from child process 17934.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 17902)]
0x404f42a9 in g_type_check_instance_cast () from /usr/lib/libgobject-2.0.so.0
(gdb) bt
#0  0x404f42a9 in g_type_check_instance_cast () from /usr/lib/libgobject-2.0.so.0
#1  0x40fb06d2 in get_gtk_widget_for_gdk_window (window=0x44776569) at
nsWindow.cpp:2972
#2  0x40fb29a2 in IM_get_input_context (aArea=0x44776569) at nsWindow.cpp:4273
#3  0x40fb22b7 in nsWindow::IMEGetContext() (this=0x44776569) at nsWindow.cpp:3981
#4  0x40fb1f16 in nsWindow::IMELoseFocus() (this=0x8cd19e8) at nsWindow.cpp:3904
#5  0x40fb1e1e in nsWindow::IMEDestroyContext() (this=0x8cd19e8) at
nsWindow.cpp:3875
#6  0x40faa480 in nsWindow::Destroy() (this=0x8cd19e8) at nsWindow.cpp:367
#7  0x414893fa in ~nsView (this=0x8cd1990) at nsView.cpp:151
#8  0x4148c523 in ~nsViewManager (this=0x8cd1840) at nsViewManager.h:350
#9  0x4148c6ae in nsViewManager::Release() (this=0x8c69688) at nsViewManager.cpp:508
#10 0x08062266 in ~nsCOMPtr_base (this=0x44776569) at nsCOMPtr.cpp:81
#11 0x412fd5b4 in ~DocumentViewerImpl (this=0x8cd0bd8) at nsPrintData.h:75
#12 0x412fcfb6 in DocumentViewerImpl::Release() (this=0x8c69688) at
nsDocumentViewer.cpp:513
#13 0x08062266 in ~nsCOMPtr_base (this=0x44776569) at nsCOMPtr.cpp:81
#14 0x414a4d00 in GlobalWindowImpl::Close() (this=0x83e5760) at
nsGlobalWindow.cpp:2153
#15 0x40a2a2e7 in XPTC_InvokeByIndex () at xptcinvoke_gcc_x86_unix.cpp:69
#16 0x40a9f03c in XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode) (ccx=@0xbfffe120,
    mode=CALL_METHOD) at xpcwrappednative.cpp:2026
#17 0x40aa586e in XPC_WN_CallMethod(JSContext*, JSObject*, unsigned, long*,
long*) (cx=0x8282558, obj=0x44776569,
    argc=1148675433, argv=0x8c69688, vp=0x44776569) at
xpcwrappednativejsops.cpp:1287
---Type <return> to continue, or q <return> to quit---./mozilla: line 184: 17892
Killed                  "$dist_bin/run-mozilla.sh" $script_args
"$dist_bin/$MOZILLA_BIN" "$@"
> but I tacked a -g onto the end

Perfect.

It's interesting that this is crashing inside get_gtk_widget_for_gdk_window
instead of on dereferencing its return value....
(In reply to comment #15)

> It's interesting that this is crashing inside get_gtk_widget_for_gdk_window
> instead of on dereferencing its return value....

Got it!  Your mention of widgets made me think to change my Preferences for
'Downloads' from 'Open a progress dialog' to either of the other two settings:
The crashing disappears :o)

Anything I can do to narrow it down even more?  FWIW, the bug started
with the commits made in the 24 hours between May 6 and May 7, noon PST.
(In reply to comment #17)
> None of these seem gtk-specific...

Oops -- this time I'll give you the right dates :o(  May5-6:

These are likely candidates:
U mozilla/widget/src/gtk2/mozcontainer.c
U mozilla/widget/src/gtk2/mozcontainer.h
U mozilla/widget/src/gtk2/mozdrawingarea.c
U mozilla/widget/src/gtk2/nsWindow.cpp
U mozilla/widget/src/gtk2/nsWindow.h
U mozilla/widget/src/xpwidgets/nsBaseFilePicker.cpp
Ahh, the dangers of messing with widget destruction.
(In reply to comment #19)
> ccing bryner, since that's his checkin.
> 
> Actual checkin range when this broke, per Walter's comment:
> 
>
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2004-05-05+12%3A00%3A00&maxdate=2004-05-06+12%3A00%3A00&cvsroot=%2Fcvsroot

Actually it seems I was too hasty in blaming the gtk2 changes.  Since then I've
narrowed the time down to between 20:00 and 21:00 on May05, which puts it in
roc's court.  See bug 233441 comment #10.  Unfortunately this was a huge commit
and I can't even begin to debug it on my own.

Any ideas how that commit might be causing a problem with the download progress
dialog widget?
I backed out 233441 so if it was that checkin, this should be fixed now.
(In reply to comment #22)
> I backed out 233441 so if it was that checkin, this should be fixed now.

On two out of three nearly identical machines it is indeed fixed.  I just
realized this today because I work mostly on the third machine which still
has the same crash with the same backtrace in spite of freshly updated sources.

Maybe some sticky tags in the source tree on this box -- I'll see if I can
find out why.  Thanks.
I'm delighted to say that my third machine now is working properly after
re-starting the source directory from scratch.

I think this bug can be marked Resolved.  Thanks!
A headup unrelated to this bug, just because of the distinguished audience ;o)

I discovered this morning that I was forced to add 'disable-gnomeui' and
'disable-gnomevfs' to my .mozconfig to avoid unwanted linkage to gnome
libraries.  This is brand new today.

There were some strange new crashes due to undefined gnome-related symbols
(I'm not running gnome on these machines, but I do have some gnome libraries
installed.)

The bad part, of course, is that I didn't see the 'missing symbol' messages
until I used gdb and/or started mozilla from the command-line, which most
people don't do.

So, beware reports of strange new crashes from people (like me) who compile
from source.

Thanks!
> There were some strange new crashes due to undefined gnome-related symbols

do you have stacktraces for those crashes? please file bugs about them.
(In reply to comment #26)
> > There were some strange new crashes due to undefined gnome-related symbols
> 
> do you have stacktraces for those crashes? please file bugs about them.

No.  When I ran gdb I got messages saying 'could not find thread nnnn'
or similar.  However (the good part) is that gdb did display the 'missing
symbol' message before returning to the gdb prompt, which led me to run
mozilla from a command prompt, where I got the same missing-symbol error.

I puzzled over the error for awhile before I realized that I damned-well-
should-never-get-a-missing-gnome-symbol error when I'm not running gnome!

Using ldd on the culprit library (libimglib2.so) showed multiple linkages
to gnome libraries that never were linked before today, and should not
have been linked according to my .mozconfig file.

I noticed some checkins for mozilla/configure today.  Perhaps someone more
skilled than I could figure this out.


ew, libimglib2.so? what symbols were that?
(In reply to comment #28)
> ew, libimglib2.so? what symbols were that?

This is from memory, so it may not be exact:
_Z2_gnome_icon_theme_nvw (or some three-
letter combination I can't quite remember).
Definitely involved icon_theme, however.

I'm running kde, so gnome should NOT be
involved -- except that the 'pan' news
client pulls in gnome libraries whether or
not you want them.  I would guess that the
'configure' script tests for gnome headers
and includes the gnome libraries by default,
(starting today.)  Just a guess.

Adding the disable-gnome* lines to my
.mozconfig file completely solved the
problem, in case I didn't make it clear.
Flags: blocking1.8a? → blocking1.8a-
(In reply to comment #29)
> This is from memory, so it may not be exact:
> _Z2_gnome_icon_theme_nvw (or some three-
> letter combination I can't quite remember).
> Definitely involved icon_theme, however.

ok. this should be fixed in current builds (the symbol should now be
resolvable); although it will still link to libgnomeui. this is intentional.

> I'm running kde, so gnome should NOT be
> involved

why not?

> I would guess that the
> 'configure' script tests for gnome headers
> and includes the gnome libraries by default,

indeed.

> Adding the disable-gnome* lines to my
> .mozconfig file completely solved the
> problem, in case I didn't make it clear.

that should not be needed to fix anything, is my point.
> ok. this should be fixed in current builds (the symbol should now be
> resolvable); although it will still link to libgnomeui. this is intentional.

Correct.  I deleted the 'disable-gnome*' lines and recompiled and everything
works again.  I checked and the linkage to gnome-related libraries is back
again.
 
> > I'm running kde, so gnome should NOT be involved
 
> why not?

Well, no good reason -- as long as everything works ;-)
I would love to know the reason for the gnome libraries.
What do you (we) gain from using them?
(In reply to comment #31)
> What do you (we) gain from using them?

Nice icons in download manager and the helper app dialog :)
(that's the one thing - the libgnomeui dependency)

Another thing (that bryner implemented) is using the file associations made in
gnome. (this one uses dlopen/dlsym, thus has no configure option)

And the third thing, thanks to darin, is support for gnome-vfs; that currently
means you can use smb: urls.
I've made a build with the patch from bug 233441 and I don't see this bug at all
:-(.
(In reply to comment #33)
> I've made a build with the patch from bug 233441 and I don't see this bug at all

The big checking from May5 that was backed out a few days later?  There were
some significant changes to gtk2 also made on May5 and those files have been
modified since then -- perhaps there was something fixed in the meantime
since your patch was backed out?

I can try applying your patch at this end if you like, and see if I still get
the crashes.
(In reply to comment #33)
> I've made a build with the patch from bug 233441 and I don't see this bug at all

Did the original patch (attachment to 23441) apply cleanly to recent sources?
I had 7 or 8 hunks fail so I didn't attempt to compile it.
This looks like one manifestation of a problem I noted with toplevel window
destruction on gtk2.  Basically, what happens is that nsIWidgets are created
with a _native_ parent set to the toplevel window's GdkWindow, but not using
that nsIWidget parent (this happens via nsIView::CreateWidget).  These child
nsIWidgets are then parented to the toplevel widget's mContainer.

Then, when teardown of the window happens, the toplevel window ends up being the
first thing we try to destroy.  The toplevel window calls Destroy() on any child
nsIWidgets, but this completely misses the nsIWidgets which are inserted as
described above.  When the toplevel window destroys mShell, any GdkWindows which
are owned by that toplevel window are also destroyed.

Now we come to the crash stack listed.  This is when we try to destroy one of
the nsIWidgets which is a not-quite-child of the toplevel window.  In
IM_get_input_context, aArea->inner_window is a GdkWindow that was already freed
since it was contained within the toplevel mShell.  We end up doing a UMR, and
things go downhill from there.

I tried briefly to make the widget parenting work the way it seems like it
should (that is, the widgets are parented at the nsIWidget level as well as at
the native level) but I ended up with a blank browser window with a tiny content
area and didn't really feel like figuring out why.  Maybe roc knows.  Instead I
came up with a patch that tries harder to destroy nsWindow children when we
destroy the toplevel window, by finding the nsWindow for any drawing areas or
child MozContainers and calling Destroy on them.  It seems to fix the crashes
for me and I've seen no other problems (other than that it's ugly).
Attached patch patchSplinter Review
Attachment #149322 - Flags: superreview?(blizzard)
Attachment #149322 - Flags: review?(blizzard)
(In reply to comment #36)
> I came up with a patch that...seems to fix the crashes
> for me and I've seen no other problems (other than that it's ugly).

I have seen no more of these crashes since roc backed out his
checkin of May05.  Are you still able to reproduce them with recent
sources?
Comment on attachment 149322 [details] [diff] [review]
patch

Is there any reason why we can't just use gdk_window_get_children on the
widget's window and use that to get the objects in question?
Attachment #149322 - Flags: superreview?(blizzard)
Attachment #149322 - Flags: superreview-
Attachment #149322 - Flags: review?(blizzard)
Attachment #149322 - Flags: review-
> I have seen no more of these crashes since roc backed out his
> checkin of May05.  Are you still able to reproduce them with recent
> sources?

Doesn't really matter.. there's definitely a FMR condition that happens on
window teardown and I thought this bug would be a convenient place to address
that.  Seemingly unrelated changes can tickle this sort of condition, as
happened with my NS_THEMECHANGED patch.

(In reply to comment #39)
> (From update of attachment 149322 [details] [diff] [review])
> Is there any reason why we can't just use gdk_window_get_children on the
> widget's window and use that to get the objects in question?
> 

Er, I thought you were the one who suggested I do this the way I did.  At any
rate, I'll look into this suggestion.
(In reply to comment #40)
> > I have seen no more of these crashes since roc backed out his
> > checkin of May05. 
> 
> Doesn't really matter.. there's definitely a FMR condition that happens on
> window teardown and I thought this bug would be a convenient place to address
> that...

I was intending to test your patch, but since I can't crash mozilla anymore
I wouldn't know if the problem was fixed or not.  Is there a way to know
for sure?
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: