Last Comment Bug 678637 - crash libxul (SEGV @ 0x0) at startup of ChatZilla
: crash libxul (SEGV @ 0x0) at startup of ChatZilla
Status: RESOLVED FIXED
: crash, regression
Product: Core
Classification: Components
Component: Security: PSM (show other bugs)
: Trunk
: All All
: -- critical with 1 vote (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
Depends on: 674597
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-12 15:10 PDT by Tony Mechelynck [:tonymec]
Modified: 2011-08-18 14:50 PDT (History)
3 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
verbose log of Mercurial changesets in ancestors(bad) but not in ancestors(good) (28.31 KB, text/plain)
2011-08-12 16:53 PDT, Tony Mechelynck [:tonymec]
no flags Details
the same verbose log, with long lines reformatted (31.11 KB, text/plain)
2011-08-12 18:12 PDT, Tony Mechelynck [:tonymec]
no flags Details
cZ customizations (testcase) (3.65 KB, application/octet-stream)
2011-08-13 06:29 PDT, Tony Mechelynck [:tonymec]
no flags Details
Stack trace for abort from bug 674597 (3.94 KB, text/plain)
2011-08-17 15:28 PDT, James Ross
no flags Details

Description Tony Mechelynck [:tonymec] 2011-08-12 15:10:46 PDT
Mozilla/5.0 (X11; Linux x86_64; rv:8.0a1) Gecko/20110812 Firefox/8.0a1 SeaMonkey/2.5a1 ID:20110812003110

ChatZilla 0.9.87

This crash did not happen yesterday, with the previous SeaMonkey nightly.

I got it four times in succession, at the very start of startup, with exactly identical stacks:
bp-16f28731-aacc-40ef-bc46-c6f552110812
bp-6d158edf-ce0b-4d7a-8b58-090012110812
bp-f81ccbec-24b4-43c8-93f5-654682110812
bp-82e8b4f3-b92d-45de-b081-2adde2110812

Then I started only the mailer (with -mail on the command-line: didn't crash), unset ChatZilla in the windows to be opened at startup (in "Appearance" preferences), shut down Sm & started again, and the browser & mailer started up with no crash.

After I report this bug, I'll try to start ChatZilla from an already running SeaMonkey, to see if it crashes.

Here is the Soccorro data from the first of these identical crashes:
Signature	libxul.so@0xebc281
UUID	16f28731-aacc-40ef-bc46-c6f552110812
Date Processed	2011-08-12 11:17:29.610672
Uptime	35
Last Crash	15.7 hours before submission
Install Age	35 seconds since version was first installed.
Install Time	2011-08-12 18:16:24
Product	SeaMonkey
Version	2.5a1
Build ID	20110812003110
Release Channel	nightly
Branch	8
OS	Linux
OS Version	0.0.0 Linux 2.6.37.6-0.7-desktop #1 SMP PREEMPT 2011-07-21 02:17:24 +0200 x86_64
CPU	amd64
CPU Info	family 15 model 4 stepping 1
Crash Reason	SIGSEGV
Crash Address	0x0
User Comments	starting up
Processor Notes 	
EMCheckCompatibility	False
Winsock LSP	
Adapter Vendor ID	
Adapter Device ID	
Bugzilla - Report this Crash
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	libxul.so 	libxul.so@0xebc281 	
1 	libxul.so 	libxul.so@0xeb840f 	
2 	libnspr4.so 	pt_PostNotifies 	nsprpub/pr/src/pthreads/ptsynch.c:138
Comment 1 Tony Mechelynck [:tonymec] 2011-08-12 15:26:28 PDT
(In reply to Tony Mechelynck [:tonymec] from comment #0)
[...]
> After I report this bug, I'll try to start ChatZilla from an already running
> SeaMonkey, to see if it crashes.
[...]
It did: bp-5a596f7e-127a-48c1-989e-7ea742110812 — again with the same 3-level crash, and again with a SEGV violation at address zero.
Comment 2 James Ross 2011-08-12 15:38:42 PDT
So opening CZ is triggering the crash, no matter when you open CZ?

In any case, I don't see how this belongs in the CZ component; we have no native code. :)
Comment 3 Tony Mechelynck [:tonymec] 2011-08-12 15:43:08 PDT
Today's nightly of SeaMonkey (bad) was built from http://hg.mozilla.org/mozilla-central/rev/f262c389193e

Yesterday's (good) from http://hg.mozilla.org/mozilla-central/rev/7871abb0e291

I don't know the comm-central changesets.
Comment 4 Tony Mechelynck [:tonymec] 2011-08-12 15:46:24 PDT
In reply to comment #2: yes, starting cZ triggers the crash.

If you have the least idea where this bug belongs, feel free to move it. I filed it in cZ because:
ChatZilla => crash
no ChatZilla => no crash.
Comment 5 Tony Mechelynck [:tonymec] 2011-08-12 16:01:48 PDT
Mercurial changelogon mozilla-central: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7871abb0e291&tochange=f262c389193e
Comment 6 Tony Mechelynck [:tonymec] 2011-08-12 16:09:42 PDT
the crash happens after opening the network tabs but before opening the autostart channel and query tabs
Comment 7 Tony Mechelynck [:tonymec] 2011-08-12 16:53:53 PDT
Created attachment 552800 [details]
verbose log of Mercurial changesets in ancestors(bad) but not in ancestors(good)

These are the mozilla-central changesets generated (with verbose on) on the mozilla-central repo by "hg glog -r 'ancestors(f262c389193e) - ancestors(7871abb0e291)'" i.e. the verbose graph of changesets in the bad build not present in the good build.

I'm not sure how to determine the corresponding comm-central changesets but maybe they are not relevant for ChatZilla.
Comment 8 Tony Mechelynck [:tonymec] 2011-08-12 18:12:01 PDT
Created attachment 552812 [details]
the same verbose log, with long lines reformatted

the same verbose log, with long lines reformatted to no more than 100 characters in the hope (not always vindicated) of making the result more legible.

Maybe this inspires some of you developers, me it doesn't. I'm now gonna try and test some hourlies in the hope of narrowing the regression range.
Comment 9 Tony Mechelynck [:tonymec] 2011-08-12 18:19:40 PDT
haha. No more hourlies available that are earlier than the bad build.
Comment 10 Tony Mechelynck [:tonymec] 2011-08-12 21:13:50 PDT
Mozilla/5.0 (X11; Linux x86_64; rv:8.0a1) Gecko/20110811 Firefox/8.0a1 SeaMonkey/2.5a1 ID:20110811003048
Built from http://hg.mozilla.org/mozilla-central/rev/7871abb0e291

After reinstalling this build, ChatZilla can again be started with no crash.

Since it's the same version of cZ that crashes in the next nightly and not in this one, I suppose the bug must be elsewhere.

Here is the list of bugs which were FIXED on mozilla-central between the "good" build and the "bad" one: https://bugzilla.mozilla.org/buglist.cgi?cmdtype=dorem&remaction=run&namedcmd=bug678637&sharer_id=104443.

For comm-central, only two changesets are in range:
http://hg.mozilla.org/comm-central/rev/8f7fe249332e - bug 678356
http://hg.mozilla.org/comm-central/rev/a4076abb21b7 - bug 644567
and at first glance they don't look very likely to be the cause of the crash.
Comment 11 James Ross 2011-08-13 02:22:13 PDT
Have you tried before and after Firefox builds? It seems unlikely to be a comm-central-specific change that started the crashing.

According to the crash reports, the main thread is waiting on something and it is another thread which has crashed. What's really needed is a symboled stack trace, so people can see where libxul is actually crashing. Which I guess means someone needs to make a debug build.
Comment 12 Tony Mechelynck [:tonymec] 2011-08-13 04:48:02 PDT
(In reply to James Ross from comment #11)
> Have you tried before and after Firefox builds? It seems unlikely to be a
> comm-central-specific change that started the crashing.
> 
> According to the crash reports, the main thread is waiting on something and
> it is another thread which has crashed. What's really needed is a symboled
> stack trace, so people can see where libxul is actually crashing. Which I
> guess means someone needs to make a debug build.

I'll try some Nightly nightlies, or if you can request a try-build with debug, I can test that.
Comment 13 Tony Mechelynck [:tonymec] 2011-08-13 05:08:53 PDT
Mozilla/5.0 (X11; Linux x86_64; rv:8.0a1) Gecko/20110812 Firefox/8.0a1 ID:20110812030744
Built from http://hg.mozilla.org/mozilla-central/rev/f262c389193e

ChatZilla did not crash when started up; however, it has much fewer startup actions (no autoconnects, no scripts, no auto-joins or -queries) than my usual SeaMonkey chatZilla, also it is a much simpler browser profile (much more like a new profile: fewer extensions, fewer tabs, fewer "user set" prefs in about:config). Now I'm making the cZ prefs in that Firefox profile more like what I use in SeaMonkey.
Comment 14 Tony Mechelynck [:tonymec] 2011-08-13 06:21:06 PDT
After adding "General" cZ prefs including a few auto-connected networks, but not yet the auto-join and auto-queries for these networks, Firefox crashes on restart of ChatZilla. My next task (after this comment) will be to prepare a .tar.gz of chatZilla prefs (from prefs.js), scripts and plugins so I can attach a testcase to this bug. Apparently the Firefox prefs.js wasn't saved so I include the "extensions.irc.*" prefs (except the autojoins ant autoqueries by network, which I hadn't yet added) from my SeaMonkey prefs.js.

Firefox crash: bp-fb685e98-428c-49cc-a193-657a52110813
The stack is different, and longer; but the immediate cause is still a SEGV fault at address zero.
Here comes the Soccorro report:
Signature	nsNSSSocketInfo::EnsureDocShellDependentStuffKnown
UUID	fb685e98-428c-49cc-a193-657a52110813
Date Processed	2011-08-13 05:56:45.408120
Uptime	3753
Last Crash	8.1 weeks before submission
Install Age	1.1 hours since version was first installed.
Install Time	2011-08-13 11:50:01
Product	Firefox
Version	8.0a1
Build ID	20110812030744
Release Channel	nightly
Branch	2.2
OS	Linux
OS Version	0.0.0 Linux 2.6.37.6-0.7-desktop #1 SMP PREEMPT 2011-07-21 02:17:24 +0200 x86_64
CPU	amd64
CPU Info	family 15 model 4 stepping 1
Crash Reason	SIGSEGV
Crash Address	0x0
User Comments	at chatzilla startup
App Notes 	OpenGL: Mesa Project -- Software Rasterizer -- 1.4 (2.1 Mesa 7.10.2) -- texture_from_pixmap
WebGL? WebGL-
Processor Notes 	
EMCheckCompatibility	False
Winsock LSP	
Adapter Vendor ID	
Adapter Device ID	
Bugzilla - Report this Crash
Related Bugs

    636810 NEW crash [@ nsNSSSocketInfo::EnsureDocShellDependentStuffKnown] (also intermittently during tests/security/ssl/mixedcontent/test_bug383369.html)

Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	libxul.so 	nsNSSSocketInfo::EnsureDocShellDependentStuffKnown 	nsIInterfaceRequestorUtils.h:55
1 	libxul.so 	nsNSSSocketInfo::GetPreviousCert 	security/manager/ssl/src/nsNSSIOLayer.cpp:849
2 	libxul.so 	HandshakeCallback 	security/manager/ssl/src/nsNSSCallbacks.cpp:924
3 	libssl3.so 	ssl3_HandleFinished 	ssl3con.c:8501
4 	libssl3.so 	ssl3_HandleHandshakeMessage 	ssl3con.c:8657
5 	libssl3.so 	ssl3_HandleRecord 	ssl3con.c:8725
6 	libssl3.so 	ssl3_GatherCompleteHandshake 	ssl3gthr.c:209
7 	libssl3.so 	ssl_GatherRecord1stHandshake 	sslcon.c:1258
8 	libssl3.so 	ssl_Do1stHandshake 	sslsecur.c:151
9 	libssl3.so 	ssl_SecureSend 	sslsecur.c:1222
10 	libssl3.so 	ssl_Write 	sslsock.c:1659
11 	libxul.so 	nsSSLThread::Run 	security/manager/ssl/src/nsSSLThread.cpp:1047
12 	libnspr4.so 	_pt_root 	nsprpub/pr/src/pthreads/ptthread.c:187
13 	libpthread-2.11.3.so 	libpthread-2.11.3.so@0x6a3e
Comment 15 Tony Mechelynck [:tonymec] 2011-08-13 06:29:36 PDT
Created attachment 552869 [details]
cZ customizations (testcase)

This is a .tar.gz archive containing the following:
czcrash/
czcrash/chatzilla/
czcrash/chatzilla/scripts/       the installed scripts
czcrash/chatzilla/scripts/joinint/
czcrash/chatzilla/scripts/joinint/init.js
czcrash/chatzilla/scripts/away-marker/
czcrash/chatzilla/scripts/away-marker/init.js
czcrash/chatzilla/scripts/ctcp-notification/
czcrash/chatzilla/scripts/ctcp-notification/init.js
czcrash/chatzilla/networks.txt   the customized network definitions
czcrash/prefs.js                 the possibly relevant extensions.irc.* prefs

czcrash/chatzilla/ can be merged with your <profile>/chatzilla/ (you might not have customized scripts & networks at all, in which case there will be no conflicts).
czcrash/prefs.js has to be merged with your <profile>/prefs.js while the application (Firefox or SeaMonkey) is not running.
Comment 16 Tony Mechelynck [:tonymec] 2011-08-16 23:59:58 PDT
Silver: I see a lot of "JS Engine" patches between the "good" and "bad" builds. Any idea which (if any) of them could cause a crash at cZ startup? FWIW, I have a custom networks.txt, autoconnected networks, and some "well-known" scripts: the whole of that seems to be enough to produce the crash. See the attachments for details.
Comment 17 James Ross 2011-08-17 13:14:02 PDT
Okay so .tar.gz is just irritating but luckily looking at the stack has been enough to trivially reproduce the issue on my own build: connect to ircs://moznet. BOOM.

There's at least one cz: logged error related to SSL too.

I'll see what I can narrow down here.
Comment 18 James Ross 2011-08-17 15:28:01 PDT
There is an abort added by "74214 (be91fb29d950) Bug 674597 - abort if attempting to create an xpcom proxy for wrapped JS (r=bsmedberg)" for when GetProxyForObject is called on a wrapped JS object. The NSS SSL code is smacking in to that.

I imagine this relates to the CZ code setting transport.securityCallbacks = new BadCertHandler(), judging by the QI going on and the code around.
Comment 19 James Ross 2011-08-17 15:28:39 PDT
Created attachment 553928 [details]
Stack trace for abort from bug 674597
Comment 20 James Ross 2011-08-17 15:40:51 PDT
There are a load of related bugs, like bug 679140 and bug 679144, but I'm not sure exactly what relevance they have to the problem here.
Comment 21 James Ross 2011-08-17 15:43:20 PDT
I *think* this is a Core: Security: PSM issue overall; even if not, hopefully the people there know more about the whole proxies/dispatch stuff than I.
Comment 22 Luke Wagner [:luke] 2011-08-17 16:35:13 PDT
The relevant patches have been backed out and the crashes should go away in tonight's nightly or the next.
Comment 23 Tony Mechelynck [:tonymec] 2011-08-18 10:18:37 PDT
(In reply to Luke Wagner [:luke] from comment #22)
> The relevant patches have been backed out and the crashes should go away in
> tonight's nightly or the next.

Mozilla/5.0 (X11; Linux x86_64; rv:9.0a1) Gecko/20110818 Firefox/9.0a1 ID:20110818030747

I can now indeed start ChatZilla on Firefox without crashing. I'll now try it with the latest SeaMonkey nightly.
Comment 24 Tony Mechelynck [:tonymec] 2011-08-18 14:10:26 PDT
Mozilla/5.0 (X11; Linux x86_64; rv:9.0a1) Gecko/20110818 Firefox/9.0a1 SeaMonkey/2.6a1 ID:20110818003112
Built from http://hg.mozilla.org/mozilla-central/rev/8b970cb862f2

:-( Here I still crash, bp-eb5eb4bb-1a16-40b5-ae6a-294ab2110818 and bp-d73c00e7-4a9f-4e9a-bbfe-a7dff2110818

but it was built a few hours earlier: Fx from comment #23 Built from http://hg.mozilla.org/mozilla-central/rev/f69a10f23bf3 at the next mozilla-inbound merge to m-c and changeset 5f0596a0b81e (bug 674597 comment #9) is between them, see http://hg.mozilla.org/mozilla-central/graph/75479?revcount=120

It's too early for the 2011-08-19 nightly of SeaMonkey, but let's see if I can find a hourly... Aha! From the .txt file, I see a SeaMonkey build from http://hg.mozilla.org/mozilla-central/rev/fb919f4fa210 (slightly later than the Fx nightly), let's download it, close this one and try again...
Comment 25 Tony Mechelynck [:tonymec] 2011-08-18 14:50:08 PDT
Mozilla/5.0 (X11; Linux x86_64; rv:9.0a1) Gecko/20110818 Firefox/9.0a1 SeaMonkey/2.6a1 ID:20110818044401
Built from http://hg.mozilla.org/mozilla-central/rev/fb919f4fa210

No crash.

I'm setting this bug FIXED (not WORKSFORME) because after the above (collective) detective work I think I can say with reasonable confidence that this bug was fixed at changeset 5f0596a0b81e backing out a patch for bug 674597. For the same reason I'm adding a comment in bug 674597 about an easy test (thanks Silver) at comment #17 above.

Note You need to log in before you can comment on or make changes to this bug.