Fennec crash [@ libz.so@0x14b4 ] [@ libz.so@0x14d4 ] [ @ nsACString_internal::Replace | mozilla::ipc::AsyncChannel::OnDispatchMessage ]

VERIFIED FIXED in mozilla5

Status

()

Core
General
--
critical
VERIFIED FIXED
7 years ago
6 years ago

People

(Reporter: Scoobidiver (away), Assigned: mwu)

Tracking

({crash, relnote, topcrash})

Trunk
mozilla5
ARM
Android
crash, relnote, topcrash
Points:
---

Firefox Tracking Flags

(blocking2.0 final+, fennec4.0.1+)

Details

(Whiteboard: [hardblocker], crash signature)

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

7 years ago
It is a new crash signature in Fennec 4.0b3.
It is #4 top crasher in Fennec 4.0b3 for the last week.

Signature	libz.so@0x14b4
UUID	b3a334be-117d-4a2f-aa64-f01ca2110114
Time 	2011-01-14 23:05:24.64066
Uptime	0
Install Age	131 seconds since version was first installed.
Product	Fennec
Version	4.0b3
Build ID	20101221205132
Branch	1.9
OS	Linux
OS Version	0.0.0 Linux 2.6.32.28-cyanogenmod #1 PREEMPT Wed Jan 12 15:08:49 CST 2011 armv7l
CPU	arm
Crash Reason	SIGSEGV
Crash Address	0xc6e0df26

Frame 	Module 	Signature [Expand] 	Source
0 	libz.so 	libz.so@0x14b4 	
1 	org.mozilla.firefox-1.apk 	org.mozilla.firefox-1.apk@0x1adc00 	
2 	org.mozilla.firefox-1.apk 	org.mozilla.firefox-1.apk@0x1ade13 	
3 	plugin-container 	__gnu_unwind_pr_common 	unwind-arm.c:1225
4 	libmozalloc.so 	moz_malloc 	memory/mozalloc/mozalloc.cpp:109
5 		@0xbecf0900 	
6 	libz.so 	libz.so@0x12447 	
7 	libz.so 	libz.so@0x12447 	
8 	libz.so 	libz.so@0x132bf 	
9 	libz.so 	libz.so@0x1246d 	
10 	libxul.so 	nsACString_internal::Replace 	xpcom/string/src/nsTSubstring.cpp:488
11 	libxul.so 	nsFrameScriptExecutor::LoadFrameScriptInternal 	content/base/src/nsFrameMessageManager.cpp:659
12 	libxul.so 	mozilla::dom::TabChild::RecvLoadRemoteScript 	dom/ipc/TabChild.cpp:749
13 	libxul.so 	mozilla::dom::PBrowserChild::OnMessageReceived 	PBrowserChild.cpp:1211
14 	libxul.so 	mozilla::dom::PContentChild::OnMessageReceived 	PContentChild.cpp:949
15 	libxul.so 	mozilla::ipc::AsyncChannel::OnDispatchMessage 	ipc/glue/AsyncChannel.cpp:262
16 	libxul.so 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:440
17 	libxul.so 	RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run 	ipc/chromium/src/base/task.h:308
18 	libxul.so 	mozilla::ipc::RPCChannel::DequeueTask::Run 	RPCChannel.h:475
19 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:344
20 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:354
21 	libxul.so 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:451
22 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:115
23 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run 	ipc/glue/MessagePump.cpp:230
24 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:220
25 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:512
26 	libxul.so 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:198
27 	libxul.so 	XRE_RunAppShell 	toolkit/xre/nsEmbedFunctions.cpp:631
28 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run 	ipc/glue/MessagePump.cpp:222
29 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:220
30 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:512
31 	libxul.so 	XRE_InitChildProcess 	toolkit/xre/nsEmbedFunctions.cpp:510
32 	libmozutils.so 	ChildProcessInit 	other-licenses/android/APKOpen.cpp:691
33 	plugin-container 	main 	ipc/app/MozillaRuntimeMainAndroid.cpp:69
34 	libc.so 	libc.so@0x14c74 	

More reports at:
http://crash-stats.mozilla.com/query/query?product=Fennec&range_value=4&range_unit=weeks&query_search=signature&query_type=startswith&query=libz.so@0x14&build_id=&process_type=any&hang_type=any&do_query=1
(Reporter)

Updated

7 years ago
tracking-fennec: --- → ?

Comment 1

7 years ago
I find it particularly strange how 779 crashes showed up in the past two days for a previously unknown signature.

Comment 2

7 years ago
Actually, the fact that it showed up on both 4.0b3 and 4.0b4pre at the same time leads me to suspect that something external changed, like a system upgrade perhaps?
(Reporter)

Comment 3

7 years ago
It is now #1 top crasher in Fennec 4.0b3 (33% of all crashes) so comment 2 is good.
I'm wondering if some set of devices got an update with a new libz that we don't get along with. Unfortunately, I don't see any crash reports with device information so we can't narrow down which devices are affected. 

I wonder if using our own libz would make this go away.
(Reporter)

Comment 5

7 years ago
Among crashes, libz.so debug identifier is mainly:
D85FCD31A7FE605564DE920A8F1117460
Sometimes: D85FCD31A7FE605564DE920A8F1167E90

Updated

7 years ago
blocking2.0: --- → ?
tracking-fennec: ? → 2.0+
Assignee: nobody → mwu
(Reporter)

Comment 6

7 years ago
First crash happened on 1/14/11 at 4:02 in 4.0b3.
First crash happened on 1/14/11 at 9:51 in 4.0b4pre.
The stack beyond the top frame is probably completely bogus, let's assign this to general for now.
blocking2.0: ? → final+
Component: IPC → General
QA Contact: ipc → general
Actually, from about frame 11 on down looks totally sane. 2-10 are definitely off in the weeds, though. If someone can find a copy of that libz.so that matches one of these crash reports, we can probably get a more sensible stack out of it (even with just export symbols, there should be enough CFI to get us to the right caller frame).
Whiteboard: [hardblocker]
(Assignee)

Comment 9

7 years ago
As far as I can tell, most if not all the devices are using some sort of hacked firmware. All the ones that say 2.6.32.28.S10.4.OC-ga36929b-dirty , for example, is using some cyanogen gingerbread, and a bunch more actually identify themselves as cyanogen in the kernel version string.
(Assignee)

Comment 10

7 years ago
We're crashing due to a cyanogen specific change in zlib: https://github.com/CyanogenMod/android_external_zlib/commit/e6981afc21ff7b315c945b062763db11ef231ef4
(Assignee)

Comment 11

7 years ago
As an android only issue, this probably isn't blocking2.0. (but blocking-fennec, certainly)
blocking2.0: final+ → ?
2.0 and fennec are roughly the same thing. Although given what we know, is this just a cyanogen bug found by people who aren't running production devices?
blocking2.0: ? → final+
(Assignee)

Comment 13

7 years ago
(In reply to comment #12)
> 2.0 and fennec are roughly the same thing. Although given what we know, is this
> just a cyanogen bug found by people who aren't running production devices?

This is a cyanogen bug found by people running production devices. Cyanogen however, is a non-stock non-production Android build/firmware that features a great number of non-upstream changes. One of them is a change to zlib which crashes us. Apparently, a large number of our nightly users like to run cyanogen.

The particular bug that's biting us/them should be addressed "upstream" with the cyanogen devs IMHO, since we do want to take advantage of zlib optimization where it exists.

Comment 14

7 years ago
I can confirm that CM7 (the Gingerbread version) is affected. When the bug hits, it hits a large number of times in a row and creates a crash report for each, which is probably why the number of crashes is so large. I'm easily looking at over 50 reports, six of which have bp- ID numbers, from the last time this struck.

Due to CM policy, I can't file an issue in their tracker, but I will go ahead and put a patch up on CM gerrit to revert the change mwu identified and bring it to the attention of the core team.

Comment 15

7 years ago
I've reverted the patch, and contacted the author with a link to the crash report and this bug report. Nobody wants to break apps, thanks for the heads up.

Comment 16

7 years ago
I can now also confirm that the rollback of the zlib optimization in CyanogenMOD has cleared this up for me.

Comment 17

7 years ago
If you ever see any insanity like this in the future on a CM build of Android, feel free to contact me directly. CM7 isn't actually released yet, we only have nightly builds that aren't RC yet. We usually ask that people not file bugs on nightlies because they are almost always feature requests or things in rapid development, but app breakage is a different story- especially native apps like Fennec.
Let's resolve this, then.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → WORKSFORME
Keywords: relnote
We're now seeing this in builds other than old CM7 nightlies, including a report that it is in the official version of Android 2.3.3. for the new HTC Desire S:

https://support.mozilla.com/en-US/questions/803031

This is the #1 topcrasher for Fennec 4.0.  Re-opening and nominating for 4.0.1.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Whiteboard: [hardblocker] → [hardblocker][4.0.1?]

Comment 20

6 years ago
Given this looks to be our #1 crasher, +'ing for 4.0.1
tracking-fennec: 2.0+ → 4.0.1+

Comment 21

6 years ago
Bummer, I wish CodeAurora Forum would have responded when I tried to contact them about the ABI break.
Created attachment 523344 [details] [diff] [review]
WIP

This disables system zlib, and tries to fix the build to work with the in-tree zlib instead.  But freetype still fails to build without system zlib, and I haven't yet figured out how to fix that.  If someone who understands the build system wants to take this, please do.
(Assignee)

Comment 23

6 years ago
Created attachment 523660 [details] [diff] [review]
Optimize script reading

Well, for some reason, this optimized zlib copy doesn't like it when we decompress in 8kb chunks. Which.. is fine since reading all at once is how we do it everywhere else and it involves less copies and less lines of code.
Attachment #523344 - Attachment is obsolete: true
Attachment #523660 - Flags: review?(Olli.Pettay)
Comment on attachment 523660 [details] [diff] [review]
Optimize script reading

>+    if (!buffer ||
>+        NS_FAILED(input->Read(buffer, avail, &read)) ||
>+        read != avail) {
>+      return;
I asked biesi about this and currently it works, but
it is not promised by the contract.
So better to call Read in a loop until it returns 0.

Could you update the patch.
Attachment #523660 - Flags: review?(Olli.Pettay)
I was pointed to http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsNetUtil.h#1176
That could be useful here.
(In reply to comment #19)
> We're now seeing this in builds other than old CM7 nightlies, including a
> report that it is in the official version of Android 2.3.3. for the new HTC
> Desire S:

I can confirm this with my HTC Desire S with either Fennec 4.0 or with Fennec nightlys (last tried with a 20110406xx-build). The device was also reseted but the error shows also on a freshly installed device.

One crashreports (others were throttled):
http://crash-stats.mozilla.com/report/index/bp-16ce1dd4-723d-48e3-a6d8-c1dcd2110405
(Assignee)

Comment 27

6 years ago
Created attachment 524242 [details] [diff] [review]
Optimize script reading, v2
Attachment #523660 - Attachment is obsolete: true
Attachment #524242 - Flags: review?(Olli.Pettay)
Comment on attachment 524242 [details] [diff] [review]
Optimize script reading, v2

So why can you just use NS_ReadInputStreamToString which I linked to?
  rv = NS_ReadInputStreamToString(input, data, avail);
  if (NS_FAILED(rv)) {
    return;
  }
why can't you...
(Assignee)

Comment 30

6 years ago
Created attachment 524250 [details] [diff] [review]
Optimize script reading, v3
Attachment #524242 - Attachment is obsolete: true
Attachment #524250 - Flags: review?(Olli.Pettay)
Attachment #524242 - Flags: review?(Olli.Pettay)

Updated

6 years ago
Attachment #524250 - Flags: review?(Olli.Pettay) → review+
(Assignee)

Comment 31

6 years ago
http://hg.mozilla.org/mozilla-central/rev/ee12989404ec
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago6 years ago
Resolution: --- → FIXED
This should also be landed on mozilla-2.1.
Whiteboard: [hardblocker][4.0.1?] → [hardblocker][needs to land on mozilla-2.1]
(Assignee)

Updated

6 years ago
Keywords: checkin-needed
(Assignee)

Comment 33

6 years ago
http://hg.mozilla.org/releases/mozilla-2.1/rev/d0344a994a44
Keywords: checkin-needed
Whiteboard: [hardblocker][needs to land on mozilla-2.1] → [hardblocker]
Duplicate of this bug: 649588
Duplicate of this bug: 650803
Target Milestone: --- → mozilla5
Verified Fixed using a Desire S Mozilla/5.0 (Android; Linux armv7l; rv:2.1.1) Gecko/20110415 Firefox/4.0.2pre Fennec/4.0.1 ID:20110415172201
Status: RESOLVED → VERIFIED
v. Mozilla/5.0 (Android; Linux armv7l; rv:6.0a1) Gecko/20110419 Firefox/6.0a1 Fennec/6.0a1 ID:20110419042214
Crash Signature: [@ libz.so@0x14b4 ] [@ libz.so@0x14d4 ]
You need to log in before you can comment on or make changes to this bug.