Installing Bugzilla Tweaks 1.8 leads to a crash

RESOLVED DUPLICATE of bug 637915

Status

Firefox for Android Graveyard
General
RESOLVED DUPLICATE of bug 637915
7 years ago
7 years ago

People

(Reporter: jdm, Assigned: dougt)

Tracking

({crash})

Trunk
ARM
Android
crash

Details

Attachments

(1 attachment, 4 obsolete attachments)

Install bugzilla tweaks by searching in the addon screen and pressing "add to fennec".  Watch the crash reporter dialog appear.
This is on the 2/24 nightly for me.  about:crashes doesn't seem to show any new entries for some reason.
tracking-fennec: --- → ?
logcat shows this as soon as the log messages from the alertsProgressListener_OnProgress reach 100%:

W/dalvikvm(12390): ReferenceTable overflow (max=1024)
W/dalvikvm(12390): Last 10 entries in JNI pinned array reference table:
W/dalvikvm(12390):  1014: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1015: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1016: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1017: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1018: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1019: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1020: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1021: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1022: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390):  1023: 0x400328f0 cls=[C (20 bytes)
W/dalvikvm(12390): JNI pinned array reference table summary (1024 entries):
W/dalvikvm(12390):  1024 of [C 20B (1 unique)
W/dalvikvm(12390): Memory held directly by native code is 20 bytes
E/dalvikvm(12390): Failed adding to JNI pinned array ref table (1024 entries)
I/dalvikvm(12390): "Thread-11" prio=5 tid=21 RUNNABLE
I/dalvikvm(12390):   | group="main" sCount=0 dsCount=0 s=N obj=0x47c50158 self=0x13dc38
I/dalvikvm(12390):   | sysTid=12399 nice=0 sched=0/0 cgrp=default handle=1301368
I/dalvikvm(12390):   at org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
I/dalvikvm(12390):   at org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
I/dalvikvm(12390):   at org.mozilla.gecko.GeckoAppShell.runGecko(GeckoAppShell.java:324)
I/dalvikvm(12390):   at org.mozilla.gecko.GeckoApp$2.run(GeckoApp.java:165)
I/dalvikvm(12390): 
E/dalvikvm(12390): VM aborting
D/Zygote  ( 2190): Process 12390 exited cleanly (1)
I/WindowManager( 2254): WIN DEATH: Window{47b75ab8 SurfaceView paused=false}
I/WindowManager( 2254): WIN DEATH: Window{47b4a8d0 org.mozilla.fennec/org.mozilla.fennec.App paused=false}
I/ActivityManager( 2254): Process org.mozilla.fennec (pid 12390) has died.
This stackoverflow question [1] claims that the cause is leaking strings.

[1] http://stackoverflow.com/questions/4959495/jni-reference-table-overflow
CCing the usual android embedding suspects.
(Assignee)

Comment 5

7 years ago
on desktop i see:

error: An exception occurred.
Traceback (most recent call last):
  File "resource://jid0-qbniplfdfa4lpdrjhac6vbqn20q-addon-kit-lib/context-menu.js", line 47, in 
    ].join(""));
Error: The context-menu module currently supports only Firefox.  In the future we would like it to support other applications, however.  Please see https://bugzilla.mozilla.org/show_bug.cgi?id=560716 for more information.

Not sure if it is related yet.
(Assignee)

Comment 6

7 years ago
wfm on a g2.  build from yesterday.
(Assignee)

Comment 7

7 years ago
bah.  i lie.

it doesn't happen every time, but it does happen more than 1/3 of the time.


logcat:

D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
D/szipinf (10666): Initializing inflate state
W/dalvikvm(10666): ReferenceTable overflow (max=1024)
W/dalvikvm(10666): Last 10 entries in JNI pinned array reference table:
W/dalvikvm(10666):  1014: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1015: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1016: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1017: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1018: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1019: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1020: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1021: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1022: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666):  1023: 0x400200f8 cls=[C (20 bytes)
W/dalvikvm(10666): JNI pinned array reference table summary (1024 entries):
W/dalvikvm(10666):  1024 of [C 20B (1 unique)
W/dalvikvm(10666): Memory held directly by tracked refs is 20 bytes
E/dalvikvm(10666): Failed adding to JNI pinned array ref table (1024 entries)
I/dalvikvm(10666): "Thread-12" prio=5 tid=11 RUNNABLE
I/dalvikvm(10666):   | group="main" sCount=0 dsCount=0 obj=0x4054c4d8 self=0x148e68
I/dalvikvm(10666):   | sysTid=10676 nice=0 sched=0/0 cgrp=default handle=1006664
I/dalvikvm(10666):   | schedstat=( 19562835692 3778076159 10072 )
I/dalvikvm(10666):   at org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
I/dalvikvm(10666):   at org.mozilla.gecko.GeckoAppShell.nativeRun(Native Method)
I/dalvikvm(10666):   at org.mozilla.gecko.GeckoAppShell.runGecko(GeckoAppShell.java:324)
I/dalvikvm(10666):   at org.mozilla.gecko.GeckoApp$2.run(GeckoApp.java:165)
I/dalvikvm(10666): 
E/dalvikvm(10666): VM aborting




from gdb:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 10676]
0xaca401f8 in dvmAbort () from /home/dougt/tegra-gdb-20100902/lib/libdvm.so
(gdb) bt
#0  0xaca401f8 in dvmAbort () from /home/dougt/tegra-gdb-20100902/lib/libdvm.so
#1  0xaca44b54 in ?? () from /home/dougt/tegra-gdb-20100902/lib/libdvm.so
#2  0xaca44b54 in ?? () from /home/dougt/tegra-gdb-20100902/lib/libdvm.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

dvmAddToReferenceTable is failing.  

Sounds like a leak in android and we are overflowing a table.  

http://code.google.com/p/android/issues/detail?id=2123
http://code.google.com/p/android/issues/detail?id=8868

maybe we can work around.  What is confusing right now is why this addon installation is tickling this bug, and not everything else we do.
tracking-fennec: ? → 2.0+
(Assignee)

Comment 8

7 years ago
reproducible by simply downloading the XPI from a http server (not using the built in UI).
(Assignee)

Comment 9

7 years ago
to make matters worse, you can get into a state where restarting after the crash will trigger this abort.  this requires that you kill your profile.
(Assignee)

Updated

7 years ago
Assignee: nobody → doug.turner
(In reply to comment #9)
> to make matters worse, you can get into a state where restarting after the
> crash will trigger this abort.  this requires that you kill your profile.

This is not actually true, in my experience.  I've found that Android seems to clear out the pinned table after some period of time (a couple minutes, maybe?) and I can get Fennec to start without problems after that.
<devils_advocate>
If this crash is limited to Jetpack SDK based add-ons, which Fennec technically doesn't support, then I don't think we should hard block on it. If you think we should hard block because of the potential underlying leak then I guess I could agree. Let's just not let other hard blockers stagnate while we put a lot of effort into this one.
</devils_advocate>
(Assignee)

Comment 12

7 years ago
It is very reproducible if you call into:

AndroidBridge::GetMimeTypeFromExtensions
or
AndroidBridge::GetHandlersForMimeType

a few dozen times.  Looks like it is just leaky on the android side.
(Assignee)

Comment 13

7 years ago
Created attachment 515741 [details] [diff] [review]
log for android bridge

super helpful when debugging these sorts of problem.
Attachment #515741 - Flags: review?(blassey.bugs)
(Assignee)

Comment 14

7 years ago
Created attachment 515742 [details] [diff] [review]
call local delete

not strictly required, but it does hint to js that our local reference is not needed.  with this patch, we continue to crash.
Attachment #515742 - Flags: review?(blassey.bugs)
Comment on attachment 515741 [details] [diff] [review]
log for android bridge

I don' think we always want this extra traffic, please wrap them with their own macro like ALOGIME

https://mxr.mozilla.org/mozilla-central/source/widget/src/android/nsWindow.cpp#1554
(Assignee)

Comment 16

7 years ago
Created attachment 515749 [details] [diff] [review]
move js to the default list

this moves js to the default mime list.
Attachment #515749 - Flags: review?(blassey.bugs)
Comment on attachment 515742 [details] [diff] [review]
call local delete

I can't review this. but note that you've got css twice in that list
Attachment #515742 - Flags: review?(blassey.bugs) → review?(cbiesinger)
Attachment #515749 - Flags: review?(blassey.bugs) → review?(cbiesinger)
Attachment #515741 - Flags: review?(blassey.bugs) → review-
Comment on attachment 515742 [details] [diff] [review]
call local delete

wrong patch?
Attachment #515742 - Flags: review?(cbiesinger)
(Assignee)

Updated

7 years ago
Attachment #515749 - Flags: review?(cbiesinger) → review?(blassey.bugs)
(Assignee)

Updated

7 years ago
Attachment #515742 - Attachment is obsolete: true
(Assignee)

Comment 19

7 years ago
Comment on attachment 515749 [details] [diff] [review]
move js to the default list

bz,is this the right thing here?

Basically what happens on android is that when we unzip this XPI files, we ask the OS what the mime type is for each and every file.  There appears to be a bug where Android (Dalvik) leaks during these calls.

Forcing .js files to always have this mimetype does resolve the issue (since we never go out to the OS).

It isn't clear to me when we should ever ask the operating system if it should/can open a .js file.
Attachment #515749 - Flags: review?(blassey.bugs) → review?(bzbarsky)
(Assignee)

Comment 20

7 years ago
Created attachment 515757 [details] [diff] [review]
delete local references

hints to js.
(Assignee)

Comment 21

7 years ago
^^ hints to java.  i do not think that this is required (per docs).  I continue to crash with this applied.
(Assignee)

Comment 22

7 years ago
Comment on attachment 515749 [details] [diff] [review]
move js to the default list

(ignore the extra +  { TEXT_CSS, "css" },)
Comment on attachment 515757 [details] [diff] [review]
delete local references

most of these functions have a AutoLocalJNIFrame in them, does that not handle releasing these local references?
(Assignee)

Comment 24

7 years ago
blassey - i do not think any of these are needed.  these basically tell java that the native side is done with the string.  gc() should figure this out without this call.
(Assignee)

Comment 25

7 years ago
(btw, i don't really want you to review that patch.)
Comment on attachment 515749 [details] [diff] [review]
move js to the default list

TEXT_CSS is already in the list, please don't add it again :)
(In reply to comment #19)
> It isn't clear to me when we should ever ask the operating system if it
> should/can open a .js file.

Why does unzipping a file call this code at all?
(Assignee)

Comment 28

7 years ago
Created attachment 515821 [details] [diff] [review]
move js to the default list
Attachment #515749 - Attachment is obsolete: true
Attachment #515749 - Flags: review?(bzbarsky)
Attachment #515821 - Flags: review?(bzbarsky)
(Assignee)

Updated

7 years ago
Attachment #515757 - Attachment is obsolete: true
(Assignee)

Updated

7 years ago
Attachment #515741 - Attachment is obsolete: true
(Assignee)

Comment 29

7 years ago
sorry for so much noise.

-- Recap -- 

Bz, i'd like to move the .js mime type from being an extra mime type (which is looked at after going to the OS) to being a default in nsExternalHelperAppService.  See comment #19.

Most everything else can be ignored.
So historically we've tried to avoid adding things to this list unless absolutely needed for performance, because once it's on that list there is no way for the user to affect what type the extension gets mapped to.  From some quick googling, there are at least three things other than JavaScript that use this extension...

Note that a web site can trigger an arbitrary number of calls to GetTypeFromExtension pretty trivially.  So if we have a problem with that function, we really need to fix it.
(Assignee)

Updated

7 years ago
Attachment #515821 - Flags: review?(bzbarsky) → review-
(Assignee)

Comment 31

7 years ago
This is just a memory leak in our string class... fail fail fail.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 637915
You need to log in before you can comment on or make changes to this bug.