Closed Bug 751343 Opened 11 years ago Closed 11 years ago

Crash [@ libdvm.so] after keeping http://www.iex.nl open for a while

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
critical

Tracking

(blocking-fennec1.0 +)

RESOLVED DUPLICATE of bug 756068
Tracking Status
blocking-fennec1.0 --- +

People

(Reporter: martijn.martijn, Assigned: kats)

References

Details

(Keywords: crash, Whiteboard: [native-crash])

Crash Data

Attachments

(1 file)

Spawned of from bug 730890, comment 49:

I can sort of reproduce this with http://www.iex.nl/ and have that loaded some time in the background, while the phone's screen is turned off.
This is on the Samsung Galaxy Nexus and I didn't have Flash turned on on that site.

So I guess this crash isn't Flash related only.

https://crash-stats.mozilla.com/report/index/bp-ca1996d5-3d12-446d-af7a-0db962120427
0 	libdvm.so 	libdvm.so@0x50a2a 	
1 	libstlport.so 	libstlport.so@0x3327a

I tried to get to reproduce this with this testcase:
http://people.mozilla.org/~mwargers/tests/forms/formautosubmit_parentframes.htm
Because I thought it might have something to do with submitting forms in iframes. I crashed once with that, but afterwards, I didn't.

I was seeing this crash originally happening after keeping Fennec open overnight on http://www.iex.nl .

I guess related bugs, bug 730890, bug 750965 and bug 751262.
Crash Signature: [@ libstlport.so@0x3327a]
Whiteboard: [native-crash]
blocking-fennec1.0: --- → ?
It might very well be that this is the same as bug 750965. It turns out that just doing window.location.reload() on a timer is crashing Fennec after a while.
Depends on: 750965
QA Contact: general → cpeterson
Julian, can you have valgrind take a crack at this?  We are concerned that we have some devious leak.
JP, I am certainly happy to do so; however I'm currently on vacation
700 miles distant from my Android devices.  I will do it as soon as
I am back, that is, first thing Tuesday.
This file lists the definitely lost blocks (no pointer found to them)
and possibly lost blocks (pointer found only to the interior) for a 
disable-jemalloc build of m-c on Android, idling www.iex.nl for
about 5 hours.

Read from the bottom upwards -- the biggest leaks are reported at the
end.  There's nothing "obviously bad" afaics.  Summary is:

    in use at exit: 11,277,376 bytes in 43,649 blocks
  total heap usage: 11,219,194 allocs, 11,175,545 frees,
                    18,552,329,983 bytes allocated

   definitely lost: 271,303 bytes in 3,302 blocks
   indirectly lost: 569,603 bytes in 4,903 blocks
     possibly lost: 4,002,401 bytes in 3,592 blocks
   still reachable: 6,434,069 bytes in 31,852 blocks
This is on a Xoom w/ ICS, I should add, not on my Nexus S.
Does that make a difference?
See also bug 750965, which has a simple testcase and which very well might be the same as this bug.
These ones are worrying to me:

131,744 bytes in 5 blocks are definitely lost in loss record 6,438 of 6,447
   at 0x4805440: malloc (vg_replace_malloc.c:263)
   by 0x22267CEB: Java_org_mozilla_gecko_GeckoAppShell_allocateDirectBuffer (nsGeckoUtils.cpp:69)
   by 0x540CBF3: dvmPlatformInvoke (CallEABI.S:258)

2,621,440 bytes in 2 blocks are possibly lost in loss record 6,447 of 6,447
   at 0x4805440: malloc (vg_replace_malloc.c:263)
   by 0x22267CEB: Java_org_mozilla_gecko_GeckoAppShell_allocateDirectBuffer (nsGeckoUtils.cpp:69)
   by 0x540CBF3: dvmPlatformInvoke (CallEABI.S:258)

We've seen a bunch of OOM crashes where it appears that we're leaking java buffer objects.
This might be fixed now that bug 750965 seems to be worksforme too.
At least, I haven't been able to reproduce the crash on the Samsung Galaxy Nexus, Galaxy SII and HTC Desire HD, thus far.
(In reply to Kartikaya Gupta (:kats) from comment #7)
> These ones are worrying to me:
> 
> 131,744 bytes in 5 blocks are definitely lost in loss record 6,438 of 6,447
>    at 0x4805440: malloc (vg_replace_malloc.c:263)
>    by 0x22267CEB: Java_org_mozilla_gecko_GeckoAppShell_allocateDirectBuffer
> (nsGeckoUtils.cpp:69)
>    by 0x540CBF3: dvmPlatformInvoke (CallEABI.S:258)
> 
> 2,621,440 bytes in 2 blocks are possibly lost in loss record 6,447 of 6,447
>    at 0x4805440: malloc (vg_replace_malloc.c:263)
>    by 0x22267CEB: Java_org_mozilla_gecko_GeckoAppShell_allocateDirectBuffer
> (nsGeckoUtils.cpp:69)
>    by 0x540CBF3: dvmPlatformInvoke (CallEABI.S:258)
> 
> We've seen a bunch of OOM crashes where it appears that we're leaking java
> buffer objects.

Yeah.  In fact I did a number of runs and those ones _seem_ to be a
bit larger for the longer runs.

I have a stack-scan hack that might get frames below
dvmPlatformInvoke.  Would those be useful?  I can give it a try later
today.
(In reply to Julian Seward from comment #9)
> 
> I have a stack-scan hack that might get frames below
> dvmPlatformInvoke.  Would those be useful?  I can give it a try later
> today.

That can't hurt, but I don't expect it to provide much useful info. The allocations are being triggered from the java code via JNI calls, and I don't think valgrind will be able to point to the java code location of the allocation. I did just audit the relevant code and found some errors that may have caused the problem. I filed bug 753334 with my patch. It might be more helpful if you could apply that patch and see if that makes the leaks go away.
Assignee: nobody → bugmail.mozilla
blocking-fennec1.0: ? → +
QA Contact: cpeterson → general
Is this still happening? It seems like a lot of the libdvm crashes have been fixed and/or went away with the JNI fixes. qawanted to see if this is still reproducible.
Keywords: qawanted
I was unable to reproduce the issue on Nightly 15.0a1 2012-05-14 on HTC Desire (Android 2.2), Samsung Captivate (Android 2.2) or Samsung Galaxy Nexus ( Android 4.0.2) using http://www.iex.nl/. 
Leaving QAWanted for others to try and reproduce the crash.
Wfm too, see comment 8.
Status: NEW → RESOLVED
Closed: 11 years ago
Keywords: qawanted
Resolution: --- → WORKSFORME
Crash was reproduced once on Aurora 14.0a2 2012-06-17 on Samsung Galaxy Nexus (Android 4.0.2) using the steps from:
https://bugzilla.mozilla.org/show_bug.cgi?id=730890#c10

Crash report:
https://crash-stats.mozilla.com/report/index/bp-93d83b0f-2790-4072-8e6e-55c8c2120517
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
If this is only happening on Aurora 14, then this bug should still be worksforme, I think.
Based on STR in comment 14, let's dupe it to bug 756068.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → DUPLICATE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.