Last Comment Bug 700124 - Attempting to use JNI in child process [@ mozilla::AndroidBridge::EnsureJNIThread]
: Attempting to use JNI in child process [@ mozilla::AndroidBridge::EnsureJNITh...
: crash, topcrash
Product: Core
Classification: Components
Component: Widget: Android (show other bugs)
: Trunk
: x86 Mac OS X
-- critical (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
: Jim Chen [:jchen] [:darchons]
: 700322 (view as bug list)
Depends on:
Blocks: 668004 706702
  Show dependency treegraph
Reported: 2011-11-06 06:29 PST by Josh Matthews [:jdm]
Modified: 2012-01-25 11:26 PST (History)
4 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

avoid calling GfxInfoBase::GetFeatureStatusImpl for now on Android (743 bytes, patch)
2011-11-06 12:00 PST, Benoit Jacob [:bjacob] (mostly away)
josh: review+
Details | Diff | Splinter Review

Description User image Josh Matthews [:jdm] 2011-11-06 06:29:43 PST
This bug was filed from the Socorro interface and is 
report bp-dab1a704-3c13-4109-8c6c-349c12111106 .

Here's some craziness: we're using the JNI to retrieve driver information for GfxInfo, but that's called from the content process.

0 	mozilla::AndroidBridge::EnsureJNIThread 	widget/src/android/AndroidBridge.cpp:239
1 	mozilla::AndroidBridge::AutoLocalJNIFrame::AutoLocalJNIFrame 	widget/src/android/AndroidBridge.h:105
2 	mozilla::AndroidBridge::GetStaticStringField 	widget/src/android/AndroidBridge.cpp:856
3 	mozilla::widget::GfxInfo::GetAdapterVendorID 	widget/src/android/GfxInfo.cpp:192
4 	mozilla::widget::GfxInfoBase::GetFeatureStatusImpl 	widget/src/xpwidgets/GfxInfoBase.cpp:597
5 	mozilla::widget::GfxInfo::GetFeatureStatusImpl 	widget/src/android/GfxInfo.cpp:321
6 	mozilla::widget::GfxInfoBase::GetFeatureStatus 	widget/src/xpwidgets/GfxInfoBase.cpp:574
7 	mozilla::WebGLContext::SetDimensions 	content/canvas/src/WebGLContext.cpp:616
8 	nsHTMLCanvasElement::UpdateContext 	content/html/content/src/nsHTMLCanvasElement.cpp:622
9 	nsHTMLCanvasElement::GetContext 	content/html/content/src/nsHTMLCanvasElement.cpp:540
10 	nsIDOMHTMLCanvasElement_GetContext 	obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:18716
11 	js::InvokeKernel 	js/src/jscntxtinlines.h:297
12 	js::Interpret 	js/src/jsinterp.cpp:3948
13 	js::RunScript 	js/src/jsinterp.cpp:584
14 	js::Execute 	js/src/jsinterp.cpp:783
15 	JS_EvaluateUCScriptForPrincipalsVersion 	js/src/jsapi.cpp:5038
16 	nsJSContext::EvaluateString 	dom/base/nsJSEnvironment.cpp:1490
17 	nsScriptLoader::EvaluateScript 	content/base/src/nsScriptLoader.cpp:905
18 	nsScriptLoader::ProcessRequest 	content/base/src/nsScriptLoader.cpp:799
19 	nsScriptLoader::ProcessPendingRequests 	content/base/src/nsScriptLoader.cpp:948
20 	nsScriptLoader::OnStreamComplete 	content/base/src/nsScriptLoader.cpp:1182
21 	nsStreamLoader::OnStopRequest 	netwerk/base/src/nsStreamLoader.cpp:125
22 	nsHTTPCompressConv::OnStopRequest 	netwerk/streamconv/converters/nsHTTPCompressConv.cpp:127
23 	mozilla::net::HttpChannelChild::OnStopRequest 	netwerk/protocol/http/HttpChannelChild.cpp:484
Comment 1 User image Josh Matthews [:jdm] 2011-11-06 06:33:06 PST
This is killing Fennec nightly on every use of a canvas element. We should either back out the Android changes or properly serialize the relevant information in the parent and send it on process creation.
Comment 2 User image Josh Matthews [:jdm] 2011-11-06 06:34:15 PST
And when I say canvas, I really mean webgl canvas. Still, this is a topcrash.
Comment 3 User image Benoit Jacob [:bjacob] (mostly away) 2011-11-06 12:00:16 PST
Created attachment 572313 [details] [diff] [review]
avoid calling GfxInfoBase::GetFeatureStatusImpl for now on Android

this should remove this crash, according to the call stack.

So... if we can't use this interface to query hardware information in the content process on Android, what do you think we should do to get that infomation? Query it from the chrome process and pipe it to the content process?
Comment 4 User image Josh Matthews [:jdm] 2011-11-06 15:10:44 PST
I think the best plan would be to query and send the information in the chrome process inside ContentParent::ContentParent, and keep it all under an #ifdef ANDROID.
Comment 5 User image Benoit Jacob [:bjacob] (mostly away) 2011-11-06 15:19:06 PST

Reopen if crashes persist.
Comment 6 User image Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-11-07 08:51:34 PST
*** Bug 700322 has been marked as a duplicate of this bug. ***
Comment 7 User image Doug Sherk (:drs) (inactive) 2011-11-21 20:16:45 PST
I'm unable to repro this when I backout the patch. jdm: what hardware/setup are you using, and is there any other info you can give me to repro it? I'd also appreciate a more in-depth description as to how you think this should be dealt with properly.
Comment 8 User image Josh Matthews [:jdm] 2011-11-21 21:22:25 PST
I merely saw this topcrash on crash-stats, I didn't try reproducing it. I'm not certain it's worth solving properly, given that desktop e10s is on hold and mobile e10s is going away imminently on nightlies.
Comment 9 User image Doug Sherk (:drs) (inactive) 2011-11-21 22:35:52 PST
This is a problem that still has to be dealt with. It was caused by some additions I made to support blocklisting Android devices from graphics features (OpenGL Layers, WebGL, etc). Benoit wrote up a temporary fix for it, but I need to reactivate this code properly. Any info you can provide would be appreciated.
Comment 10 User image Josh Matthews [:jdm] 2011-11-21 22:40:07 PST
It looks like any content process which triggers a SetDimensions call on a webgl canvas should cause the problem. I outlined a plan to solve this properly in comment 4 - we send other bits of information to child processes already, so this would just be querying amd sending one more thing, then storing that data somewhere so that we don't need to hit the JNI for the feature status later.
Comment 11 User image Doug Sherk (:drs) (inactive) 2011-11-22 18:41:55 PST
Ok, I think I completely missed the e10s being phased out part. It sounds like based on that, there's no reason not to do these queries on a single process once this is done, i.e. the way we were already doing them minus the mistakes in handling SDK versions detailed in bug 700931.
Comment 12 User image Josh Matthews [:jdm] 2011-11-22 19:02:56 PST
Actually, it might remain necessary to solve this properly for content processes if B2G is planning to be using them, which I believe they are.
Comment 13 User image Josh Matthews [:jdm] 2011-11-22 19:03:55 PST
mwu tells me that Gonk doesn't care about this, so we can probably leave it be.
Comment 14 User image Doug Sherk (:drs) (inactive) 2011-11-22 19:04:43 PST
Ok, thanks for checking into that.
Comment 15 User image Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-11-22 19:08:17 PST
We should definitely fix this the "right way", if it's not incredibly difficult to solve generally.  Rumors of the death of multi-process Gecko are greatly exaggerated, even if particular products (desktop Firefox) aren't planning on using it in the near future.
Comment 16 User image Doug Sherk (:drs) (inactive) 2011-11-23 17:22:25 PST
Ok, I will start looking into solving it correctly then. If anyone has any repro steps or additional information on it, please provide it as right now I can't repro it.
Comment 17 User image Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-01-25 08:14:37 PST
Occurs in XUL; reopening based on comment 5.

Using the Nightly XUL 2012-01-24, created a new tab and tried to go to the address while another tab was already trying to load 


Thunderbolt, 2.3
Comment 18 User image Josh Matthews [:jdm] 2012-01-25 11:26:43 PST
Comment 17 is a different crash, tracked in bug 720400.

Note You need to log in before you can comment on or make changes to this bug.