Closed Bug 983976 Opened 11 years ago Closed 11 years ago

crash [@ libq3dtools_adreno200.so@0x335fa ] when launching Usage app after app's initial setup

Categories

(Firefox OS Graveyard :: Vendcom, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(tracking-b2g:backlog)

RESOLVED FIXED
tracking-b2g backlog

People

(Reporter: aryx, Assigned: jld)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash])

Crash Data

Attachments

(2 files)

Geeksphone Keon running 1.4.0.0-prerelease 20140315024229 When launching the Usage app after flashing the device including the wipe of usage data, it crashes after the initial setup- Crash report: https://crash-stats.mozilla.com/report/index/92368b9a-e352-4a76-88bc-f39482140315
Leaving qawanted to find out if this happens on a production device on 1.4.
Severity: major → critical
Keywords: crash, qawanted
Whiteboard: [b2g-crash]
Crash Signature: [@ libq3dtools_adreno200.so@0x335fa ]
I was unable to reproduce this after multiple attempts on a Buri or a Leo with latest v1.4. Tried with both a TMobile SIM and an AT&T SIM for both devices. v1.4 Environmental Variables: Device: Buri v1.4 MOZ BuildID: 20140317040204 Gaia: 8f802237927c7d5e024fb7dca054dd5efef6b2e6 Gecko: 25cfa01ba054 Version: 30.0a1 Firmware Version: v1.2-device.cfg v1.4 Environmental Variables: Device: Leo v1.4 MOZ BuildID: 20140317040204 Gaia: 8f802237927c7d5e024fb7dca054dd5efef6b2e6 Gecko: 25cfa01ba054 Version: 30.0a1 Firmware Version: v10d
Keywords: qawanted
QA Contact: pbylenga
Sounds like this is geeksphone-specific then.
blocking-b2g: 1.4? → backlog
Component: Gaia::Cost Control → Vendcom
This looks like a graphics driver issue : libq3dtools_adreno200.so is a Adreno 2xx User-mode Android JB Graphics Driver We would need to be on a device that has an Adreno200 JB graphics driver. I'm not sure which devices that we support has that...
This is most likely a driver issue looking at the file affected.
Flags: needinfo?(gp)
(In reply to Archaeopteryx [:aryx] from comment #0) > Crash report: https://crash-stats.mozilla.com/report/index/92368b9a-e352-4a76-88bc-f39482140315 SIGSYS, so it's a seccomp sandbox violation. Relevant registers: r7 = 0x0000011a r0 = 0xffffffff r1 = 0x4530ceb4 r2 = 0x00000010 Which means bind(-1, 0x4530ceb4, 16). I think I've figured out most of what's going on here. It's trying to create a TCP service bound to 127.0.0.1 port 0x5151 (== 20817), which seems to be associated with some kind of GL profiler support. This has never worked in b2g content processes or in Android apps without Internet permission, due to CONFIG_ANDROID_PARANOID_NETWORK. But q3dtools seems to call bind() with the return value of socket() even if it failed (i.e., if the value is -1); normally this would cause bind() to fail with EBADF, but our sandbox disallows bind() and so we crash instead. STR that works on my keon: `adb shell setprop debug.egl.profiler 1` and use WebGL (either with an app like CrystalSkull or CubeVid, or load a web page that uses WebGL in the browser app). Possible fix: `adb shell setprop debug.egl.profiler 0`.
Hi Jed, wouldn't we see this on other seccomp enabled devices?
Flags: needinfo?(jld)
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #8) > Hi Jed, wouldn't we see this on other seccomp enabled devices? The drivers would have to support the same profiling functionality (this is a Qualcomm-specific feature), and have the same bug with not checking for socket() failure, and the GL profiler would have to be enabled. It isn't enabled on my local keon builds; I don't know what Geeksphone did to cause it to be (apparently?) enabled in the builds they're providing.
Flags: needinfo?(jld)
I think I know the rest of what's going on here. Our builds use https://github.com/mozilla-b2g/device-gp-keon, but the Geeksphone builds are using https://github.com/gp-b2g/device-gp-keon, and we haven't merged their changes in quite some time. See, in particular: https://github.com/gp-b2g/device-gp-keon/commit/92d0e6c0d294 https://github.com/gp-b2g/device-gp-peak/commit/2bcc7c899073 In addition to "debug.egl.profiler=1", there are a number of other changes to graphics-related settings which could explain why I couldn't reproduce the bug the same way that the various bugs' reporters did. To summarize, a list of things that are wrong here: 1. The mozilla-b2g repo being behind the gp-b2g repo. Alternately, that we aren't just using the gp-b2g repo in our manifest, given that we don't seem to have local changes in our fork. 2. The gp-b2g system.prop turning on debug.egl.profiler, given that the GL profiler isn't allowed to create its AF_INET socket regardless of whether seccomp is enabled, so it can't possibly work. 3. The Qualcomm driver not checking for errors from socket().
Flags: needinfo?(gp)
(In reply to Jed Davis [:jld] from comment #12) > 1. The mozilla-b2g repo being behind the gp-b2g repo. Alternately, that we > aren't just using the gp-b2g repo in our manifest, given that we don't seem > to have local changes in our fork. That sounds like it warrants a different bug (and a patch). > 2. The gp-b2g system.prop turning on debug.egl.profiler, given that the GL > profiler isn't allowed to create its AF_INET socket regardless of whether > seccomp is enabled, so it can't possibly work. That sounds to be the issue here that can (hopefully) be fixed. I'm not sure what our process is to get fixes like that in GP. How knows that? > 3. The Qualcomm driver not checking for errors from socket(). Surely makes sense to contact QC on this, but I doubt they'll fix it in the old driver for those chipsets that have been used by GP for those devices. Might be a good idea to get it fixed in newer drivers of theirs, though. Who can contact them?
A fix in adreno200 would involve going though the OEM/ODM, but the Keon is on a geriatric gonk so a fix is less likely here given that the bug is not known to affect any current commercial devices.
Assignee: nobody → jld
Landed for keon: https://github.com/gp-b2g/device-gp-keon/commit/eaeb7a0c5a18d581ffdfb27a9e7a27cb07928674 device-gp-peak almost certainly needs the same patch, but I'd like to test it before sending a PR, which means finding someone who owns a peak (and doesn't mind doing a full build/flash) to reproduce the crash and test the fix.
(In reply to Jed Davis [:jld] from comment #16) > Landed for keon: https://github.com/gp-b2g/device-gp-keon/commit/eaeb7a0c5a18d581ffdfb27a9e7a27cb07928674 …except not really, because the versions of keon.xml in gp-b2g/b2g-manifest point to the v1.1.0 branch of device-gp-keon, even for versions 1.2 through master, so that needs another PR. (Also, the v1.1.0 branch has patches that aren't on master. I seem to have misunderstood the branching scheme in use here.)
See Also: → 996857
(In reply to Jed Davis [:jld] from comment #16) > device-gp-peak almost certainly needs the same patch, but I'd like to test > it before sending a PR, which means finding someone who owns a peak (and > doesn't mind doing a full build/flash) to reproduce the crash and test the > fix. I have a Peak and use it with Nightly all the time, but I have no idea how to reproduce this one. We don't see the same or similar signature on the Peak from all I can see in our stats. What I can easily repro on the Peak is crashes like bp-afebdc9c-f2a3-407f-a826-99c9b2140415 but that looks very different in terms of signal and signature, right?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
This crash is still occurring even after the fix: https://crash-stats.mozilla.com/report/index/de316d03-98fd-4c6b-b73b-6d1f02140424 Buildid for this crash : 20140424135124 comment 17 worries me some, as it sounds like it might be more than what jld is stating.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I can't reproduce the crash with the current Geeksphone build, 20140428030526. Note that this needs a full flash to get the fix, not just Gecko/Gaia.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → WORKSFORME
Geeksphone is always full flashed. That's how they run their script. If there are crash reports coming in after the socorro report, then there may be other issues causing the crash.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #24) > If there are crash reports coming in after the socorro report, then there > may be other issues causing the crash. Actually, even though there were still 3 crashes with the 20140424 build, no other build after 20140422 is affected, so this really does look fixed after all. See the "build id facet" in https://crash-stats.mozilla.com/search/?signature=~libq3dtools_adreno200.so%400x335fa&date=%3E2014-04-01&_facets=build_id&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform (also the 3 on the 4/24 build are pretty low compared to what every build has been seeing before).
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Lower occurrence doesn't necessarily mean fixed, does it? Leaving as resolved fix.
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #26) > Lower occurrence doesn't necessarily mean fixed, does it? As I said in my comment, there were no crashes at all in later builds. Not sure what that fluke of 3 crashes in the 4/24 build actually is about, maybe someone fiddled with things and managed to turn on that debugging mode manually that causes this.
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: