Closed
Bug 952170
Opened 11 years ago
Closed 11 years ago
FX OS crash in mozilla::dom::TabChild::UpdateTapState(mozilla::WidgetTouchEvent const&, nsEventStatus)
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
People
(Reporter: marcia, Assigned: kats)
References
Details
(4 keywords, Whiteboard: [b2g-crash] [CR 593145])
Crash Data
Attachments
(3 files, 2 obsolete files)
1.11 KB,
patch
|
smaug
:
review+
kats
:
checkin+
|
Details | Diff | Splinter Review |
1.13 KB,
patch
|
smaug
:
review+
kats
:
checkin+
|
Details | Diff | Splinter Review |
1.17 KB,
patch
|
smaug
:
review+
RyanVM
:
checkin+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-af8f8ccd-8e1c-4fc8-8322-64b1f2131219.
=============================================================
Seen on Buri while running the latest nightly.
Gaia e6276c96fcf087a12800419be8a9e25fcbbc924d
SourceStamp 5c7fa2bfea8b
BuildID 20131219040202
Version 29.0a1
STR:
1. Last thing done on the phone was operating in gallery (editing a photo). Put phone in my pocket and when I retrieved it received this crash.
No logcat available since I was not tethered at the time.
Comment 3•11 years ago
|
||
This has to be a regression from something that landed yesterday.
blocking-b2g: --- → 1.4?
Keywords: regression,
steps-wanted
Comment 4•11 years ago
|
||
Vivien - Did bug 950489 cause this crash? It appears to have touched the TabChild.cpp file within the regression range that matches the crash stack.
Flags: needinfo?(21)
Comment 5•11 years ago
|
||
STR |
I assume it should be related. Will try to dig into it. I can reproduce with Naoki's step from one of the dup.
STR:
Occurred when trying to update a contact with a picture really quick after creating a new contact with an image from the gallery.
(ie. gallery opened / closed and tried to open it again )
Flags: needinfo?(21)
Updated•11 years ago
|
status-b2g-v1.3:
--- → affected
Updated•11 years ago
|
Keywords: steps-wanted → reproducible
Updated•11 years ago
|
blocking-b2g: 1.3? → 1.3+
Comment 6•11 years ago
|
||
With bug 949347 now fixed, this crash is being hit instead during our smoketests, so this is now a smoketest blocker.
Keywords: smoketest
Assignee | ||
Comment 7•11 years ago
|
||
I don't think this will fix the problem but unnecessary casts like this bug me.
Attachment #8356078 -
Flags: review?(21)
Assignee | ||
Comment 8•11 years ago
|
||
I suspect this might be the problem. WidgetTouchEvent contains an nsTArray but has no explicit copy constructor, so when we do things like WidgetTouchEvent localEvent(aEvent) I think the nsTArray doesn't get copied properly. Specifically I don't think the refcount of the touches in the list gets incremented, and so they could get deleted out from under us, resulting in a dangling pointer and this error. I haven't confirmed this theory but putting the patch up anyway - it should be safe to land even if this isn't the problem.
Try push for builds at https://tbpl.mozilla.org/?tree=Try&rev=e3245179b723
Attachment #8356087 -
Flags: review?(21)
Comment 9•11 years ago
|
||
I feel a bit uncomfortable reviewing those. Is Chris Lord a good reviewer for those?
Comment 10•11 years ago
|
||
/me reviews
Updated•11 years ago
|
Attachment #8356078 -
Flags: review?(21) → review+
Comment 11•11 years ago
|
||
Comment on attachment 8356087 [details] [diff] [review]
Add explicit copy constructor
Could you rather fix AssignTouchEventData and
disabled copy-ctor.
Or if that ends up being tricky for some reason, fix AssignTouchEventData
and use that in copy-ctor.
Attachment #8356087 -
Flags: review?(21) → review-
Comment 12•11 years ago
|
||
Comment on attachment 8356087 [details] [diff] [review]
Add explicit copy constructor
Or actually, this setup is rather error prone already, and the patch
doesn't make it any worse.
I'll file a followup to clean up this.
Attachment #8356087 -
Flags: review- → review+
Assignee | ||
Comment 13•11 years ago
|
||
Also I screwed up that try push. Better one here:
https://tbpl.mozilla.org/?tree=Try&rev=9679bb7f7957
Assignee | ||
Comment 14•11 years ago
|
||
The try run got a green build for each platform so I aborted the rest and landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/343ac499c993
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf2f2356f3b6
Comment 15•11 years ago
|
||
Kats,
Is anything pending to land for the solution to be completed here?
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 16•11 years ago
|
||
Not really. But I don't know if the patches I landed will actually fix the crash or not. If the STR are reliable can somebody from QA test a build with these patches to see if it fixes the problem?
Flags: needinfo?(bugmail.mozilla)
Comment 17•11 years ago
|
||
STR |
Issue appears to occur reliably if user scrolls when loading any app that has ability to horizontally or vertically scroll (e.g. Calender, Gallery, Homescreen [from lockscreen])
https://hg.mozilla.org/mozilla-central/rev/343ac499c993
https://hg.mozilla.org/mozilla-central/rev/bf2f2356f3b6
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 20•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/758629e75ddf
https://hg.mozilla.org/releases/mozilla-aurora/rev/d8852933a8b4
Assignee: nobody → bugmail.mozilla
status-firefox27:
--- → fixed
status-firefox28:
--- → fixed
status-firefox29:
--- → fixed
Assignee | ||
Comment 21•11 years ago
|
||
Whoops, I probably should have marked this as no_uplift until we verified the fix. Ah well the uplift shouldn't hurt. But we'll still need to verify the fix.
status-firefox27:
fixed → ---
Comment 22•11 years ago
|
||
We just confirmed that the patches here didn't fix the crash, so I'm reopening.
Status: RESOLVED → REOPENED
status-b2g-v1.3:
fixed → ---
status-firefox28:
fixed → ---
status-firefox29:
fixed → ---
Resolution: FIXED → ---
Comment 23•11 years ago
|
||
Comment 24•11 years ago
|
||
Kats,
To make sure this blocker is on your radar I am ni you.
As you know this is a QC blocker and they are expecting a fix by 1/10/14.
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 25•11 years ago
|
||
Jason, did you try backing out bug 950489 to see if it fixes the crash?
Flags: needinfo?(bugmail.mozilla) → needinfo?(jsmith)
Comment 26•11 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #25)
> Jason, did you try backing out bug 950489 to see if it fixes the crash?
Not yet.
Flags: needinfo?(jsmith)
Updated•11 years ago
|
status-b2g-v1.3:
--- → affected
Assignee | ||
Comment 30•11 years ago
|
||
Interesting to note that this crash has actually been around for a long time. See for instance https://crash-stats.mozilla.com/report/index/4535287d-a194-4524-a005-649e32131215 which occurred on B2G-18 build from 20130621133927. The frequency just spiked way up on Dec 19, probably from the landing of bug 950489. Backing that out will probably reduce the frequency back to old levels but won't fix it completely.
Crash Signature: [@ mozilla::dom::TabChild::UpdateTapState(mozilla::WidgetTouchEvent const&, nsEventStatus)] → [@ mozilla::dom::TabChild::UpdateTapState(mozilla::WidgetTouchEvent const&, nsEventStatus)]
[@ mozilla::dom::TabChild::UpdateTapState ]
Assignee | ||
Comment 31•11 years ago
|
||
Also, this crash should only happen when APZC is disabled, and as of last night APZC is enabled by default so this crash shouldn't get very often. Nonetheless I will spend some time today to see if I can figure it out.
Comment 32•11 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #31)
> Also, this crash should only happen when APZC is disabled, and as of last
> night APZC is enabled by default so this crash shouldn't get very often.
> Nonetheless I will spend some time today to see if I can figure it out.
Per talking w/Marcia - she mentioned on a build with APZC enabled, the crash rate is definitely lower. However, it's still possible to hit in Gaia. She can add more details on the reproduction steps.
Reporter | ||
Comment 33•11 years ago
|
||
Here are the steps from my crash today:
1. Incoming call log, select a new number and "Create a new contact"
2. Contacts app: Start typing a name in the first name field, and then backspace a few times using the delete key on the keyboard
I put in a patch just to null check and see what was going on. without APZ turned on, I got an error :
01-09 17:09:32.759: E/GeckoConsole(1653): [JavaScript Error: "NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIContentFrameMessageManager.content]" {file: "chrome://global/content/BrowserElementChildPreload.js" line: 737}]
And I did not crash. Having said that I don't think I fixed it, something else must have fixed it. We should probably check in tomorrow's build if we crash or not without APZ turned on. If we still do, we can try creating a new build and seeing if someone else crashes with this patch?
Gaia a5288df8de82cd84be14391c96f99a140535a78e
Gecko 78ebe7f47e81d6ffbaff3ff12f32597aca457205
BuildID 20140109155817
Version 29.0a1
ro.build.version.incremental=eng.archermind.20131114.105818
ro.build.date=Thu Nov 14 10:58:33 CST 2013
Updated•11 years ago
|
Attachment #8358119 -
Flags: review?(bugmail.mozilla)
Apparently, this still crashed even with APZ turned on according to the smoke testers.
Comment 37•11 years ago
|
||
Kats,
Please prioritize this blocker. it blocks QC
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 38•11 years ago
|
||
I have not yet been able to reproduce this. I'll keep trying.
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #8356078 -
Flags: checkin+
Assignee | ||
Updated•11 years ago
|
Attachment #8356087 -
Flags: checkin+
Assignee | ||
Comment 39•11 years ago
|
||
Comment on attachment 8358119 [details] [diff] [review]
bug952170.patch
Review of attachment 8358119 [details] [diff] [review]:
-----------------------------------------------------------------
I assume that since you didn't mention it, you never saw the "touch is null" output get hit, which means that the null check here never triggered at all. So I don't think this patch will have much effect.
Also according to the crashreports the crashes were a SIGBUS rather than a SIGSEGV so I don't think it's a null pointer dereference that's causing the problem.
Attachment #8358119 -
Flags: review?(bugmail.mozilla) → review-
Reporter | ||
Comment 40•11 years ago
|
||
(In reply to Marcia Knous [:marcia - use needinfo] from comment #33)
> Here are the steps from my crash today:
>
> 1. Incoming call log, select a new number and "Create a new contact"
> 2. Contacts app: Start typing a name in the first name field, and then
> backspace a few times using the delete key on the keyboard
Another way I was able to reproduce it three times in a row using today's build involved installing the XE Currency app and then trying to add a currency by Country - typing a few characters in the text field caused the crash to happen three times: https://crash-stats.mozilla.com/report/index/58d37b4e-53c2-40a2-9b89-9e7d82140113
Comment 41•11 years ago
|
||
Caveat: Slightly older build and on Peak
https://crash-stats.mozilla.com/report/index/fd0896c7-ec57-4f67-b35e-7de842140113
STR:
1. Tap settings icon to load app.
2. Immediately tap the screen again before the app loads fully.
Assignee | ||
Comment 42•11 years ago
|
||
I think I can reproduce it now, using the STR in comment 41 along with a call to nsDOMWindowUtils::GarbageCollectNow() just before the UpdateTapState call at http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabChild.cpp?rev=49c6393d37f3#1875
Investigating...
Assignee | ||
Comment 43•11 years ago
|
||
Well that was a bit anticlimactic. DispatchWidgetEvent was removing all the touch points from the event, so when we did event.touches[0] we got back a pointer that had a 0x1 value. Dereferencing that caused the SIGBUS.
The GarbageCollect() call I mentioned above seems unrelated except that maybe it slows down things a bit and widens the window in which I could trigger the bug.
Attachment #8359361 -
Flags: review?(bugs)
Comment 44•11 years ago
|
||
I am hitting this crash with a 100% repro rate by replying to an email in the Email app and typing in the body field. The keyboard disappears after a few characters are typed. When this process of selecting the body and attempting to type is repeated 1 to 4 more times this crash is experienced.
Environmental Variables
Device: Buri v 1.4 Mozilla RIL
Build ID: 20140113040202
Gecko: http://hg.mozilla.org/mozilla-central/rev/12d3ba62a599
Gaia: fd3b9a97cb3c41cfa56be387b46a51db136b4422
Platform Version: 29.0a1
Firmware Version: V1.2-device.cfg
Comment 45•11 years ago
|
||
Comment on attachment 8359361 [details] [diff] [review]
Patch
I think we should just null check, since if we remove stuff from the touch list,
we didn't dispatch events for the touch.
Attachment #8359361 -
Flags: review?(bugs) → review-
Assignee | ||
Comment 46•11 years ago
|
||
Ok, that's even easier then.
Attachment #8359361 -
Attachment is obsolete: true
Attachment #8359400 -
Flags: review?(bugs)
Updated•11 years ago
|
Attachment #8359400 -
Flags: review?(bugs) → review+
Comment 47•11 years ago
|
||
I get this crash after a while, while running:
http://people.mozilla.org/~mwargers/tests/contenteditable/contenteditable_focus_blur_parentframe.htm
Assignee | ||
Comment 48•11 years ago
|
||
(In reply to Martijn Wargers [:mwargers] (QA) from comment #47)
> I get this crash after a while, while running:
> http://people.mozilla.org/~mwargers/tests/contenteditable/
> contenteditable_focus_blur_parentframe.htm
Presumably this is without the patch?
Assignee | ||
Comment 49•11 years ago
|
||
Assignee | ||
Updated•11 years ago
|
Status: REOPENED → ASSIGNED
Assignee | ||
Updated•11 years ago
|
status-b2g-v1.2:
--- → wontfix
status-b2g-v1.4:
--- → affected
status-firefox27:
--- → wontfix
status-firefox28:
--- → affected
status-firefox29:
--- → affected
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Attachment #8358119 -
Attachment is obsolete: true
Updated•11 years ago
|
Attachment #8359400 -
Flags: checkin+
Comment 51•11 years ago
|
||
Updated•11 years ago
|
Whiteboard: [b2g-crash] → [b2g-crash] [CR 593145]
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•