Closed Bug 1263001 Opened 8 years ago Closed 8 years ago

Still crashing in nsTArray_Impl<T>::IndexOf<T> | mozilla::dom::Geolocation::RemoveRequest

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla48

Tracking Flags:

Tracking

Status

firefox46

---

wontfix

firefox47

fixed

firefox48

fixed

firefox-esr45

---

affected

People

(Reporter: khuey, Assigned: mccr8)

References

Details

(Keywords: crash, Whiteboard: btpp-active)

Crash Data

Attachments

(1 file)

Don't Notify() an unlinked nsGeolocationRequest. 8 years ago Andrew McCreight [:mccr8] 1011 bytes, patch	jdm : review+ ritu : approval-mozilla-beta+	Details \| Diff \| Splinter Review

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Reporter

Description

•

8 years ago

This bug was filed from the Socorro interface and is 
report bp-a472c842-2c7b-4ca7-a267-478cf2160405.
=============================================================

Actual stack, from MSVC:

 	xul.dll!nsTArray_Impl<nsIContent * __ptr64,nsTArrayInfallibleAllocator>::IndexOf<nsIContent * __ptr64,nsDefaultComparator<nsIContent * __ptr64,nsIContent * __ptr64> >(nsIContent * const & aItem, unsigned __int64) Line 1098	C++
>	xul.dll!mozilla::dom::Geolocation::RemoveRequest(nsGeolocationRequest * aRequest) Line 1415	C++
 	xul.dll!nsGeolocationRequest::NotifyErrorAndShutdown(unsigned short aErrorCode) Line 426	C++
 	xul.dll!nsGeolocationRequest::TimerCallbackHolder::Notify(nsITimer * __formal) Line 781	C++
 	xul.dll!nsTimerImpl::Fire() Line 541	C++
 	xul.dll!nsTimerEvent::Run() Line 290	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 994	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 297	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 98	C++
 	xul.dll!MessageLoop::RunHandler() Line 224	C++
 	xul.dll!MessageLoop::Run() Line 204	C++
 	xul.dll!nsBaseAppShell::Run() Line 158	C++
 	xul.dll!nsAppShell::Run() Line 264	C++
 	xul.dll!nsAppStartup::Run() Line 282	C++
 	xul.dll!XREMain::XRE_mainRun() Line 4343	C++
 	xul.dll!XREMain::XRE_main(int argc, char * * argv, const nsXREAppData * aAppData) Line 4439	C++
 	xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData, unsigned int aFlags) Line 4546	C++

This is 1% of our nightly crashes over the last three days.

Alphan Chen [:alchen]

Comment 1

•

8 years ago

It should be related to bug 1256061.

Hi Andrew,
could you take a look?

Flags: needinfo?(continuation)

Andrew McCreight [:mccr8]

Assignee

Comment 2

•

8 years ago

Bah, I guess this is another variant of bug 1256061. I even see the same crash signatures. It looks like it is on the first clause of the inequality in RemoveRequest() rather than the second, so maybe the geolocation object is null instead of the request? I'm not sure how mLocator can end up null here, as I don't think it gets cleared outside of the CC unlink method and the destructor, and the object is being held live on the stack. Maybe I'll just throw a null check in...

Assignee: nobody → continuation

Blocks: 1256061

Crash Signature: [@ nsTArray_Impl<T>::IndexOf<T>] → [@ nsTArray_Impl<T>::IndexOf<T>][@ nsTArray_Impl<T>::IndexOf<T> | mozilla::dom::Geolocation::RemoveRequest]

Flags: needinfo?(continuation)

Andrew Overholt [:overholt]

Updated

•

8 years ago

Whiteboard: btpp-active

(not currently active) Ted Mielczarek

Comment 3

•

8 years ago

I looked at why Breakpad fails to get a good stack here, and it's actually pretty straightforward. Frame 0 is a leaf function, so it doesn't have an entry in the unwind table. However, if Breakpad can't unwind using unwind info it skips to trying to use a frame pointer, and %rbp is not actually a frame pointer here, but it points to valid memory so Breakpad hands that back as the return address and falls apart. The actual return address is just sitting there at the top of the stack. I think I know how to fix Breakpad to do the right thing here without breaking other cases, so I'll try to fix that upstream.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Reporter

Comment 4

•

8 years ago

I hit this twice in a row on http://beta.temblor.net/

Andrew McCreight [:mccr8]

Assignee

Comment 5

•

8 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #4)
> I hit this twice in a row on http://beta.temblor.net/

Did you share your location? Did you close the tab quickly?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Reporter

Comment 6

•

8 years ago

I shared my location and then tried to zoom in and move around.  I didn't close the tab quickly.

(not currently active) Ted Mielczarek

Comment 7

•

8 years ago

I wrote a patch to Breakpad's stackwalker that allows it to get the correct stack in this situation:
https://codereview.chromium.org/1902783002

Robert Kaiser

Comment 8

•

8 years ago

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #7)
> I wrote a patch to Breakpad's stackwalker that allows it to get the correct
> stack in this situation:
> https://codereview.chromium.org/1902783002

Awesome! Is there a place where we track getting that deployed on our side?

(not currently active) Ted Mielczarek

Comment 9

•

8 years ago

No, but I'm going to land it upstream today, and spin a new Breakpad build for Socorro, so the next Socorro production push after that should pick it up.

Robert Kaiser

Comment 10

•

8 years ago

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #9)
> No, but I'm going to land it upstream today, and spin a new Breakpad build
> for Socorro, so the next Socorro production push after that should pick it
> up.

OK, thanks, we should get a bunch of recent crashes with this signature reprocessed then (and attach the new signature here) and we probably should get a message to stability@m.o to watch out for changing signatures (and better stacks).

Andrew McCreight [:mccr8]

Assignee

Comment 11

•

8 years ago

Attached patch Don't Notify() an unlinked nsGeolocationRequest. — Details — Splinter Review

If an unlinked nsGeolocationRequest somehow stays alive, then calling
Notify() on it will likely cause a null-deref crash. I'm not sure
exactly how that could happen, but maybe this will at least fix the
crash. mLocator is only cleared in Unlink.

Attachment #8744402 - Flags: review?(josh)

Josh Matthews [:jdm]

Updated

•

8 years ago

Attachment #8744402 - Flags: review?(josh) → review+

Pulsebot

Comment 12

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/1e006c7b5eda

Comment hidden (spam)

[Tracking Requested - why for this release]:

Chris Peterson [:cpeterson]

Updated

•

8 years ago

blocking-b2g: 2.2r? → ---

tracking-b2g: backlog → ---

Carsten Book [:Tomcat]

Comment 14

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/1e006c7b5eda

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox48: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla48

[:philipp]

Comment 15

•

8 years ago

[Tracking Requested - why for this release]:
this is around in 47 builds as well. in the last builds of the aurora cycle the signature is at #7 with 1.7% of crashes. maybe the fix could be uplifted to 47?

status-firefox47: --- → affected

tracking-firefox47: --- → ?

Andrew McCreight [:mccr8]

Assignee

Comment 16

•

8 years ago

(In reply to philipp from comment #15)
> maybe the fix could be uplifted to 47?

Yeah, definitely, if it actually works this time. :)

Kyle, once you've updated your Nightly, could you see if you can still reproduce this? Thanks. Not that it is a guarantee, but it would give us at least some partial information.

Flags: needinfo?(khuey)

Andrew McCreight [:mccr8]

Assignee

Updated

•

8 years ago

Summary: crash in nsTArray_Impl<T>::IndexOf<T> → Still crashing in nsTArray_Impl<T>::IndexOf<T> | mozilla::dom::Geolocation::RemoveRequest

[:philipp]

Comment 17

•

8 years ago

so far there are no more crashes after build 20160424030601, which is a good sign :-)

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 18

•

8 years ago

Hi Andrew, given the promising data from Nightly, should we uplift this to Beta47? Thanks!

tracking-firefox47: ? → +

tracking-firefox48: ? → +

Flags: needinfo?(continuation)

Andrew McCreight [:mccr8]

Assignee

Comment 19

•

8 years ago

Comment on attachment 8744402 [details] [diff] [review]
Don't Notify() an unlinked nsGeolocationRequest.

Approval Request Comment
[Feature/regressing bug #]: bug 1238427
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: this code in general is tested, but this particular part isn't well-tested
[Risks and why]: low-risk: it just adds a null check
[String/UUID change made/needed]: none

Flags: needinfo?(khuey)

Flags: needinfo?(continuation)

Attachment #8744402 - Flags: approval-mozilla-beta?

Attachment #8744402 - Flags: approval-mozilla-aurora?

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 20

•

8 years ago

Comment on attachment 8744402 [details] [diff] [review]
Don't Notify() an unlinked nsGeolocationRequest.

Nightly data on this crash fix looks promising, Aurora48+, Beta47+

Attachment #8744402 - Flags: approval-mozilla-beta?

Attachment #8744402 - Flags: approval-mozilla-beta+

Attachment #8744402 - Flags: approval-mozilla-aurora?

Attachment #8744402 - Flags: approval-mozilla-aurora+

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 21

•

8 years ago

Comment on attachment 8744402 [details] [diff] [review]
Don't Notify() an unlinked nsGeolocationRequest.

This is already in Fx48 (which was Nightly pre-merge) when the fix landed.

Attachment #8744402 - Flags: approval-mozilla-aurora+

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Reporter

Comment 22

•

8 years ago

FWIW I wasn't able to reproduce this on yesterday's nightly.

Andrew McCreight [:mccr8]

Assignee

Updated

•

8 years ago

status-firefox46: --- → wontfix

Andrew McCreight [:mccr8]

Assignee

Comment 23

•

8 years ago

I don't think this is common enough of a crash to justify backporting to ESR45, but it would be easy to do.

Blocks: 1238427
No longer blocks: 1256061

status-firefox-esr45: --- → affected

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 24

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/28738ca1f22a

status-firefox47: affected → fixed

You need to log in before you can comment on or make changes to this bug.