Crash in JS:CollectRuntimeStats while stability testing

RESOLVED DUPLICATE of bug 1132502

Status

()

defect
P2
critical
RESOLVED DUPLICATE of bug 1132502
4 years ago
3 years ago

People

(Reporter: ggrisco, Assigned: terrence)

Tracking

({crash})

unspecified
ARM
Gonk (Firefox OS)
Points:
---

Firefox Tracking Flags

(blocking-b2g:2.2?)

Details

(Whiteboard: [b2g-crash][caf-crash 643][caf priority: p1][CR 848251], crash signature)

Attachments

(11 attachments)

Reporter

Description

4 years ago
Saw this stack trace while running stability tests overnight:

@ AddClassInfo | StatsCellCallback<(Granularity)0u> | IterateCompartmentsArenasCells | js::IterateZonesCompartmentsArenasCells ]

cafbot will upload the logs.
Whiteboard: [CR 848251] → [caf priority: p1][CR 848251]
Whiteboard: [caf priority: p1][CR 848251] → [b2g-crash][caf-crash 643][caf priority: p1][CR 848251]
Keywords: crash
Looks like a use after free accessing the contents of a BaseShape's JSClass.

Comment 7

4 years ago
Naveed/Jon, 

Your help is appreciated with this stability bug. If you need further info, feel free to ping ggrisco or ikumar from CAF

(Also NI Josh Cheng - 2.2 RM to triage and track 2.2 crash issues)

Thanks
Hema
Flags: needinfo?(nihsanullah)
Flags: needinfo?(joshcheng)
Flags: needinfo?(jcoppeard)
Flags: needinfo?(jcoppeard)
It seems there have been no dynamically allocated JSClasses since FF31 (bug 990290), so not UAF of a JSClass.

Does this happen every time?  Is it possible to reproduce?
Flags: needinfo?(ggrisco)
jonco is active here so pulling off the needinfo on me
Flags: needinfo?(nihsanullah)
Reporter

Comment 10

4 years ago
(In reply to Jon Coppeard (:jonco) from comment #8)
> It seems there have been no dynamically allocated JSClasses since FF31 (bug
> 990290), so not UAF of a JSClass.
> 
> Does this happen every time?  Is it possible to reproduce?

So far we only saw this crash one time on AU 196 after many hours of test, so it's not easily reproduced.
Flags: needinfo?(ggrisco)

Updated

4 years ago
Flags: needinfo?(joshcheng) → needinfo?(jocheng)
Dear Jon,
Thanks for your help.
Is it possible to find any clue from the log provided earlier?
Flags: needinfo?(jocheng) → needinfo?(jcoppeard)
It's not a lot to go on.  I'm wondering if something in the memory reporter can end up triggering a GC while we're iterating through the arenas, but I'd be surprised if that didn't trigger an assert somewhere.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) (PTO until 21st July) from comment #12)
> It's not a lot to go on.  I'm wondering if something in the memory reporter
> can end up triggering a GC while we're iterating through the arenas, but I'd
> be surprised if that didn't trigger an assert somewhere.

Thanks Jon,
It seems we can only wait until same issue happen next time? Before then, is there any additional log we can ask Greg to provide?
Flags: needinfo?(jcoppeard)

Updated

4 years ago
Flags: needinfo?(jocheng)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
"Closing issue which has not been seen since 06/30/15 20:21"

Updated

4 years ago
Flags: needinfo?(jocheng)
Flags: needinfo?(jcoppeard)

Updated

4 years ago
blocking-b2g: 2.2? → ---
Reporter

Comment 15

4 years ago
Re-opening since this was seen again on AU 214.  cafbot will follow-up with latest logs.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Reporter

Updated

4 years ago
blocking-b2g: --- → 2.2?
Hi Jon,
Could you help to check whether there are any useful information in new log?
Thanks!
Flags: needinfo?(jcoppeard)
It's the same problem as before, crashing when trying to dereferencing the heap free pattern while inside the memory reporter.

Nick, have you seen anything like this before?
Flags: needinfo?(jcoppeard) → needinfo?(n.nethercote)
The JS memory reporter iterates over every live thing in the GC heap, and measures most of the malloc'd blocks that hang off those things. So if there's any kind of GC heap corruption there's a good chance that it will manifest in the reporter.

This reminds me slightly of part 6 in bug 972712 which involved identifying which class each object belonged to. It originally landed in March 2014 but had to be backed out due to intermittent ASAN failures. I was eventually able to land it three months later (in bug 1023719) despite not having made any changes. I concluded that I had been hitting some kind of latent heap corruption and then it went away, either via luck or via someone fixing something.

So this one is going to be difficult to debug. I wonder if implementing a GC heap sanity checker would be a good idea. It would be similar to the reporter -- iterate over every GC thing, checking that everything looks ok. I've seen this kind of thing be implemented in other systems, mostly checking IR in compilers, and they typically find real problems.
Flags: needinfo?(n.nethercote)

Comment 27

4 years ago
Please feel free to provide us a debug patch to collect more logs. We are seeing this issue more consistently now after many hours of stability testing so we need to resolve it asap.
Hi Nick,
Is it possible to provide a debug patch here? Thanks!
Flags: needinfo?(n.nethercote)
> Is it possible to provide a debug patch here? Thanks!

I don't have anything specific for you, sorry.
Flags: needinfo?(n.nethercote)
Hi Bobby,
Could you help to find anyone who can help here?
Thanks!
Flags: needinfo?(bchien)
This looks like the same crash reported in bug 1189934.

Comment 32

4 years ago
Similar crash from memory reference in bug 1189934 and bug 1132502.
Flags: needinfo?(bchien)
See Also: → 1189934, 1132502

Comment 33

4 years ago
Jason, per some research in comment 32. Looks similar crash in Javascript. could you help on this?
Flags: needinfo?(jorendorff)

Comment 34

4 years ago
Naveed, could you help this bug? Thanks.
Flags: needinfo?(nihsanullah)
Terrence check out Nick's comment, https://bugzilla.mozilla.org/show_bug.cgi?id=1180954#c23, would that help us out in the future?
Assignee: nobody → terrence
Flags: needinfo?(nihsanullah)

Updated

4 years ago
Flags: needinfo?(jorendorff)
Priority: -- → P2

Comment 37

4 years ago
on a OS capable up to 3,3 Gb ram.
Comment hidden (obsolete)

Comment 39

4 years ago
I got it, Yorgos. Sorry for confusion.
Flags: needinfo?(condacum)

Comment 40

4 years ago
Revisit minidump attachments and search signature from crash report site. No new similar crash for long time. Close as worksforme.
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → WORKSFORME
Assignee

Updated

4 years ago
Duplicate of this bug: 1189934

Comment 42

4 years ago
@Bobby: I dont understand your request. 
Minidum folder: empty.
I found only
https://crash-stats.mozilla.com/report/index/6aaae493-b969-41da-ae75-badf82151102
where sould I search for the signature?

Comment 44

4 years ago
(In reply to Yorgos from comment #42)
> @Bobby: I dont understand your request. 
> Minidum folder: empty.
> I found only
> https://crash-stats.mozilla.com/report/index/6aaae493-b969-41da-ae75-
> badf82151102
> where sould I search for the signature?

Per your crash (https://crash-stats.mozilla.com/report/index/6aaae493-b969-41da-ae75-badf82151102), you could copy signature "js::detail::HashTable<T>::lookupForAdd" to search (top-right corner) from crash report site. so you could query statistics. Let me know if you still have trouble. Thanks.

Comment 45

4 years ago
Done.
http://postimg.org/image/sgc0wl7yb/
I am not the only with this crash, I hope for a fix.

Comment 46

4 years ago
Hi Jorgos, I close this bug for firefox OS. I saw you follow another bug 1132502 for firefox. Please keep trace. Many thanks.
Yeah, turns out this is actually a dup of bug 1132502.
Resolution: WORKSFORME → DUPLICATE
Duplicate of bug: 1132502
No longer blocks: CAF-v2.2-metabug
You need to log in before you can comment on or make changes to this bug.