stylo: Intermittent REFTEST PROCESS-CRASH | Main app process exited normally | application crashed [@ DynamicAtom::GCAtomTableLocked(mozilla::BaseAutoLock<mozilla::Mutex> const &,DynamicAtom::GCKind)]

RESOLVED FIXED in Firefox 57

Status

()

Core
XPCOM
P3
critical
RESOLVED FIXED
11 months ago
10 months ago

People

(Reporter: Treeherder Bug Filer, Assigned: bholley)

Tracking

({crash, intermittent-failure})

unspecified
mozilla57
crash, intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 wontfix, firefox57 fixed)

Details

(Whiteboard: [stockwell unknown], crash signature)

Duplicate of this bug: 1388711

Comment 2

11 months ago
20 failures in 179 pushes (0.112 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* autoland: 17
* mozilla-inbound: 2
* try: 1

Platform breakdown:
* windows7-32-stylo: 10
* linux64-stylo: 7
* windows10-64-stylo: 2
* macosx64-stylo: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-08-10&endday=2017-08-10&tree=all

Comment 3

11 months ago
37 failures in 901 pushes (0.041 failures/push) were associated with this bug in the last 7 days. 

This is the #41 most frequent failure this week.  

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. ** 

Repository breakdown:
* autoland: 32
* mozilla-inbound: 3
* try: 1
* mozilla-central: 1

Platform breakdown:
* windows7-32-stylo: 19
* linux64-stylo: 9
* windows10-64-stylo: 5
* macosx64-stylo: 4

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-08-07&endday=2017-08-13&tree=all

Updated

11 months ago
Component: JavaScript Engine → XPCOM

Comment 4

10 months ago
42 failures in 949 pushes (0.044 failures/push) were associated with this bug in the last 7 days. 

This is the #37 most frequent failure this week.  

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. ** 

Repository breakdown:
* autoland: 25
* try: 6
* mozilla-central: 6
* mozilla-inbound: 3
* mozilla-beta: 2

Platform breakdown:
* windows7-32-stylo: 14
* linux64-stylo: 12
* macosx64-stylo: 10
* windows10-64-stylo: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-08-14&endday=2017-08-20&tree=all
:froydnj, I see you listed as the triage owner for the xpcom component- I assume this frequent intermittent is filed in the right bugzilla component- could you help find an owner to look into this?  It seem to be a stylo specific crash
Flags: needinfo?(nfroyd)
Whiteboard: [stockwell needswork]

Comment 6

10 months ago
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #5)
> :froydnj, I see you listed as the triage owner for the xpcom component- I
> assume this frequent intermittent is filed in the right bugzilla component-
> could you help find an owner to look into this?  It seem to be a stylo
> specific crash

Chris, do you know who could look into this intermittent?
Flags: needinfo?(nfroyd) → needinfo?(cpeterson)
Xidorn, do you think this GCAtom leak assertion could be related to the JSRuntime and cairo leaks in bug 1384701?
Flags: needinfo?(cpeterson) → needinfo?(xidorn+moz)
Priority: P5 → P2
Summary: Intermittent REFTEST PROCESS-CRASH | Main app process exited normally | application crashed [@ DynamicAtom::GCAtomTableLocked(mozilla::BaseAutoLock<mozilla::Mutex> const &,DynamicAtom::GCKind)] → stylo: Intermittent REFTEST PROCESS-CRASH | Main app process exited normally | application crashed [@ DynamicAtom::GCAtomTableLocked(mozilla::BaseAutoLock<mozilla::Mutex> const &,DynamicAtom::GCKind)]
I have no idea.

Looking at the code before the assertion, it seems to me that, if we are indeed leaking dynamic atoms, it should produce a message about what atoms are leaked? [1]

Not sure why that doesn't work. If that code works properly, but doesn't show any leaking dynamic atoms, the only explanation would be that gUnusedAtomCount is still bogus at shutdown time. In that case, this is unrelated to any real leak. Otherwise, no idea.


[1] https://searchfox.org/mozilla-central/rev/89e125b817c5d493babbc58ea526be970bd3748e/xpcom/ds/nsAtomTable.cpp#438-462
Flags: needinfo?(xidorn+moz)

Comment 9

10 months ago
23 failures in 908 pushes (0.025 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* try: 9
* autoland: 6
* mozilla-central: 4
* mozilla-inbound: 2
* oak: 1
* mozilla-beta: 1

Platform breakdown:
* windows7-32-stylo: 9
* linux64-stylo: 7
* windows10-64-stylo: 5
* macosx64-stylo: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-08-21&endday=2017-08-27&tree=all
(Assignee)

Comment 10

10 months ago
IIUC, this assertion isn't about leaking, but rather an accounting inconsistency within the atom table. We've tried to fix that in the past, but it doesn't seem to have fully worked. Unless I'm missing something, the solution would presumably be in the atom code, not in any consumer.

This would be stylo specific only to the extent that we're doing more OMT atom stuff in stylo. It's not clear to me that this blocks shipping stylo, thought it would certainly good to come up with an answer here.
Priority: P2 → P3

Comment 11

10 months ago
1 failures in 939 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-beta: 1

Platform breakdown:
* linux64-stylo: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-08-28&endday=2017-09-03&tree=all
Whiteboard: [stockwell needswork] → [stockwell unknown]
This doesn't seem to happen on trunk anymore, presumably because of bug 1393632 and bug 1383332 which make us less parallel. Also the time point which this intermittent stopped to happen coincides with the time that bug 1393632 landed, so it seems parallel traversal is likely the reason we have this.

In that case, it means we are still accumulatively messing up gUnusedAtomCount, because otherwise we shouldn't have this assertion trigger in shutdown, at which point there shouldn't be any style thread running.
(Assignee)

Comment 13

10 months ago
This is almost certainly the same issue as bug 1397052.
Depends on: 1397052
I believe this is the same issue as bug 1397052 so closing as well.
Status: NEW → RESOLVED
Last Resolved: 10 months ago
Resolution: --- → FIXED
Assignee: nobody → bobbyholley
status-firefox55: --- → unaffected
status-firefox56: --- → wontfix
status-firefox57: --- → fixed
status-firefox-esr52: --- → unaffected
Target Milestone: --- → mozilla57

Comment 15

10 months ago
1 failures in 924 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* try: 1

Platform breakdown:
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1388632&startday=2017-09-04&endday=2017-09-10&tree=all
You need to log in before you can comment on or make changes to this bug.