Comment 2

I have a membership management app that I use for Boy Scouts that is built on Silverlight, and I'm in it frequently enough that I tend to leave it open in a tab. It times out after 30 minutes of inactivity and kicks you out, and the login screen isn't Silverlight, which means the app would be frequently getting unloaded and reloaded. I don't know if that's related, but I seem to get this crash every day or two. Said app is usually in a background tab when it crashes.

Steven Michaud [:smichaud] (Retired)

Comment 11

•

11 years ago

For what it's worth, these are currently the #1 Mac topcrasher on the 31 branch and #3 on the 30 branch. They're down in the 20s and 30s on the 29 and 28 branches, and not even in the top 100 on the 27 branch. So though this problem has existed for a long time, we appear to have done something recently to make it much worse -- especially on the Mac.

status-firefox30: --- → affected

status-firefox31: --- → affected

tracking-firefox30: --- → ?

tracking-firefox31: --- → ?

Keywords: topcrash-mac

Steven Michaud [:smichaud] (Retired)

Comment 12

•

11 years ago

Could some kind QA person please find a list of URLs most closely associated with these crashes?

Whiteboard: [QA wanted]

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Keywords: needURLs

Steven Michaud [:smichaud] (Retired)

Comment 13

•

11 years ago

Attached patch Potential bandaid patch (obsolete) — Details — Splinter Review

Most of the Mac crashes are null-dereferences, so this might work as a bandaid patch. Kyle, I'm asking you to review because you have hg blame for the code I'm changing. I've started an all-platform tryserver build, whose results should eventually be available here: https://tbpl.mozilla.org/?tree=Try&rev=45714bba0168

Attachment #8409191 - Flags: review?(khuey)

Steven Michaud [:smichaud] (Retired)

Comment 14

•

11 years ago

Comment on attachment 8409191 [details] [diff] [review] Potential bandaid patch But apparently Kyle Huey is away, so I'll try Andrew McCreight.

Attachment #8409191 - Flags: review?(khuey) → review?(continuation)

Steven Michaud [:smichaud] (Retired)

Comment 15

•

11 years ago

Forgot to mention that this is where these crashes happen: http://hg.mozilla.org/mozilla-central/annotate/7fe3ee0cf8be/xpcom/base/CycleCollectedJSRuntime.cpp#l1006

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 16

•

11 years ago

Comment on attachment 8409191 [details] [diff] [review] Potential bandaid patch Review of attachment 8409191 [details] [diff] [review]: ----------------------------------------------------------------- My vacation starts in 10 minutes, so you should request review from mccr8 again. ::: xpcom/base/CycleCollectedJSRuntime.cpp @@ +1003,5 @@ > > nsISupports* wrapper = items->ElementAt(lastItemIdx); > + if (!wrapper) { > + continue; > + } It seems quite surprising to me that we would get a null pointer in here, but if we do, we should still remove it from the array to avoid having to memcpy as we destroy the rest of the array. So you should use NS_IF_RELEASE rather than adding an early continue.

Attachment #8409191 - Flags: review?(continuation) → review-

Steven Michaud [:smichaud] (Retired)

Comment 17

•

11 years ago

Attached patch Bandaid patch rev1 — Details — Splinter Review

Thanks, Kyle, for your suggestion.

Attachment #8409191 - Attachment is obsolete: true

Attachment #8409205 - Flags: review?(continuation)

Steven Michaud [:smichaud] (Retired)

•

11 years ago

I reproduce it pretty reliably on https://lodgemaster.oa-bsa.org/lodge/client but that requires a login. FWIW it typically happens after I let the page idle for several hours in a background tab after logging in.

Benjamin Kerensa [:bkerensa]

Comment 23

•

11 years ago

Tracking for now due to topcrash.

tracking-firefox30: ? → +

tracking-firefox31: ? → +

Steven Michaud [:smichaud] (Retired)

Comment 26

•

11 years ago

(In reply to comment #22) Dave, could you try the following tryserver build for a day or two? http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-76e2a3694db2/try-macosx64/firefox-31.0a1.en-US.mac.dmg It was made using my rev1 patch.

Dave Miller [:justdave]

Reporter

Comment 27

•

11 years ago

(In reply to Steven Michaud from comment #26) > Dave, could you try the following tryserver build for a day or two? OK, running it now, got the above page opened, and I'll let you know what happens.

Dave Miller [:justdave]

Reporter

Comment 28

•

11 years ago

Possibly of note, I was previously running Aurora, and this is a trunk build, so hopefully this crash was happening on Nightly, too.

Steven Michaud [:smichaud] (Retired)

Comment 29

•

11 years ago

> so hopefully this crash was happening on Nightly, too. It certainly seems so. This is a Mac topcrasher on the 30 and 31 branches. It's my hunch that the recent Mac null-dereferences have a different origin from the other crashes -- those that go back many FF versions, and many of which aren't null-dereferences. Your results should help us tell whether or not I'm right. If you don't see any of these crashes with my tryserver build, I'm likely to be right. But if you do see even one of *these* crashes (in ReleaseSliceNow()), I'm likely to be wrong.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 30

•

11 years ago

Comment on attachment 8409205 [details] [diff] [review] Bandaid patch rev1 We can try this, but I'll look at the real fix too. Doesn't sound like a CC bug, but bug in something using DeferredFinalize

Attachment #8409205 - Flags: review?(bugs) → review+

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

11 years ago

QA Contact: lhenry

Steven Michaud [:smichaud] (Retired)

Comment 31

•

11 years ago

Comment on attachment 8409205 [details] [diff] [review] Bandaid patch rev1 Landed on mozilla-inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/bace819903bb

Carsten Book [:Tomcat]

Comment 32

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/bace819903bb

Assignee: nobody → smichaud

Status: NEW → RESOLVED

Closed: 11 years ago

status-firefox31: affected → fixed

•

11 years ago

A bunch of (supposedly) null-dereference crashes have happened in the Mac m-c nightlies for 2014-04-26 and 2014-04-27. My bandaid patch didn't fix them :-(

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

11 years ago

Whiteboard: [QA wanted]

Liz Henry (:lizzard) (relman/hg->git project)

Comment 41

•

11 years ago

Steven let me know if there's anything you need. It still looks like a top crashing signature for Firefox 31.0a1.

Steven Michaud [:smichaud] (Retired)

Comment 42

•

11 years ago

Right now I'm waiting for my patch for bug 1002564 to get into trunk nightlies (starting with tomorrow's). It won't fix this bug. But I think it'll show that the relatively large number of recent Mac crashes, which Socorro currently reports as null-dereference crashes, aren't that at all. With luck it'll also provide some clues about the actual cause.

u279076

Comment 43

•

11 years ago

Dropping qawanted keyword from this bug as it doesn't look like there's much QA can do to help here. From bug 1002564 comment #13 by Steven Michaud: > ...it doesn't seem to have fixed the issue with the Socorro stacks for bug 997908: > All the recent Mac crashes (those in builds containing the patch) are still reported as > NULL-dereferences, even though they can't possibly be (as best I can tell). > > I've now thought of more tests I can run to try to get to the bottom of this > (for example to find out if this is an Apple bug). Let me do them and get > back before we decide either way. I think we should block this bug until bug 1002564 is addressed.

Depends on: 1002564

Keywords: qawanted

Steven Michaud [:smichaud] (Retired)

Comment 44

•

11 years ago

justdave: Are you still seeing this bug? And if so are you willing to try another tryserver build for a day or two? http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-b93830179c78/try-macosx64/firefox-32.0a1.en-US.mac.dmg This build contains my latest patch for bug 1002564. It doesn't fix this bug, but it *does* partly fix the problem with Breakpad falsely reporting some crashes as null-deferences. My hope is that if/when you see this crash again with this build, you'll be able to tell us the actual crash address.

Marcia Knous [:marcia]

Comment 45

•

11 years ago

Steven: I just hit this signature running the beta. The browser crashed when Firefox was idle and I believe only one tab was open at the time: https://crash-stats.mozilla.com/report/index/df40a1d3-18f0-4fa1-b3bb-a7ab02140520 (Note: this wasn't just a plugin crash - it took the whole browser down).

Steven Michaud [:smichaud] (Retired)

Comment 46

•

11 years ago

Marcia: Since comment #6 I haven't thought this was a plugin bug, so your report doesn't surprise me. Do you see this bug often? If so, could you try out my tryserver build from comment #44 for a couple of days?

Dave Miller [:justdave]

Reporter

Comment 47

•

11 years ago

•

Edited

(In reply to Steven Michaud from comment #44) > justdave: Are you still seeing this bug? And if so are you willing to try > another tryserver build for a day or two? Yes, frequently (well, every few days still). Sure, I'll give it a run.

Dave Miller [:justdave]

Reporter

Comment 48

•

11 years ago

(In reply to Steven Michaud from comment #44) > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-b93830179c78/try-macosx64/firefox-32.0a1.en-US.mac.dmg https://app.work.com renders as a blank page using this build. Is that specific to this build or shall I file a bug against nightly?

Andrew McCreight [:mccr8]

Comment 49

•

11 years ago

(In reply to Dave Miller [:justdave] (justdave@bugzilla.org) from comment #48) > https://app.work.com renders as a blank page using this build. Is that > specific to this build or shall I file a bug against nightly? That sounds vaguely like a cache loading bug that has been very recently fixed. Try doing force reload (shift click on the reload) and see if that helps.

Steven Michaud [:smichaud] (Retired)

Comment 50

•

11 years ago

> https://app.work.com renders as a blank page using this build. I just tried loading this page in my tryserver build (on OS X 10.8.5), and had no trouble.

Dave Miller [:justdave]

Reporter

Comment 51

•

11 years ago

Force reload didn't help (that was the first thing I tried). Were you logged in when you visited work.com? Their site is one of those ones that behaves very differently if you're logged in. I'm on OS X 10.9.3 also. I'm just running Safari for that site for the time being, so I'm okay so far. I'm told on IRC that Asa said there's a ton of sites busted on Nightly right now (should be fixed in a couple days) but that's the only one I've run into so far.

Steven Michaud [:smichaud] (Retired)

•

11 years ago

Benjamin, are you the right person to look to on this for a deeper dive into the CC code? If not, can you recommend someone?

Flags: needinfo?(sledru) → needinfo?(benjamin)

Benjamin Smedberg

Comment 63

•

11 years ago

Not me, no. If this were a CC bug, probably mccr8, but without steps we really don't know whether the bug is actually in the CC or elsewhere in the codebase. Other than asking justdave to do an ASAN/valgrind run and see if it finds something interesting, I don't think we have clear next steps here.

Flags: needinfo?(benjamin)

Andrew McCreight [:mccr8]

Comment 64

•

11 years ago

As Olli said in comment 30, it seems like the most likely cause is something misusing DeferredFinalize. Though the fact that we're hitting a null deref immediately after a null check is mysterious...

Benjamin Smedberg

Comment 65

•

11 years ago

The null isn't real, it's a crash reporting artifact. Assume that the pointer is non-null.

Andrew McCreight [:mccr8]

Comment 66

•

11 years ago

I'll try to think about something we can do.

Assignee: nobody → continuation

Steven Michaud [:smichaud] (Retired)

Comment 67

•

11 years ago

For what it's worth, I'm doing a Mac opt universal ASan build for justdave.

Steven Michaud [:smichaud] (Retired)

Comment 68

•

11 years ago

I was unable to do a universal build (for some reason the clang I need to do ASan builds won't do 32-bit builds). So here's a 64-bit only build: http://people.mozilla.org/~stmichaud/bmo/firefox-asan.dmg Please try it, justdave, and let us know your results. Some of your plugins (notably Silverlight) won't work. And it may turn out that, for arbitrary reasons, you can't reproduce the ReleaseSliceNow() crashes with this build. But it seems like you're our best hope for moving forward on this bug :-)

Steven Michaud [:smichaud] (Retired)

Comment 78

•

11 years ago

In that case we will need to leave this to be fixed (hopefully) in FF31 as we're too late to take on any speculative work in FF30 and this has no current, verified fix.

status-firefox30: affected → wontfix

tracking-firefox32: --- → +

Steven Michaud [:smichaud] (Retired)

•

11 years ago

(In reply to Steven Michaud from comment #79) > Dave, I've got another tryserver build for you to try, if you're willing: Running it now, will let you know.

Christian Holler (:decoder)

Comment 83

•

11 years ago

(In reply to Steven Michaud from comment #80) > > This leads me to suspect that, even if we can fix the build problem, 32-bit > ASan builds will be unusable. > > Am I right, Christian? Yes, I think so. The reduced stack space on 32 bit previously caused startup errors for Firefox (e.g. "Too much recursion" from the JS engine because stack space was exhausted too quickly). This typically happens in the parser because our parsing works recursively and stack frames grow really large with Clang Inlining+ASan. We should still try to fix the build errors (or just create 64-bit only ASan builds). There is another problem on Mac though that would have to be solved too, that is bug 923916. It has to do with the fact that on Mac, ASan uses a dylib instead of statically linking the ASan runtime to the target binary.

Flags: needinfo?(choller)

Dave Miller [:justdave]

Reporter

Comment 84

•

10 years ago

Attached file Crash data from syslog — Details

Crash triggered. Syslog data attached. https://crash-analysis.mozilla.com/hang-reports/2014/05-29/hr-20140529-880016b7-3d3b-48ff-b0bc-c28b93eb95f1.html bp-347f536c-345b-4c80-9a0d-81aff2140529

Dave Miller [:justdave]

Reporter

Comment 85

•

10 years ago

FWIW, the event that appears to have triggered the crash was clicking a URL to a PNG file on dropbox in my IRC client, which brought Firefox to the front, hung for a minute or so, then crashed.

Steven Michaud [:smichaud] (Retired)

•

10 years ago

Yep, I'll keep trying.

Steven Michaud [:smichaud] (Retired)

Comment 90

•

10 years ago

Also, could you post Breakpad crash IDs for both of the plugin-container crashes?

Steven Michaud [:smichaud] (Retired)

Comment 91

•

10 years ago

>> bp-347f536c-345b-4c80-9a0d-81aff2140529 > > This isn't a ReleaseSliceNow() crash. Also Firefox is running in 32-bit mode (which > you weren't doing previously). This crash ID is actually for the Silverlight plugin. But (apparently) you were also running two other 64-bit plugins. Its crash IDs for those that I want to see -- particularly the one whose rbx == 0x5a5a5a5a5a5a5a5a. Also the Firefox crash id, if you can find it. It *might* have a ReleaseSliceNow() crash.

Dave Miller [:justdave]

Reporter

Comment 92

•

10 years ago

Those are the only breakpad crashes that were dropped. The next one in the list by date is from the 25th.

Robert Kaiser

Comment 93

•

10 years ago

(In reply to Steven Michaud from comment #86) > rbx: 0x5a5a5a5a5a5a5a5a Hmm, that's the value we use for poisoning on freeing memory.

Steven Michaud [:smichaud] (Retired)

Comment 94

•

10 years ago

As per bug 1002564 comment #39, the latest minidump-stackwalk will display all the user-level registers for the top frame of the crashing thread. And as minidumps have contained this information all along, it just occurred to me to download a few corresponding to recent instance of these crashes. They all have the following: rbx = 0x5a5a5a5a5a5a5a5a So this is the crash address for these crashes. You need special permissions to download minidumps from http://crash-stats.mozilla.com. Here's how to download and build the latest minidump-stackwalk: 1) Install a reasonably recent XCode (I have 4.5.2) and the latest XCode commandline tools. 2) Visit https://github.com/mozilla/socorro and click "download zip". 3) CC=clang CXX=clang++ make breakpad 4) CC="clang -Wno-switch" CXX="clang++ -Wno-switch" make stackwalker

Summary: crash in ReleaseSliceNow(unsigned int, void*) → crash in ReleaseSliceNow(unsigned int, void*) at 0x5a5a5a5a5a5a5a5a

Steven Michaud [:smichaud] (Retired)

Comment 95

•

10 years ago

Something else also occurred to me: It may be possible to do a universal build that combines a 64-bit ASan build with a 32-bit "regular" build. I'll try that on Monday.

Steven Michaud [:smichaud] (Retired)

Comment 96

•

10 years ago

Dave, you don't have to keep waiting for a ReleaseSliceNow crash to happen in the build I gave you. But if I can manage to do a special ASan universal build as per comment #95, I'll post the instructions here.

Steven Michaud [:smichaud] (Retired)

Updated

•

10 years ago

Summary: crash in ReleaseSliceNow(unsigned int, void*) at 0x5a5a5a5a5a5a5a5a → crash in ReleaseSliceNow(unsigned int, void*) accessing memory at 0x5a5a5a5a5a5a5a5a

Steven Michaud [:smichaud] (Retired)

Comment 97

•

10 years ago

Comment 113

•

10 years ago

The (ICC-related) patch for bug 1023758 hasn't stopped these crashes from happening on trunk (the 33 branch). And though ICC has been turned off on the 32 (Aurora) branch as of the 2017-07-03 nightly (bug 911246 comment #20), these crashes are also still happening there. So, though mozilla::IncrementalFinalizeRunnable::Run() is on the stack, these crashes aren't (directly) related to incremental cycle collection.

Flags: needinfo?(justdave)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 114

•

10 years ago

IncrementalFinalizeRunnable has nothing to do with incremental cycle collector. IncrementalFinalizeRunnable is about GC calling Release on refcounted object after GC has collected the relevant JS wrappers.

Dave Miller [:justdave]

Reporter

Comment 115

•

10 years ago

(In reply to Steven Michaud from comment #112) > Dave, are you still seeing these crashes in current trunk nightlies? I don't use Nightly on a day-to-day basis, I primarily use Aurora. I am still seeing this crash on Aurora.

Steven Michaud [:smichaud] (Retired)

Comment 116

•

10 years ago

I keep updating my Mac ASan build about once a week. I've just done so again: http://people.mozilla.org/~stmichaud/bmo/firefox-asan.dmg http://people.mozilla.org/~stmichaud/bmo/firefox-asan-howto.txt These are based on current trunk code. But could you use one for a few days, Dave, to see if you can reproduce one of these crashes in it? To launch it, you'd typically do something like this (at a Terminal prompt): 1) cd ~/Downloads 2) ASAN_SYMBOLIZER_PATH=~/Desktop/Nightly\ ASan.app/Contents/MacOS/llvm-symbolizer nohup ~/Desktop/Nightly\ ASan.app/Contents/MacOS/firefox 2>&1 | tee firefox-asan.log & When/if you crash, you'll see an ASan log (fully symbolized) in firefox-asan.log.

Sylvestre Ledru [:Sylvestre]

Comment 117

•

10 years ago

Same as 30, wontfix for 31.

status-firefox31: affected → wontfix

status-firefox33: --- → affected

tracking-firefox33: --- → +

Dave Miller [:justdave]

Reporter

Comment 118

•

10 years ago

I finally got a good breaking point to restart my browser and try this out... It dies at launch with: dyld: lazy symbol binding failed: Symbol not found: ___asan_init_v3 Referenced from: /Applications/Nightly.app/Contents/MacOS/libmozglue.dylib Expected in: flat namespace dyld: Symbol not found: ___asan_init_v3 Referenced from: /Applications/Nightly.app/Contents/MacOS/libmozglue.dylib Expected in: flat namespace And it leaves an OS X crash dump (which I can upload if you want it). The llvm-symbolizer is what crashes, the app itself continues running and works fine (using it to type this comment).

Dave Miller [:justdave]

Reporter

Comment 119

•

10 years ago

Apparently llvm-symbolizer crashes every time I open a new page.

Steven Michaud [:smichaud] (Retired)

•

10 years ago

I've typically had to be using Firefox about a day and a half to trigger this crash, and I can't keep this running long enough to crash it. It leaks like a sieve, and is using 14 GB of physical RAM right now (I only have 16 GB) and I've only been running it for 24 hours. As soon as I'm done typing this comment I'm restarting it to clear up the memory leak, and that may also reset my chances of reproducing the crash :(

Steven Michaud [:smichaud] (Retired)

Comment 125

•

Comment 137

•

10 years ago

Andrew - You're listed as the owner although from the recent comments it looks like Steven may actually be driving this one. I'm don't have a great deal of confidence right now that this will be resolved in 32. Do we have next steps here?

Flags: needinfo?(smichaud)

Flags: needinfo?(continuation)

Andrew McCreight [:mccr8]

Comment 138

•

10 years ago

Not really. There's just not any information to go on.

Flags: needinfo?(continuation)

Steven Michaud [:smichaud] (Retired)

Comment 139

•

10 years ago

This bug is currently stuck hard in the mud. My last hope was to get someone to reproduce these crashes (or their precursor) in an ASan build. But apparently that's not going to work.

Flags: needinfo?(smichaud)

Lawrence Mandel [:lmandel] (use needinfo)

Comment 140

•

10 years ago

Comment 149

•

10 years ago

Steven, if you want to land that, feel free to do so with r=me. We should back it out at some point if it doesn't turn up anything, but it should be relatively harmless.

Andrew McCreight [:mccr8]

Comment 150

•

10 years ago

Ideally that will crash even with non-ASan builds thanks to poisoning, if the object is in fact bad there.

Steven Michaud [:smichaud] (Retired)

Comment 151

•

10 years ago

> Steven, if you want to land that, feel free to do so with r=me. Will do. Thanks! Crash reports for tryserver builds don't get properly symbolized, so you have to symbolize them by hand (using atos). So having this in m-c nightlies will be *much* more convenient. I'll change the comment to indicate that the patch should come out at some point -- either when this bug gets fixed, or if the patch doesn't help fix it.

Steven Michaud [:smichaud] (Retired)

Comment 152

•

10 years ago

Attached patch Patch for testing (*not* a fix) -- should make crash stacks more informative (obsolete) — Details — Splinter Review

What I'll land on m-i.

Attachment #8492208 - Attachment is obsolete: true

Andrew McCreight [:mccr8]

Comment 153

•

•

10 years ago

Attached patch Patch for testing (*not* a fix) (obsolete) — Details — Splinter Review

•

10 years ago

Like previous releases, wontfix for 33

status-firefox33: affected → wontfix

status-firefox35: --- → affected

tracking-firefox34: --- → +

tracking-firefox35: --- → +

Liz Henry (:lizzard) (relman/hg->git project)

Comment 174

•

10 years ago

This is still the #10 topcrash on Firefox 34.0a2, though it isn't very high volume, with 64/3985 crashes in the last 7 days.

Steven Michaud [:smichaud] (Retired)

Comment 175

•

10 years ago

Attached patch Another patch for testing (*not* a fix) (obsolete) — Details — Splinter Review

Let me try my hand at this again. This patch is *very* simple and *very* ugly, but I think it'll work -- that it won't get optimized away and will crash when it needs to (giving us better stacks). I'll run this through the tryservers before landing it. But they're currently closed.

Attachment #8493487 - Attachment is obsolete: true

Attachment #8513044 - Flags: review?(continuation)

Andrew McCreight [:mccr8]

Comment 176

•

10 years ago

Comment on attachment 8513044 [details] [diff] [review] Another patch for testing (*not* a fix) Review of attachment 8513044 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for being persistent on this.

Attachment #8513044 - Flags: review?(continuation) → review+

Steven Michaud [:smichaud] (Retired)

Comment 177

•

10 years ago

Comment on attachment 8513044 [details] [diff] [review] Another patch for testing (*not* a fix) I've started my tryserver builds: https://tbpl.mozilla.org/?tree=Try&rev=f3e383000027 These are all platform because some of the B2G builds are done on OS X.

Steven Michaud [:smichaud] (Retired)

Comment 178

•

•

10 years ago

Comment on attachment 8513821 [details] [diff] [review] Patch for testing (*not* a fix), rev7 Review of attachment 8513821 [details] [diff] [review]: ----------------------------------------------------------------- Inline assembly patches are the best kind of patches. ::: xpcom/base/CycleCollectedJSRuntime.cpp @@ +1063,5 @@ > +#if defined(XP_MACOSX) && defined(__LP64__) > + // We'll crash here if aSupports is poisoned (== 0x5a5a5a5a5a5a5a5a). This > + // is better (more informative) than crashing in ReleaseSliceNow(). See > + // bug 997908. This patch should get backed out when bug 997908 gets fixed, > + // or if it doesn't actually help fix that bug. Nit: maybe "or if it doesn't actually help diagnose that bug"

Attachment #8513821 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Updated

•

Steven Michaud [:smichaud] (Retired)

Comment 226

•

10 years ago

Comment on attachment 8539512 [details] [diff] [review] Another test/debug patch (*not* a fix) Oops, sorry. I already noticed a problem. New patch (and new tryserver tests) coming up.

Attachment #8539512 - Attachment is obsolete: true

Attachment #8539512 - Flags: review?(continuation)

Steven Michaud [:smichaud] (Retired)

Comment 227

•

10 years ago

Attached patch Another test/debug patch (*not* a fix) — Details — Splinter Review

Let's try this again. Here's another set of tryserver builds: https://tbpl.mozilla.org/?tree=Try&rev=adb35338f548

Attachment #8539518 - Flags: review?(continuation)

Andrew McCreight [:mccr8]

Comment 228

•

•

10 years ago

Attached patch Possible fix — Details — Splinter Review

Given what I said in comment #232, I started poking around in nsTArray code for whatever clues I might be able to find. I'm pretty sure nsTArray::SwapElements() isn't giving us trouble -- for non-auto arrays (as we're using), it just swaps pointers. But then I noticed that nsTArray code tries to use constructors and destructors as it adds and removes elements, or moves them around. Already we've seen with my first (abortive) test patch (attachment 8492208 [details] [diff] [review]) that we aren't dealing here with "ordinary" nsISupports objects -- just adding the following two lines to CycleCollectedJSRuntime::DeferredFinalize() caused all kinds of trouble: + NS_IF_ADDREF(aSupports); + NS_IF_RELEASE(aSupports); So I don't really think we want nsTArray to be calling constructors and destructors on the objects in CycleCollectedJSRuntime::mDeferredSupports and IncrementalFinalizeRunnable::mSupports. So why not change these nsTArrays from nsTArray<nsISupports*> to nsTArray<void*>? That's what this patch does. I've started a full set of tryserver builds, here: https://tbpl.mozilla.org/?tree=Try&rev=c5a99eb900c9 I'm leaving tomorrow on my Christmas vacation. So even if you really like this patch, Andrew, I think it's probably best to wait to land it on the trunk until I get back (in late December or January).

Attachment #8540413 - Flags: review?(continuation)

Andrew McCreight [:mccr8]

Updated

•

10 years ago

Depends on: 1114804

Andrew McCreight [:mccr8]

•

10 years ago

I got one today in 35 during Firefox Hello call while sitting back and chatting. Lots of tabs in the background. https://crash-stats.mozilla.com/report/index/f72e5bc6-7c63-487a-890c-2df6e2150117

Steven Michaud [:smichaud] (Retired)

Updated

•

10 years ago

Comment 239

•

10 years ago

As of the patch for bug 1114804 (which landed on trunk on 2015-03-10 and first appeared in the 2015-03-11 mozilla-central nightly), the code where these crashes happen has disappeared. As best I can tell, that code has been replaced by DeferredFinalizerImpl::DeferredFinalize(), here: https://hg.mozilla.org/mozilla-central/annotate/cf1060d8ce9f/dom/bindings/BindingUtils.h#l2910 The big difference is that the "new" code uses nsTArray::RemoveElementsAt() once instead of nsTArray::RemoveElementAt() repeatedly. And that seems to have been enough to fix this bug. I see no crashes at all at DeferredFinalizerImpl::DeferredFinalize() in the last week on the 39 branch.

Steven Michaud [:smichaud] (Retired)

Updated

•

10 years ago

Status: REOPENED → RESOLVED

Closed: 11 years ago → 10 years ago

Resolution: --- → FIXED

Andrew McCreight [:mccr8]

Comment 240

•

10 years ago

Great! Thanks for looking through the crash stats.

Keywords: leave-open

Steven Michaud [:smichaud] (Retired)

Comment 241

•

10 years ago

(Following up comment #239) > The big difference is that the "new" code uses > nsTArray::RemoveElementsAt() once instead of > nsTArray::RemoveElementAt() repeatedly. Actually another (and likely more important) difference is that ReleaseSliceNow() called Release() on each element it removed, but DeferredFinalizerImpl::DeferredFinalize() never calls Release(). Could the calls to Release() in ReleaseSliceNow() have been redundant? And could that have been the cause of bug 997908?

Andrew McCreight [:mccr8]

Comment 242

•

10 years ago

The old code was an array of raw pointers, while the new code is an array of smart pointers, so it will be automatically released when things are removed.

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Updated

•

7 years ago

Blocks: 1045992

plugin-container_2014-04-17-134244_Dave-Millers-MacBook-Pro.crash 11 years ago Dave Miller [:justdave] 65.00 KB, text/plain		Details
plugin-container_2014-04-17-134244-1_Dave-Millers-MacBook-Pro.crash 11 years ago Dave Miller [:justdave] 70.73 KB, text/plain		Details
Potential bandaid patch 11 years ago Steven Michaud [:smichaud] (Retired) 1.01 KB, patch	khuey : review-	Details \| Diff \| Splinter Review
Bandaid patch rev1 11 years ago Steven Michaud [:smichaud] (Retired) 1.00 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Mozconfig for 64-bit ASan build 11 years ago Steven Michaud [:smichaud] (Retired) 1019 bytes, text/plain		Details
Crash data from syslog 10 years ago Dave Miller [:justdave] 2.98 KB, text/plain		Details
Mozconfig for universal ASan build 10 years ago Steven Michaud [:smichaud] (Retired) 1.82 KB, text/plain		Details
llvm-dump.txt 10 years ago Dave Miller [:justdave] 33.67 KB, text/plain		Details
Patch for testing (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 853 bytes, patch		Details \| Diff \| Splinter Review
Patch for testing (not a fix) -- should make crash stacks more informative 10 years ago Steven Michaud [:smichaud] (Retired) 970 bytes, patch	mccr8 : review+	Details \| Diff \| Splinter Review
Patch for testing (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 1.36 KB, patch		Details \| Diff \| Splinter Review
Another patch for testing (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 1.19 KB, patch	mccr8 : review+	Details \| Diff \| Splinter Review
Patch for testing (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 1.17 KB, patch		Details \| Diff \| Splinter Review
Patch for testing (not a fix), rev7 10 years ago Steven Michaud [:smichaud] (Retired) 1.22 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
Add basic thread checking to incremental finalize. 10 years ago Andrew McCreight [:mccr8] 6.17 KB, patch		Details \| Diff \| Splinter Review
Another test/debug patch (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 2.67 KB, patch		Details \| Diff \| Splinter Review
Another test/debug patch (not a fix) 10 years ago Steven Michaud [:smichaud] (Retired) 2.68 KB, patch	mccr8 : review+	Details \| Diff \| Splinter Review
Possible fix 10 years ago Steven Michaud [:smichaud] (Retired) 5.67 KB, patch	mccr8 : review-	Details \| Diff \| Splinter Review