Closed Bug 968956 Opened 10 years ago Closed 10 years ago

[DMD] crash in mozilla::dmd::ClearReports

Categories

(Core :: DMD, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: erahm, Assigned: erahm)

Details

Attachments

(2 files, 1 obsolete file)

Attached file stack trace
I'm seeing sporadic crashes when dumping memory info on a DMD enabled build on my buri devices. The crash itself is in mozilla::dmd::ClearReports(), AFAICT this looks like a multi-threaded access of a hash table issue.
Good catch!

  // This lock must be held while manipulating global state, such as
  // gStackTraceTable, gBlockTable, etc.
  static Mutex* gStateLock = nullptr;

And it looks like ClearReports() does not get the lock.

Care to write a patch? It should be easy. Use the RAII AutoLockState class. And while you're doing that can you check that the lock is held for other uses of those global variables? Thanks!
Assignee: nobody → erahm
Sure thing, I'll get a patch together and check the rest of the usages.
Bonus points if you create a wrapper class around js::HashSet that asserts if gStateLock is not held during accesses, and convert the global hash tables to use it :)
Attached patch Initial patch (obsolete) — Splinter Review
Moved the core ClearReports logic to an unsafe (internal) function. The external interfaces now properly acquires the state lock. Properly checking that the current thread actually owns the mutex can probably be addressed in a separate bug.
Attachment #8372010 - Flags: review?(n.nethercote)
Also of note: the DMD tests compiled, ran, and passed on my OS X machine. I also looked through for any other unlocked accesses to the has table and only found one case (which I don't think would have been an issue, but went ahead and added a lock anyhow).
Comment on attachment 8372010 [details] [diff] [review]
Initial patch

Review of attachment 8372010 [details] [diff] [review]:
-----------------------------------------------------------------

Did you run the test?

Looks good. My only thought is if "Unsafe" is the best suffix; it's vague. Maybe ClearReportsNoLock()? I'll let you decide.
Attachment #8372010 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #6)
> Comment on attachment 8372010 [details] [diff] [review]
> Initial patch
> 
> Review of attachment 8372010 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Did you run the test?
> 
> Looks good. My only thought is if "Unsafe" is the best suffix; it's vague.
> Maybe ClearReportsNoLock()? I'll let you decide.

Tests were run (see Comment 5). Maybe just use| ClearReportsInternal| like SizeOf does?
Comment on attachment 8372010 [details] [diff] [review]
Initial patch

Review of attachment 8372010 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, ClearReportsInternal() sounds ok, since we have precedent.

::: memory/replace/dmd/DMD.cpp
@@ +2078,3 @@
>    if (!gIsDMDRunning) {
>      return;
>    }

Oh wait, this test should be in ClearReports(), not ClearReportsInternal().
Attached patch Renamed functionSplinter Review
Switch from Unsafe to Internal and moved DMD running check to ClearReports
Attachment #8372631 - Flags: review?(n.nethercote)
Attachment #8372010 - Attachment is obsolete: true
Comment on attachment 8372631 [details] [diff] [review]
Renamed function

Review of attachment 8372631 [details] [diff] [review]:
-----------------------------------------------------------------

Since I gave r+ on the previous patch, you didn't need to ask again for review -- unless you really wanted it -- and you could have carried over the r+ yourself. But no matter.

I'll land this on Monday morning. Thanks.
Attachment #8372631 - Flags: review?(n.nethercote) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/b667bd72d909

Eric: I added "DONTBUILD because DMD is NPOTB." after the first line of the commit message. DMD (unfortunately) isn't tested on TBPL. The presence of "DONTBUILD" anywhere in the commit message tells TBPL not to run tests for this push. We do this for patches that are NPOTB ("not part of the build") in order to save machine time.
https://hg.mozilla.org/mozilla-central/rev/b667bd72d909
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: