Closed
Bug 1157343
Opened 9 years ago
Closed 9 years ago
TSan: data race image/src/ProgressTracker.cpp:384 SyncNotify
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: froydnj, Assigned: seth)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(2 files)
28.26 KB,
text/plain
|
Details | |
7.13 KB,
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer). * Specific information about this bug Looks like we're accessing mImage across threads without locks. There's a second race included in the log where TSan detects that we're racing on the vtable of RasterImage, which is probably its own brand of fun and crashes. There's also a host of other races following after these two in the TSan log in calls from SyncNotify; I'm not going to file races for those, since I suspect they'd all have the same root cause as this one. * General information about TSan, data races, etc. Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2]. If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist. [1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong [2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Assignee | ||
Comment 1•9 years ago
|
||
I think the two races are really the same thing. In theory we could get away with just using an Atomic<Image*> for mImage, but I feel like we're playing a bit fast-and-loose with a weak pointer in this code. I'm just going to use a mutex instead, so we can take a strong reference atomically, which will help me sleep at night. I'm not convinced this code is hot enough that it's worth preferring an atomic.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → seth
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•9 years ago
|
||
BTW, I think it's worth noting that I suspect the reason this happens is ultimately because of this NotifyListener() call in imgLoader::LoadImage(), and the analogous one in imgLoader::LoadImageWithChannel(): https://dxr.mozilla.org/mozilla-central/source/image/src/imgLoader.cpp#2336 This ends up creating an AsyncNotifyRunnable to do the notification. If this is an additional load for an image which is already loading, then it's possible for AsyncNotifyRunnable::Run() (where we'll touch mImage via ProgressTracker::SyncNotify) to race with imgRequest::OnDataAvailable() (where we'll touch mImage via ProgressTracker::SetImage).
Updated•9 years ago
|
Attachment #8596224 -
Flags: review?(tnikkel) → review+
Assignee | ||
Comment 4•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=82803b3040db
Assignee | ||
Comment 5•9 years ago
|
||
Thanks for the review! Try also looks good, so I think we're ready to push this.
Assignee | ||
Comment 6•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f97bf645c13
Comment 7•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/5f97bf645c13
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Comment 8•9 years ago
|
||
Backed out for various tp crashes. https://hg.mozilla.org/integration/mozilla-inbound/rev/0c438bdbc45f https://treeherder.mozilla.org/logviewer.html#?job_id=9262137&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=9268382&repo=mozilla-inbound https://treeherder.mozilla.org/logviewer.html#?job_id=9260538&repo=mozilla-inbound
Status: RESOLVED → REOPENED
status-firefox40:
fixed → ---
Resolution: FIXED → ---
Target Milestone: mozilla40 → ---
Comment 9•9 years ago
|
||
Merge of backout: https://hg.mozilla.org/mozilla-central/rev/0c438bdbc45f FWIW, I've seen more crashes in various talos suites on the other branches where the backout hasn't landed yet and not one has had an identical signature.
Assignee | ||
Comment 10•9 years ago
|
||
I investigated this failure, and it pointed to a further _serious_ problem - in my opinion, more serious than the TSan issue. We use ProgressTrackerInit to register the image with its ProgressTracker _while the image's constructor is running_. ProgressTracker can then both internally touch the refcount on the image and hand out a reference to the image via ProgressTracker::GetImage(). This is totally unsafe, because if we take such a reference and then drop it while the constructor is still running, the image's refcount can hit zero before it's been assigned to an nsRefPtr. We can end up running its destructor before the constructor even returns. This is very bad news. I'm going to file another bug to fix that issue. Fixing it should make the patch in this bug safe.
Assignee | ||
Comment 11•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bcab57198d1b
Assignee | ||
Comment 12•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=de2118aac2ee
Assignee | ||
Comment 13•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1fdb62f5beb1
Assignee | ||
Comment 14•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6cc7b68737a0
Assignee | ||
Comment 15•9 years ago
|
||
OK, looks like tp5o is nice and green in those tests, so I think this should be safe to reland now that bug 1159409 has landed.
Assignee | ||
Comment 16•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/33d085b0ca40
Comment 17•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/33d085b0ca40
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•