Closed Bug 1532803 Opened 5 years ago Closed 5 years ago

Give Js::HelperThread tasks a common interface

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: KrisWright, Assigned: KrisWright)

References

Details

Attachments

(1 file)

All js::HelperThread tasks have different task types and handle their tasks based on that type, as seen in HelperThread::TaskSpecs [1]. We want all of these task types to inherit from a common interface which XPCOM thread pools will eventually be able to understand. Each inheriting task will then define a method to handle their particular workload.

[1] https://searchfox.org/mozilla-central/rev/3e0f1d95fcf8832413457e3bec802113bdd1f8e8/js/src/vm/HelperThreads.cpp#2397

Priority: -- → P2

Added 'RunnableTask' struct to Utility.h to unify HelperThread task types in a way that can be exposed to XPCOM thread pools. Re-implemented tasks within native HelperThreads using their runnableTask method.

Blocks: 1539270

Sorry about the delay reviewing this patch, but I'm in two minds about this approach.

There's a problem with using a class with virtual methods and virtual destructor for tasks that run on another thread: if an object of such a class gets deleted then C++ will write to the vtable pointer without any synchronisation while calling the destructors for the class. This can lead to subtle race conditions / undefined behvaiour.

Here's more description of the problem: https://github.com/google/sanitizers/wiki/ThreadSanitizerPopularDataRaces#data-race-on-vptr

This is the reason why GCParallelTask was changed to use a function pointer in bug 1457703 (it originally used a virtual method) after Helgrind reported this problem as occurring in the original code.

Making this change opens up the possibility of people thinking they can synchronise inside virtual destructors and breaking things. Probably it will be fine as long as we're careful. But I'd like it if there was some way we could ensure this problem can't happen.

(In reply to Jon Coppeard (:jonco) from comment #2)

Sorry about the delay reviewing this patch, but I'm in two minds about this approach.

There's a problem with using a class with virtual methods and virtual destructor for tasks that run on another thread: if an object of such a class gets deleted then C++ will write to the vtable pointer without any synchronisation while calling the destructors for the class. This can lead to subtle race conditions / undefined behvaiour.

Here's more description of the problem: https://github.com/google/sanitizers/wiki/ThreadSanitizerPopularDataRaces#data-race-on-vptr

This is the reason why GCParallelTask was changed to use a function pointer in bug 1457703 (it originally used a virtual method) after Helgrind reported this problem as occurring in the original code.

Making this change opens up the possibility of people thinking they can synchronise inside virtual destructors and breaking things. Probably it will be fine as long as we're careful. But I'd like it if there was some way we could ensure this problem can't happen.

Thanks so much for the info. What would be a viable workaround for this problem? All I can think of is only taking the common interface from a point where any joining destructor is final, which it looks like would require defining more handler functions. I've also toyed with exposing a wrapper object to xpconnect for each type, which only holds and handles a HelperThread task in such a way that XPCOM thread pools can use without modifying the existing offthread tasks, if that would be preferable.

(In reply to Kristen Wright :KrisWright from comment #3)

I think what you've done makes sense and realised after I wrote this that we probably do this elsewhere in the browser already.

Steve, can we make the hazard analysis detect problematic use of synchronisation from virtual destructors? (NI for visibility of what I asked you on email). If so then I'll r+ this patch.

Flags: needinfo?(sphink)

Yay, I'm back on bugzilla.

I think the hazard analysis can check this, though after reading through the description of the problem it's not totally clear to me what the checks should be. What functions should be considered to be problematic synchronization primitives? Condvar waits, certainly, and releasing mutexes. Spinlocks? Do unlocks count? And then what exceptions would we need? It seems useful to check for the benign case mentioned in that link, where you don't subclass the class with synchronization in its destructor -- but even there, I don't know if that matters in our code base. (But generally, it seems like the problem is overriding a destructor that does synchronization.)

At the moment, the analysis really only computes the set of functions that can reach a GC. It would need a separate pass to compute what can reach these various synchronization calls. And each of these will require its own set of annotations to handle the unknown cases (like function pointers). Hopefully calls through function pointers should be relatively uncommon from destructors.

I do have code that at least generalizes the basic notion of decorating a known set of functions with "feature bits", and then doing a whole-callgraph traversal to propagate them everywhere to compute reachability. Once I switch to that, then the basic static reachability will just be a matter of giving synchronization functions a feature bit of their own similar to the "GC Call" feature bit, and it'll compute "can reach a synchronization primitive" globally at the same time as it does the "Can GC" (aka "can reach a GC call") pass. But again, that doesn't handle function pointers.

There's also the question of whether tsan is good enough to check this already. But it's a dynamic analysis, so I guess it won't always see the problem.

Flags: needinfo?(sphink)

(In reply to Steve Fink [:sfink] [:s:] from comment #5)

Filed bug 1543029 to investigate creating an analysis.

Keywords: checkin-needed

Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/aacb0c9ecddb
Common interface for JSThreadPool tasks r=jonco

Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Assignee: nobody → kwright
Blocks: 1556861
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: