Closed Bug 575544 Opened 14 years ago Closed 13 years ago

Thread Safepoints

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: siwilkin, Assigned: siwilkin)

References

Details

Attachments

(2 files, 15 obsolete files)

73.79 KB, patch
Details | Diff | Splinter Review
58.41 KB, patch
Details | Diff | Splinter Review
Due to the possibility of distinct regions of a thread's stack being managed by different GCs, the safepointing system does not consider the safety of individual threads, rather, regions on a thread's stack. Hence, a SafepointRecord describes the bounds and safepointing information for a single stack region. Each thread can have any number of non-overlapping SafepointRecords; the 'current' SafepointRecord of a thread describes the region up to the thread's current stackpointer. SafepointManager objects maintain a set of references to SafepointRecords which cover regions from any number of thread's stacks. Multiple SafepointManager objects may be alive in a process, however, their sets of managed SafepointRecords must be mutually exclusive.

To request a safepoint, a thread calls a SafepointManager's requestSafepoint() method and passes in a callback. Requests to a SafepointManager are serialized, however, requests to different SafepointManagers may proceed in parallel (including the case where a thread's stack is segmented between multiple SafepointManagers).  The callback is dispatched on the requesting thread, and the following conditions are guaranteed to hold throughout its execution:

   1) All threads whose current SafepointRecord is managed by the SafepointManager will be suspended at a safepoint. Each thread's registers will be flushed to the stack.
   
   2) All non-current SafepointRecords managed by the SafepointManager will remain non-current. In other words, threads will continue to execute unless they unwind into a stack region managed by a SafepointManager with an ongoing safepoint request.
   
   3) No new SafepointRecords will be added to the SafepointManager until after the callback.
   
The patch includes a safepoint-aware implementation of the RAII locker for WaitNotifyMonitors (see bug 555765).
Attached patch Initial patch (obsolete) — Splinter Review
Depends on: 555765
There are now 2 types of safepoint-aware RAII-lockers available for WaitNotifyMonitors: the original version that is not Exception aware, and a new version that will automatically unlock and then rethrow any exception that is thrown out of the critical section. (Clearly this refers to the AVM's exception mechanism, not the unused C++ one).

The general form of the RAII-locking macros are:

SCOPE_LOCK_SP(monitor) {
    // Critical section
}

and

SCOPE_LOCK_SP_RETHROW(monitor, core) {
    // Critical section.
    // Any exceptions thrown out of the critical section will
    // unlock the monitor.
}
Attachment #454804 - Attachment is obsolete: true
Blocks: asymmGC
Updates in this patch:

- Added initial profiling / debug support to safepoint requests
- Safepoint tasks are now dispatched via the Runnable interface, rather than a function pointer.
Attachment #457237 - Attachment is obsolete: true
Blocks: 600723
Assignee: nobody → siwilkin
Flags: flashplayer-qrb+
Depends on: 609837
Depends on: 611232
Attached patch Main patch. Latest. (obsolete) — Splinter Review
The documentation still needs some work, but time is pressing, so I'm requesting to start the reviews now.
Attachment #470449 - Attachment is obsolete: true
Attachment #493935 - Flags: review?(lhansen)
Attachment #493935 - Flags: review?(fklockii)
Attachment #493935 - Flags: review?(edwsmith)
Attached patch Initial selftest (obsolete) — Splinter Review
As the safepointing code does not get properly exercised until the thread-safe GC work lands, the self tests must demonstrate/document safepoint usage, as well as verifying the implementation.

Currently this patch performs only one test, but it does show the simplest pattern that is used throughout the GC work.

More tests will be added asap.
Attachment #493936 - Flags: feedback?(lhansen)
Attachment #493936 - Flags: feedback?(fklockii)
Attachment #493936 - Flags: feedback?(edwsmith)
Comment on attachment 493935 [details] [diff] [review]
Main patch. Latest.

(fyi for the reviewers: if you want to apply this patch and build it, you'll need to also apply the patches on the dependent bugs...)
(In reply to comment #6)
> Comment on attachment 493935 [details] [diff] [review]
> Main patch. Latest.
> 
> (fyi for the reviewers: if you want to apply this patch and build it, you'll
> need to also apply the patches on the dependent bugs...)

At Dan's and Ed's suggestion, I have made a feature-branched clone of TR, named tr-asymm, that is a suitable target for applying the patches on this ticket itself.

You can use this as the base for applying the patches:
  http://asteam.corp.adobe.com/hg/users/fklockii/tr-asymm/rev/76fd6b88114b

If you'd prefer to jump in with the patches applied, you can use:
  http://asteam.corp.adobe.com/hg/users/fklockii/tr-asymm/rev/74106539c6bc
This revised selftest patch has got much better documentation and, hopefully, clearly demonstrates safepoint usage and mechanism.

I'm also continuing to add docs to https://zerowing.corp.adobe.com/display/~siwilkin/Safepoints
Attachment #493936 - Attachment is obsolete: true
Attachment #495307 - Flags: review?(lhansen)
Attachment #495307 - Flags: review?(fklockii)
Attachment #495307 - Flags: review?(edwsmith)
Attachment #493936 - Flags: feedback?(lhansen)
Attachment #493936 - Flags: feedback?(fklockii)
Attachment #493936 - Flags: feedback?(edwsmith)
Fixed a bug in the ProducerConsumer selftest.

I'll stop re-adding the review requests to the selftests, as I assume they're going to keep being updated for a while.
Attachment #495307 - Attachment is obsolete: true
Attachment #495307 - Flags: review?(lhansen)
Attachment #495307 - Flags: review?(fklockii)
Attachment #495307 - Flags: review?(edwsmith)
Comment on attachment 493935 [details] [diff] [review]
Main patch. Latest.

I'm not a domain experts so it's hard for me to review the code very carefully, but I have looked at the overall structure, API documentation, etc - more a superreview I guess.

In general the documentation is not quite adequate.  One issue is that there's a reference to an Adobe-internal wiki, which won't do.  VM documentation must go in text or html files in the doc/ directory.  Thus I've not followed that link and only looked at what's here.

The big piece of documentation that appears to be missing here would address how SafepointRecords are created, and who's responsible for doing that.

I'm on the fence but I'm giving it an R- because I think that as an API it needs more work on the documentation and I think that that rework needs a re-review.  We'll live with this API for a long time.

Some specific comments (lots of niggly details, nothing major):

Style: the house constructor style is better

  MyClass::MyClass()
      : f1(v1)
      , f2(v2) 
  {
      code goes here
  }

than what you have (note both initializer list formatting and brace placement).

Style: /most/ code in the VM does not use Java-style function bracing but always places the opening brace on the line below.  But it's not uniform (esp for inline functions) and I'm not really objecting, just noting.

SafePointRecord::setAsClear: is it really a good idea to test !isClear() here?  In most situations clearing a cleared object is OK.

SafePointRecord::isClear does not test the condition established by setAsClear.

Safepoint.h:

Before landing the documentation now on the wiki must be moved into the doc directory and linked from the header file (Tamarin is open-source, we can't point to an internal wiki, and keeping docs far away from the code is a bad idea anyway).  If there's a rev of the patch it should incorporate that documentation.

I'm a little concerned about the scoped-initialization form used in if statements, eg in SCOPE_LOCK_SP, but if all evidence is that all compilers of interest handle this then I'm OK with it.

Names like "reason_QTY" need documentation, debug code or not. (Quantity? Quality?  Why an abbreviation in the first place? Abbreviations are almost always bad.)

class SafePointManager: house style is open brace on line below "class", "public:", etc aligned with the "class" and "{" lines, eg, 

  class Fnord
  {
  public:
      virtual void doIt();
  }

Documentation for requestSafePoint, "it" on penultimate line is ambiguous (the thread? which thread? the safepoint?)  Do the last three lines add much information?

Documentation for pollSafePoint, presumably "imply ... will" should be "require ... must"?  Furthermore, "task" should be "request's task"? Does the information about register dumping add any information here, given that that information is part of the general documentation?

Documentation for inSafePointTask, "Tests is this thread is current" should be "Tests _if_ this thread is current_ly_".

Documentation for addRecord, a bit of ambiguity here, what happens with currently executing safepoint tasks?  From other docs it looks like addRecord might block, might be useful to note that.

Documentation for removeRecord, "herewith" => "hereafter", "henceforth", "subsequently"?

I see "Qty" rears its head again in the variables section here, pls document and/or expand.

SafepointRecord docn - the comment about safepoint operations referring to SafepointRecords, "*not* threads", is confusing.   There's apparently a mapping from a safepointrecord to a unique thread, ergo, an operation on a safepointrecord is also an operation on a specific thread.  I bring this up only because I misread the comment as suggesting that my operation on the
safepointrecord could map (in some manner) to various threads.  I think the underlying problem here is that safepointrecords are not well documented.

Suggest you remove the constructor documentation and move the comment about the linked list into the safepointrecord class documentation, where it is more useful.

hasCurrent documentation "it's" => "its".

"FIXME" in documentation for SafepointableMutexLocker.

In requestSafepoint(), should two instances of "friend" be "friendly"?
Attachment #493935 - Flags: review?(lhansen) → review-
Attachment #493935 - Flags: review?(edwsmith)
(i'll do a proper review today, so that Simon can incorporate feedback from both sources.)
Comment on attachment 493935 [details] [diff] [review]
Main patch. Latest.

R- because I should be part of the next round of reviews.

Nit: Find out whether the appropriate practice is to put the #ifdef
VMCFG_SAFEPOINTS into a header or to instead guard the #include of the
header.  I briefly tried to survey the code base to determine what
team practice is here, but that failed to yield conclusive result.

I'd prefer to read/write a procedure call form of SAFEPOINT_POLL_FAST
(that would presumably be inlined into nothingness when the feature is
disabled) rather than leave it as a macro.  (Or maybe just make the
macro not use all-caps, since it has the semantics of a procedure call
when it is present.)  I might be in the minority on this point though;
to my knowledge the team does not have a strict policy on this sort of
thing.

I do not understand the SafepointTaskDebug class or its 'reason'
constructor parameter.
- I skimmed through later code from our patch queue but cannot find
  uses of it, though there are hints in the SafepointDebugging class.
- Are these for your personal use?  (If so, please remove from the
  patch.)
- (If they are meant for non-personal use, then I'm not sure if its
  such a hot idea to interchangably typedef SafepointTaskDebug or
  Runnable as SafepointTaskBase when they do not have identical
  interfaces.)
- I suspect that even if they are meant to be useful to a broader
  audience, you may be better off teasing them out of this initial
  patch and leaving them for a later review...
- (I'm going to ignore the SAFEPOINT_DEBUGGING portions in the
  remainder of the patch)

In the spec for SafepointManager::pollSafepointPostExplicit(), it says
"must be called when a thread returns into safepointable code after it
has been declared explicitly safe" -- when you say "must" here, do you
mean before performing any operations that might violate the
task-constraints?  Or do you mean it must be called immediately after
a call to pollSafepoint returns?  Or does this method effectively
perform a pollSafepoint as well?
- (I guess I am still trying to process when threads are expected to
   be suspended while a safepoint-task is run to completion, versus
   when the other threads under the same SafepointManager can continue
   doing unrelated work... )
- Since it says "must", the method sounds important -- but then why is
  the pollSafepointPostExplicit method not illustrated on the wiki
  pages -- are its uses hidden behind one of your other abstractions?
  If so, please point that out.
- If, on the other hand, this method is really just meant to
  interoperate with SafepointRecord::setAsSafeExplicit (and is not
  used except in concert with the latter), then maybe the best thing
  would be to include a pointer to that method from
  SafepointManager::pollSafepointPostExplicit.

It seems bizarre terminologically that one meaning of
Status_SAFE_EXPLICIT is that the thread is at an /implicit/ safepoint.
Are there unrelated meanings of explicit/implicit at play here?  Or???

Should the SafepointRecord::Status_FOO's be encoded as an enum?  Or is
there a reason they must be explicitly typed as a uintptr_t?  (Perhaps
for proper interaction with the AtomicOps primitives?)
- Ah, I see later on that you are doing bit twiddling with these
  things.
- You could still make it an enum though
- Either way, add a note saying that space of Status values is limited to 0..3.
- (A short description of legal Status transitions would also be good;
   they are now only encoded as assertions in the code, I think; though perhaps
   it is as simple as saying that only clear <-> non-clear is allowed.)

In the implementation of SafepointManager::requestSafepoint, you have
some inlined assembly code for the IA32 pause instruction... is this a
sign of another abstraction we should be adding to VMPI rather than
putting it here?  For example we probably should be using the
intrinsic form if it is available, rather than inline assembly...

Nit: miscellaneous paragraphs with extra spaces between words,
questionable selection of line breaks, and/or other small typos.  See:
- below "Implicit safepoints (safepoint-aware locking/waiting/sleeping):"
- above SafepointManager::requestSafepoint declaration
- above SafepointableMutexLocker example: s/contented by/contended by/
  (not sure the latter is proper english, but it seems closer to your intent)
Attachment #493935 - Flags: review?(fklockii) → review-
Depends on: 622608
Attached patch Post-review API changes (obsolete) — Splinter Review
This is a thorough re-working of the safepointing API:

- The public APIs of SafepointManager and SafepointRecord have been greatly simplified.
- SafepointRecord now only has 2 states (described by an enum). The bit twiddling in the composite field has been removed.
- SafepointableMutexLocker and SafepointableMonitorLocker have been folded into templatized versions of MutexLocker and MonitorLocker.
- Explicit pre and post safepoint operations have been moved into a RAII pattern with the SafepointGate class.
- The Safepoint task debugging define/API has been removed. To be added in a later patch.

This is a WIP patch (some of the new API requires updated docs).
Attachment #493935 - Attachment is obsolete: true
Added the NestedProducerConsumerTest selftest to exercise nested and recursive SafepointManager entry.
Attachment #495373 - Attachment is obsolete: true
(In reply to comment #10)
> Comment on attachment 493935 [details] [diff] [review]
> Main patch. Latest.
> In general the documentation is not quite adequate.  One issue is that there's
> a reference to an Adobe-internal wiki, which won't do.  VM documentation must
> go in text or html files in the doc/ directory.  Thus I've not followed that
> link and only looked at what's here.

I'll move it into doc/safepoint.txt

> The big piece of documentation that appears to be missing here would address
> how SafepointRecords are created, and who's responsible for doing that.

This will be documented in doc/safepoint.txt

> I'm on the fence but I'm giving it an R- because I think that as an API it
> needs more work on the documentation and I think that that rework needs a
> re-review.  We'll live with this API for a long time.

The revised API is much simpler, and hopefully, the docs will be too.

> Some specific comments (lots of niggly details, nothing major):
> 
> Style: the house constructor style is better
> 
>   MyClass::MyClass()
>       : f1(v1)
>       , f2(v2) 
>   {
>       code goes here
>   }
> 
> than what you have (note both initializer list formatting and brace placement).
> 
> Style: /most/ code in the VM does not use Java-style function bracing but
> always places the opening brace on the line below.  But it's not uniform (esp
> for inline functions) and I'm not really objecting, just noting.

The whole patch has been updated to house style.
My New Year's resolution is to conform from now on!

> SafePointRecord::setAsClear: is it really a good idea to test !isClear() here? 
> In most situations clearing a cleared object is OK.

This is removed and not-applicable to the revised API

> SafePointRecord::isClear does not test the condition established by setAsClear.

This is removed and not-applicable to the revised API

> Safepoint.h:
> 
> Before landing the documentation now on the wiki must be moved into the doc
> directory and linked from the header file (Tamarin is open-source, we can't
> point to an internal wiki, and keeping docs far away from the code is a bad
> idea anyway).  If there's a rev of the patch it should incorporate that
> documentation.

I'll move it into doc/safepoint.txt

> I'm a little concerned about the scoped-initialization form used in if
> statements, eg in SCOPE_LOCK_SP, but if all evidence is that all compilers of
> interest handle this then I'm OK with it.

All compilers seem to be happy with this. There was a discussion of the merits of the style in the comments of bug 555765. The consensus was generally positive. 

> Names like "reason_QTY" need documentation, debug code or not. (Quantity?
> Quality?  Why an abbreviation in the first place? Abbreviations are almost
> always bad.)

reason_QTY is removed in the new API, but I'll do this from now on...

> class SafePointManager: house style is open brace on line below "class",
> "public:", etc aligned with the "class" and "{" lines, eg, 
> 
>   class Fnord
>   {
>   public:
>       virtual void doIt();
>   }

Fixed to house style

> Documentation for requestSafePoint, "it" on penultimate line is ambiguous (the
> thread? which thread? the safepoint?)  Do the last three lines add much
> information?

This and all other documentation points are either fixed in the new WIP patches, or will be before the next review.
(In reply to comment #12)
> Comment on attachment 493935 [details] [diff] [review]
> Main patch. Latest.
> 
> R- because I should be part of the next round of reviews.
> 
> Nit: Find out whether the appropriate practice is to put the #ifdef
> VMCFG_SAFEPOINTS into a header or to instead guard the #include of the
> header.  I briefly tried to survey the code base to determine what
> team practice is here, but that failed to yield conclusive result.

Currently Safepoints.h needs unconditional inclusion as it conditionally defines macros with #ifdef VMCFG_SAFEPOINTS that are used unconditionally elsewhere in the VM and GC.

E.g. we have the equivalent of:
#ifdef VMCFG_SAFEPOINTS
#define SCOPE_LOCK_SP(_m_) if (vmbase::MutexLocker<true> __locker</snip>
#else
#define SCOPE_LOCK_SP(_m_) SCOPE_LOCK(_m_) 
#endif


> I'd prefer to read/write a procedure call form of SAFEPOINT_POLL_FAST
> (that would presumably be inlined into nothingness when the feature is
> disabled) rather than leave it as a macro.  (Or maybe just make the
> macro not use all-caps, since it has the semantics of a procedure call
> when it is present.)  I might be in the minority on this point though;
> to my knowledge the team does not have a strict policy on this sort of
> thing.

In the revised API, SAFEPOINT_POLL_FAST and SAFEPOINT_POLL do more work, and are probably best left as macros.

> I do not understand the SafepointTaskDebug class or its 'reason'
> constructor parameter.
> - I skimmed through later code from our patch queue but cannot find
>   uses of it, though there are hints in the SafepointDebugging class.
> - Are these for your personal use?  (If so, please remove from the
>   patch.)
> - (If they are meant for non-personal use, then I'm not sure if its
>   such a hot idea to interchangably typedef SafepointTaskDebug or
>   Runnable as SafepointTaskBase when they do not have identical
>   interfaces.)
> - I suspect that even if they are meant to be useful to a broader
>   audience, you may be better off teasing them out of this initial
>   patch and leaving them for a later review...
> - (I'm going to ignore the SAFEPOINT_DEBUGGING portions in the
>   remainder of the patch)

I've removed it from this patch.

> In the spec for SafepointManager::pollSafepointPostExplicit(), it says
> "must be called when a thread returns into safepointable code after it
> has been declared explicitly safe" -- when you say "must" here, do you
> mean before performing any operations that might violate the
> task-constraints?  Or do you mean it must be called immediately after
> a call to pollSafepoint returns?  Or does this method effectively
> perform a pollSafepoint as well?
> - (I guess I am still trying to process when threads are expected to
>    be suspended while a safepoint-task is run to completion, versus
>    when the other threads under the same SafepointManager can continue
>    doing unrelated work... )
> - Since it says "must", the method sounds important -- but then why is
>   the pollSafepointPostExplicit method not illustrated on the wiki
>   pages -- are its uses hidden behind one of your other abstractions?
>   If so, please point that out.
> - If, on the other hand, this method is really just meant to
>   interoperate with SafepointRecord::setAsSafeExplicit (and is not
>   used except in concert with the latter), then maybe the best thing
>   would be to include a pointer to that method from
>   SafepointManager::pollSafepointPostExplicit.


Explicit pre and post safepoint operations have been moved into a RAII
pattern with the SafepointGate class.

> It seems bizarre terminologically that one meaning of
> Status_SAFE_EXPLICIT is that the thread is at an /implicit/ safepoint.
> Are there unrelated meanings of explicit/implicit at play here?  Or???

It's just an unfortunate convention that has been removed in the revised API. There is now only SP_SAFE and SP_UNSAFE.

> Should the SafepointRecord::Status_FOO's be encoded as an enum?  Or is
> there a reason they must be explicitly typed as a uintptr_t?  (Perhaps
> for proper interaction with the AtomicOps primitives?)
> - Ah, I see later on that you are doing bit twiddling with these
>   things.
> - You could still make it an enum though
> - Either way, add a note saying that space of Status values is limited to 0..3.
> - (A short description of legal Status transitions would also be good;
>    they are now only encoded as assertions in the code, I think; though perhaps
>    it is as simple as saying that only clear <-> non-clear is allowed.)

The bit-twiddling has been removed and the status is now an enum (as stated previously, only with the states SP_SAFE and SP_UNSAFE)

> In the implementation of SafepointManager::requestSafepoint, you have
> some inlined assembly code for the IA32 pause instruction... is this a
> sign of another abstraction we should be adding to VMPI rather than
> putting it here?  For example we probably should be using the
> intrinsic form if it is available, rather than inline assembly...

Added to VMPI as VMPI_spinloopPause()
See bug 622608

> Nit: miscellaneous paragraphs with extra spaces between words,
> questionable selection of line breaks, and/or other small typos.  See:
> - below "Implicit safepoints (safepoint-aware locking/waiting/sleeping):"
> - above SafepointManager::requestSafepoint declaration
> - above SafepointableMutexLocker example: s/contented by/contended by/
>   (not sure the latter is proper english, but it seems closer to your intent)

Fixed in the new WIP patches, or will be before the next review.
Attached patch Revised API and docs (obsolete) — Splinter Review
Attachment #500985 - Attachment is obsolete: true
Attached patch Selftests. Latest. (obsolete) — Splinter Review
Attachment #500986 - Attachment is obsolete: true
Attached patch Latest API and docs (obsolete) — Splinter Review
Attachment #502869 - Attachment is obsolete: true
Current patches are rev 215 of:
http://asteam.corp.adobe.com/hg/users/fklockii/tr-patch-queues
Attachment #502870 - Flags: superreview?(lhansen)
Attachment #502870 - Flags: review?(kpalacz)
Attachment #502870 - Flags: review?(fklockii)
Attachment #502917 - Flags: superreview?(lhansen)
Attachment #502917 - Flags: review?(kpalacz)
Attachment #502917 - Flags: review?(fklockii)
The below is slightly edited and copied over from bug 623233.

Felix:

.."From GC.h:

  // A safepoint cannot be called while holding this lock

I'm not sure about the phrase "safepoint cannot be called" -- do you apply the verb "call" to safepoints elsewhere in your documentation?  I think I've seen "request", "execute", and "reach" as verbs for it, but not "call" as I recall.)"

Simon:

The word 'safepoint' is very overloaded. It can be used in *at least* the following three ways:
1) noun. A static code location that is 'safe' with respect to some task.
2) noun. The dynamic property of a group of threads all reaching some 'safe' state. Example usage: "the VM is at a safepoint", "the GC's mutators are in a safepoint", "let's request a safepoint", "a safepoint cannot be called while holding this lock".
3) verb. "To safepoint". I.e. to cause 2) to happen.

In Felix's example above, 'call' is used in the context of meaning 2, as is 'request'. 'Reach' a safepoint could be either meaning 1 or 2. 'Execute' a safepoint is probably a mistake and should be 'execute a safepoint task'.

In the revised safepointing API I've tried to only use 'safepoint' in context of meaning 1, except when describing a 'safepoint task' when it is used in the context of meaning 2. For most of the meaning 2 contexts I have written "threads have been safepointed", or "threads have all reached a safepoint (location)". Meaning 3 is written as "submit/request a SafepointTask".
Please note that contrary to my comments above, all of the safepointing docs live in Safepoint.h. I hope they are concise enough not to require a home in doc/
In the past (and regarding a separate ticket, I think), I had asked Simon if template flags could be enums rather than booleans, so that they would be readable at the point of use.  In this context, the question is whether this:

+#ifdef VMCFG_SAFEPOINTS
+    template <bool SAFEPOINT_AWARE>
+#endif
     class MutexLocker

should become something more like:


+#ifdef VMCFG_SAFEPOINTS
+    template <safepoint_aware_t SAFEPOINT_AWARE>
+#endif
     class MutexLocker

where safepoint_aware_t is an two-value enum defined elsewhere.

If the expectation is that clients can be expected to remember what the meanings of MutexLocker<true> versus MutexLocker<false>are, then obviously the boolean can stay.

But as far as I know there is not a technical reason for it to be a boolean rather than an enum with named elements.  Thoughts?
Oh, Comment 23 was prompted by this note from Simon: "I'll pre-empt your comment about having a bool template parameter rather than an enum for the Mutex/MonitorLockers. Bah."

I do not know whether the "Bah" is directed at the idea itself or at the uninteresting effort involved in the translation.  But Simon definitely knows that I would raise the issue.
+1 for enum vs bool; increased readability FTW!
(In reply to comment #24)
> I do not know whether the "Bah" is directed at the idea itself or at the
> uninteresting effort involved in the translation.  But Simon definitely knows
> that I would raise the issue.

Bah for the realization that your previous comment on this issue (which was on IRC IIRC) was probably before I posted this patch. So I could have changed it already.
Having re-read the docs after a few week's break, I think the section 'Attempted acquisition of locks held by a safepointed thread' is confusing and doesn't really explain what's happening. Let me know your thoughts.
Comment on attachment 502917 [details] [diff] [review]
Latest API and docs

R+.  I'm not sure I can fit all of the pieces of this into my head at
once, so I'm a little nervous about my R+.  But I also don't have any
suggestions for how to simplify the API or its presentation.  The
abstraction makes sense, I probably just need a higher level view of
what an isolated well-written code example using this API looks like.
(The MMgc changes to support AsymmGC do not count as *isolated*).

(Just in case its not obvious: usually when I have a question in the
 review below, if the answer does not immediately follow from
 something I overlooked in the doc, then I am implicitly asking you to
 revise the doc to incorporate the answer to my question or make it
 clearer.)

== VMThread.* ==

* Above MonitorLocker, you have a very similar body of text in
comments as you have above MutexLocker; *so* similar that I think you
incorrectly say "MutexLocker" in two spots where I think you instead
should say "MonitorLocker" ...?  But better still might be to follow a
DontRepeatYourself (DRY) principle and collapse both pieces of text
into separate documentation in Safepoint.h, and just leave a pointer
to that documentation above MutexLocker and MonitorLocker?

* On that note, the dangling "and" in here can't be right:
"See Safepoint.h and for further details."

* One last note: Above MonitorLocker and MutexLocker, you say "See
Safepoint.h for further details."  In Safepoint.h you say, "See the
MutexLocker class for full details" and "See the MonitorLocker class
for full details."  I dislike the reference cycle here.  Is there
really useful info that you're trying to point the reader to here?  If
so, that's a good reason to put the core documentation into one file
under docs/ rather than a header: it discourages this kind of
convoluted cross-referencing structure.  But this issue might get
resolved just by following the DRY suggestion I wrote above.

* If I'm in the midst of debugging, I can't just blindly replace
SCOPE_LOCK's with SCOPE_LOCK_SP and expect to pay solely some amount
of extra performance-overhead, right?  Or can I do that without worry
about e.g. new opportunities for deadlock?  The documentation
discussing the SCOPE_LOCK macros says "The SCOPE_LOCK macros defined
by VMThread.h create a safepoint-unaware MutexLocker.", but I cannot
determine if safepoint-unawareness is sometimes a necessity to avoid
deadlock, or if it is just a performance boost.  The documentation in
Safepoint.h similarly points out conditions where one needs to be in
the scope of a SafepointGate when acquiring a lock, but it still is
unclear whether it would be bad to always use the safepoint-aware
forms.  I'd probably be satisfied with just a note pointing out the
answer to this question in the spot(s) where you mention the
SCOPE_LOCK macros.

* nit: repeating from an earlier review: "contented by ... threads" is
probably not what you want to say.  Either convince me I am wrong, or
put in "contended by ... threads" (or better yet, find the 100%
correct english for this, rather than my 80% attempt).


== Safepoint.h ==

* Safepoint.h doc:

  "Note, however, that a thread does not have to be suspended to
  be 'safe'; a thread could be 'safe' by remaining in library
  code for the task's duration. A thread that is either suspended
  at a safepoint or 'safe' by some other means is 'safepointed'."

I think it would be worthwhile to establish the terminology for the
two cases above in the text above, to avoid later confusion.  I
*think* that the suspending case corresponds to what you later call
"explicit safepoints", and the safe-by-other-means case corresponds to
"implicit safepoints."  If this is true, then I suggest adding
parenthetical references to the terms at appropriate points above.

(If my hypothesis is false and the distinction between "implicit" and
"explicit" means something else, then invent appropriate terms (eg
"suspending" vs. "non-suspending"), insert parentheticals in the text
to act as effective definitions for the terms, and add an explanation
of the difference that I was missing between the two pairs of terms
when I made the hypothesis.)

* Woot, I like the pictures!  (I'm a visual guy.)

* Safepoint.h doc:

 "Note that SafepointRecords that are not top-most on a thread's
  stack are always SP_SAFE."

Is this invariant enforced, or is this a requirement for the client of
the API to uphold?  (It looks like its enforced by the Safepoint
implementation from reading the code.)

* Safepoint.h doc:

  "no threads are allowed to enter or leave a SafepointManager's
  context (i.e. push or pop a StackRecord)"

You mean "SafepointRecord" here, right?

* Safepoint.h doc nit: I don't mind the abbreviations "ctor"
  and "dtor" in code comments, but they seem out of place in the
  middle of these large blocks of pure English text.

* Safepoint.h doc nit:

  "Only implicit safepoints take true advantage of the RAII
   pattern"

"take true advantage of" may be better phrased as "make full use of""

* Safepoint.h doc: I dislike the use of "safepoint" as a verb, as in
  "Explicit safepoints are the code locations where a thread may
  safepoint itself."  I know Simon explained the meaning of this in
  comment 21, but he also said in comment that he was going to get rid
  of uses of "safepoint" as a verb.

* Safepoint.h doc: 

  "The solution here is to prepare the sleeping thread to fulfill
  safepointing constraints before it goes to sleep. This is
  achieved through the RAII pattern of a SafepointGate..."

This is illustrated quite nicely in the implementation of the
VMThread::sleep() procedure (VMCFG_SAFEPOINTS version), right?  I'm
wondering if a cross-reference from here to there would be
worth-while.  (Maybe not; could fall victim to code/comment-rot over
time.)

* Safepoint.h doc:

 "If a thread, T2, which is running in the same context as T1's
  SafepointManager reaches L1 before being safepointed, ..."

When you say "reaches L1", do you mean "attempts to acquire L1"?

* Safepoint.h doc uber-nit: did you actually set your line width
  to 82 columns to edit this, or was that random?

* SafepointManager::requestSafepointTask doc:

  "Makes all threads safepointed with respect to this
   SafepointManager before executing the given SafepointTask on
   the calling thread."

should this instead say "Blocks until all threads managed by this
SafepointManager are in safepoints, then executes the given
SafepointTask on the calling thread."?  (The main distinction I
am thinking of is that threads not managed by this
SafepointManager won't get safepointed, right? Or is there a
technicality here along the lines of unmanaged threads are *by
definition* safepointed with respect to this SafepointManager, in
the sense that they must be "safe" or else they would have been
managed??)

== Safepoint-inlines.h ==

* Why have the (void*) cast in
  SafepointHelper_WaitNotifyMonitor::wait(WaitNotifyMonitor&) but
  no (void*) cast on the analogous argument in
  SafepointHelper_WaitNotifyMonitor::wait(WaitNotifyMonitor&,int32_t)
  ?

* MonitorLocker<SAFEPOINT_AWARE>::MonitorLocker(WaitNotifyMonitor&)
Consider changing or expanding upon the comments to make it clear that
the dynamic scope of the safepoint *ends* once the lock is acquired.
(The code looks fine, and its consistent with the relevant note in
Safepoint.h, but it took me a while to remember this detail when I was
reviewing the code; I think the comment "Only prepare to safepoint if
we don't get the lock immediately" led me astray here, because I
initially interpreted it as talking about a broader scope than just
the lock acquisition.  Perhaps just it rephrasing as "Only pass
through SafepointGate if ..." would suffice.)
Attachment #502917 - Flags: review?(fklockii) → review+
(In reply to comment #28)
> The abstraction makes sense, I probably just need a higher level view of
> what an isolated well-written code example using this API looks like.
> (The MMgc changes to support AsymmGC do not count as *isolated*).

(Simon reminds me that there is still the Safepoint Selftests to review; they may yet serve as the higher level example I mused about here.)
Depends on: 629435
Comment on attachment 502917 [details] [diff] [review]
Latest API and docs

vmbase/safepoint.h:

Documentation block: Generally very readable and clear, but it breaks down a couple of places.  Specifically I'm currently unclear on the following:

- What is the meaning of the sentence "Note that these macros must be placed in code locations that are 'safe' with respect to any SafepointTask that can be submitted to a SafepointManager polled from the macro." ?  I think I'm confused because (by earlier definitions) threads, not code locations, are "safe".  

- The discussion in the subsection "Attempted acquisition of locks held by a safepointed thread" is not clear.  For one there is the matter of the use of the undefined term "safepointable thread"; safepoints are code locations, not threads, by earlier definitions.  I suspect an example would be helpful here.

The last two paragraphs of the doc block appear to be dangling; the penultimate sets something up: "Only for those locks where the following hold:" but that thought never seems to be completed.

Some of the documentation for functions/macros make up for that confusion, still might be worthwhile to think about improvements.


Code:

Style nit (and I mean that): house style for macros is

  #define FNORD(a,b) \
     ... \
     ...

not

  #define FNORD(a,b)    .... \
                        ....

since the latter becomes painful to maintain with all that indentation.

(Re the rest of the code, it looks credible to me but I'm not the expert on the fine details here.)
Attachment #502917 - Flags: superreview?(lhansen) → superreview+
Attachment #502870 - Flags: superreview?(lhansen) → superreview+
Comment on attachment 502870 [details] [diff] [review]
Selftests. Latest.

Nice tests.

I wouldn't mind a little more elaboration in the comment block above NestedProducerConsumerTest, outlining what sorts of badness this test hopes to uncover, and perhaps also a paragraph easing the reader into the structure of the nesting.  For example, there's a series of managers but then after the series is exhausted, the test reuses the _oldest_ one; was that an arbitrary choice, or a requirement, or an interesting corner-case to test?

I was going to comment some of the code is not following house-style for braces; (e.g. the methods of  SafepointTestBase is written with its open curly-bracket on the same line as the method header, but house style is to put it at the start of the next line).  But then I skimmed over the other self-tests like ST_mmgc_dependent.st, and it looks like they don't follow the house-style either.

----

Last minute thoughts about both patches:

1. Is it really a good thing that SafepointTask is just defined as a typedef of Runnable?  I guess the run() method is not documented in either case, but if one were to attempt to add documentation for Runnable::run, one would then run into the problem that the context for SafepointTask::run (the guarantees one assumes when implementing it) is quite different from that of Runnable::run (right?).  I don't see an immediate benefit to conflating the two (are there cases where you expect a single class to be used in distinct contexts, one as Runnable, and another as SafepointTask?  Maybe you can point me at a pre-existing case.)

An alternative here would be to give SafepointTask its own class definition with a differently named method, like "runSafely", "runOnOwn", or "runFromSafeContext()"; an ugly name does not matter, since the method is not meant to be called directly from client code.

2. I've been objecting to your use of "safepoint" as a verb; does that imply that you should find an alternative term to the explicitly-defined adjective "safepointed", so that people don't fall into the trap of back-infering that "safepoint" can be used as a verb?
Attachment #502870 - Flags: review?(fklockii) → review+
Attached patch Latest API and docs. MQ rev 232 (obsolete) — Splinter Review
Attachment #502917 - Attachment is obsolete: true
Attachment #511152 - Flags: review?(kpalacz)
Attachment #502917 - Flags: review?(kpalacz)
The latest patch addresses the latest documentation comments raised by Lars and Felix (except Felix's 'last minute thoughts' comment, above).

It also converts the previously ambiguous definition:

template <bool SAFEPOINT_AWARE> class MutexLocker

into:

enum BlockingMode
{
    IMPLICIT_SAFEPOINT, // When blocked, a thread should be considered 'safe' with respect to its current safepoint context
    NO_SAFEPOINT        // Blocking has no effect on a thread's safepoint status
};

template <BlockingMode BLOCKING_MODE> class MutexLocker
Attachment #502870 - Attachment is obsolete: true
Attachment #511283 - Flags: review?(kpalacz)
Attachment #502870 - Flags: review?(kpalacz)
Given this: "SCOPE_LOCK_SP is *always* safe to use, whereas use of SCOPE_LOCK can lead to deadlock if used incorrectly in the presence of safepoints", should we consider alpha renaming these macros as follows:

SCOPE_LOCK ==> SCOPE_LOCK_UNSAFEPOINTED
SCOPE_LOCK_SP ==> SCOPE_LOCK

[[ the second one is optional ]]

The point is that I don't want it to be easier to type or overlook the more dangerous operation.

Its a pretty last minute suggestion, I admit.
(The suggestion from comment 35 is assuming that once the Safepoint infrastructure lands, it is going to be used (at least, used by any client that would consider using the VMThread infrastructure in the first place).  Otherwise I would not want to pollute the VMThread API with details about the Safepoint API...)
(In reply to comment #32)
> Created attachment 511152 [details] [diff] [review]
> Latest API and docs. MQ rev 232

While working through a merge, noticed typo: "Tamarain"
Comment on attachment 511152 [details] [diff] [review]
Latest API and docs. MQ rev 232

Couple of nits:

Safepoint.h:130 
The numbering of safepoint managers seems inconsistent with previous picture and text, m1 -> m0, m2->m1? 

Safepoint.h:154
This section is hard to parse, in particular, there's a 'then' clause without a clear 'if' clause and a lot of parentheses.

Safepoint.h:194
Nit: the paragraph would IMHO be easier to follow if you first stated the prescribed safepoint placement policy, and then listed how the policy constraints could be relaxed.

Safepoint.h:487 
very minor: private undefined new operator doesn't guarantee stack-only allocation.

SafepointRecord::m_safeRecordEnd 
seems to be maintained but currently not used for anything? 

SafepointGate::SafepointGate
There is no membar after write to m_safe, so the write won't be visible until the next membar (e.g., when entailed by a lock acquisition). In current use the membar happens promptly, but this may not be the case in the future, at which point an explicit membar could be beneficial, and it would be worthwhile to document it. An example how to wrap a blocking call in a SafepointGate would also be useful.


SafepointGate::~SafepointGate 
IMHO blocking destructors are sneaky (often invisible in the caller), which makes me uneasy but I won't insist on changing the idiom.

I think I'd side with Felix on the alpha-renaming of mutex locker macros (with the safepointable version as default, the other could be SCOPE_LOCK_NO_SP).
Attachment #511152 - Flags: review?(kpalacz) → review+
(In reply to comment #35)
> Given this: "SCOPE_LOCK_SP is *always* safe to use, whereas use of SCOPE_LOCK
> can lead to deadlock if used incorrectly in the presence of safepoints", should
> we consider alpha renaming these macros as follows:
> 
> SCOPE_LOCK ==> SCOPE_LOCK_UNSAFEPOINTED
> SCOPE_LOCK_SP ==> SCOPE_LOCK
> 
> [[ the second one is optional ]]
> 
> The point is that I don't want it to be easier to type or overlook the more
> dangerous operation.

How about we have:

#define SCOPE_LOCK_NO_SP(m) if (vmbase::MutexLock</snip>
#define SCOPE_LOCK_SP(m)    if (vmbase::MutexLock</snip>
#define SCOPE_LOCK(m)       SCOPE_LOCK_SP(m)

This will allow clients to either document their true intention, or just go with the safe default.

I'd also be tempted to use the explicit _SP and _NO_SP versions throughout the VMThread, Safepoint and asymmGC implemenations.

Thoughts?
Attachment #511152 - Attachment is obsolete: true
Attached patch Latest selftests. MQ rev 239 (obsolete) — Splinter Review
Attachment #511283 - Attachment is obsolete: true
Attachment #514706 - Flags: review?(kpalacz)
Attachment #511283 - Flags: review?(kpalacz)
(In reply to comment #38)
> Comment on attachment 511152 [details] [diff] [review]
> Latest API and docs. MQ rev 232
> 
> Couple of nits:
> 
> Safepoint.h:130 
> The numbering of safepoint managers seems inconsistent with previous picture
> and text, m1 -> m0, m2->m1? 

Fixed

> Safepoint.h:154
> This section is hard to parse, in particular, there's a 'then' clause without a
> clear 'if' clause and a lot of parentheses.

Fixed

> Safepoint.h:194
> Nit: the paragraph would IMHO be easier to follow if you first stated the
> prescribed safepoint placement policy, and then listed how the policy
> constraints could be relaxed.

Fixed

> Safepoint.h:487 
> very minor: private undefined new operator doesn't guarantee stack-only
> allocation.

I'll leave it in there to stop gratuitous mis-use.

> SafepointRecord::m_safeRecordEnd 
> seems to be maintained but currently not used for anything? 

m_safeRegionEnd? It's used in later patches.

> SafepointGate::SafepointGate
> There is no membar after write to m_safe, so the write won't be visible until
> the next membar (e.g., when entailed by a lock acquisition). In current use the
> membar happens promptly, but this may not be the case in the future, at which
> point an explicit membar could be beneficial, and it would be worthwhile to
> document it. An example how to wrap a blocking call in a SafepointGate would
> also be useful.

You're right that it does need a 2nd membar.
There is already the case of VMThread::sleep() being wrapped by a SafepointGate.
However, your analysis is slightly incorrect:
The membar is needed even if the thread is guaranteed to reach a lock promptly, as it could block. We require *release* semantics to fence the write to m_status.

> SafepointGate::~SafepointGate 
> IMHO blocking destructors are sneaky (often invisible in the caller), which
> makes me uneasy but I won't insist on changing the idiom.
> 
> I think I'd side with Felix on the alpha-renaming of mutex locker macros (with
> the safepointable version as default, the other could be SCOPE_LOCK_NO_SP).

As per my comment above, we now have:

#define SCOPE_LOCK_NO_SP(m) if (vmbase::MutexLock</snip>
#define SCOPE_LOCK_SP(m)    if (vmbase::MutexLock</snip>
#define SCOPE_LOCK(m)       SCOPE_LOCK_SP(m)

..and similarly for the _NAMED variants.
Comment on attachment 514706 [details] [diff] [review]
Latest selftests. MQ rev 239

Looks good to me.
SafepointTestBase and TestRunner look similar to what's in vmbase_concurrency, might we worthwhile to factor it out, it's prolly not the last time we use this type of logic. Not necessarily in this patch.

What's the role of VMPI_threadYield() in these tests? Ensure interleavings, ensure that threads don't just finish their job in a single quantum without rescheduling or otherwise keep the scheduler busy? May be worth a comment.
Attachment #514706 - Flags: review?(kpalacz) → review+
Depends on: 637233
Attachment #514706 - Attachment is obsolete: true
changeset: 6010:f52ad6bae096
user:      Simon Wilkinson <siwilkin>
summary:   Bug 575544: Safepoint tweaks for C++ concurrency abstraction library (r=fklockii, r=kpalacz, sr=lhansen)

http://hg.mozilla.org/tamarin-redux/rev/f52ad6bae096
changeset: 6046:70b2ce98570d
user:      Simon Wilkinson <siwilkin@adobe.com>
summary:   Bug 575544: Selftests for Thread Safepoints (r=fklockii,r=kpalacz,sr=lhansen).

http://hg.mozilla.org/tamarin-redux/rev/70b2ce98570d
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: