Closed Bug 1465470 Opened Last year Closed Last year

Web Replay: JS recording tweaks

Categories

(Core :: JavaScript Engine, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox62 --- wontfix
firefox63 --- fixed

People

(Reporter: bhackett, Assigned: bhackett)

References

Details

Attachments

(9 files, 4 obsolete files)

5.52 KB, patch
jonco
: review+
Details | Diff | Splinter Review
11.52 KB, patch
jandem
: review+
Details | Diff | Splinter Review
1.71 KB, patch
jandem
: review+
Details | Diff | Splinter Review
768 bytes, patch
till
: review+
Details | Diff | Splinter Review
5.36 KB, patch
till
: review+
Details | Diff | Splinter Review
1.48 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
1.84 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
1.38 KB, patch
jandem
: review+
Details | Diff | Splinter Review
454 bytes, patch
sfink
: review+
Details | Diff | Splinter Review
Some additional changes to the JS engine, beyond those already reviewed in bug 1207696, are necessary for recording and replaying.
This adds a template parameter to JS hashtables and some of the classes which use them, to specify that the hashtable should preserve its iteration order when recording or replaying.  Only a few non-perf-sensitive tables will need this, so we ensure that iteration order is preserved by using the same hash number for all entries in recording/replaying processes.
Assignee: nobody → bhackett1024
Attachment #8981884 - Flags: review?(luke)
Frontend name tables are iterated over to affect how various NAME ops are emitted in scripts, and the order of those ops affects the order in which resolve hooks are called and the order of recorded events in the browser.  This patch uses a consistent iteration order for these tables.
Attachment #8981885 - Flags: review?(jorendorff)
This is pretty similar to part 8d of bug 1207696 but is much simpler now.
Attachment #8981886 - Flags: review?(jorendorff)
Promises keep track of their creation and resolved times, but only when they have debug info.  Whether a promise has debug info is not consistent between recording and replaying, so recorded events can differ as a result.  This patch always computes these times for promises.
Attachment #8981888 - Flags: review?(till)
In addition to the places in part 5b of bug 1207696, this adds some more places where adding recorded events for times for GC statistics etc. is not appropriate.
Attachment #8981889 - Flags: review?(jcoppeard)
There are a fair amount of new atomics in the JS engine that should not be recorded, on top of those in part 4e of bug 1207696.
Attachment #8981890 - Flags: review?(jdemooij)
There are a couple places where getenv is called at non-deterministic points in the JS engine.
Attachment #8981891 - Flags: review?(jdemooij)
pthread_setname_np sometimes fails when recording/replaying, though I'm not sure why (possibly because we reuse physical threads for pthread_create() calls for threads whose lifetime does not overlap, and the name has already been set).
Attachment #8981893 - Flags: review?(till)
Attachment #8981889 - Flags: review?(jcoppeard) → review+
Comment on attachment 8981884 [details] [diff] [review]
Part 1 - Allow JS hashtables to preserve iteration order when recording/replaying.

Review of attachment 8981884 [details] [diff] [review]:
-----------------------------------------------------------------

Some questions, for reference:
 - Is the address, and its use to compute the hash, the source of the nondeterminism here?
 - Why are only a few hash tables are affected?
 - I'm guessing it's because, due to the nondeterminism of hash table enumeration, we avoid making hash table enumeration semantically visible.  So what are the cases where it is visible?
 - To avoid the terrible O(n^2) cliffs, could we change the handful of sites that do this visible nondeterministic enumeration to, when recording/replaying, put the elements into a Vector and Sort() that Vector based on something deterministic?
(ni? so I won't miss the answer)
Flags: needinfo?(bhackett1024)
Comment on attachment 8981888 [details] [diff] [review]
Part 4 - Always calculate creation/resolved time for promises.

Review of attachment 8981888 [details] [diff] [review]:
-----------------------------------------------------------------

How much overhead does calling mozilla::TimeStamp::Now() add? This is performance-critical code that's not as fast as it should be as-is, so I'm worried about adding anything to the non-debug path. Could we do this only if Web Replay is either recording or replaying?
Attachment #8981888 - Flags: review?(till)
Attachment #8981885 - Flags: review?(jorendorff) → review+
Attachment #8981886 - Flags: review?(jorendorff) → review+
(In reply to Luke Wagner [:luke] from comment #9)
> Some questions, for reference:
>  - Is the address, and its use to compute the hash, the source of the
> nondeterminism here?

Yes, pointer addresses will not be consistent between recording and replaying.

>  - Why are only a few hash tables are affected?
>  - I'm guessing it's because, due to the nondeterminism of hash table
> enumeration, we avoid making hash table enumeration semantically visible. 
> So what are the cases where it is visible?

Yeah, the main way the JS engine affects the recording is in the JS code that is executed and the events triggered via the natives etc. which that code calls.  Most engine internals --- the GC, the JITs --- are allowed to behave non-deterministically, mainly to give the debugger the elbow room it needs to do things while replaying that it didn't while recording, such as setting breakpoints and other hooks, creating GC things, and so forth.

The cases where this is visible are in parts 2 and 3 of this bug.

>  - To avoid the terrible O(n^2) cliffs, could we change the handful of sites
> that do this visible nondeterministic enumeration to, when
> recording/replaying, put the elements into a Vector and Sort() that Vector
> based on something deterministic?

We did something like this in part 8d of bug 1207696, but it's ugly code and requires some Web Replay APIs that aren't used anywhere else --- for the frontend name table we could sort lexicographically, but for the transferable objects we'd have to keep track of the order in which the objects were added to the table, then sort by that later on.  I started experimenting with approaches similar to what we do with PL and PLD hashtables in Gecko (where we do retain the amortized computational complexity of the normal non-recording/replaying case) but it just seemed like overkill given the use cases.
Flags: needinfo?(bhackett1024)
Comment on attachment 8981893 [details] [diff] [review]
Part 8 - Tolerate pthread_setname_np failing when recording or replaying.

Review of attachment 8981893 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8981893 - Flags: review?(till) → review+
Comment on attachment 8982351 [details] [diff] [review]
Part 4 - Always calculate creation/resolved time for promises.

Review of attachment 8982351 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you!
Attachment #8982351 - Flags: review?(till) → review+
(In reply to Brian Hackett (:bhackett) from comment #12)
> The cases where this is visible are in parts 2 and 3 of this bug.

Ah, I see now.  While rare, many-local-variable situations certainly exist in the wild so it seems likely an O(n^2) hang would likely bite in the future and thus eventually require a lexicographic sort, yes?  Agreed that the transferrable map is probably always going to be tiny.

Do you think these are likely the only remaining cases of hash-table-of-pointer-enumeration?
Flags: needinfo?(bhackett1024)
Comment on attachment 8981890 [details] [diff] [review]
Part 6 - Avoid recording various JS atomics.

Review of attachment 8981890 [details] [diff] [review]:
-----------------------------------------------------------------

I assume there will be a big comment somewhere explaining when it's appropriate to not record atomics, with some typical examples for both the record/dont-record cases?
Attachment #8981890 - Flags: review?(jdemooij) → review+
Comment on attachment 8981891 [details] [diff] [review]
Part 7 - Avoid calling getenv at non-deterministic points in the JS engine.

Review of attachment 8981891 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsutil.h
@@ +346,5 @@
>  static inline void
>  Poison(void* ptr, uint8_t value, size_t num, MemCheckKind kind)
>  {
> +    static bool disablePoison = !mozilla::recordreplay::IsRecordingOrReplaying()
> +                             && bool(getenv("JSGC_DISABLE_POISONING"));

This is a static bool, so getenv is only called once, right? Is that still non-deterministic?
Attachment #8981891 - Flags: review?(jdemooij) → review+
(In reply to Luke Wagner [:luke] from comment #16)
> (In reply to Brian Hackett (:bhackett) from comment #12)
> > The cases where this is visible are in parts 2 and 3 of this bug.
> 
> Ah, I see now.  While rare, many-local-variable situations certainly exist
> in the wild so it seems likely an O(n^2) hang would likely bite in the
> future and thus eventually require a lexicographic sort, yes?

Well, either a lexicographic sort or more sophisticated instrumentation, similar to what we do with PLDHashTables.

> Do you think these are likely the only remaining cases of
> hash-table-of-pointer-enumeration?

It's hard to say, though in the near-mid term we'll have both more testing being done in the wild, and more automated testing by recording/replaying mochitests, so should have a better idea then.
Flags: needinfo?(bhackett1024)
(In reply to Jan de Mooij [:jandem] from comment #17)
> Comment on attachment 8981890 [details] [diff] [review]
> Part 6 - Avoid recording various JS atomics.
> 
> Review of attachment 8981890 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I assume there will be a big comment somewhere explaining when it's
> appropriate to not record atomics, with some typical examples for both the
> record/dont-record cases?

There isn't one yet, but I'll add one to mfbt/Atomics.h
(In reply to Jan de Mooij [:jandem] from comment #18)
> Comment on attachment 8981891 [details] [diff] [review]
> Part 7 - Avoid calling getenv at non-deterministic points in the JS engine.
> 
> Review of attachment 8981891 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jsutil.h
> @@ +346,5 @@
> >  static inline void
> >  Poison(void* ptr, uint8_t value, size_t num, MemCheckKind kind)
> >  {
> > +    static bool disablePoison = !mozilla::recordreplay::IsRecordingOrReplaying()
> > +                             && bool(getenv("JSGC_DISABLE_POISONING"));
> 
> This is a static bool, so getenv is only called once, right? Is that still
> non-deterministic?

It is non-deterministic if the first call to Poison() happens somewhere that executes non-deterministically, such as during a GC.
(In reply to Brian Hackett (:bhackett) from comment #19)
> It's hard to say, though in the near-mid term we'll have both more testing
> being done in the wild, and more automated testing by recording/replaying
> mochitests, so should have a better idea then.

If it's just the one case, then putting the ordered Vector to use for enumeration order into the structured clone buffer seems not too much worse and keeps warts out of HashTable.  If it's more cases, adding the order-preserving support to the HashTable (like it sounds like you did with PLDHash) seems like a good idea to avoid the possible O(n^2) cliffs for each new case.
Comment on attachment 8981884 [details] [diff] [review]
Part 1 - Allow JS hashtables to preserve iteration order when recording/replaying.

OK, I see a simple solution to this hashtable problem now.  Let's leave the hashtable classes alone, and just use a hasher for the relevant tables that ensures consistent hash numbers when recording or replaying.

I think a solution like what we do with PLDHashTables is overkill because we only need to do this for a couple tables in JS, whereas we do it for all PLDHashTables in Gecko (there are lots of tables that affect the recording there).  Instrumenting HashTable.h to always behave consistently when recording/replaying would be trickier as well, due to the highly templatized nature of the class.  (While most PLDHashTables are specified using templates, they are wrapping an implementation based on callbacks).
Attachment #8981884 - Flags: review?(luke)
Frontend name tables can get consistent iteration order when recording/replaying, O(1) table operations, and minimal code modification by using the atom's precomputed hash as the table's hash code.
Attachment #8981884 - Attachment is obsolete: true
Attachment #8981885 - Attachment is obsolete: true
Attachment #8983638 - Flags: review?(jorendorff)
The transferable objects table can use a zero hash number when recording/replaying to ensure consistent iteration order.  (Sorry for asking you to review this three times now, I hope this is the last one!)
Attachment #8981886 - Attachment is obsolete: true
Attachment #8983639 - Flags: review?(jorendorff)
Attachment #8983638 - Flags: review?(jorendorff) → review+
Comment on attachment 8983639 [details] [diff] [review]
Part 3 - Preserve iteration order for structured clone transferable object table.

Review of attachment 8983639 [details] [diff] [review]:
-----------------------------------------------------------------

Yep, this should be fine.
Attachment #8983639 - Flags: review?(jorendorff) → review+
Priority: -- → P1
A couple new atomics appeared in the last month and a half that shouldn't be recorded.
Attachment #8993170 - Flags: review?(jdemooij)
This fixes a problem that showed up after bug 1405374.  JS helper threads are not allowed to perform recorded events (allowing e.g. Ion compilations and GC activity to differ between recording and replaying), and this check fails when the helper thread registers itself with the profiler.
Attachment #8993171 - Flags: review?(sphink)
Attachment #8993171 - Flags: review?(sphink) → review+
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b17cfac5aa3f
Part 6 - Avoid recording various JS atomics, r=jandem.
Whiteboard: leave-open
Comment on attachment 8993170 [details] [diff] [review]
Part 9 - Avoid recording additional atomics.

Review of attachment 8993170 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay.
Attachment #8993170 - Flags: review?(jdemooij) → review+
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/636405b06158
Part 2 - Preserve iteration order for frontend name tables, r=jorendorff.
https://hg.mozilla.org/integration/mozilla-inbound/rev/cc6f7cf73260
Part 3 - Preserve iteration order for structured clone transferable object table, r=jorendorff.
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a97ba8c172a
Part 5 - Avoid recording time events at non-deterministic points in the JS engine, r=jonco.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac9f2aa84644
Part 7 - Avoid calling getenv at non-deterministic points in the JS engine, r=jandem.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e4adbe48c7a4
Part 8 - Tolerate pthread_setname_np failing when recording or replaying, r=till.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e5a6b23ce42d
Part 9 - Avoid recording some more JS atomics, r=jandem.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a372dc28d27d
Part 10 - Avoid registering helper threads with the profiler when recording/replaying, r=sfink.
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/171dfd07ecb4
Part 4 - Always calculate creation/resolved time for promises, r=till.
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Whiteboard: leave-open
Target Milestone: --- → mozilla63
You need to log in before you can comment on or make changes to this bug.