Closed Bug 981991 Opened 7 years ago Closed 2 years ago
Investigate defining JS
_POISON in release builds, not just #ifdef JS _CRASH _DIAGNOSTICS
As of bug 860254, jemalloc now poisons freed memory in both debug and release builds. bsmedberg recommends poisoning JS memory allocations, too: https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/dUlUxYsFz5Q
We started poisoning freed Ion code in release builds with bug 977810. This bug is about enabling (just some?) JS_POISON calls in release builds.
Is the scope of this bug just changing the defines here: https://searchfox.org/mozilla-central/source/js/src/jsutil.h#357-362 Or did we also want to add more JS_POISON calls? Assuming it's just changing these, my proposal would be to switch it to |EARLY_BETA_OR_EARLIER|, let it ride through beta, and then switch to always enabling it once beta looked good. I'm inferring from the fact that this is happilly on in nightly that there aren't any (known) critical performance issues or crashes :-) Does that seem reasonable?
bsmedberg's original suggestion was to define JS_POISON in all builds and channels. Enabling JS_CRASH_DIAGNOSTICS would have more overhead than enabling JS_POISON, but perhaps JS_CRASH_DIAGNOSTICS (or a subset) could be enabled with EARLY_BETA_OR_EARLIER? https://searchfox.org/mozilla-central/search?case=true&q=JS_CRASH_DIAGNOSTICS
Ah, ok I didn't realize JS_CRASH_DIAGNOSTIC would be a bigger. Just switching to JS_POISON always being on seems straightforward then. I'll upload a patch momentarily.
Previously JS_POISON was defined for debug and nightly builds. Now it'll be enabled for all builds. JS_POISON makes it easier to identify UAFs and other memory lifetime issues.
Poisoning sometimes affects performance a lot, see the JSGC_DISABLE_POISONING environment var. We set it on AWFY and when running Talos: https://searchfox.org/mozilla-central/rev/cf464eabfeba64e866c1fa36b9fefd674dca9c51/testing/talos/talos/ttest.py#126 At the very least we should compare Talos runs with that enabled/disabled, and probably also try Octane in the browser or JS shell (js/src/octane/run.js runs in the shell).
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #2) > Or did we also want to add more JS_POISON calls? I do not think we want to add more for the moment. > Does that seem reasonable? As Jan mentioned in comment 6, we should be careful at not regressing performance too much, and maybe we can filter the slow poisoning from the fast one. Otherwise this sounds good to me. Another alternative might be to enable it on dev-edition, but not on beta, and compare the two.
Ok, I did a talos run: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=9251f89d196a3c0a3ca744dc9c254482ec669e5c&framework=1&showOnlyComparable=1&showOnlyImportant=1&selectedTimeRange=172800 Looks like there's a decent number of 2-4% regressions, and a lot of talos tests which are unaffected. Is that an acceptable level of regression, or should I spend some time investigating which of the poisons can safely be enabled vs. not?
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #8) > Is that an acceptable level of regression, or should I spend some time > investigating which of the poisons can safely be enabled vs. not? Maybe try without these: * https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Allocator.cpp#692 This poisons a whole chunk (1 MB) so it's not great for caches and doing this seems less useful than poisoning when we sweep something. * https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Nursery.cpp#92 * https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Nursery.cpp#104 These are a bit more useful, but I can see them causing regressions when we do lots of nursery GCs.
Talos run with those lines turned into |JS_EXTRA_POISON|: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=7108218a522014b92453d59ad5901b90c14a0352&framework=1&showOnlyComparable=1&selectedTimeRange=172800 Removes most of the significant results, however there are still a few.
Jan, do you have an opinion on these results? (I just noticed that while one of the motionmark benchmarks shows a regression, another one shows a win; the joys of benchmarking).
Sorry for the delay. If Talos and GC telemetry (when this moves to beta) are okay with it, this is probably fine. Jon owns the GC and most of the poisoning is GC-related, so I'll forward to him. FWIW see bug 1467697 comment 12 for a recent profile where poisoning during nursery collection showed up in Nightly. Also I assume JS_EXTRA_POISON is temporary and will be replaced with JS_NIGHTLY_POISON/JS_DIAGNOSTIC_POISON or something that matches the current behavior.
Flags: needinfo?(jdemooij) → needinfo?(jcoppeard)
Assignee: nobody → jcoppeard
Comment on attachment 9035369 [details] [diff] [review] bug981991-refactor-macros Review of attachment 9035369 [details] [diff] [review]: ----------------------------------------------------------------- Much nicer. ::: js/src/jsutil.h @@ +344,5 @@ > +} > + > +// Poison a region of memory in debug builds. Can be disabled by setting the > +// JSGC_DISABLE_POISONING environment variable. > +static inline void ExtraPoison(void* ptr, uint8_t value, size_t num, Pre-existing: what do you think about renaming this to DebugPoison or DebugOnlyPoison?
Attachment #9035369 - Flags: review?(jdemooij) → review+
Comment on attachment 9035370 [details] [diff] [review] bug981991-poison-more Review of attachment 9035370 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, this makes sense I think. If we do get talos/raptor regressions or see this in profiles we can reconsider.
Attachment #9035370 - Flags: review?(jdemooij) → review+
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/77dfbff37444 Replace JS_*_POISON macros with inline functions r=jandem https://hg.mozilla.org/integration/mozilla-inbound/rev/65174e301470 Make most poisoning unconditional r=jandem
You need to log in before you can comment on or make changes to this bug.