Closed Bug 981991 Opened 7 years ago Closed 2 years ago

Investigate defining JS_POISON in release builds, not just #ifdef JS_CRASH_DIAGNOSTICS

Categories

(Core :: JavaScript: GC, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox66 --- fixed

People

(Reporter: cpeterson, Assigned: jonco)

References

(Blocks 2 open bugs)

Details

(Keywords: sec-want, Whiteboard: [js:p1][adv-main66-])

Attachments

(2 files, 1 obsolete file)

As of bug 860254, jemalloc now poisons freed memory in both debug and release builds. bsmedberg recommends poisoning JS memory allocations, too:

https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/dUlUxYsFz5Q
Keywords: sec-want
We started poisoning freed Ion code in release builds with bug 977810. This bug is about enabling (just some?) JS_POISON calls in release builds.
Blocks: 588087
Is the scope of this bug just changing the defines here: https://searchfox.org/mozilla-central/source/js/src/jsutil.h#357-362

Or did we also want to add more JS_POISON calls?

Assuming it's just changing these, my proposal would be to switch it to |EARLY_BETA_OR_EARLIER|, let it ride through beta, and then switch to always enabling it once beta looked good. I'm inferring from the fact that this is happilly on in nightly that there aren't any (known) critical performance issues or crashes :-)

Does that seem reasonable?
Flags: needinfo?(nicolas.b.pierron)
bsmedberg's original suggestion was to define JS_POISON in all builds and channels. Enabling JS_CRASH_DIAGNOSTICS would have more overhead than enabling JS_POISON, but perhaps JS_CRASH_DIAGNOSTICS (or a subset) could be enabled with EARLY_BETA_OR_EARLIER?

https://searchfox.org/mozilla-central/search?case=true&q=JS_CRASH_DIAGNOSTICS
Ah, ok I didn't realize JS_CRASH_DIAGNOSTIC would be a bigger. Just switching to JS_POISON always being on seems straightforward then. I'll upload a patch momentarily.
Previously JS_POISON was defined for debug and nightly builds. Now
it'll be enabled for all builds. JS_POISON makes it easier to identify UAFs and
other memory lifetime issues.
Assignee: nobody → agaynor
Poisoning sometimes affects performance a lot, see the JSGC_DISABLE_POISONING environment var. We set it on AWFY and when running Talos:

https://searchfox.org/mozilla-central/rev/cf464eabfeba64e866c1fa36b9fefd674dca9c51/testing/talos/talos/ttest.py#126

At the very least we should compare Talos runs with that enabled/disabled, and probably also try Octane in the browser or JS shell (js/src/octane/run.js runs in the shell).
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #2)
> Or did we also want to add more JS_POISON calls?

I do not think we want to add more for the moment.

> Does that seem reasonable?

As Jan mentioned in comment 6, we should be careful at not regressing performance too much, and maybe we can filter the slow poisoning from the fast one.

Otherwise this sounds good to me.
Another alternative might be to enable it on dev-edition, but not on beta, and compare the two.
Flags: needinfo?(nicolas.b.pierron)
Ok, I did a talos run:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=9251f89d196a3c0a3ca744dc9c254482ec669e5c&framework=1&showOnlyComparable=1&showOnlyImportant=1&selectedTimeRange=172800

Looks like there's a decent number of 2-4% regressions, and a lot of talos tests which are unaffected.

Is that an acceptable level of regression, or should I spend some time investigating which of the poisons can safely be enabled vs. not?
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #8)
> Is that an acceptable level of regression, or should I spend some time
> investigating which of the poisons can safely be enabled vs. not?

Maybe try without these:

* https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Allocator.cpp#692

This poisons a whole chunk (1 MB) so it's not great for caches and doing this seems less useful than poisoning when we sweep something.

* https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Nursery.cpp#92
* https://searchfox.org/mozilla-central/rev/c621276fbdd9591f52009042d959b9e19b66d49f/js/src/gc/Nursery.cpp#104

These are a bit more useful, but I can see them causing regressions when we do lots of nursery GCs.
Jan, do you have an opinion on these results? (I just noticed that while one of the motionmark benchmarks shows a regression, another one shows a win; the joys of benchmarking).
Flags: needinfo?(jdemooij)
Sorry for the delay. If Talos and GC telemetry (when this moves to beta) are okay with it, this is probably fine.

Jon owns the GC and most of the poisoning is GC-related, so I'll forward to him. 

FWIW see bug 1467697 comment 12 for a recent profile where poisoning during nursery collection showed up in Nightly. Also I assume JS_EXTRA_POISON is temporary and will be replaced with JS_NIGHTLY_POISON/JS_DIAGNOSTIC_POISON or something that matches the current behavior.
Component: JavaScript Engine: JIT → JavaScript: GC
Flags: needinfo?(jdemooij) → needinfo?(jcoppeard)
Assignee: agaynor → nobody
Priority: -- → P3
Attachment #8983927 - Attachment is obsolete: true

First of all, replace the JS_*_POISON macros with inline functions.

Attachment #9035369 - Flags: review?(jdemooij)

Make most poisoning unconditional. This leaves poisoning for new chunks and the nursery the same as it is now, i.e. nightly-only.

Attachment #9035370 - Flags: review?(jdemooij)
Comment on attachment 9035369 [details] [diff] [review]
bug981991-refactor-macros

Review of attachment 9035369 [details] [diff] [review]:
-----------------------------------------------------------------

Much nicer.

::: js/src/jsutil.h
@@ +344,5 @@
> +}
> +
> +// Poison a region of memory in debug builds. Can be disabled by setting the
> +// JSGC_DISABLE_POISONING environment variable.
> +static inline void ExtraPoison(void* ptr, uint8_t value, size_t num,

Pre-existing: what do you think about renaming this to DebugPoison or DebugOnlyPoison?
Attachment #9035369 - Flags: review?(jdemooij) → review+
Comment on attachment 9035370 [details] [diff] [review]
bug981991-poison-more

Review of attachment 9035370 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this makes sense I think. If we do get talos/raptor regressions or see this in profiles we can reconsider.
Attachment #9035370 - Flags: review?(jdemooij) → review+

One issue is that "-t all" doesn't run all talos tests :/ "mach try fuzzy" is needed for the Raptor and jsshell-bench tests (Octane might be interesting for the GC poisoning).

(In reply to Jan de Mooij [:jandem] from comment #16)

Pre-existing: what do you think about renaming this to DebugPoison or
DebugOnlyPoison?

Good idea, I'll go with DebugOnlyPoison.

One issue is that "-t all" doesn't run all talos tests

Ah that's annoying. Well, I've run octane locally and didn't see any regressions so I'm going to land this and we can back it out if it causes problems.

Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/77dfbff37444
Replace JS_*_POISON macros with inline functions r=jandem
https://hg.mozilla.org/integration/mozilla-inbound/rev/65174e301470
Make most poisoning unconditional r=jandem
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Whiteboard: [js:p1] → [js:p1][adv-main66-]
You need to log in before you can comment on or make changes to this bug.