Closed Bug 1199203 Opened 4 years ago Closed 4 years ago

Allow per-thread testing with oomAfterAllocations/oomAtAllocation

Categories

(Core :: JavaScript Engine, defect, major)

All
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: decoder, Assigned: decoder)

References

Details

(Keywords: sec-want, Whiteboard: [adv-main43-])

Attachments

(1 file, 3 obsolete files)

Right now, I often hit OOM conditions caused by the two named functions but the point where the OOM happens is in a helper thread. These tests typically don't reduce well because the main thread (or some other thread) is racing with the buggy thread about who goes OOM first since the testing functions affect all threads in the same way.

I propose to add a second argument to the testing functions that allows specifying a thread type for which the testing functions apply.

Whenever a helper thread picks up a certain workload, it sets a TLS variable to indicate the type of work it is performing. Only if this type matches the specified target type (second argument to oomAfterAllocations), the OOM counters are increased and an artificial OOM can happen.

I wrote a proof-of-concept patch that already allowed me to reproduce a bug that would otherwise only end up in an "unhandlable oom" on the main thread when reducing. I'll attach the patch after some cleanup and request feedback.
Attached patch oom-thread.patch (obsolete) — Splinter Review
Proof-of-concept patch. Requesting feedback from jandem, but anyone who feels like providing feedback, please go on :)
Attachment #8653497 - Flags: feedback?(jdemooij)
Comment on attachment 8653497 [details] [diff] [review]
oom-thread.patch

Review of attachment 8653497 [details] [diff] [review]:
-----------------------------------------------------------------

f+ from me, thanks for doing this!

::: js/public/Utility.h
@@ +81,5 @@
> + * without causing an OOM in the main thread first.
> + */
> +extern JS_PUBLIC_DATA(uint32_t) OOM_target_thread;
> +extern JS_PUBLIC_DATA(mozilla::ThreadLocal<uint32_t>) OOM_thread;
> +#define JS_OOM_SET_THREAD_TYPE(TYPE) OOM_thread.set(TYPE)

This could be an function rather than a define.

@@ +88,5 @@
> +#define JS_OOM_THREAD_TYPE_ION          3
> +#define JS_OOM_THREAD_TYPE_PARSE        4
> +#define JS_OOM_THREAD_TYPE_COMPRESS     5
> +#define JS_OOM_THREAD_TYPE_GCHELPER     6
> +#define JS_OOM_THREAD_TYPE_GCPARALLEL   7

Should probably be an enum.

@@ +112,5 @@
>  static inline bool
>  ShouldFailWithOOM()
>  {
> +    if (OOM_target_thread && OOM_target_thread != OOM_thread.get())
> +        return false;

The thread condition appears here and in IsSimulatedOOMAllocation().  This could be factored out so it only appears in one place.

We also only want to do this once per allocation, which I think means we can take the check out of IsSimulatedOOMAllocation() and put it in its callers.

::: js/src/builtin/TestingFunctions.cpp
@@ +2801,3 @@
>  "  After 'count' js_malloc memory allocations, fail the next allocation\n"
> +"  (return NULL). The optional thread type limits the effect to the specified\n"
> +"  type of helper thread."),

Would be nice to pass a string indicating the thread type rather than a number.
Duplicate of this bug: 1199227
Comment on attachment 8653497 [details] [diff] [review]
oom-thread.patch

Review of attachment 8653497 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just some nits.

::: js/public/Utility.h
@@ +62,5 @@
>  #if defined JS_USE_CUSTOM_ALLOCATOR
>  # include "jscustomallocator.h"
>  #else
>  # if defined(DEBUG) || defined(JS_OOM_BREAKPOINT)
> +#include "mozilla/ThreadLocal.h"

Not sure about this #include here in Utility.h. I think it'd be nice to keep the ThreadLocal stuff in jsutil.cpp and add a function there to return the integer/enum value. Then we can call that function here and don't need to know anything about the ThreadLocal magic in this file :)

::: js/src/builtin/TestingFunctions.cpp
@@ +1002,4 @@
>          return false;
>  
> +    if (args.length() == 2) {
> +        if (!ToUint32(cx, args.get(1), &OOM_target_thread))

Nit: you could remove the args.length() == 2 check here and below as args.get(i) returns |undefined| if i is out of bounds.

That way we'll also correctly reset OOM_target_thread if it's called first with 2 arguments and then with 1 argument.

@@ +2795,2 @@
>  "  After 'count' js_malloc memory allocations, fail every following allocation\n"
> +"  (return NULL). The optional thread type limits the effect to the specified\n"

Nit: s/NULL/nullptr/

@@ +2801,2 @@
>  "  After 'count' js_malloc memory allocations, fail the next allocation\n"
> +"  (return NULL). The optional thread type limits the effect to the specified\n"

And here.

::: js/src/jsapi.cpp
@@ +588,5 @@
>      using js::TlsPerThreadData;
>      if (!TlsPerThreadData.initialized() && !TlsPerThreadData.init())
>          return false;
>  
> +    if (!OOM_thread.initialized() && !OOM_thread.init())

This also needs an

#if defined(DEBUG) || defined(JS_OOM_BREAKPOINT)

Maybe we could always define JS_OOM_BREAKPOINT in DEBUG builds, so that you only need #ifdef JS_OOM_BREAKPOINT everywhere, but that's not really necessary for this bug.

::: js/src/vm/HelperThreads.cpp
@@ +1452,1 @@
>              MOZ_CRASH("No task to perform");

Nit: add {} (either none or all branches of an if-else should have them)
Attachment #8653497 - Flags: feedback?(jdemooij) → feedback+
(In reply to Jan de Mooij [:jandem] from comment #4)
> Comment on attachment 8653497 [details] [diff] [review]
> oom-thread.patch
> 
> Review of attachment 8653497 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good, just some nits.
> 
> ::: js/public/Utility.h
> @@ +62,5 @@
> >  #if defined JS_USE_CUSTOM_ALLOCATOR
> >  # include "jscustomallocator.h"
> >  #else
> >  # if defined(DEBUG) || defined(JS_OOM_BREAKPOINT)
> > +#include "mozilla/ThreadLocal.h"
> 
> Not sure about this #include here in Utility.h. I think it'd be nice to keep
> the ThreadLocal stuff in jsutil.cpp and add a function there to return the
> integer/enum value. Then we can call that function here and don't need to
> know anything about the ThreadLocal magic in this file :)

So you propose to add a getter and a setter function in jsutil.cpp to encapsulate the ThreadLocal magic. But then I have to include jsutil.h in js/public/Utility.h, is that right? Because right now, it's not.

> 

> 
> @@ +2795,2 @@
> >  "  After 'count' js_malloc memory allocations, fail every following allocation\n"
> > +"  (return NULL). The optional thread type limits the effect to the specified\n"
> 
> Nit: s/NULL/nullptr/
> 
> @@ +2801,2 @@
> >  "  After 'count' js_malloc memory allocations, fail the next allocation\n"
> > +"  (return NULL). The optional thread type limits the effect to the specified\n"
> 
> And here.

Will do this, just wanted to point out that it said NULL before my patch as well.
Flags: needinfo?(jdemooij)
(In reply to Jon Coppeard (:jonco) from comment #2)

> @@ +112,5 @@
> >  static inline bool
> >  ShouldFailWithOOM()
> >  {
> > +    if (OOM_target_thread && OOM_target_thread != OOM_thread.get())
> > +        return false;
> 
> The thread condition appears here and in IsSimulatedOOMAllocation().  This
> could be factored out so it only appears in one place.
> 
> We also only want to do this once per allocation, which I think means we can
> take the check out of IsSimulatedOOMAllocation() and put it in its callers.

I quickly checked which code uses this function and it's only used here:

http://mxr.mozilla.org/mozilla-central/source/js/src/vm/Runtime.cpp#758

I don't even understand what this is used for. Why is the simulated OOM handled differently to a regular OOM here? Please let me know how/if I should change this.

> 
> ::: js/src/builtin/TestingFunctions.cpp
> @@ +2801,3 @@
> >  "  After 'count' js_malloc memory allocations, fail the next allocation\n"
> > +"  (return NULL). The optional thread type limits the effect to the specified\n"
> > +"  type of helper thread."),
> 
> Would be nice to pass a string indicating the thread type rather than a
> number.

We could do this, but we don't do it for gczeal either and using numbers makes it a little easier to test automatically. On the other hand, it makes documenting the options a little easier as well, I'll think about that.
Flags: needinfo?(jcoppeard)
(In reply to Christian Holler (:decoder) from comment #5)
> So you propose to add a getter and a setter function in jsutil.cpp to
> encapsulate the ThreadLocal magic. But then I have to include jsutil.h in
> js/public/Utility.h, is that right? Because right now, it's not.

I think you can add the declaration to Utility.h and have the definition in jsutil.cpp, so you don't need jsutil.h?
Flags: needinfo?(jdemooij)
(In reply to Christian Holler (:decoder) from comment #6)

> Why is the simulated OOM handled differently to a regular OOM here?

We try to recover from a real OOM, and if we can then the caller never sees that it has happened.  That would defeat the point of simulating an OOM which is to check that the caller does the right thing when it sees a OOM failure.

> Please let me know how/if I should change this.

The code in Runtime.cpp just needs to be correct as to whether a simulated OOM has happened.  I think I was suggesting a refactoring that meant we only check the thread once when simulating OOM, but it doesn't matter too much.

> > Would be nice to pass a string indicating the thread type rather than a
> > number.
> 
> We could do this, but we don't do it for gczeal either and using numbers
> makes it a little easier to test automatically. On the other hand, it makes
> documenting the options a little easier as well, I'll think about that.

Good point about automatic testing.  It's up to you.
Flags: needinfo?(jcoppeard)
Attached patch oom-thread.patch (obsolete) — Splinter Review
Patch with some improvements. Changes are:

* Added the ThreadType enum, moved it into the js::oom namespace 
* Moved all the other variables (only those that I add) into the js::oom namespace
* Added the JS_OOM_THREAD_CHECK macro to avoid code duplication
* Rewrote the shell functions as suggested and additionally made sure that only assignments are made when all arguments have been checked
* Added a MAX check to the shell functions
* Added Get/Set/InitThreadType functions in the js::oom namespace in jsutil.cpp


We should get this landed now and see if it helps us finding new bugs (I will also add support for it in my test script that I gave Hannes).
Attachment #8653497 - Attachment is obsolete: true
Attachment #8658653 - Flags: review?(jdemooij)
Comment on attachment 8658653 [details] [diff] [review]
oom-thread.patch

Review of attachment 8658653 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!

::: js/public/Utility.h
@@ +109,5 @@
> +extern bool InitThreadType(void);
> +extern void SetThreadType(ThreadType);
> +extern uint32_t GetThreadType(void);
> +
> +#define JS_OOM_THREAD_CHECK() (!js::oom::targetThread || js::oom::targetThread == js::oom::GetThreadType())

Please make this an inline function.

@@ +123,5 @@
>  ShouldFailWithOOM()
>  {
> +    if (!JS_OOM_THREAD_CHECK())
> +        return false;
> +    

Extra whitespace here.

::: js/src/builtin/TestingFunctions.cpp
@@ +1000,5 @@
> +    uint32_t targetThread = 0;
> +    if (!ToUint32(cx, args.get(1), &targetThread))
> +        return false;
> +
> +    if (targetThread < 0 || targetThread >= js::oom::THREAD_TYPE_MAX) {

targetThread is never going to be less than zero because it's unsigned, so we can remove that check.

@@ +1033,5 @@
> +    uint32_t targetThread = 0;
> +    if (!ToUint32(cx, args.get(1), &targetThread))
> +        return false;
> +
> +    if (targetThread < 0 || targetThread >= js::oom::THREAD_TYPE_MAX) {

Ditto above.
Attachment #8658653 - Flags: review?(jdemooij) → review+
This has caused build bustage. I'm going to back it out unless you can fix in the next 5 mins or so.
Flags: needinfo?(choller)
Simple issue, forgot some ifdefs for non-debug builds. Will fix that later and repush, and do an opt build before that locally.
Attached patch oom-thread.patch (obsolete) — Splinter Review
Updated the #ifdefs in the patch to make the functions in jsutil.cpp no-ops in those builds that don't have OOM testing code. Had to move the enum definition outside the #ifdefs for that.

I suggest that in some other bug, we move all of the legacy OOM testing code also into the js::oom:: namespace to make this code a little cleaner.

Keeping r+ since this should not change functionality, just make the opt-builds work.
Attachment #8658653 - Attachment is obsolete: true
Attachment #8659661 - Flags: review+
This is green on try for Linux, but I'm making a try push with Windows to ensure that the build bustage there is also resolved before re-landing this.
Attached patch oom-thread.patchSplinter Review
Final version, moved the function definitions outside the #ifdef linux (I heard it helps to not #ifdef linux your code if you want to run it on mac and windows as well... -.-) and made these functions inline no-ops if the build is not an OOM build.
Attachment #8659661 - Attachment is obsolete: true
Attachment #8659883 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/9c1c2581ad65
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Whiteboard: [adv-main43-]
You need to log in before you can comment on or make changes to this bug.