Closed Bug 642327 Opened 13 years ago Closed 13 years ago

Add OOM regression checking to |make check|

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: paul.biggar, Unassigned)

References

Details

(Whiteboard: [fixed-in-tracemonkey])

Attachments

(1 file)

This patch attempts to prevent regressions in our OOM handling.

Summary of the patch:
 - in DEBUG mode, add a |-A NUM| flag to the shell, causing memory allocation
to fail on the NUM'th allocation
 - add a script to cycle through all memory allocations, and see if we fail
properly
 - call the script from |make check|

The output of the script is:

  "Found the expected number of OOM errors (128)"


I asked the security team for a review-of-the-concept (bug 635413), and it was the consensus that including something like this in the tree is not a security problem.
Attachment #519813 - Flags: review?(nnethercote)
Comment on attachment 519813 [details] [diff] [review]
Add OOM checking to |make check|

>diff --git a/config/find_OOM_errors.py b/config/find_OOM_errors.py
>new file mode 100755
>--- /dev/null
>+++ b/config/find_OOM_errors.py

The attached patch includes this file twice somehow.  Weird.

I don't know Python, so consider this a very light review of this script.

I'd like a high-level comment at the top of the script explaining what it's
doing, particularly for the Valgrind-related bits.  A usage message is also
needed AFAICT.


>+# How often is a particular lines important for this.

I don't understand that sentence.


>+# Help ensure that the number of OOM errors in SpiderMonkey doesn't increase.
>+# If the number of OOM errors, update the number below. We intend this number
>+# to go down over time, by fixing OOMs.

"If the number of OOM errors _changes_..." ?


>+#ifdef DEBUG
>+/*
>+ * For JS_OOM_POSSIBLY_FAIL in jsutil.h. We use UINT_MAX as a signal to the
>+ * shell to print the number of allocations in a normal program, so initialize
>+ * OOM_max_allocations to -2 instead.
>+ */
>+JS_PUBLIC_DATA(unsigned int) OOM_max_allocations = (unsigned int)(-2);
>+JS_PUBLIC_DATA(unsigned int) OOM_counter = 0;
>+#endif

No magic numbers, please!



>+#ifdef DEBUG
>+/*
>+ * In order to test OOM conditions, when the shell command-line option
>+ * |-A NUM| is passed, we fail continuously after the NUM'th allocation.
>+ */
>+extern JS_PUBLIC_DATA(unsigned int) OOM_max_allocations; /* set once from shell/js.cpp */
>+extern JS_PUBLIC_DATA(unsigned int) OOM_counter; /* data race, who cares. */

Is 32 bits enough for the counter?  Probably, but it makes me slightly
nervous.  64 bits would be safer.


>+#define JS_OOM_POSSIBLY_FAIL() \
>+    do \
>+    { \
>+        if (OOM_counter++ >= OOM_max_allocations) { \
>+            return NULL; \
>+        } \
>+    } while (0)
>+
>+#else
>+#define JS_OOM_POSSIBLY_FAIL() do {} while(0)
>+#endif

You could make the RHS empty instead of |do {} while(0)|.


>+#ifdef DEBUG
>+        case 'A':
>+            if (++i == argc)
>+                return usage();
>+            OOM_max_allocations = atoi(argv[i]);
>+            break;
>+#endif

Since you've already looked for a -A option in main(), this is just
duplicating work, right?  Could you just skip over the option and argument?
(With a comment explaining that main() already handled it.)


>     result = Shell(cx, argc, argv, envp);
> 
>+#ifdef DEBUG
>+    if (OOM_max_allocations == (unsigned int)(-1))
>+        printf("OOM max count: %d\n", OOM_counter);
>+#endif

I don't understand what this is for?

r=me with the above addressed.
Attachment #519813 - Flags: review?(nnethercote) → review+
(In reply to comment #2)
> >diff --git a/config/find_OOM_errors.py b/config/find_OOM_errors.py
> The attached patch includes this file twice somehow.  Weird.

They're two different files: config/find_OOM_errors.py and js/src/config/find_OOM_errors.py, due to the build-sync-check stuff.



> I'd like a high-level comment at the top of the script explaining what it's
> doing, particularly for the Valgrind-related bits.  A usage message is also
> needed AFAICT.

Good points. The valgrind bits aren't actually used in |make check| but I think it's better to add a flag to the full script for |make check\, then to port it into a script of its own. Either way, more comments required.



> >+# How often is a particular lines important for this.
> 
> I don't understand that sentence.

New comment:
 """Keep track of the amount of times individual lines occur, in order to
     prioritize the errors which occur most frequently."""



 
> >+#ifdef DEBUG
> >+/*
> >+ * For JS_OOM_POSSIBLY_FAIL in jsutil.h. We use UINT_MAX as a signal to the
> >+ * shell to print the number of allocations in a normal program, so initialize
> >+ * OOM_max_allocations to -2 instead.
> >+ */
> >+JS_PUBLIC_DATA(unsigned int) OOM_max_allocations = (unsigned int)(-2);
> >+JS_PUBLIC_DATA(unsigned int) OOM_counter = 0;
> >+#endif
> 
> No magic numbers, please!

But its sooo much easier!

OK, fine.

 
 
> >+#ifdef DEBUG
> >+/*
> >+ * In order to test OOM conditions, when the shell command-line option
> >+ * |-A NUM| is passed, we fail continuously after the NUM'th allocation.
> >+ */
> >+extern JS_PUBLIC_DATA(unsigned int) OOM_max_allocations; /* set once from shell/js.cpp */
> >+extern JS_PUBLIC_DATA(unsigned int) OOM_counter; /* data race, who cares. */
> 
> Is 32 bits enough for the counter?  Probably, but it makes me slightly
> nervous.  64 bits would be safer.

I think it's plenty, but no harm going with 64 bits here.


> >+#define JS_OOM_POSSIBLY_FAIL() do {} while(0)
> You could make the RHS empty instead of |do {} while(0)|.

AFAIK, the Windows build in the full build complains when there are random semi-colons lying around.

 
 
> >+#ifdef DEBUG
> >+        case 'A':
> >+            if (++i == argc)
> >+                return usage();
> >+            OOM_max_allocations = atoi(argv[i]);
> >+            break;
> >+#endif
> 
> Since you've already looked for a -A option in main(), this is just
> duplicating work, right?  Could you just skip over the option and argument?
> (With a comment explaining that main() already handled it.)

Yes.


> >     result = Shell(cx, argc, argv, envp);
> > 
> >+#ifdef DEBUG
> >+    if (OOM_max_allocations == (unsigned int)(-1))
> >+        printf("OOM max count: %d\n", OOM_counter);
> >+#endif
> 
> I don't understand what this is for?

That's to print out the total number of allocations, so that the find_OOM script will know how often to run the test.



> r=me with the above addressed.


Thanks! I'm just fixing some errors that tryserver found, and will commit soon I hope.
http://hg.mozilla.org/tracemonkey/rev/b0461952d5d3
Whiteboard: [fixed-in-tracemonkey]
http://hg.mozilla.org/mozilla-central/rev/b0461952d5d3
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
With this landed, we see a11y oranges in M(other). dvanders patch from bug 638680 also triggered those errors, and was backed out. So backing this out for now, and if things stay orange, I'll push it again, otherwise I'll block on bug 652459.

Backout: http://hg.mozilla.org/tracemonkey/rev/41b74f0b32fe
Merge: http://hg.mozilla.org/tracemonkey/rev/b5302c13e059
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [fixed-in-tracemonkey]
Disregard comment 6, that was for bug 628332.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-tracemonkey]
You need to log in before you can comment on or make changes to this bug.