Closed
Bug 383932
Opened 17 years ago
Closed 14 years ago
JS engine OOM testing
Categories
(Core :: JavaScript Engine, enhancement)
Tracking
()
RESOLVED
DUPLICATE
of bug 624094
People
(Reporter: gavin.reaney, Assigned: gavin.reaney)
References
Details
Attachments
(1 file, 5 obsolete files)
38.54 KB,
patch
|
Details | Diff | Splinter Review |
The attached zip file contains a basic malloc replacement to facilitate OOM testing of the JS shell.
Previous OOM testing that I've undertaken didn't use this code (it was part of a larger application and test framework so the JS shell wasn't involved, and the SM code tested was from older branches). Since this code is new, it may contain bugs.
There are three parts:
- the malloc replacement code
- diffs to the engine to ensure malloc calls get replaced
- a driver shell script to force failure of each malloc in turn
The test method is very simple - we set a limit beyond which all malloc requests fail. The driver script simply starts with a specified limit (e.g. 0) and repeatedly runs the JS shell, increasing the limit each time until no more allocations are requested.
This won't catch all possible crashes due to OOM, but it has proven to be effective. It's possible to add support for failing a proportion of malloc requests too, although I haven't done that here.
There are probably other malloc replacement libraries that could be used, and I have no objection to that (what I've put together here is really just to show the principle involved).
The code detects memory leaks and over/under writes, which are sometimes triggered by bad OOM handling. Add JS_OOM_TESTING=1 to the make command line to enable the new code.
When running this with the JS shell, a lot of memory leaks are reported, as well as a few crashes (due to bad OOM handling), so there are some things to fix as a result of this.
Some examples of how to use the driver script:
#start with a malloc limit of zero, testing with myscript.js
./test.sh 0 ./Linux_All_DBG.OBJ/js myscript.js
#use the jsDriver to run a test
./jsDriver.pl -e smdebug -s "../src/test.sh" --opt "0 ../src/WINNT5.1_DBG.OBJ/js.exe" -l js1_6/decompilation/regress-352613-01.js
Assignee | ||
Comment 1•16 years ago
|
||
This diff gets us back to being able to do low memory testing for code that uses standard malloc/free. More attention is needed to properly test code that uses C++ operator new/delete.
Much of the code that invokes new/delete is in header files, which may cause some headaches if we try to use macro replacements here.
Also, we may wish to pay attention to any areas of code that use mmap, since these won't be covered by the current test strategy.
After applying the patch it may be necessary to make the driver script (test.sh) executable.
This has been tested on OS X only.
A chunk of memory leaks are reported up to allocation number 205, but these seem harmless since they are caused by the JS shell not cleaning up in error cases.
Attachment #267873 -
Attachment is obsolete: true
Assignee | ||
Comment 2•16 years ago
|
||
(Hopefully) remove the need to have jsdmalloc.h at the end of the include list. Put ifdefs into jsdmalloc.h header instead of all files that include it.
Attachment #353193 -
Attachment is obsolete: true
Assignee | ||
Comment 3•16 years ago
|
||
This code finds OOM crashes in some of the existing regression tests so I'll open new bugs for those soon.
Attachment #353774 -
Attachment is obsolete: true
Assignee | ||
Comment 4•16 years ago
|
||
Added some additional commentary, and moved code into jsdbgapi instead of jsapi. Slightly updated usage instructions can be found in oom-test.sh.
This still doesn't test use of the C++ new operator, but I'm happy to leave it out of the initial commit (I may be setting this ready for review soon on that basis). I think OOM handling wrt the new operator, and C++ constructors deserves special attention so I'll open a separate bug about that.
I was able to use this code to track down a couple of OOM crashes. The general approach is as follows:
- run the oom-test.sh script according to the examples in oom-test.sh (I used the jsDriver to run a set of regression tests).
- have a look at the failure logs output by the jsDriver (or directly by the shell). In these you should see what allocation limit caused a crash/leak/assert etc.
- if a debugger wasn't automatically launched, you need to re-run the application inside your favourite debugger with the problem allocation limit set. You can set the allocation limit via an environment variable (JS_OOM_ALLOC_LIMIT) or via the jsdbgapi or by setting the correct global variable inside jsdmalloc.cpp from your debugger.
- run the application inside the debugger. The full command line for your application should be available in the output produced by oom-test.sh. The same failure should now be repeated inside the debugger. It may be helpful to set breakpoints on 'alloc_limit_reached' or 'failure_detected' which are called by the malloc replacement code whenever it fails an allocation or detects a problem.
Additional note: when tracking down memory leaks, the file/line info of the allocator may not be very useful (since most report JS_Malloc at the moment). To help us here, we provide an 'id' number for each allocation. If you see 'allocation 10 leaked' then the allocation limit can then be set to 10, and a breakpoint set on 'alloc_limit_reached' to discover what code is performing that allocation.
Attachment #354686 -
Attachment is obsolete: true
Assignee | ||
Comment 5•16 years ago
|
||
Brendan> please nominate another reviewer if you wish.
My intention after pushing this to tracemonkey is to raise bugs (and address some of them) for crashes that I've run into.
Attachment #354817 -
Attachment is obsolete: true
Attachment #355935 -
Flags: review?(brendan)
Comment 6•16 years ago
|
||
Comment on attachment 355935 [details] [diff] [review]
A few improvements to oom-test.sh driver script
>+ifdef JS_OOM_TESTING
>+DEFINES += -DJS_OOM_TESTING
>+endif
We'll want configure --enable-oom-testing support too I expect.
>@@ -725,24 +725,24 @@ Options(JSContext *cx, JSObject *obj, ui
> found = JS_TRUE;
> names = JS_sprintf_append(names, "%s%s",
> names ? "," : "", js_options[i].name);
> if (!names)
> break;
> }
> }
> if (!found)
>- names = strdup("");
>+ names = JS_strdup(cx, "");
> if (!names) {
> JS_ReportOutOfMemory(cx);
JS_strdup reports OOM so the JS_ReportOutOfMemory should go here or you'll get two OOM reports.
> return JS_FALSE;
> }
> str = JS_NewString(cx, names, strlen(names));
> if (!str) {
>- free(names);
>+ JS_free(cx, names);
OTOH, what was wrong with the original code? Is the idea to put all C-string buffer allocations through JS_malloc and not malloc (whether via strdup or not)? Yikes.
Will review more when I have time, but I'm short on time right now. Set me straight on comments/questions above in the mean time.
/be
Assignee | ||
Comment 7•16 years ago
|
||
(In reply to comment #6)
> > return JS_FALSE;
> > }
> > str = JS_NewString(cx, names, strlen(names));
> > if (!str) {
> >- free(names);
> >+ JS_free(cx, names);
>
> OTOH, what was wrong with the original code? Is the idea to put all C-string
> buffer allocations through JS_malloc and not malloc (whether via strdup or
> not)? Yikes.
In js.cpp, we don't do malloc replacement (I mean any calls to malloc in this file still go to the system malloc as normal). Since the engine is using our malloc replacement, we need to be careful about memory blocks which are passed across the JS API. In this line of code:
str = JS_NewString(cx, names, strlen(names));
we are passing ownership of 'names' into the engine. Since the engine will use our special version of free to release it this would look like a corrupt block because it wouldn't have our special header and so on (if it was allocated by system malloc). So the change was to ensure that the version of 'malloc' and 'free' that is used are consistent.
I know this is a subtle change to the API requirements for JS_NewString for this test code, but given the small number of the changes to support it, I didn't worry too much about it...
Comment 8•16 years ago
|
||
Actually, it isn't a change at all:
https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_NewString
Rather, it's just a bug in js.cpp.
Assignee | ||
Comment 9•16 years ago
|
||
(In reply to comment #8)
> Actually, it isn't a change at all:
>
> https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_NewString
>
> Rather, it's just a bug in js.cpp.
Excellent, thanks for pointing that out.
Comment 10•16 years ago
|
||
The API doc on MDC is not the primary source of truth. For ~13 years, the code has allowed malloc and free to be used to manage a char buffer passed to JS_NewString. It is helpful to change this for the purposes of this bug (OOM testing) but we cannot change the API to require any such thing for all or even Mozilla-hosted SpiderMonkey embeddings.
Don't pretend doc trumps code, esp. code in the field. The field will bite back.
/be
Comment 11•16 years ago
|
||
Generally, the point is to use the docs instead of the code, because studying the code to figure out such things is non-trivial. I assumed that was the case here. Also, what for those people who depended on the documentation to be correct? I don't know whether the docs or the code needs changing, but certainly one or the other needs to be changed, and I don't think it's obvious js.cpp *doesn't* need to be changed:
http://google.com/codesearch/p?hl=en#Ir51gE6JS7o/trunk/src/couchdb/couch_js.c&q=JS_NewString&l=510
http://google.com/codesearch/p?hl=en#s7LHiE7n3MA/gxine-0.4.1/src/script_engine.c&q=JS_NewString&l=404
http://google.com/codesearch/p?hl=en#Q5u4n3fNfJo/trunk/pacparser.c&q=JS_NewString&l=125
These are just a few of the hits I found in a codesearch on Google; I assume there are others. Luckily a lot of the hits went through JS_vsnprintf or similar, so the most important thing here seems to be to ensure the two use the same allocator. As for the code hits, hopefully those links won't go stale, but who knows, the URLs aren't obviously permanent to me..
Comment 12•16 years ago
|
||
(In reply to comment #11)
> Generally, the point is to use the docs instead of the code, because studying
> the code to figure out such things is non-trivial.
That's why we earn the big bux :-P.
Seriously, the doc is epiphenomenal. The code is the primary source of truth, for better and worse. It's what has been embedded all over for 10+ years. It is what it is.
Changing it to be better, and documenting the new "is" state, sounds fine but not in the abstract, and not doc-first. We need to be concrete in order to be careful about compatibility, so: bugs with patches.
> I assumed that was the case
> here. Also, what for those people who depended on the documentation to be
> correct? I don't know whether the docs or the code needs changing, but
> certainly one or the other needs to be changed, and I don't think it's obvious
> js.cpp *doesn't* need to be changed:
>
> http://google.com/codesearch/p?hl=en#Ir51gE6JS7o/trunk/src/couchdb/couch_js.c&q=JS_NewString&l=510
> http://google.com/codesearch/p?hl=en#s7LHiE7n3MA/gxine-0.4.1/src/script_engine.c&q=JS_NewString&l=404
> http://google.com/codesearch/p?hl=en#Q5u4n3fNfJo/trunk/pacparser.c&q=JS_NewString&l=125
The code can certainly change, but it does not *need* to just because it's not obvious.
The fact remains that embedders have been able to pass malloc'ed buffers into JS_NewString for 10+ years. That is not going to change just because the doc changes, and breaking the code to require JS_malloc as the allocator will break some embeddings.
> These are just a few of the hits I found in a codesearch on Google; I assume
> there are others. Luckily a lot of the hits went through JS_vsnprintf or
> similar, so the most important thing here seems to be to ensure the two use the
> same allocator.
The assumption since day one has been that everyone using the JS API including the engine uses the same malloc and free.
/be
Comment 13•16 years ago
|
||
To be clear: recommending JS_malloc'ed buffers exclusively may be good future proofing, but I have no idea when we might reject or fail on malloc'ed buffers. The doc as informative best-practices guide can help establish new norms. We just do not know when the existing norms might no longer apply.
/be
Comment 14•16 years ago
|
||
(In reply to comment #12)
> (In reply to comment #11)
> > Generally, the point is to use the docs instead of the code, because studying
> > the code to figure out such things is non-trivial.
>
> That's why we earn the big bux :-P.
>
> Seriously, the doc is epiphenomenal. The code is the primary source of truth,
> for better and worse. It's what has been embedded all over for 10+ years. It is
> what it is.
The code may be what matters in reality, but I don't think the presumption that our docs are generally reliable is unreasonable; they're as much time-savers for us as for embedders.
Comment 15•16 years ago
|
||
No straw men, please. I didn't say the docs were unreliable, or any such thing. The arrow of causality can go both ways, but differently:
code (phenomena) => docs (epiphenomena)
docs (norms) => future code (stricter behavior, new errors or failures)
But docs saying X when code allows X or Y and has for a decade does not mean you can force the second kind of flow. That's all.
People use JS_malloc because they suspect it's required, even if they don't read the docs. This can lead to extra copying, in a single-malloc-under-everything embedding. It has not been required, and the copying can hurt (the call overhead probably doesn't). This is one reason for the asymmetric and partial use of JS_malloc within SpiderMonkey.
/be
Comment 16•16 years ago
|
||
(In reply to comment #15)
> No straw men, please. I didn't say the docs were unreliable, or any such
> thing.
I wasn't trying to imply that, but in this case (barring the last paragraph in this message) they are. I simply think that when there are docs, I should be able to rely on them. If they're not, we should either fix them or remove them so as not to confuse readers.
> But docs saying X when code allows X or Y and has for a decade does not mean
> you can force the second kind of flow. That's all.
Not force, but when people rely on inaccurate docs for too long, it should at least be a factor.
> People use JS_malloc because they suspect it's required, even if they don't
> read the docs. This can lead to extra copying, in a
> single-malloc-under-everything embedding. It has not been required, and the
> copying can hurt (the call overhead probably doesn't). This is one reason for
> the asymmetric and partial use of JS_malloc within SpiderMonkey.
That I didn't know, that it was required that malloc and JS_malloc wrap to the same thing. Yet another difference from the docs (last note):
https://developer.mozilla.org/en/SpiderMonkey/JSAPI_Reference/JS_malloc
I'm not sure this is going anywhere particularly productive at this point, but I think I've said what I need to say, and my positions are clear here, so I'll try to shut up now.
Comment 17•16 years ago
|
||
In this case the doc's inaccuracy, over-constraining the allocator to be used with JS_NewString, could be a good thing in the long run. I think we agree on that (right?).
Also we agree the code change can go in. I'm a bit jammed with other work, though. Waldo, you probably are too but if you want to take this review request from me I'd be happy to advise (in a low-key friendly way :-P).
/be
Comment 18•14 years ago
|
||
Comment on attachment 355935 [details] [diff] [review]
A few improvements to oom-test.sh driver script
I've been a bad reviewer these many years, but the good news is that pbiggar picked up the OOM testing torch in bug 635413.
/be
Attachment #355935 -
Flags: review?(brendan)
Comment 19•14 years ago
|
||
See also bug 624094. Gavin, I'll let you dup forward.
/be
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•