jsemit.c's SpanDep allocation code should not use arenas

RESOLVED FIXED in mozilla1.9beta4

Status

()

Core
JavaScript Engine
RESOLVED FIXED
11 years ago
10 years ago

People

(Reporter: Brian Crowder, Assigned: Brian Crowder)

Tracking

Trunk
mozilla1.9beta4
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +
in-testsuite -
in-litmus -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 8 obsolete attachments)

(Assignee)

Description

11 years ago
Created attachment 303782 [details] [diff] [review]
JSArenas abandoned in favor of good ol' malloc/realloc/free

This code is the only code using the (potentially unsound?) JS_ArenaFreeAllocation() routine.  Also, the chunk-sizes it works in are far larger than the arenasize of the tempPool it uses, so we're actually paying for arena code/checks/linked-listing AND a malloc/realloc whenever the buffer here is initialized or grows.  Let's cut out the middle man.

I am in the process of testing this patch further and will request review shortly, if I don't run into hang-ups.
(Assignee)

Comment 1

11 years ago
Comment on attachment 303782 [details] [diff] [review]
JSArenas abandoned in favor of good ol' malloc/realloc/free

This does not introduce any new testsuite failures.
Attachment #303782 - Flags: review?(igor)

Comment 2

11 years ago
Comment on attachment 303782 [details] [diff] [review]
JSArenas abandoned in favor of good ol' malloc/realloc/free

>Index: jsemit.c
>===================================================================
>         if (!sdbase) {
>             size = SPANDEPS_SIZE_MIN;
>-            JS_ARENA_ALLOCATE_CAST(sdbase, JSSpanDep *, &cx->tempPool, size);
>+            sdbase = (JSSpanDep *)malloc(size);
>         } else {
>             size = SPANDEPS_SIZE(index);
>-            JS_ARENA_GROW_CAST(sdbase, JSSpanDep *, &cx->tempPool, size, size);
>+            sdbase = (JSSpanDep *)realloc(sdbase, size + size);
>+            if (!sdbase)
>+                free(cg->spanDeps);
>         }

Single realloc is sufficient here as it accepts null as an initial pointer. 

>         if (!sdbase) {
>             js_ReportOutOfScriptQuota(cx);

That has to be changed into js_ReportOutOfMemory(cx). 

>             return JS_FALSE;
>         }

on OOM this return will keep the free pointer in cg->spanDeps.
Attachment #303782 - Flags: review?(igor) → review-

Comment 3

11 years ago
Comment on attachment 303782 [details] [diff] [review]
JSArenas abandoned in favor of good ol' malloc/realloc/free

>     size = SPANDEPS_SIZE(JS_BIT(JS_CeilingLog2(cg->numSpanDeps)));
>-    JS_ArenaFreeAllocation(&cx->tempPool, cg->spanDeps,
>-                           JS_MAX(size, SPANDEPS_SIZE_MIN));
>+    free(cg->spanDeps);

In addition to this place the free call should go to js_FinishCodeGenerator.
(Assignee)

Updated

11 years ago
Blocks: 417999
(Assignee)

Comment 4

11 years ago
Created attachment 303786 [details] [diff] [review]
Igor's review changes

Thanks, how's this?
Assignee: general → crowder
Attachment #303782 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #303786 - Flags: review?(igor)
(Assignee)

Comment 5

11 years ago
Comment on attachment 303786 [details] [diff] [review]
Igor's review changes

Woops, missed comment #3.  New one coming right up.
Attachment #303786 - Attachment is obsolete: true
Attachment #303786 - Flags: review?(igor)
(Assignee)

Comment 6

11 years ago
Created attachment 303787 [details] [diff] [review]
like so
Attachment #303787 - Flags: review?(igor)
(Assignee)

Comment 7

11 years ago
Actually, the cleanup in the realloc() failure case isn't necessary since the CodeGenerator will get finished shortly.  I'll remove and submit one more.
(Assignee)

Comment 8

11 years ago
Created attachment 303788 [details] [diff] [review]
better
Attachment #303787 - Attachment is obsolete: true
Attachment #303788 - Flags: review?(igor)
Attachment #303787 - Flags: review?(igor)

Comment 9

11 years ago
Comment on attachment 303788 [details] [diff] [review]
better

>Index: jsemit.c
>===================================================================
>RCS file: /cvsroot/mozilla/js/src/jsemit.c,v
>retrieving revision 3.305
>diff -u -p -8 -r3.305 jsemit.c
>--- jsemit.c	16 Feb 2008 02:46:46 -0000	3.305
>+++ jsemit.c	17 Feb 2008 00:20:46 -0000
>@@ -101,16 +101,17 @@ js_InitCodeGenerator(JSContext *cx, JSCo
> }
> 
> JS_FRIEND_API(void)
> js_FinishCodeGenerator(JSContext *cx, JSCodeGenerator *cg)
> {
>     TREE_CONTEXT_FINISH(&cg->treeContext);
>     JS_ARENA_RELEASE(cg->codePool, cg->codeMark);
>     JS_ARENA_RELEASE(cg->notePool, cg->noteMark);
>+    free(cg->spanDeps);

Here cg->spanDeps most likely is null. Plus explicit check for non-null pointer will follow, for example, JS_free that calls free only with non-null argument. r+ with that fixed.
(Assignee)

Comment 10

11 years ago
Created attachment 303791 [details] [diff] [review]
for posterity

Thanks to Igor for excellent reviews.
Attachment #303788 - Attachment is obsolete: true
Attachment #303791 - Flags: review+
Attachment #303788 - Flags: review?(igor)
(Assignee)

Comment 11

11 years ago
Created attachment 303793 [details] [diff] [review]
one more refinement

A hint that we don't expect spanDeps to be non-NULL here.
Attachment #303791 - Attachment is obsolete: true
Attachment #303793 - Flags: review+
(Assignee)

Comment 12

11 years ago
Comment on attachment 303793 [details] [diff] [review]
one more refinement

This is needed as part of the effort to resolve bug 408921, which is a blocker.
Attachment #303793 - Flags: approval1.9?
(Assignee)

Comment 13

11 years ago
Comment on attachment 303793 [details] [diff] [review]
one more refinement

Ugh, the use of JS_UNLIKELY here causes a warning about cast-to-int, style choice:
    if (JS_UNLIKELY(cg->spanDeps != NULL))
or
    if (JS_UNLIKELY((int)cg->spanDeps))
Attachment #303793 - Attachment is obsolete: true
Attachment #303793 - Flags: approval1.9?
We see this elsewhere, != NULL is the right way to cope. However, IIRC an if-then with no else will predict the branch as taken (the if condition as false), at least on modern x86. Someone correct me if I'm mistaken. It makes me question the use of JS_{UN,}LIKELY in any thing less than interpreter-loop perf critical. The hint could still be good on architectures with explicit branch-likely/unlikely instructions to select, of course. This just seems less than critical-path, and so not worth the hassle.

/be
I'll plus a final patch that just uses if-then (assuming I'm correct about modern branch prediction hardware removing the instruction selection burden from hackers and compiler writers).

/be
Flags: blocking1.9+
(Assignee)

Comment 16

11 years ago
Comment on attachment 303791 [details] [diff] [review]
for posterity

This one uses if-then rather than the branch-prediction.  The idea of using JS_UNLIKELY was more to provide a hint to the reader that the condition should only happen in the rarest of circumstances (oom).
Attachment #303791 - Flags: approval1.9?
(Assignee)

Comment 17

11 years ago
Bugzilla interdiff seems to like these patches, you can review the delta using it, surprisingly enough.
(Assignee)

Updated

11 years ago
Attachment #303791 - Attachment is obsolete: false
Comment on attachment 303791 [details] [diff] [review]
for posterity

>+        size = sdbase ? SPANDEPS_SIZE(index) : SPANDEPS_SIZE_MIN;
>+        sdbase = (JSSpanDep *) realloc(sdbase, size + size);

So in the basis case (sdbase == NULL), 2 * SPANDEPS_SIZE_MIN bytes are allocated, which is not what the old code did:

>         if (!sdbase) {
>-            size = SPANDEPS_SIZE_MIN;
>-            JS_ARENA_ALLOCATE_CAST(sdbase, JSSpanDep *, &cx->tempPool, size);

To preserve the plain sense of SPANDEPS_SIZE_MIN, suggest initializing size in the basis case to SPANDEPS_SIZE_MIN / 2.

>-        } else {
>-            size = SPANDEPS_SIZE(index);
>-            JS_ARENA_GROW_CAST(sdbase, JSSpanDep *, &cx->tempPool, size, size);
>-        }
>-        if (!sdbase) {
>-            js_ReportOutOfScriptQuota(cx);
>+            js_ReportOutOfMemory(cx);

If you use JS_realloc instead of realloc, you don't need to call JS_ReportOOM here, it does it for you. It also updates the operation counter, FWIW.

I'll approve a patch that fixes these fine points.

/be

Comment 19

11 years ago
(In reply to comment #14)
> The hint could still be good on architectures with explicit
> branch-likely/unlikely instructions to select, of course. This just seems less
> than critical-path, and so not worth the hassle.

JS_UNLIKELY hint here serves self-documenting purpose indicating that the extra check is a conscious choice. Otherwise one day a purge to replace if (x) free(x) by just free(x) accidentally remove the check.

It could be argued that a comment is still necessary here. But then JS_UNLIKELY allows to be very short with comment's text.
True, but the mandatory != NULL bugged me enough to doubt the value. I'm ok either way (comment or JS_UNLIKELY), it's not big deal.

/be
(Assignee)

Comment 21

11 years ago
Created attachment 303885 [details] [diff] [review]
with Brendan's points addressed
Attachment #303791 - Attachment is obsolete: true
Attachment #303885 - Flags: review?(brendan)
Attachment #303791 - Flags: approval1.9?
(Assignee)

Comment 22

11 years ago
Created attachment 304065 [details] [diff] [review]
fix bad logic
Attachment #303885 - Attachment is obsolete: true
Attachment #304065 - Flags: review?(brendan)
Attachment #303885 - Flags: review?(brendan)
Comment on attachment 304065 [details] [diff] [review]
fix bad logic

>+        JS_free(cg->spanDeps);
[...]
>+        sdbase = (JSSpanDep *) JS_realloc(sdbase, size + size);
[...]
>+    JS_free(cg->spanDeps);

These can't compile, because JS_*alloc and JS_free take a leading cx parameter. Which means this patch can't have been tested, either. Please compile and test before requesting review!

/be
Attachment #304065 - Flags: review?(brendan) → review-
(Assignee)

Comment 24

11 years ago
Created attachment 304076 [details] [diff] [review]
with a compile and a test-suite run

Sorry for sloppiness.
Attachment #304065 - Attachment is obsolete: true
(Assignee)

Updated

11 years ago
Attachment #304076 - Flags: review?(brendan)
Comment on attachment 304076 [details] [diff] [review]
with a compile and a test-suite run

>+    if (cg->spanDeps) /* only non-null after OOM */

English nit: "non-null only after OOM". r/a=me with that, thanks.

/be
Attachment #304076 - Flags: review?(brendan)
Attachment #304076 - Flags: review+
Attachment #304076 - Flags: approval1.9+
(Assignee)

Comment 26

11 years ago
Checked-in with comment reworded:

jsemit.c:3.306
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED

Updated

11 years ago
Flags: in-testsuite-
Flags: in-litmus-

Comment 27

10 years ago
(In reply to comment #14)
> if-then with no else will predict the branch as taken (the if condition as
> false), at least on modern x86. Someone correct me if I'm mistaken.

The Intel docs say that the static branch prediction algorithm predicts forward conditional branches to be NOT taken.  This means that if-bodies are assumed to be executed. [http://www.intel.com/design/processor/manuals/248966.pdf page 88]  

That means that the common js coding practice of
  if (unlikely_condition)
      return NULL;
  happyPath();
  return whatever;

is incurring a small performance penalty on the Intel processors that use static branch prediction.  (Intel recommends that coders assume that all Intel processors use the static algorithm, even though they don't.)

This would be faster:
  if (likely_condition) {
     happyPath();
     return whatever;
  }
  return NULL;

Though, I don't necessarily think that that coding practice is the best.

I'm not complaining about this bug, merely responding to Brendan's "correct me if I'm wrong" request.

Robin, thanks for the info. I was thinking of something I'd read a while ago. It may have been about SGI/MIPS compilers, even (dating myself badly here), based on studies showing that control flow tends to avoid then clauses of else-less if statements, but to prefer then clauses of if-else statements.

But none of this can be relevant on modern micro-architectures because of caching and branch prediction independently wanting to favor what was prefetched, what's "next". Anything else is wasted i-cache lines of abnormal early-out "then" code, and unnecessarily allocated BTBs (dynamic branch prediction resources) to target around the exceptional code.

So really, the "normal code least indented, cast out errors and hard cases" style of SpiderMonkey is all wrong for modern CPUs! This is one reason the PGO builds show such a speedup. It isn't only SpiderMonkey that favors that style, BTW. The folks at Intel recommend not doing violence to such code, instead using PGO as we are now for Firefox 3.

/be
Back at Netscape, Intel folks came to talk about Pentium IV (I think) and some advised us to bend our C code about uarch economics: "don't use shifts, multiply instead". Clearly a job for the compiler, not the programmer. How hard can it be to tweak instruction selection to strength-increase shift to multiply instead of strength-reduce multiply to shift? ;-).

PGO is a bigger trick to require of compilers, but it seems to be working for us. Still, this is all food for thought and worth reflecting on, so thanks again.

/be
Indeed all our fast path optimizations (the polymorphic property cache; shaver's array work in js_Interpret) are if-then structures where everything in the pipe is the fast code. So less optimal un-PGO'ed code elsewhere fades out of profiles.

/be
You need to log in before you can comment on or make changes to this bug.