<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 2

•

17 years ago

This may be a bit late, if not heretical: why don't we use a separate JSRuntime per thread, and turn off JS_THREADSAFE? We would need to do something to jsdtoa.cpp to get rid of its global locks, or turn them on without JS_THREADSAFE if they are not showing up on profiles. We'd get perf wins on all threads from the lack of JS_THREADSAFE, and we would get better isolation and faster GCs on workers (and on the main thread, lesser issue with its bigger heap). /be

Comment 3

•

17 years ago

I have a patch in progress which does just that. There are some issues with script-blocking in the DOM that I may need help with. See http://hg.mozilla.org/users/bsmedberg_mozilla.com/xpcthreading , which appears to be misconfigured somehow at the moment.

Comment 4

•

17 years ago

I'm all for it, though it'd be perhaps too big an API incompatibility to spring on our extension authors for 3.1? I wonder if any of them are still left using multiple threads after the ThreadManager punking, though, so maybe now is indeed the time. Worth reaching out to extension and app folks to see whether they need shared objects for their multi-threading needs. (Copying Blizzard here to that end.) With a separate runtime (and therefore object pool) per thread, we would also be able to just rid ourselves of the XPConnect monitor, long an ambition of ours. :)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 5

•

17 years ago

Yeah, I ripped out all the xpconnect locks and monitors.

Reporter

Comment 6

•

17 years ago

The runtime-per-thread idea was floated at the start of the workerthreads stuff but I think everyone was concerned about the added costs of copying strings across runtimes... Not a big deal for passing simple messages across workers but it might be for large blobs of JSON'd objects. Is that still a concern?

Comment 7

•

17 years ago

My guess is that the cost of copying string data will be much lower than the pervasive locking costs we pay today.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 8

•

17 years ago

If it becomes a concern, we can probably use the external-string facility to special-case string handling. memcpy of string data should be dwarfed by the work done by the threads themselves, I would sort of hope, and perhaps even offset completely by the locking wins. Brendan believes that crossing between threads *should* hurt, and he's convinced me of that path's righteousness.

Comment 9

•

17 years ago

(In reply to comment #8) > If it becomes a concern, we can probably use the external-string facility to > special-case string handling. memcpy of string data should be dwarfed by the > work done by the threads themselves, I would sort of hope, and perhaps even > offset completely by the locking wins. I think that a bigger issue might be memory usage more than the memcpy itself; but as you say, it can be avoided esp. just for strings. It does make it harder to have some future world where real JS objects can be passed around (chrome) threads, but maybe that world is terrifying to begin with..

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 10

•

17 years ago

(In reply to comment #6) > The runtime-per-thread idea was floated at the start of the workerthreads stuff > but I think everyone was concerned about the added costs of copying strings > across runtimes... Not a big deal for passing simple messages across workers > but it might be for large blobs of JSON'd objects. Is that still a concern? Let's have that problem. Vlad: we were never gonna multi-thread Gecko, so I don't know what future world that was. A dystopian comic-book one where everone dies, even the plucky comic relief, probably. /be

Comment 11

•

17 years ago

Hooray! I'm sure we can figure out how to share string data or even more complex stuff if/when that becomes important.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 12

•

17 years ago

FWIW I suspect that future versions of the Worker spec will allow passing Blob objects across workers. However this might actually be a good thing since Blobs are read-only and so can easily be made thread safe and shared across threads. Further, if we allow JSON objects to be passed across workers we can serialize directly to a shared string buffer that can be passed across threads without copying.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 13

•

17 years ago

(In reply to comment #9) > I think that a bigger issue might be memory usage more than the memcpy itself; > but as you say, it can be avoided esp. just for strings. It does make it > harder to have some future world where real JS objects can be passed around > (chrome) threads, but maybe that world is terrifying to begin with.. (In reply to comment #11) > Hooray! I'm sure we can figure out how to share string data or even more > complex stuff if/when that becomes important. Right now actually. Over in Bug 450290 I'm using the thread manager to do work in JS on a background thread. I do believe the code could be refactored such that this is OK, but the tests are a bit more difficult. Additionally, we plan on doing more work in the background for places so writes don't block the UI whenever possible. We haven't fully spec'd it out yet, but not being able to pass objects around the different threads makes things more difficult to do. Would it still be safe to create, in JS, an object that implements nsIRunnable on one thread, and then it's ran on another thread?

Comment 14

•

17 years ago

(In reply to comment #13) > Would it still be safe to create, in JS, an object that implements nsIRunnable > on one thread, and then it's ran on another thread? No. However, if you can refactor your code to avoid sharing objects, you will probably find it easier to maintain in the long run.

Comment 15

•

17 years ago

(In reply to comment #14) > No. So, that means anybody who wants to use the thread manager in JS to create a background thread cannot anymore, right? There are very good reasons to do this, so I don't see why we'd want to take that away now. > However, if you can refactor your code to avoid sharing objects, you will > probably find it easier to maintain in the long run. While this is generally true, in the current use case it isn't. I'm sharing a variable that stores a wrapped native mozIStorageConnection. Having to get it for each thread would just be annoying.

Comment 16

•

17 years ago

(In reply to comment #15) > I'm sharing a > variable that stores a wrapped native mozIStorageConnection. Why?

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 17

•

17 years ago

(In reply to comment #16) > Why? I define a smart getter on the main thread that is used on the background thread. Regardless though, this change would break my code because I create objects that implement nsIRunnable on the main thread and dispatch them to my background thread.

Reporter

Comment 18

•

17 years ago

Ok, so, sorry to throw another wrench in here, but sicking realized that I actually lied earlier when I said that worker threads only pass strings across threads... Workers are run on a thread pool currently. Each thread that gets created by the pool is given its own context. When it's time to schedule a worker we assign the worker's global object to the context, run the worker, and then clear the global from the context so that another worker may reuse the thread. If the thread is busy then the pool will create a new thread (up to some magic limit) and run on that one. A worker itself is only ever running on one thread at a time, but it may be a different threads each time it runs. This lets us limit the number of OS threads that are created while also allowing unlimited workers. It should be possible to make a new runtime per worker (although this sounds expensive and wasteful), but even that would cause problems because the workers use xpconnect to reflect XHR, (a special flavor of) DOM events, and listeners. If xpconnect has its own runtime then I couldn't use it any longer (or I'd have to do some cross-runtime-proxy calls)...

Comment 19

•

17 years ago

I think we can have an xpconnect per worker, and just run that script on whatever physical thread is available. I'll need to do some rework and thinking, because all objects associated with an xpconnect will need to be able to get back to it via a pointer chain, rather than using nsXPConnect::GetXPConnect()

Comment 20

•

17 years ago

I'm pretty sure we can get you from JSObject* to JSRuntime* in O(1), using something similar to what we do for mapping JSGCThing* to the GC flags.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 21

•

17 years ago

Yeah, I found JS_SetRuntimePrivate and I'm going to set that to the matching nsXPConnect*.

Reporter

Comment 22

•

17 years ago

Oh, right, I left out Vlad and Roc's suggestion that XPConnect be somehow per-runtime rather than per-thread.

Johnny Stenback (:jst)

Comment 23

•

17 years ago

So has anyone looked into what extensions might actually be bitten by this work? I know there's extensions out there that use threads (don't know which ones though), I know Songbird uses threads, and who knows what other apps. Having some sort of idea of the impact of this change for those would seem to be a requirement before we even consider taking a change like this, IMO.

Comment 24

•

17 years ago

Benjamin: If you need to go from JSObject* to JSRuntime*, without any cx param, we should generalize and rename js_GetGCStringRuntime. Johnny's right, this is a big shift, and we do have MT JS API using plugin authors out there (see bug 424376, but I can't find the reference to the outfit whose plugin we broke -- Igor, do you remember?). We need a way for such platform customers to cope. One step forward would be to leave JS_THREADSAFE on but still use a runtime per worker. The main thread and any plugin-created threads would share the main (currently only) runtime used by main-thread XPConnect. Do-able? With enough hand-holding and bridging help such plugins might migrate to a world where JS_THREADSAFE is off. Can we do this in 3.1 timeframe? I don't think so. We could jump to Mozilla 2 at any time and declare API shenanigans, but really: we need a plan and hand-holders. /be

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 25

•

17 years ago

Could we maybe get a post in dev.platform about this since it is not going to get a lot of exposure sitting in a bug?

Comment 26

•

17 years ago

So having one XPConnect per runtime only helps if we create one runtime per worker, no? This is certainly something we could do, but it would be a fairly costly solution. Especially considering that the number of workers is controlled by the page and we can't really set any reasonably low cap on it. It would also negate the current limits we have on how much memory JS is allowed to eat since that limit is per runtime.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 27

•

17 years ago

Why is a runtime per-worker costly? I was imagining that it would be less costly, because you have smaller heaps to walk. Is the dynamic footprint of a JSRuntime larger than I imagine? We'd certainly have to think about how resource limits were handled.

Comment 28

•

17 years ago

(In reply to comment #24) > We need a way for such platform customers to cope. Wild and crazy idea: enable JS_THREADSAFE dynamically for those customers?

Updated

•

17 years ago

Blocks: ST-XPConnect

Comment 29

•

17 years ago

I've cloned bug 455548 to discuss single-threading xpconnect, so that this bug can just be about RefillDoubleFreeList if necessary.

Joe Sabash [:JoeS1]

Comment 30

•

17 years ago

Does http://crash-stats.mozilla.com/report/index/cace737a-846a-11dd-a334-001cc45a2c28 Have any relevance to this bug.

Assignee

Comment 31

•

17 years ago

(In reply to comment #30) > Does > http://crash-stats.mozilla.com/report/index/cace737a-846a-11dd-a334-001cc45a2c28 > Have any relevance to this bug. No.

Assignee

Comment 32

•

17 years ago

Attached file test case to check for double locking overhead (obsolete) — Details

The test is for optimized thread-safe js-shell and measures the performance of running a function N times sequentially versus running it via scatter in parallel. As the function the test first uses code with a loop that creates a double on each iteration and then it uses a similar code that involves only int calls. By default N is set to 4 and should not exceed the number of cores on the system. Here is results of a typical run on 4-core 2.6 GHz Xenon CPU under 64-bit Ubuntu Linux (no jit): ~/m/31-ff/js/src $ ~/m/31-ff/js/src/Linux_All_OPT.OBJ/js ~/s/b.js 2 2 - number of threads double case 62.5 - average time for uncontested case 120 - scatter timing 1.92 - overhead int case 51.5 - average time for uncontested case 81 - scatter timing 1.5728155339805825 - overhead ~/m/31-ff/js/src $ ~/m/31-ff/js/src/Linux_All_OPT.OBJ/js ~/s/b.js 3 3 - number of threads double case 62.333333333333336 - average time for uncontested case 192 - scatter timing 3.0802139037433154 - overhead int case 51.666666666666664 - average time for uncontested case 153 - scatter timing 2.9612903225806453 - overhead ~/m/31-ff/js/src $ ~/m/31-ff/js/src/Linux_All_OPT.OBJ/js ~/s/b.js 4 4 - number of threads double case 63.25 - average time for uncontested case 285 - scatter timing 4.5059288537549405 - overhead int case 51.25 - average time for uncontested case 214 - scatter timing 4.175609756097561 - overhead Here the overhead is the scatter timing divided by the average timing of a sequential case. With the ideal parallelism it should be 1, but the real world is far from the ideal. Already with 3 threads the overhead of the scatter case is about 3 and with 4 threads it exceeds 4 for both double and int cases. So according to the results the scatter may speedup things only with 2 threads, and even there it make sense only for ints, as the overhead of double allocation offsets any wins from the parallel execution.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 33

•

17 years ago

Attached file test case to check for double locking overhead — Details

The previous test case is not good as it measures the max scatter timing, not the average one. This penalizes the parallel execution time. The new test fixes that and improves the reporting. Here is the results of the typical run on 2.66 4-Core Xenon CPU under 64-bit Ubuntu, no jit: ~/m/31-ff/js/src $ ~/m/31-ff/js/src/Linux_All_OPT.OBJ/js ~/s/b.js Best timing of 10 scatter runs: double case threads=1: 62.0 threads=2: 84.0 (overhead 1.35) threads=3: 122.3 (overhead 1.97) threads=4: 169.8 (overhead 2.74) int case threads=1: 50.0 threads=2: 59.5 (overhead 1.19) threads=3: 93.7 (overhead 1.87) threads=4: 131.5 (overhead 2.63) Here again both int and double cases show substantial slowdown compared with the sequential case even with 2 threads. But the double case slows down more then the int case (1.35 versus 1.19 as an overhead).

Attachment #339752 - Attachment is obsolete: true

Reporter

Comment 34

•

17 years ago

So I don't think we can conclude much from these numbers yet because of bug 419086. Here's the shark profile of the int case with 4 threads on my 2xdual-core machine: 42.2% pthread_mutex_lock 42.1% PR_Lock 24.9% JS_BeginRequest 24.9% JS_ResumeRequest 24.9% JS_YieldRequest 24.9% my_BranchCallback(JSContext*, JSScript*) 24.9% js_ResetOperationCount 24.9% js_Interpret 24.9% js_Invoke 24.9% js_InternalInvoke 24.9% JS_CallFunctionValue 24.9% DoScatteredWork(JSContext*, ScatterThreadData*) 18.7% RunScatterThread(void*) 18.7% _pt_root 18.7% thread_start

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Updated

•

17 years ago

Depends on: 419086

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 35

•

17 years ago

With the patch from bug 419086 I get there numbers on my 2xCore2 machine: Best timing of 10 scatter runs: double case threads=1: 192.0 threads=2: 344.5 (overhead 1.79) threads=3: 670.7 (overhead 3.49) threads=4: 1002.5 (overhead 5.22) int case threads=1: 137.0 threads=2: 179.0 (overhead 1.31) threads=3: 209.3 (overhead 1.53) threads=4: 231.2 (overhead 1.69)

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 36

•

17 years ago

And without the patch: double case threads=1: 203.0 threads=2: 440.5 (overhead 2.17) threads=3: 1098.7 (overhead 5.41) threads=4: 1752.5 (overhead 8.63) int case threads=1: 159.0 threads=2: 298.0 (overhead 1.87) threads=3: 420.0 (overhead 2.64) threads=4: 476.0 (overhead 2.99)

Assignee

Comment 37

•

17 years ago

Attached patch work in progress (obsolete) — Details — Splinter Review

The patch prototypes that idea of using the whole page as local free list for doubles. The patch stores the list in JSGCFreeListSet to ensure that a non-empty free list can be used by other context and threads when the current context leaves the current request.

Assignee

Comment 38

•

17 years ago

Attached patch tested implementation (obsolete) — Details — Splinter Review

This is what I am going to benchmark. To Ben Turner: Could you check if with this patch the timing still dominated with the contested double lock in RefillDoubleFreeList?

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Updated

•

17 years ago

Attachment #340608 - Attachment is obsolete: true

Reporter

Comment 39

•

17 years ago

I'm on a different machine today (1xCore2Duo) but here are my results. Both of these runs have the patch from bug 419086 applied. Without patch (attachment 340789 [details] [diff] [review]): double case threads=1: 83.0 threads=2: 284.5 (overhead 3.43) threads=3: 550.0 (overhead 6.63) threads=4: 613.0 (overhead 7.39) int case threads=1: 58.0 threads=2: 103.5 (overhead 1.78) threads=3: 137.0 (overhead 2.36) threads=4: 150.8 (overhead 2.60) --------------------------------------- With patch (attachment 340789 [details] [diff] [review]): double case threads=1: 80.0 threads=2: 158.5 (overhead 1.98) threads=3: 225.0 (overhead 2.81) threads=4: 255.5 (overhead 3.19) int case threads=1: 59.0 threads=2: 103.0 (overhead 1.75) threads=3: 137.3 (overhead 2.33) threads=4: 155.2 (overhead 2.63) In the 2-thread double case my profile says that 76% of the time is spent in js_Interpret, 9.7% in js_NewDoubleInRootedValue, and 1.5% in RefillDoubleFreeList.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 40

•

17 years ago

Sadly attachment 340789 [details] [diff] [review] doesn't help parallel raytrace as much as I'd hoped, but it is definitely closer! This sped up a five run average of 2-thread parallel raytrace on my core2 machine from 13015 to 11539 ms. Looks like the contention now is between js_NewGCThing and RefillDoubleFreeList.

Assignee

Comment 41

•

17 years ago

Attached patch whole page as free list (obsolete) — Details — Splinter Review

With the new patch the free lists spans the whole GC page both for doubles and ordinary GC things. What are the performance numbers with this patch?

Assignee

Updated

•

17 years ago

Attachment #340789 - Attachment is obsolete: true

Assignee

Comment 42

•

17 years ago

Attached patch whole page as free list (updated) (obsolete) — Details — Splinter Review

The patch required an update due to TM merge.

Attachment #341128 - Attachment is obsolete: true

Ben Turner (not reading bugmail, use the needinfo flag!)

Updated

•

17 years ago

Severity: normal → enhancement

Reporter

Comment 43

•

17 years ago

(In reply to comment #41) > What are the performance numbers with this patch? They were great, but the output of the raytrace was incorrect and so i suspect that this patch broke something. We're closing in on 3.1, is this something that should block?

Flags: blocking1.9.1?

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 44

•

17 years ago

Wasn't the incorrect output due to a TM bug, since fixed? /be

Reporter

Comment 45

•

17 years ago

(In reply to comment #44) > Wasn't the incorrect output due to a TM bug, since fixed? That was something different. Workers still don't use JIT so this was always tested with the interpreter only.

Ben Turner (not reading bugmail, use the needinfo flag!)

Updated

•

17 years ago

Flags: wanted1.9.1+

Flags: blocking1.9.1?

Flags: blocking1.9.1-

Reporter

Updated

•

16 years ago

Blocks: 487948

Comment 46

•

16 years ago

Updated patch and results with jit enabled for workers coming? /be

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 47

•

16 years ago

I should update the patch after addressing my P1 and P2. With luck it could be even this week.

Reporter

Comment 48

•

16 years ago

/me cheers! I'll be happy to benchmark.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 49

•

16 years ago

Sayrer, given bug 487948 should we reconsider blocking on this? Seems like any math heavy worker could trigger this bottleneck...

Flags: blocking1.9.1- → blocking1.9.1?

Comment 50

•

16 years ago

I don't think we can block on a worker performance bug, as there won't be a lot web content depending on it. thoughts?

Andreas Gal :gal

Comment 51

•

16 years ago

I would be worried about taking this into final without beta exposure. So if we block we have to fix this immediately for b4.

Andreas Gal :gal

Comment 52

•

16 years ago

I doubt this bug only hurts worker though. Double allocation performance is pretty horrible even for single-threaded use. bz's mandelbrot benchmark shows that well (trace-test.js).

Comment 53

•

16 years ago

Yes, it probably bites in more than just workers, but it's not worse than Firefox 3.

Updated

•

16 years ago

Flags: blocking1.9.1? → blocking1.9.1-

Nochum Sossonko [:Natch]

Updated

•

16 years ago

Flags: blocking1.9.2?

Keywords: perf

Ben Turner (not reading bugmail, use the needinfo flag!)

Updated

•

16 years ago

Flags: wanted-next+

Flags: blocking1.9.2?

Flags: blocking1.9.2-

Reporter

Comment 54

•

16 years ago

sayrer, this is the bug I was talking about the other day. Any way we can bump the priority on this? It sailed right through 1.9.1 and looks like it may miss 1.9.2 as well unless some action is taken.

Andreas Gal :gal

Comment 55

•

16 years ago

I have a couple ideas for this. bent, can you catch me on irc?

Assignee

Comment 56

•

16 years ago

I will update/fix the patch before the Wednesday.

Assignee

Comment 57

•

16 years ago

Attached patch whole page as free list v2 (obsolete) — Details — Splinter Review

Here is updated but completely untested patch. The idea here is to minimize the amount of time the GC spends inside GC_LOCK even when prepraing the free lists. In particular, the lists are assembled outside the lock.

Assignee: general → igor

Attachment #341260 - Attachment is obsolete: true

Assignee

Comment 58

•

16 years ago

Attached patch whole page as free list v3 — Details — Splinter Review

updated patch that passes shell tests

Attachment #403848 - Attachment is obsolete: true

Assignee

Comment 59

•

16 years ago

With the patch from the comment above the test case from the comment 33 changes the output from: Best timing of 10 scatter runs: double case threads=1: 76.0 threads=2: 79.5 (overhead 1.05) threads=3: 99.0 (overhead 1.30) threads=4: 106.0 (overhead 1.39) to: Best timing of 10 scatter runs: double case threads=1: 77.0 threads=2: 77.0 (overhead 1.00) threads=3: 78.0 (overhead 1.01) threads=4: 79.3 (overhead 1.03) This is on 4-core CPU.

Assignee

Comment 60

•

16 years ago

Comment on attachment 404249 [details] [diff] [review] whole page as free list v3 asking for review as the patch has passed the try server.

Attachment #404249 - Flags: review?(brendan)

Comment 61

•

16 years ago

Comment on attachment 404249 [details] [diff] [review] whole page as free list v3 >@@ -104,16 +104,20 @@ FinishThreadData(JSThreadData *data) > #if defined JS_TRACER > js_FinishJIT(&data->traceMonitor); > #endif > } > > static void > PurgeThreadData(JSContext *cx, JSThreadData *data) > { >+ /* >+ * Clear the double free list to release all the pre-allocated doubles. >+ */ >+ data->doubleFreeList = NULL; > js_PurgeGSNCache(&data->gsnCache); Blank line after date->doubleFreeList = NULL; line. >+const uint32 DOUBLES_PER_ARENA = >+ (ARENA_INFO_OFFSET * JS_BITS_PER_BYTE) / (JS_BITS_PER_DOUBLE + 1); . . . >+const uint32 DOUBLES_ARENA_BITMAP_WORDS = >+ JS_HOWMANY(DOUBLES_PER_ARENA, JS_BITS_PER_WORD); Do we want size_t instead of uint32 for these consts, for better 64-bit perf? They should all fit in 32 bits but their initializers involve sizeof, so size_t is the "obvious" type that doesn't require more analysis of the code reader. Note for ref below. >+const uint32 DOUBLES_ARENA_BITMAP_OFFSET = >+ ARENA_INFO_OFFSET - DOUBLES_ARENA_BITMAP_WORDS * sizeof(jsuword); . . . > #define DOUBLE_ARENA_BITMAP(arenaInfo) \ > (CHECK_DOUBLE_ARENA_INFO(arenaInfo), \ > (jsbitmap *) arenaInfo - DOUBLES_ARENA_BITMAP_WORDS) jsbitmap must be word-sized, and it is -- but without any static assertion or other such sanity check. Do we care? Perhaps use jsbitmap instead of jsuword throughout? >+TurnUsedArenaIntoDoubleList(JSGCArenaInfo *a) > { >+ /* >+ * When m below points the last bitmap's word in the arena, its high high "high high" typo -- "high half"? >+ * corresponds to non-existing cells and cellptr is outside the space >+ * allocated for doubles. ClearDoubleArenaFlags sets such bits to 1. Thus >+ * even for this last word its bit is unset iff the corresponding cell >+ * exists and free. >+ */ >+ JSGCDoubleCell *list = NULL; >+ jsuword cellptr = ARENA_INFO_TO_START(a) + >+ DOUBLES_ARENA_BITMAP_WORDS * JS_BITS_PER_WORD * >+ sizeof(jsdouble); Use DOUBLES_PER_ARENA instead of the second line of the cellptr initializer to simplify? >+ for (jsbitmap *m = DOUBLE_ARENA_BITMAP_END(a); >+ m != DOUBLE_ARENA_BITMAP(a);) { >+ JS_ASSERT(cellptr >= ARENA_INFO_TO_START(a)); Here isn't cellptr strictly > ARENA_INFO_TO_START(a)? >+ JS_ASSERT((cellptr - ARENA_INFO_TO_START(a)) % >+ (JS_BITS_PER_WORD * sizeof(jsdouble)) == 0); >+ --m; >+ jsbitmap bits = *m; >+ if (bits == jsbitmap(-1)) { >+ cellptr -= JS_BITS_PER_WORD * sizeof(jsdouble); >+ } else { >+ const unsigned unroll = 4; >+ const jsbitmap unrollMask = (jsbitmap(1) << unroll) - 1; >+ JS_STATIC_ASSERT(JS_BITS_PER_WORD % unroll == 0); Asserting ((JS_BITS_PER_WORD & unrollMask) == 0) is more righteous ;-). >+ for (unsigned n = JS_BITS_PER_WORD; n != 0; ) { >+ n -= unroll; I'd use a do {...} while ((n -= unroll) != 0); loop, with unsigned n = JS_BITS_PER_WORD; before. The block scope ends with the loop anyway. >+static JSGCDoubleCell * >+TurnEmptyArenaIntoDoubleList(JSGCArenaInfo *a) >+{ >+ JSGCDoubleCell *head = reinterpret_cast<JSGCDoubleCell*>(ARENA_INFO_TO_START(a)); >+ JSGCDoubleCell *lastcell = head + DOUBLES_PER_ARENA - 1; >+ for (JSGCDoubleCell *cell = head; cell != lastcell; ++cell) Perhaps a blank line before the for loop to let things breathe a bit. > /* >+ * Loop until we find arena with some free doubles. We turn arenas >+ * into free lists outside the to minimize lock contention. Missing word ("lock") after "outside the". > */ >+ while ((a = rt->gcDoubleArenaList.cursor)) { Nit: house style favors redundant != NULL here. Nice patch, r=me with these addressed. /be

Attachment #404249 - Flags: review?(brendan) → review+

Assignee

Comment 62

•

16 years ago

http://hg.mozilla.org/tracemonkey/rev/1c011b6b4f57 I nominate this for 1.9.2 per comment 54.

Flags: wanted1.9.2?

Whiteboard: fixed-in-tracemonkey

Comment 63

•

16 years ago

ok wanted+, but I'll merge it.

Flags: wanted1.9.2? → wanted1.9.2+

http://hg.mozilla.org/mozilla-central/rev/1c011b6b4f57

Comment 64

•

16 years ago

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 65

•

16 years ago

Woo! Thanks guys!