Closed Bug 120992 Opened 23 years ago Closed 23 years ago

Decimal to string conversion very slow because of locking

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.1alpha

People

(Reporter: bratell, Assigned: bratell)

References

Details

(Keywords: js1.5, perf)

Attachments

(4 files, 10 obsolete files)

Patch that uses a stack based buffer. mk1 23 years ago Daniel Bratell 22.38 KB, patch		Details \| Diff \| Splinter Review
New better patch. 23 years ago Daniel Bratell 23.82 KB, patch		Details \| Diff \| Splinter Review
With one more stack->heap move 23 years ago Daniel Bratell 23.83 KB, patch		Details \| Diff \| Splinter Review
fast -> local naming change 23 years ago Daniel Bratell 23.88 KB, patch		Details \| Diff \| Splinter Review
Merged with latest on the trunk 23 years ago Daniel Bratell 25.11 KB, patch		Details \| Diff \| Splinter Review
benchmark timing test for binary->decimal->binary 23 years ago Kenton Hanson (gone) 516 bytes, text/plain		Details
New tuned patch 23 years ago Daniel Bratell 25.68 KB, patch		Details \| Diff \| Splinter Review
The previous testcase webified. 23 years ago Daniel Bratell 872 bytes, text/html		Details
Higher lock fix for jsdtoa.c 23 years ago Ivan Eisen 81.99 KB, text/plain		Details
Graph over alternatives 23 years ago Daniel Bratell 25.51 KB, image/png		Details
Updated high level locking. No inline. 23 years ago Daniel Bratell 3.17 KB, patch	khanson : review+	Details \| Diff \| Splinter Review
Patch addressing Brendan's comments 23 years ago Daniel Bratell 5.55 KB, patch		Details \| Diff \| Splinter Review
The New patch which will bring happiness to all the world. 23 years ago Daniel Bratell 6.25 KB, patch	khanson : review+ brendan : superreview+	Details \| Diff \| Splinter Review
fixed webified testcase 23 years ago Jacob Kjome 1.23 KB, text/html		Details

Daniel Bratell

Assignee

Description

•

23 years ago

js_dtoa is very slow compared to how fast it should be. It's for instance 250 times slower than converting an integer to a string. That's all because of locking. For some reason js_dtoa help functions diff, lshift, i2b, d2b and mult all call Balloc to get a buffer which ends up doing locking with PR_Lock and PR_Unlock. For 10000 calls to js_dtoa I see 235 000 calls to Balloc. Balloc and Bfree is about 75% of the function. The rest is divided between quorem, multadd, diff and the function JS_dtoa iself. I tried to look at the function but it looks to pure greek to me. In fact I think greek is somewhat easier unless you were the one to write the function.

Daniel Bratell

Assignee

Comment 1

•

23 years ago

Attached patch Patch that uses a stack based buffer. mk1 (obsolete) — Details — Splinter Review

This patch passes around a stack based allocator that in the normal cases I've tested means that we never use the heap or the shared buffer. This saves 70% or more of the function time (that is 200ms on the sort test case page). It seems to work, so I could use some reviews. The patch is a couple of hundred lines long, but most of the lines are just changed to pass around a context (ctx) that's used when allocating memory. Brendan?

Daniel Bratell

Assignee

Comment 2

•

23 years ago

Assigning to myself since I did a patch after all.

Assignee: rogerl → bratell

Keywords: patch, perf

Daniel Bratell

Assignee

Updated

•

23 years ago

Blocks: 117611

Brendan Eich [:brendan]

Comment 3

•

23 years ago

Yeah, the David M. Gay (based on Guy Steele) dtoa code is crufty, and the way thread-safety was hacked in without undoing the needless contention through Balloc, is all wrong. Thanks for the patch. Kenton, could you review first and comment? I'm tied up elsewhere. /be

Keywords: js1.5, mozilla0.9.9

Daniel Bratell

Assignee

Comment 4

•

23 years ago

Just don't review this just yet. I've found a problem with the p5s routine that keeps Bigints on a static list. It used to work fine, but not anymore when the Bigints are coming from the stack. I didn't see that before because apparently the stack buffer was so small that all bigints in p5s happened to be allocated on the heap. I then tuned the stack size and now it misbehaves. Patch coming up tonight.

Daniel Bratell

Assignee

Comment 5

•

23 years ago

Attached patch New better patch. (obsolete) — Details — Splinter Review

This is the promised patch. dtoa is now 2.7 times faster than before in my tests. Since the previous patch I've tuned the constants so that the heap is very seldom needed, after all, almost all decimals sent in is small. Also, the Bigints are moved to the heap before being put on the shared list so that they can be reused. In 20000 calls I see locking twice and heap allocation once. The rest of the function time is spent in maths that is better left to people who understands it. khanson, can you review this?

Attachment #65835 - Attachment is obsolete: true

Kenton Hanson (gone)

Comment 6

•

23 years ago

Comment on attachment 65894 [details] [diff] [review] New better patch. Yes I will review this patch. I also have been thinking about ways to optimize dtoa =Kenton=

Kenton Hanson (gone)

Comment 7

•

23 years ago

@@ -702,10 +805,10 @@ } PR_Unlock(p5s_lock); if (wasted_effort) { - Bfree(wasted_effort); + fastBfree(ctx, wasted_effort); } #else - p51 = p5->next = mult(p5,p5); + p51 = p5->next = moveToSharedMemory(mult(ctx, p5,p5)); should probably be + p51 = p5->next = moveToSharedMemory(ctx, mult(ctx, p5,p5)); I am having trouble. I enter to the console "x = 3.3e199;" then when I enter "x" to the console I get an "unmapped memory exception." FastBalloc is producing an unintialized value in bi->k, leading me to believe that a pointer is invalid. I am using Codewarrior on a Mac. =Kenton=

Daniel Bratell

Assignee

Comment 8

•

23 years ago

Yes, the ctx should be there. I guess there must be some bug when building without JS_THREADSAFE. How do I build and run the console (I presume you don't meen the JavaScript Console in Mozilla)? I tried 3.3e199 in the javascript console in the tools menu in Mozilla and got "3.3e+199" back. The bi->k returned from fastBalloc should have been set in initFastDToAContext so a damaged pointer sounds like a plausible explanation. I will try to find it as soon as I figure out how to build single threaded.

Daniel Bratell

Assignee

Comment 9

•

23 years ago

Attached patch With one more stack->heap move (obsolete) — Details — Splinter Review

I found one bug in the not threadsafe code. 625 wasn't moved to the heap before being put on the shared powers-of-five-list. That would probably cause a crash. khanson, what do you think now?

Attachment #65894 - Attachment is obsolete: true

Kenton Hanson (gone)

Comment 10

•

23 years ago

*** Bug 99120 has been marked as a duplicate of this bug. ***

Kenton Hanson (gone)

Comment 11

•

23 years ago

99120 not a duplicate of this bug. I hit the wrong button!

Kenton Hanson (gone)

Comment 12

•

23 years ago

Daniel, I agree with your analysis. The last patch works fine. I want to do a some more testing and will r= it soon. I was a little concerned with the prefix "fast". As the code ages hopefully the improvements will be viewed as normal. Thanks for the patch, =Kenton=

Daniel Bratell

Assignee

Comment 13

•

23 years ago

I'm more than willing to change that name. It started as a working name while hacking and then I never changed it. It's not guaranteed to be "fast" so fast is not a very good prefix anyway. I'll change it to something neutral before checking in (Balloc? LocalBalloc?).

Brendan Eich [:brendan]

Comment 14

•

23 years ago

Local sounds good to me -- my 2 cents. Or Auto if that's the storage class (cf nsAutoString in mozilla/string code). /be

Daniel Bratell

Assignee

Comment 15

•

23 years ago

Attached patch fast -> local naming change (obsolete) — Details — Splinter Review

New patch, but the only thing that has changed is that "fast" is replaced by "local". I beg for more reviews. :-)

Attachment #67224 - Attachment is obsolete: true

Kenton Hanson (gone)

Comment 16

•

23 years ago

Comment on attachment 67480 [details] [diff] [review] fast -> local naming change r=khanson Great patch! I am wondering if keeping all the powers of 5 less than say 5^50, would improve performance. These get rebuilt over and over during binary <-> decimal conversions. Daniel, you might consider reviewing bug #85267. Thanks Kenton

Attachment #67480 - Flags: review+

Phil Schwartau

Comment 17

•

23 years ago

Just need sr= to get this patch in -

Daniel Bratell

Assignee

Comment 18

•

23 years ago

Brendan, does this deserve your sr=

Daniel Bratell

Assignee

Updated

•

23 years ago

Blocks: 85267

Phil Schwartau

Comment 19

•

23 years ago

Looking for sr= on this patch

Kenton Hanson (gone)

Comment 20

•

23 years ago

Comment on attachment 67480 [details] [diff] [review] fast -> local naming change Patch needs a minor update to reflect changes made by bug#130711

Attachment #67480 - Flags: review+ → needs-work+

Daniel Bratell

Assignee

Comment 21

•

23 years ago

Attached patch Merged with latest on the trunk (obsolete) — Details — Splinter Review

Oh, I thought I had already uploaded this patch. I've had the patch with the bug 130711 included in this fix all the time. So, no changes, just merged with the trunk. New r and sr, khanson, brendan, whoever feels competent?

Attachment #67480 - Attachment is obsolete: true

Kenton Hanson (gone)

Comment 22

•

23 years ago

Attached file benchmark timing test for binary->decimal->binary — Details

Daniel, I'm not seeing much of a performance improvement with this patch. Granted, I may be looking at a biased benchmark. Could you please provide a test program that exhibits the performance increase referenced in comment #5. Thanks, Kenton

Daniel Bratell

Assignee

Comment 23

•

23 years ago

The testcase I think I used was the array sorting test on <http://www.formula-one.nu/mozilla/jstestcases/sortArray.htm>. I have no clean build to compare with so I can't do the test now. Also, I see that you're test is full of integers. Notice that this code is for decimal (floating point) number conversion to string. It might be so that your test don't exercise this code at all. Filling an array with fibonacci numbers which are very integery by nature will probably use int->string conversion.

m_mozilla

Comment 24

•

23 years ago

You might also find a test case in the long javascript URL at http://bugzilla.mozilla.org/showattachment.cgi?attach_id=72417 I'd be interested to know if this patch affects performance in that test. On that test, NN is an order of magnitude faster than Mozilla. For details, see http://bugzilla.mozilla.org/show_bug.cgi?id=70054#c5 -matt

Kenton Hanson (gone)

Comment 25

•

23 years ago

Daniel, The fibonacci series I used produces mostly numbers that can only be represented in floating-point (greatly exceeding the C int range.) My test program should be a good test for binary<->decimal conversions and should exercise your patch. I worked with the test program you referenced and did not see any appreciable improvement. I might be testing in the wrong environment or not testing the correct profile of numbers. I would really like to reproduce the 2.7 factor of improvement you achieved in jsdtoa.c (comment #5.) I?m open to suggestions. =Kenton=

Kenton Hanson (gone)

Comment 26

•

23 years ago

Changed the powers of 5 routine to be table driven for all powers less than or equal to 32. Then tried all powers of 5 less than or equal to 64. I found no appreciable performance improvement.

Daniel Bratell

Assignee

Comment 27

•

23 years ago

I'm a little curious why you don't see anything with this patch. I did both profiles and wall time measurement when I wrote it and both showed an improvement. Now I have no build to compare with so I can not repeat my tests. It may be that your tests have something else taking so much time that the conversion isn't measurable, or that the numbers you have in your test for some reason don't trigger the code that used to be very, very slow. I don't know more than that this was a win for the sort test and removed something that shouldn't have been there for such a low level routine (shared buffers requiring locking all the time). I wish I had the time to look into this more thouroghly but I don't have the time right now to set up a testing environment. :-( Could you look at it in a profiler or debugger? Just to see if js_dtoa is called and how often? I wouldn't expect a table for p5s to gain much, but it would probably make the code much easier to understand which is a good thing.

Daniel Bratell

Assignee

Comment 28

•

23 years ago

Ok, now I've looked at the profile of the testcase in a profiler. It seems as if it tests the right things quite well. When running everything in the browser, js_dtoa is still 42% of the CPU time. I do see that the test causes localBalloc to revert to heap based allocation in 20% of the allocation cases which is almost half of the time in js_dtoa. I will try to see why. It's probably just some buffer size that has to be tuned. I'll be back.

Daniel Bratell

Assignee

Comment 29

•

23 years ago

Attached patch New tuned patch (obsolete) — Details — Splinter Review

Yes, the MAX_STACKBUFFER_EXP had to be increased from 5 to 6. Here is a new patch that is even faster. I made a webbified test from the one Kenton attached (I will attach my test too) and that one became 10% faster with this change. js_dtoa went from 2.2s to 1.5s. Kenton or Brendan, can you check the test I will attach and then review this patch? I really think we're there now. Any more improvements will have to be algorithmic. (multadd is slow, quorem and diff too)

Attachment #75816 - Attachment is obsolete: true

Daniel Bratell

Assignee

Comment 30

•

23 years ago

Attached file The previous testcase webified. (obsolete) — Details

This test runs in 5.5-6.0 seconds on my Duron 800MHz with the newest patch applied.

Daniel Bratell

Assignee

Comment 31

•

23 years ago

I get much lower times when running the test from bugzilla.mozilla.org then when I ran it from my own local disk. Strange. The last patch times in at 4.25s. The next to last patch is at 4.55s. I have no clean build to test with but I would guess that a trunk mozilla would be at ~5-6s.

Brendan Eich [:brendan]

Comment 32

•

23 years ago

Best to test a .js file via xpcshell, not a .html page via mozilla, I think. /be

Daniel Bratell

Assignee

Comment 33

•

23 years ago

Oh, I didn't know about xpcshell. Looks useful. But, anyway, for this I was only interested in an easy testcase in a real environment (multi threaded, GC triggering, external events....) that could show that the patch make a difference, and for that the test suffices.

Kenton Hanson (gone)

Comment 34

•

23 years ago

I tested the latest patch #79060 in the Codewarrior environment on a Macintosh. The patch resulted in a 6% degradation of performance. I will try the VC++ environment on Windows.

Daniel Bratell

Assignee

Comment 35

•

23 years ago

How can that be even remotely possible???? The patch adds an argument to the function calls. It adds an initialization that might be ~100 machine code instructions. It uses ~1KB more of the stack. That's the downside. The other side is that it removes doussins of calls to PR_Lock and malloc and free, which are costly both in heap usage and in clock cycles. Comparing the two sides, this should save us immensely more then it costs. I've studied this in Windows profilers. I've checked every line of the patch. I can see nothing that show anything else than a great speedup. In profilers. In real life. In Mozilla. Could you, please, do a profile to see what _really_ goes on? What environment are you testing in? Are you sure it's quiet, no external processes that disturb? Is everything really built optimized?

Ivan Eisen

Comment 36

•

23 years ago

Attached file Higher lock fix for jsdtoa.c (obsolete) — Details

Locks are done at a higher level with this new code. All changes are marked by //perf.

Daniel Bratell

Assignee

Comment 37

•

23 years ago

Ah, inline. Do you know how much of a difference that made? Btw, jsdtoa.c is a C file. Didn't inline appear first in C99? In that case this might break a few compilers. It should be possible though to make it a #define so that supporting platforms can get it if it looks promising.

Ivan Eisen

Comment 38

•

23 years ago

The inline helped the OS/2 a great deal. I did not see very much improvement for NT. The OS/2 compiler is really bad. I will try to get my results out on bugzilla.

Ivan Eisen

Comment 39

•

23 years ago

I ran 2 sets of tests on 2 operating systems (OS/2 and NT 4.0). I ran the test Daniel included in bugzilla and the Parseint test that is part of http://www.formula-one.nu/mozilla/jsTimeTest.htm . I tested using the original nightly builds of 4/15/02. I rebuild the javascript dll using Daniel's stack fix and my higher lock fix (see attachment). The tests which were run on a dual boot 333Mhz Pentium 2 are displayed below: O/2 Tests 4/15 orig 4/15 w/stack fix 4/15 w/lock fix Parseint test 45.8 secs 37.5 secs 32.5 secs Daniels's test 17.6 secs 14.5 secs 15.5 secs NT 4.0 Tests 4/15 orig 4/15 w/stack fix 4/15 w/lock fix IE6.0 Parseint test 37.1 secs 35.3 secs 32.4 secs 18.0 secs Daniels's test 15.4 secs 14.8 secs 14.2 secs 4.2 secs Test machine was a 333 MHz - 128MB, Ethernet,Win NT 4.0 SP4, OS/2 CP4.52 I think the problem with the stack fix may be the extra parm that is passed around. Our traces show that on NT before the fixes were applied about 15% is spent using locks. In OS/2 the number is about 35%. Ivan

Ivan Eisen

Comment 40

•

23 years ago

I ran 2 sets of tests on 2 operating systems (OS/2 and NT 4.0). I ran the test Daniel included in bugzilla and the Parseint test that is part of http://www.formula-one.nu/mozilla/jsTimeTest.htm . I tested using the original nightly builds of 4/15/02. I rebuild the javascript dll using Daniel's stack fix and my higher lock fix (see attachment). The tests which were run on a dual boot 333Mhz Pentium 2 are displayed below: O/2 Tests 4/15 orig 4/15 w/stack fix 4/15 w/lock fix Parseint test 45.8 secs 37.5 secs 32.5 secs Daniels's test 17.6 secs 15.5 secs* 14.5 secs* NT 4.0 Tests 4/15 orig 4/15 w/stack fix 4/15 w/lock fix IE6.0 Parseint test 37.1 secs 35.3 secs 32.4 secs 18.0 secs Daniels's test 15.4 secs 14.8 secs 14.2 secs 4.2 secs Test machine was a 333 MHz - 128MB, Ethernet,Win NT 4.0 SP4, OS/2 CP4.52 I think the problem with the stack fix may be the extra parm that is passed around. Our traces show that on NT before the fixes were applied about 15% is spent using locks. In OS/2 the number is about 35%. * corrected previous data. Ivan

Steven Cole

Comment 41

•

23 years ago

You guys may want to look at bug 138666. I found another memory leak (very similar to the last one I found), and you may want to incorporate the small change there into your patches here. --scole

Daniel Bratell

Assignee

Comment 42

•

23 years ago

Attached image Graph over alternatives — Details

Now, I've done thourogh measurements of the alternatives on Windows 2000. I've done the tests in xpcshell (thank you Brendan!) and I've done tests with the current code in CVS, the stack option and the high level lock option. I've also tested the high level lock option and stack option with and without inlining of free and cmp. The test was Hanson's test but with 50 iterations. I then ran the test 25-30 times and recorded the times as reported by the test. The difference between the options are obvious when graphed. For some reason all test runs follow the same pattern making them extra easy to compare. Neglecting the inline the high level lock is 250ms faster than the original and the stack buffer is 150ms faster. Since js_dtoa is ~50% of the total time, that would account for a speedup of 12% and 8%. The inline of free and cmp then brings another 150ms and that is regardless of locking solution. The tree also contained code from the next version of NSPR which has faster locking on Windows so the improvements in a current tree (with an old NSPR) are probably bigger. All results are also confirming Ivan's numbers so now I think we know where we are. My proposal is therefore to use Ivan's high level locks without inlining and then open a new bug about inlining where we can discuss which functions should be inlined, along with compatibility and footprint issues.

Daniel Bratell

Assignee

Comment 43

•

23 years ago

Attached patch Updated high level locking. No inline. (obsolete) — Details — Splinter Review

Here is Ivan's patch without inlines and with matching ACQUIRE_LOCK and FREE_LOCK. The locks are placed in the JS_FRIEND_API functions. I've opened bug 138741 about inlining functions. khanson, review? The patch is very short.

Ivan Eisen

Comment 44

•

23 years ago

Sam discovered that prdtoa.c has the same locking problems as jsdtoa.c . Should another bug be opened for this fix to be applied to prdtoa.c ? Note, the fix would look pretty much the same as for jsdtoa.c .

Daniel Bratell

Assignee

Comment 45

•

23 years ago

Better be another bug. prdtoa, I guess that's NSPR, and NSPR have a different release schedule, different module owners and checkin policy.

Kenton Hanson (gone)

Comment 46

•

23 years ago

Comment on attachment 80175 [details] [diff] [review] Updated high level locking. No inline. r=khanson. This looks good to me. Thank you Ivan and Daniel for the fix and anaylsis.

Attachment #80175 - Flags: review+

Brendan Eich [:brendan]

Comment 47

•

23 years ago

Comment on attachment 80175 [details] [diff] [review] Updated high level locking. No inline. >Index: jsdtoa.c >=================================================================== >RCS file: /cvsroot/mozilla/js/src/jsdtoa.c,v >retrieving revision 3.16 >diff -u -r3.16 jsdtoa.c >--- jsdtoa.c 19 Mar 2002 04:28:20 -0000 3.16 >+++ jsdtoa.c 20 Apr 2002 08:17:04 -0000 >@@ -331,7 +331,8 @@ > > static Bigint *freelist[Kmax+1]; > >-/* Allocate a Bigint with 2^k words. */ >+/* Allocate a Bigint with 2^k words. >+ * This is not threadsafe. The caller must use thread locks */ Nit: prevailing JS comment style for multiline comments is /* * blah blah * blah. */ But maybe this matches jsdtoa.c's "style" (not coherent enough for me to avoid sarcastic quotes :-). > static Bigint *Balloc(int32 k) > { > int32 x; >@@ -340,10 +341,8 @@ > uint32 len; > #endif > >- ACQUIRE_DTOA_LOCK(0); > if ((rv = freelist[k]) != NULL) > freelist[k] = rv->next; >- FREE_DTOA_LOCK(0); Nit: the rest of the engine uses FREE for memory deallocation and RELEASE as the antonym to ACQUIRE for locking. Another nit: the (n) parameter to these ACQUIRE/FREE_DTOA_LOCK macros seems useless. We own this fork of the old DMG code, why not clean it up? >@@ -1093,6 +1090,8 @@ > #ifdef JS_THREADSAFE > if (!initialized) InitDtoa(); > #endif >+ /* Locking for Balloc's shared buffers */ >+ ACQUIRE_DTOA_LOCK(0); > > *err = 0; > >@@ -1679,6 +1678,7 @@ > ret: > if (se) > *se = (char *)s; >+ FREE_DTOA_LOCK(0); > return sign ? -rv : rv; > } > Good to bracket the whole routine and consolidate returns (as was already done), but why lock longer than needed? Couldn't you ACQUIRE just after the break2: label, and RELEASE before setting *se conditionally -- that is, immediately after the ret: label? >@@ -2496,14 +2496,19 @@ > /* the sign and/or decimal point */ > char *numEnd; /* Pointer past the digits returned by JS_dtoa */ > >+ /* Locking for Balloc's shared buffers */ >+ ACQUIRE_DTOA_LOCK(0); >+ > JS_ASSERT(bufferSize >= (size_t)(mode <= DTOSTR_STANDARD_EXPONENTIAL ? DTOSTR_STANDARD_BUFFER_SIZE : > DTOSTR_VARIABLE_BUFFER_SIZE(precision))); > > if (mode == DTOSTR_FIXED && (d >= 1e21 || d <= -1e21)) > mode = DTOSTR_STANDARD; /* Change mode here rather than below because the buffer may not be large enough to hold a large integer. */ > >- if (!JS_dtoa(d, dtoaModes[mode], mode >= DTOSTR_FIXED, precision, &decPt, &sign, &numEnd, numBegin, bufferSize-2)) >+ if (!JS_dtoa(d, dtoaModes[mode], mode >= DTOSTR_FIXED, precision, &decPt, &sign, &numEnd, numBegin, bufferSize-2)) { >+ FREE_DTOA_LOCK(0); > return 0; >+ } > > nDigits = numEnd - numBegin; > Same comment here: why are we holding the lock across all the string fiddling that doesn't need it? >@@ -2598,6 +2603,8 @@ > (word1(d) || (word0(d) & Frac_mask)))) { > *--numBegin = '-'; > } >+ >+ FREE_DTOA_LOCK(0); > return numBegin; > } > IOW, lock around the JS_dtoa call only. BTW, that function is static, and callers must be in jsdtoa.c because they must lock around calls to it, so it should be named js_dtoa (JS_ is for public APIs). >@@ -2657,6 +2664,9 @@ > double di; /* d truncated to an integer */ > double df; /* The fractional part of d */ > >+ /* Locking for Balloc's shared buffers */ >+ ACQUIRE_DTOA_LOCK(0); >+ > JS_ASSERT(base >= 2 && base <= 36); > > buffer = (char*) malloc(DTOBASESTR_BUFFER_SIZE); Same deal here: do not lock around a malloc call, it increases contention and risks deadlock. >@@ -2674,6 +2684,7 @@ > /* Check for Infinity and NaN */ > if ((word0(d) & Exp_mask) == Exp_mask) { > strcpy(p, !word1(d) && !(word0(d) & Frac_mask) ? "Infinity" : "NaN"); >+ FREE_DTOA_LOCK(0); > return buffer; > } > >@@ -2813,5 +2824,7 @@ > JS_ASSERT(p < buffer + DTOBASESTR_BUFFER_SIZE); > *p = '\0'; > } >+ >+ FREE_DTOA_LOCK(0); > return buffer; > } Thanks for the great work here, I hope you can fix the over-long critical section problems above, and pick the nits. I'll gladly sr= a patch that addresses those issues. /be

Daniel Bratell

Assignee

Comment 48

•

23 years ago

Attached patch Patch addressing Brendan's comments (obsolete) — Details — Splinter Review

This should address all your points I think. The locks are held for shorter times and the function is renamed to js_dtoa (lowercase js).

Attachment #79060 - Attachment is obsolete: true

Attachment #79867 - Attachment is obsolete: true

Attachment #80175 - Attachment is obsolete: true

Steven Cole

Comment 49

•

23 years ago

Urk. Would it be too much to ask that ACQUIRE_DTOA_LOCK be changed to ACQUIRE_DTOA_LOCK() so that it still looks like a function? (And the RELEASE too, naturally...) Without the parens it looks too much like a constant macro, not like a code macro... --scole

Brendan Eich [:brendan]

Comment 50

•

23 years ago

Comment on attachment 83926 [details] [diff] [review] Patch addressing Brendan's comments Yeah, what scole said. Make macros that expand to non-constant expressions look like function calls by taking 0 params if necessary. /be

Daniel Bratell

Assignee

Comment 51

•

23 years ago

Attached patch The New patch which will bring happiness to all the world. — Details — Splinter Review

With macros looking like macros (my fault; I misunderstood) and some comments updated.

Attachment #83926 - Attachment is obsolete: true

Kenton Hanson (gone)

Comment 52

•

23 years ago

Comment on attachment 84018 [details] [diff] [review] The New patch which will bring happiness to all the world. r=khanson. Looks good. Did some testing and saw a 1.3% increase in speed (similar to the results in attachment #80173 [details]).

Attachment #84018 - Flags: review+

Brendan Eich [:brendan]

Comment 53

•

23 years ago

Comment on attachment 84018 [details] [diff] [review] The New patch which will bring happiness to all the world. sr=brendan@mozilla.org, thanks! Let's get this into the 1.1alpha trunk. /be

Attachment #84018 - Flags: superreview+

Daniel Bratell

Assignee

Comment 54

•

23 years ago

Fix is checked in. Anyone interested in the inline issue should head over to bug 138741.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla1.1alpha

R.K.Aa.

Comment 55

•

23 years ago

Daniel - see bug 145493

Daniel Bratell

Assignee

Comment 56

•

23 years ago

I just made a small adjustment to the checked in patch. Bad indentation (no indentation at all) made me miss that one of the ACQUIRE_JSDTOA_LOCK was inside a for loop that sometimes wasn't entered. It asserted badly but I hadn't run a debug build since doing the changes wanted by reviewers so I didn't see it until now. Bad me. I got mozbot to do a quick review. diff -u -r3.18 jsdtoa.c --- jsdtoa.c 18 May 2002 06:21:49 -0000 3.18 +++ jsdtoa.c 18 May 2002 17:29:00 -0000 @@ -1097,6 +1097,10 @@ bb = bd = bs = delta = NULL; sign = nz0 = nz = 0; rv = 0.; + + /* Locking for Balloc's shared buffers that will be used in this block */ + ACQUIRE_DTOA_LOCK(); + for(s = s00;;s++) switch(*s) { case '-': sign = 1; @@ -1119,8 +1123,6 @@ goto break2; } break2: - /* Locking for Balloc's shared buffers that will be used in this block */ - ACQUIRE_DTOA_LOCK(); if (*s == '0') { nz0 = 1;

Jacob Kjome

Comment 57

•

23 years ago

The html testcase is messed up for this bug. It has a call to print(c) which pops up the print dialog in a nearly endless loop. Someone is a bit too used to writing C, me thinks. It should be document.write(c)...however, when I run the testcase in Mozilla, the last document.write statement never gets printed and the browser just keeps running the testcase. IE finishes up just fine. Attaching testcase in a minute.... Jake

Jacob Kjome

Comment 58

•

23 years ago

Attached file fixed webified testcase — Details

This is an html4.01 Strict valid testcase + in uses document.write(c) instead of print(c) so that the print dialogue won't be popped up in a nearly endless loop.

Attachment #79063 - Attachment is obsolete: true

Phil Schwartau

Comment 59

•

23 years ago

Marking Verified Fixed. Here are some of my results (all times in ms). WinNT 4.0 (SP6); 500MHz CPU; 128M RAM Kenton's test from Comment #22 run directly in the JS shell: RC4 (Jan 01) RC4a (Mar 21) Current tip (May 20) OPTIMIZED JS SHELL 1360 1235 1260 DEBUG JS SHELL 2688 2625 2641 The parseInt() test from http://www.formula-one.nu/mozilla/jsTimeTest.htm Using Mozilla trunk binaries: 2002-05-14 2002-05-20 27296 23797 26906 23782 27109 23828 The JS shells I used are separated by months, and span fixes for other bugs as well as this one. That may explain the slight decrease in performance on Kenton's test between RC4a and the current tip. The Mozilla builds come from just a couple days before and after this fix, and show a clear improvement -

Status: RESOLVED → VERIFIED

Bob Clary [:bc] (inactive)

Updated

•

20 years ago

Flags: testcase?

Bob Clary [:bc] (inactive)

Updated

•

19 years ago

Flags: testcase? → testcase-

You need to log in before you can comment on or make changes to this bug.