Closed Bug 169559 Opened 18 years ago Closed 16 years ago

Accessing global and local variables performance diff

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8alpha1

People

(Reporter: zbraniecki, Assigned: brendan)

References

Details

(Keywords: js1.5, perf, testcase)

Attachments

(9 files, 11 obsolete files)

5.24 KB, text/javascript
Details
655 bytes, text/html
Details
20.83 KB, patch
shaver
: review+
Details | Diff | Splinter Review
18.55 KB, text/plain
Details
174.03 KB, patch
Details | Diff | Splinter Review
169.94 KB, patch
shaver
: review+
Details | Diff | Splinter Review
1.58 KB, patch
bryner
: review+
Details | Diff | Splinter Review
5.24 KB, text/plain
Details
7.43 KB, text/plain
Details
I created this bug in result of discussion in bug 169442 .

Mozilla (like IE) slower access global variables than local. But the difference
is very big in Mozilla. I'm just wondering if we can do anything with this?
(like checking Security only at the beggining, and after value changing?)

http://alladyn.art.pl/gandalf/MozillaGlobal/
Keywords: perf
Keywords: testcase
Is there any chance to find reason for this?

IE6:
Global: 812
Local: 344

Mozilla:
Global: 1578
Local: 297
Mass-reassigning bugs to dom_bugs@netscape.com
Assignee: jst → dom_bugs
Mozilla 20030402

Global: 1719
Local: 344

IE6:

Global: 875
Local: 360

So nothing changed... Anyone could look it?
*** Bug 200848 has been marked as a duplicate of this bug. ***
Reassigning to Security:General to see if we can mitigate some of
the security checks or not; cc'ing jst. Note the testcase given by
Tim Powell, and his results, in bug 200848 comment 1.
Assignee: dom_bugs → mstoltz
Component: DOM Level 0 → Security: General
QA Contact: desale → carosendahl
While I'm continuing to work on optimizing the security check on global variable
access, I'm wondering how important this is to real-world applications. Are you
accessing global vars in a tight loop, so that the speed of access will actually
have some impact on the performance of your application?
Global vars are quite common, and often you *have* to use them, although the more
common cases are just forgotten |var|s. Remember that most function calls, every
use of the |Math| object, |undefined|, |parseInt|, |document| etc. constitute
accesses to global objects.

For an extreme real-world case see bug 203699.

IMHO, the way to go is to check at compile time whether an identifier ever needs
a security check. This can be done for every code not within a |with| or |catch|.
Blocks: 203699
Global vars are indeed really common on DHTML based sites /applications.
Blocks: 21762, 203448
No longer blocks: 203699
Blocks: 203699
Hmm.. Yes. They are still very common, even if they shouldn't be.
Most of DHTML scripts should work on objects and if so, global variables
shouldn't be used. But there are at least two situations where global variables
are logical and natural.

1) variables which determinating platform on which script works. For example
varibales ie,ns4,op,up5 and so on... 

2) variables for things that are gobal in window scope (for example handlers of
drag&droped objects), or variables passed between two frames, windows and so on.

Second one is not a problem, but many scripts including widly used DynApi/DynDuo
use this determination in every step of animation. So it's about in every 10,20
ms!!! Example:

function Step () {
 if(ie)document.all['obj'].style.top=(T+=10),break;
 if(ns4)document.layers.['obj'].top=(T+=10),break;
 if(up5)document.getElementByID('obj').style.top=(T+=10),break;
}
Anyone working on this bug should try out zack-weg@gmx.de's
testcases in bug 203699 comment 19 and bug 203699 comment 20. 

The former takes a LONG time to load in Mozilla. The latter does not.
The only difference is that in the latter, the top-level script is
wrapped in an anonymous function to make the global variables local.
It's not just the DOM security checks that slow global variables down.
They are much slower than local variables per se.

This simple code:

  var x= 1000000; while ( --x ) ;

takes still more than double time in global than in local context if I run it
in the js shell.

AIUI, most local variables are connected directly to a specific "slot" in the
function, i.e. they don't have to be looked up by name at runtime, but their
location is well known. In contrast, global variables are just properties of
the global object, which is typically a host object, and the host might want to
do something special if a property of its objects is accessed. E.g., if you say
|var status= "some text"| at top level, a browser window displays "some text"
in the status bar, but only if the user has allowed to do so.

But most global variables are not meaningful to the host environment, yet it
has to handle them. It is a design flaw of JavaScript that it is not possible
to differentiate between ordinary global variables and properties of the global
object.

However, if the host would supply the engine with a list of properties of the
global object it knows about resp. it is interested in, it should be possible
to connect the remaining global identifiers directly to slots in the global
object, just the same way it is done with local variables -- at least if they
are undeletable (this covers all /declared/ global variables, functions and
most (all?) native properties of the global object like |Math|, |parseInt| and
|Infinity|, unless the host claims a name for its own use). An implementation
of this suggestion would automatically bypass most unneeded DOM security checks
as a side effect.

I have no deeper understanding of the js engine code, but I think all problems
araising from this suggestion should be solvable. There might be some additional
difficulties when enumerating properties, however, and (unlike local variables)
global variables must still be accessible as properties of the global object,
in which case security checks are still needed (e.g. |frames[0].globalVar| might
be forbidden to access).
Confirming the timings zack-weg@gmx.de reports from the optimized JS shell:

----------------------- GLOBAL ACCESS ------------------------
js> var start = Date.now(); var x= 1000000;
    while (--x){}; print(Date.now() - start);
750

----------------------- LOCAL ACCESS -------------------------
js> (function() {var start = Date.now(); var x= 1000000;
                 while (--x){}; print(Date.now() - start);})()
328
Severity: normal → major
Hardware: PC → All
Severity: major → normal
Status: NEW → ASSIGNED
Target Milestone: --- → Future
Has anybody of the js engine team looked at my suggestion in comment 11 before
letting this bug laying around futured at the security component, while it could
better be fixed in the engine?
In both testcases you can adjust the loop count. In the js shell set
|this.count|
to do so, in the HTML testcase set |count| in the query string. Default is
10000.

My results for the global/local ratio are ~ 5.2 in the js shell and ~ 13.2 in
the
browser.
CCing the other JS engine guy.
Re: comment 13 -- yes, the engine could do something like what comment 11
sketches if it had enough reliable hints from the DOM.  Note that per ECMA-262
Edition 3, a var statement does not replace a pre-existing property, so we can't
do anything unless 'var foo' is creating foo (in which case, it will be
permanent, so we can, in the current JS engine, count on its slot not changing).

I was talking to jst the other week about taking more aggressive steps to solve
the performance disparity.  Thanks, zack-weg@gmx.de, for pushing on this topic.
 Cc'ing jst, taking bug.

/be
Assignee: mstoltz → brendan
Status: ASSIGNED → NEW
Keywords: js1.5
Priority: -- → P1
Target Milestone: Future → mozilla1.5alpha
The hint we need from the DOM is actually already implemented, via the
JSClass.resolve hook.  At the point (case JSOP_DEFVAR: in jsinterp.c) where
OBJ_DEFINE_PROPERTY is being called after js_CheckRedeclaration has ascertained
that no pre-existing property is named by this var statement, we know that the
property's slot won't change, provided (attrs & JSPROP_PERMANENT).

The trick will be specializing the code at compile time, if possible. 
Respecializing at runtime would be a pain, introducing bloat, requiring
specialized code caching, etc.

/be  
Status: NEW → ASSIGNED
Component: Security: General → JavaScript Engine
QA Contact: carosendahl → pschwartau
Target Milestone: mozilla1.5alpha → mozilla1.6alpha
May get to this in 1.7.

/be
Target Milestone: mozilla1.6alpha → mozilla1.7alpha
Blocks: js-perf
This patch includes other fixes and #ifdef'd instrumentation.

/be
Attached patch lightly tested mega-patch (obsolete) — Splinter Review
Includes fixes for bug 165201 and bug 206599.

/be
Blocks: 165201, 206599
Still need some work; JSSLOT_FREE doesn't include computed reserved slots.

/be
Attachment #144014 - Attachment is obsolete: true
I doubt this has anything to do with DHTML performance.  If bug 21762 is worth
keeping open, it should be about DHTML, which uses (or should be fixed to use)
function code, not global code.

New patch and some benchmark results soon.

/be
No longer blocks: 21762
I can minimize this by splitting patches out for other bugs.  Some files are
just being tidied (e.g. WIN16 cruft removal).

/be
Attachment #144389 - Attachment is obsolete: true
gcc3.3.2 Fedora Core 1 -O2 js shell build, without (<) and with (>) minimal patch:

3,5c3,5
< Global: 764
< Local: 132
< Ratio: 5.788
---
> Global: 223
> Local: 124
> Ratio: 1.798
8,10c8,10
< Global: 700
< Local: 130
< Ratio: 5.385
---
> Global: 160
> Local: 127
> Ratio: 1.260
13,15c13,15
< Global: 700
< Local: 131
< Ratio: 5.344
---
> Global: 164
> Local: 124
> Ratio: 1.323

These are from three runs in a row after warming up with three runs where the
output was not saved.  It looks like the patch speeds up global variables 4x, so
that they are about 25% slower than local variables, for the benchmark at
attachment 122533 [details].

/be
Attached patch minimal patch (obsolete) — Splinter Review
I'll attach the cleanup-only and strict-alias-fix-only parts separately.

/be
Attachment #144635 - Attachment is obsolete: true
This should go into 1.8a ASAP.

/be
Attached patch strict-alias fixes only (obsolete) — Splinter Review
Other strict alias fixes are in the main patch (attachment 144682 [details] [diff] [review]).

/be
Attached patch fixed minimal patch (obsolete) — Splinter Review
- fixed LookupArgOrVar to use -1 for slot instead of default-0 sprop->shortid
  if !(sprop->flags & SPROP_HAS_SHORTID), to avoid matching eval-created var
  references from eval code against argument slot range, selecting JSOP_GETARG
  etc. -- this was broken in the non-minimal patch, but not exposed until I
  added the
    (getter == js_CallClass.getProperty &&
     fp->fun && (uintN) slot < fp->fun->nargs)
  disjunction right operand to the (getter == js_GetArgument) condition
  governing JOF_QARG opcode selection in LookupArgOrVar.
- fixed two bogus JOF_CONST tests in jsemit.c (regressed after non-minimal
  patch, although second instance is not apparently caught by any test);
- renamed SPROP_HAS_DEFAULT_[GS]ETTER to SPROP_HAS_STUB....

This attachment passes the testsuite.

/be
Attachment #144682 - Attachment is obsolete: true
Attachment #144683 - Flags: review?(shaver)
Comment on attachment 144687 [details] [diff] [review]
strict-alias fixes only

The jsdbgapi.c change is more than just a strict aliasing fix: it fixes an old
bug where early error returns toward the bottom of JS_SetWatchPoint would leave
obj locked.

/be
Attachment #144687 - Flags: review?(bryner)
Attachment #144728 - Flags: review?(shaver)
Latest patch, warmed-up best of three before/after diff:

13,15c13,15
< Global: 700
< Local: 131
< Ratio: 5.344
---
> Global: 150
> Local: 127
> Ratio: 1.181

/be
Target Milestone: mozilla1.7alpha → mozilla1.8alpha
shaver: look for the XYZZY comment in jsemit.c and pretend I make that line look
like this:

              case JSOP_SETCONST: /* NB: no change */ break;

and similarly for the other "no change" line nearby (no need to set op to its
present value).

/be
The previous patch added one word to JSFunction's size.  This patch makes use
of an old patch I had lying around to combine fun->native and fun->script via a
union discriminated by the new fun->interpreted packed bool that fits in a
spare byte.

This patch also contains the XYZZY removal fix mentioned previously, and a few
comment tweaks.

/be
Attachment #144726 - Attachment is obsolete: true
Attachment #144728 - Attachment is obsolete: true
Attachment #144728 - Flags: review?(shaver)
Comment on attachment 144814 [details] [diff] [review]
diff -w version of last attachment (for reviewing)

Shaver, this is money -- still passes the testsuite, works in mozilla, and I
hope for some Ts, Txul, and maybe even Tp wins.

/be
Attachment #144814 - Flags: review?(shaver)
Argh, one more jsemit.c tweak, diff of diff format, new patch on the right:

784c784
< +              case JSOP_DELNAME:  op = JSOP_FALSE; break;
---
> +              case JSOP_DELNAME:  /* NB: no change */ break;

IOW, for fast global variable access, we have to let DELNAME look up the slow
property and test whether it is permanent.  If it is, no-op (and the fast path
may apply if other conditions are true -- see the new JSOP_DEFVAR/CONST case in
jsinterp.c).

If not, the fast path can't be used, fp->vars[atomIndex] will be JSVAL_NULL, and
DELNAME will delete the impermant global property that existed before the
variable declaration with the same name was processed.

/be
> If [DELNAME finds an impermanent property], the fast path can't be used,
> fp->vars[atomIndex] will be JSVAL_NULL, and DELNAME will delete the
> impermanent global property that existed before the variable declaration
> with the same name was processed.

(Typo fix above.)

And any JSOP_*GVAR ops that execute after the DELNAME will find JSVAL_NULL in
fp->vars[atomIndex] and not a int-jsval-tagged slot number, and take the slow
path, for example failing to find the now-deleted property in the case of
JSOP_GETGVAR/JSOP_NAME.

/be
Comment on attachment 144814 [details] [diff] [review]
diff -w version of last attachment (for reviewing)

New patch incorporating patch for bug 238881 and another fix found while
testing.

I'll give a diff-of-diff run-down soon.

/be
Attachment #144814 - Attachment is obsolete: true
Attachment #144814 - Flags: review?(shaver)
Attachment #144813 - Attachment is obsolete: true
All should be clear.  Shaver, ask me anything.

/be
Attachment #144894 - Attachment is obsolete: true
Attachment #145002 - Flags: review?(shaver)
Attachment #144687 - Flags: review?(bryner) → review+
Comment on attachment 144687 [details] [diff] [review]
strict-alias fixes only

Oops, the jsxdrapi.c change is all wrong.  What was I thinking?

/be
Attachment #144687 - Attachment is obsolete: true
Attachment #144687 - Flags: review+
Attachment #145030 - Flags: review?(bryner)
Comment on attachment 145002 [details] [diff] [review]
diff -w version of last attachment (for reviewing)

Looks good, with the assertion change from IRC.  Let's see some T* numbers!
Attachment #145002 - Flags: review?(shaver) → review+
Attachment #145030 - Flags: review?(bryner) → review+
Fix is in.

/be
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment on attachment 144683 [details] [diff] [review]
WIN16 and other ifdef and lesser cleanups only

r=shaver, RIP WIN16
Attachment #144683 - Flags: review?(shaver) → review+
Did this really help? I ran attachment 122533 [details] 10 times each in js.exe built from
Mozilla 1.4.3, Mozilla 1.7.6 and Mozilla 1.8b from today on a dual Xeon 3.06Ghz
w/1G Ram running winxpsp2 and it looks like things have gotten worse over time
rather than better.

Mozilla 1.4.3
	Global	Local	Ratio
	282.00	46.00	6.13
	281.00	47.00	5.98
	312.00	47.00	6.64
	297.00	62.00	4.79
	282.00	46.00	6.13
	282.00	47.00	6.00
	281.00	47.00	5.98
	281.00	47.00	5.98
	266.00	47.00	5.66
	281.00	47.00	5.98
Average	284.50	48.30	5.93

Mozilla 1.7.6			
	297.00	46.00	6.46
	281.00	63.00	4.46
	297.00	47.00	6.32
	297.00	47.00	6.32
	266.00	47.00	5.66
	281.00	63.00	4.46
	281.00	47.00	5.98
	282.00	62.00	4.55
	297.00	47.00	6.32
	281.00	62.00	4.53
Average	286.00	53.10	5.51

Mozilla 1.8b			
	312.00	63.00	4.95
	312.00	78.00	4.00
	313.00	62.00	5.05
	375.00	62.00	6.05
	328.00	62.00	5.29
	328.00	63.00	5.21
	344.00	78.00	4.41
	312.00	63.00	4.95
	313.00	62.00	5.05
	313.00	78.00	4.01
Average	325.00	67.10	4.90
1.4.3 and 1.7.6 look about the same.  This bug's patch affects the 1.8 trunk
only, and per comment 32 it did give a big win.  Bob, could you try pulling
js/src, js/src/config, and js/src/editline by date from just after I checked in,
and retesting?  I'll retest too.  I hope there isn't some OS dependency here --
I've been testing only with gcc on Linux.

/be
Hmm, best 1 of 5 tries with an optimized js shell on my 2GHz thinkpad running
FC3 gcc gives:

$ Linux_All_DBG.OBJ/js t.js
Testing...
Global: 196
Local: 161
Ratio: 1.217

That seems a bit off from the initial gain reported in comment 32.  It'll take
some pull-by-date rebuild tedium to see whether there was a regression.

The results a bit noisier than I recall, too.  The other four tries' results were:

$ Linux_All_DBG.OBJ/js t.js
Testing...
Global: 247
Local: 160
Ratio: 1.544
$ Linux_All_DBG.OBJ/js t.js
Testing...
Global: 245
Local: 159
Ratio: 1.541
$ Linux_All_DBG.OBJ/js t.js
Testing...
Global: 196
Local: 160
Ratio: 1.2
$ Linux_All_DBG.OBJ/js t.js
Testing...
Global: 197
Local: 161
Ratio: 1.224

/be
I pulled js/src, js/src/config, js/src/editline from 2004-04-13 and built using 
make -f Makefile.ref BUILD_OPT=1
then ran 10 tests.

Xeon 3.06Ghz, winxpsp2, VC6 = Average	281.50	62.10	4.59
Gahh.  What about a build pulled from 4/11?

/be
To make sure I am not doing anything stupid, I will attach the test case I am
using and list exactly what I did to get these results. If anyone else has
WinXPSP2, I would appreciate confirmation. I did get even worse results for a
current trunk build on a thinkpad t42 1.7Ghz machine running WinXPSP2

Visual C++ 6, SP5, Processor Pack and Platform SDK

.mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@-release
mk_add_options MOZ_CO_PROJECT=suite
ac_add_options --enable-application=suite

ac_add_options --enable-optimize
ac_add_options --disable-activex
ac_add_options --disable-activex-scripting

ac_add_options --disable-tests
ac_add_options --disable-debug

ac_add_options --enable-crypto

cd mozilla/js
cvs update -D 2004-04-11 src src/config src/editline
cd src
rm WINNT5.1_OPT.OBJ/*
make -f Makefile.ref BUILD_OPT=1
cd WINNT5.1_OPT.OBJ
js
js>load('bug-169559.js')

Ten runs 3.06Ghz Xeon 
(Average is average of the numbers in the column):

	281.00	47.00	5.98
	282.00	31.00	9.10
	266.00	47.00	5.66
	266.00	47.00	5.66
	282.00	46.00	6.13
	266.00	31.00	8.58
	281.00	47.00	5.98
	266.00	46.00	5.78
	266.00	47.00	5.66
	266.00	46.00	5.78
Average 272.20	43.50	6.43
zack-weg, with your permission this will be included in the javascript test
library. Could you send me the name and email address you would like included
under "Contributor(s)" ?
I didn't get a reply on permission, so I checked in a different testcase which I
believe illustrates the difference between global and local variables as
js/tests/Regress/regress-169559.js
QA Contact: pschwartau → moz
The test for this currently checks that global is no more than twice as fast as
local access. This is failing due to the ratio randomly coming in up to 2.3 for
some runs. I intend to change the ratio to 2.5 to eliminate spurious random
failures.
Flags: testcase+
Windows is getting slower and Linux is getting faster in 1.9. vs 1.8.

Ratio  OS      Build   
 2.51  windows opt 1.8.0.1_2006011106
 2.51  windows opt 1.8.0.1_2006011223
 2.51  windows opt 1.8_2006011300
 2.90  windows opt 1.9a1_2006011306
 3.00  linux   opt 1.9a1_2006011107
 3.00  linux   opt 1.9a1_2006011300
 3.02  linux   opt 1.9a1_2006011300
 3.03  windows opt 1.8_2006011305
 3.03  windows opt 1.8_2006011305
 3.30  linux   opt 1.9a1_2006011107
 3.54  windows opt 1.8.0.1_2006011303
 3.78  macosx  opt 1.8.0.1_2006011223
 3.78  macosx  opt 1.8.0.1_2006011223
 3.81  macosx  opt 1.8.0.1_2006011106
 4.24  macosx  opt 1.8_2006011107
 4.26  linux   opt 1.8.0.1_2006011106
 4.26  linux   opt 1.8.0.1_2006011106
 4.27  macosx  opt 1.8_2006011300
 4.30  linux   opt 1.8_2006011106
 4.33  macosx  opt 1.8_2006011300
 4.38  linux   opt 1.8.0.1_2006011223
 4.40  linux   opt 1.8.0.1_2006011223
 4.42  linux   opt 1.8_2006011300
 4.52  linux   opt 1.8_2006011223
 4.58  linux   opt 1.8_2006011107
 4.87  windows opt 1.8_2006011300
 4.87  windows opt 1.9a1_2006011108
 4.87  windows opt 1.9a1_2006011301
 4.87  windows opt 1.9a1_2006011301
 5.03  windows dbg 1.8.0.1_2006011106
 5.03  windows dbg 1.8.0.1_2006011223
 5.03  windows dbg 1.8_2006011300
 5.03  windows dbg 1.8_2006011300
 5.03  windows dbg 1.9a1_2006011109
 5.03  windows dbg 1.9a1_2006011301
 5.06  windows dbg 1.8.0.1_2006011107
 5.06  windows dbg 1.8.0.1_2006011223
 5.06  windows dbg 1.8_2006011107
 5.06  windows dbg 1.9a1_2006011108
 5.20  windows opt 1.8_2006011107
 5.20  windows opt 1.9a1_2006011108
 5.26  windows opt 1.8_2006011107
 5.34  windows dbg 1.8_2006011108
 5.52  macosx  dbg 1.8.0.1_2006011300
 5.56  linux   dbg 1.9a1_2006011108
 5.62  linux   dbg 1.9a1_2006011301
 5.64  linux   dbg 1.8.0.1_2006011106
 5.64  linux   dbg 1.9a1_2006011301
 5.70  macosx  opt 1.9a1_2006011108
 5.72  linux   dbg 1.8.0.1_2006011106
 5.75  macosx  opt 1.9a1_2006011301
 5.77  linux   dbg 1.9a1_2006011108
 5.78  linux   dbg 1.8_2006011300
 5.87  windows opt 1.8.0.1_2006011106
 5.94  linux   dbg 1.8_2006011300
 5.96  linux   dbg 1.8_2006011107
 6.00  linux   dbg 1.8.0.1_2006011223
 6.02  macosx  dbg 1.8_2006011108
 6.04  macosx  opt 1.9a1_2006011301
 6.05  linux   dbg 1.8.0.1_2006011223
 6.11  linux   dbg 1.8_2006011107
 6.26  windows opt 1.8.0.1_2006011223
 6.32  macosx  dbg 1.8_2006011301
 6.33  macosx  dbg 1.8.0.1_2006011300
 6.33  macosx  dbg 1.8_2006011301
 6.42  macosx  dbg 1.8.0.1_2006011107
 8.85  macosx  dbg 1.9a1_2006011302
 9.14  macosx  dbg 1.9a1_2006011302
 9.31  macosx  dbg 1.9a1_2006011109
10.75  windows dbg 1.9a1_2006011301
How about in the js shell?  Should be no significant difference there.

jst, bz: are we doing anything extra on Windows in window resolve or class-wide getProperty/setProperty hooks?

Bob: you may have to profile to find out more.

/be
QA Contact: bclary → general
See bug 376291 for the 1.9 issue (I think).

/be
You need to log in before you can comment on or make changes to this bug.