Closed
Bug 578761
Opened 14 years ago
Closed 14 years ago
JM: PIC for |o[e]| (JSOP_GETELEM)
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: dmandelin, Assigned: cdleary)
References
Details
(Whiteboard: fixed-in-jaegermonkey)
Attachments
(5 files, 10 obsolete files)
18.56 KB,
patch
|
Details | Diff | Splinter Review | |
23.03 KB,
text/plain
|
Details | |
4.68 KB,
text/plain
|
Details | |
815 bytes,
application/javascript
|
Details | |
46.64 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
We need a pic for getelem expressions of the form o[e]. It is the main cause of slowdown on fasta. Previous measurements suggest that we don't have too worry too much about megamorphism: a simple solution, like going to a slow call after 16 stubs, will probably do. I suspect that most getelems we see will either use integer indexes, thus skipping this entirely, or will go over a relatively small number of property names.
JM wanted SS win: 20 ms
Assignee | ||
Comment 1•14 years ago
|
||
Ooh, I'll take this one.
Assignee: general → cdleary
Status: NEW → ASSIGNED
Comment 2•14 years ago
|
||
For string-fasta.js's two for-in loops, we could do even better, but maybe it is overkill: the def-use chain for the loops has the def in the loop head dominating all (few) uses, so we could try a bit more static analysis and predict the next value (property name).
/be
What about the idea mentioned a few times of iterating over JSScopeProperties? It'd be pretty cool if we could just read the last sprop out of the iterator and avoid the PIC.
Comment 4•14 years ago
|
||
Other than the sprop linkage being up the property tree, so in reverse for-in order, it's great.
This isn't a big deal given the fastiterators bug's work, but you'd need to tweak that code to build an array of sprops and know how to get the next one in for-in enumeration order from it. Or something like that.
/be
Assignee | ||
Comment 5•14 years ago
|
||
Very WIP: needs cleanup and the integer fast path added back in.
Comment 6•14 years ago
|
||
FWIW here's the patch from last week that keeps the last-seen value around which would avoid a MIC/PIC in the common case.
Hi, patch attached which saves 8-9ms on string-fasta for TM and JM. 'for in' loops keep an array of (id,slot) instead of id, which is inspected before doing an arbitrary name lookup (not fast-pathed yet for JM, just non-as-slow-pathed). Doesn't handle property deletes (don't know how those work). Should this go on 560989 or a new bug?
Brian
Reporter | ||
Comment 7•14 years ago
|
||
(In reply to comment #6)
> Should this go on 560989 or a new bug?
I think this belongs in bug 578756.
I tried to apply it to check out the perf gain, but it needs rebasing. We are hoping to get 23 ms total from improving that particular loop of fasta, so if we aren't getting all of it, we need to know why to decide if the current patches are going to take care of it or if we need to do more.
Assignee | ||
Updated•14 years ago
|
Summary: JM: PIC for |o[e]| → JM: PIC for |o[e]| (JSOP_GETELEM)
Assignee | ||
Comment 8•14 years ago
|
||
Shoehorning the int path back in...
Attachment #457736 -
Attachment is obsolete: true
Comment 9•14 years ago
|
||
Iterator patch rebased to current JM. I still get a 9ms speedup on string-fasta with this (35ms -> 26ms).
Attachment #457885 -
Attachment is obsolete: true
Comment 10•14 years ago
|
||
Chris, what speedup are you seeing on string-fasta with your patch (can you test it yet)?
Looking at this some more, unless I'm missing something I don't see why using a PIC makes sense for the workloads seen on SS. There are 330k slow paths taken for getelem on SS. 175k are from one site in string-fasta, 80k are from four sites in string-unpack-code, leaving 75k (iirc 30k are gets on string elements, 8k are getting holes, remainder are misc).
If you PIC on (shape x id) (you need to do this to determine the slot, right?), then for the one site in string-fasta there are 19 cases that get hit (the total number of properties in the two objects that are repeatedly iterated over), and for the sites in string-unpack-code there are between 600-2000 cases that get hit (all 1-2 character strings).
The iterator patch works on the 175k in string-fasta but not the ones in string-unpack-code; I think these require old fashioned hashtable lookups.
Assignee | ||
Comment 11•14 years ago
|
||
I've only tested on string-fasta (because that has no dense array hot paths): 13.5% == 4ms speedup.
Assignee | ||
Comment 12•14 years ago
|
||
About to test on a machine whose benchmark results I trust...
Attachment #458032 -
Attachment is obsolete: true
Assignee | ||
Comment 13•14 years ago
|
||
4ms on SunSpider, wash on V8. I'll see if I can do anything to cut down on extraneous spillage to make these numbers better.
How many times does the PIC miss on SS, and how many times does it hit?
Comment 15•14 years ago
|
||
General concern: how reliable are these benchmark comparisons? Comparison from comment 13 shows 'significant' differences in tests completely unrelated to these PICs (binary-trees, bitwise-and, plenty that only have integer array accesses). string-fasta is .9ms faster, and string-unpack-code is .5ms faster; if you repeat the comparison, how much do things change?
Comment 16•14 years ago
|
||
SunSpider overstates significance all the time; a while ago dmandelin thought he saw a flaw in its pstats but then I think thought it was ok, but it "feels" all wrongly overstated, way too often.
Of course SS has ND in it too (Date, Math.random). Benchmarketing at its best!
/be
Assignee | ||
Comment 17•14 years ago
|
||
(In reply to comment #15)
> General concern: how reliable are these benchmark comparisons?
It's reproducible. There are a lot of little effects from adding these PICs that can have emergent behavior, though most of the ones I can think of are bad: self-modifying code causes I$ flushes, PIC guards alias other branches in the BHTs, there's more I$ pressure because of the inline PIC code and out-of-line slow call code that gets tacked on to the inline buffer. I look a quick look at cachegrind diff results yesterday, but I'll look into it more today.
@dvander: I'll investigate hit counts.
Assignee | ||
Comment 18•14 years ago
|
||
I wrote a little script to compare stub call counts using our stub call profiling mode and I got the following:
sunspider-0.9.1 aggregate
JSOP_NOP: +999
JSOP_PROPINC: +1
JSOP_EXCEPTION: -1
JSOP_CONCATN: -1
JSOP_CALL: -3
JSOP_REGEXP: -999
JSOP_GETELEM: -24425
Total: -24429
I think that the stub profiling data (or my script) may be a little buggy because running the untouched shell against itself I got:
crypto-aes.js
JSOP_CONCATN: -12
JSOP_CALL: -36
Total: -48
sunspider-0.9.1 aggregate
JSOP_CONCATN: -12
JSOP_CALL: -36
Total: -48
Investigating...
Assignee | ||
Comment 19•14 years ago
|
||
Yeah, there's definitely nondeterminism in the stub profiling, although it's always for the same opcodes. Still looking, but CCing dmandelin -- maybe this is a known issue.
Assignee | ||
Comment 20•14 years ago
|
||
In any case, this is the bit that matters to us:
string-tagcloud.js
JSOP_GETELEM: -4998
date-format-xparb.js
JSOP_GETELEM: -3999
string-fasta.js
JSOP_GETELEM: -15428
sunspider-0.9.1 aggregate
JSOP_GETELEM: -24425
Those numbers should represent the PIC hits.
Comment 21•14 years ago
|
||
Is this nondeterminism in the stub profiling or nondeterminism in SS itself (see the nonce variable in AESEncryptCtr in crypto-aes.js)? As Brendan pointed out, SS doesn't do the same thing every time, including differing numbers of calls (bitten me before).
Assignee | ||
Comment 22•14 years ago
|
||
(In reply to comment #21)
> (see the nonce variable in AESEncryptCtr in crypto-aes.js)
That makes a lot of sense for crypto-aes, didn't see the call to Date until you mentioned it. :-)
Still not sure how exception is becoming propinc or regexp is becoming jsop_nop -- we don't even have a slow call for anything noppish, so that'll be what I check out next.
(In reply to comment #20)
What about PIC misses? Trying to see what percentage of PIC'able callsites we're missing. And how many times do we hit max stubs?
Assignee | ||
Comment 24•14 years ago
|
||
Excellent suggestion by dvander -- rooted out a bug. Re-analyzing with that fix.
Attachment #458158 -
Attachment is obsolete: true
Assignee | ||
Comment 25•14 years ago
|
||
1 disabled due to failed lookup at date-format-xparb.js:19. 4000 iterations where the first iteration is used in a "try to look it up, and if it turned out it wasn't in there, put it in there" fashion.
3 disabled due to abnormal object ops (2 in regexp-dna.js:1707, 1 in string-tagcloud.js:185) TODO: investigate what these objects actually are.
1 disabled in string-fasta.js:68 disabled due to megamorphism, could up the maximum a little bit to catch all of these.
4 disabled in string-unpack-code.js (:16, :30, :51, :66) due to megamorphism, but this one is definitely correct -- the strings that are being used for indexing are dynamically generated, so the atom comparison fails even when the characters are the same, i.e. "L" shows up three times with different string pointers.
In summary, I'll try to treating failed lookups as nops for the benefit of xparb and bump the stub cap for the (small) benefit of fasta.
Will rerun the benchmarks on my macbook tonight and give V8 a look-over tomorrow.
Assignee | ||
Comment 26•14 years ago
|
||
16.8ms win on SunSpider without any of the modifications I mentioned at the end of comment 25, but looks like it regresses V8 by 80ms (1%) in its current form. Will investigate tomorrow.
Attachment #458162 -
Attachment is obsolete: true
Assignee | ||
Comment 27•14 years ago
|
||
#if out some debug stuff I accidentally left in there. (I removed it before profiling in the above comment.)
Attachment #458282 -
Attachment is obsolete: true
Comment 28•14 years ago
|
||
Re: comment 25: non-native object ops in SS must be dense arrays.
ITM we want a different approach here from a PIC. We want prediction based on likelihood or (better) a bit of static analysis, not prediction based on what just missed the cache and forced us to fill it with a key that won't be matched again all too often.
/be
On my machine, with the patch from comment #27, I don't see any regressions.
string-fasta speeds up by 34% (6.5ms), and that's the only clearly-affected test.
Comment 30•14 years ago
|
||
I get a 9ms speedup (35ms -> 26ms; how are you computing the 34%?) with the patch from comment 27, the same as for the iterators patch. That other patch could get a few ms more by fast-pathing the cache lookup (the calls to GetElem and associated spills/tests are still performed) and maybe by using an array of JSScopeProperty*. Both patches are attacking string-fasta; how to proceed?
Assignee | ||
Comment 31•14 years ago
|
||
Discussed offline with dvander, dmandelin, and bhackett. We're going to go with the GETELEM PIC over the iterator patch. The GETELEM patch shows a little more speedup on microbenchmarks because the iterator patch is lacking a fast path in its current implementation.
Attachment #458289 -
Attachment is obsolete: true
Attachment #458497 -
Flags: review?(dvander)
Comment 32•14 years ago
|
||
Could you post a summary of the discussion?
Assignee | ||
Comment 33•14 years ago
|
||
(In reply to comment #32)
> Could you post a summary of the discussion?
Surely! The patches do the same thing (make getelem faster) -- the GETELEM PICs are a bit more generic and have a larger speedup ATM, so it doesn't seem worthwhile to spend more engineering effort duplicating something we already have the win for.
Since GETELEM PICs work and have a slightly larger speedup, into the tree they go.
Comment 34•14 years ago
|
||
Sorry if I missed this: What PIC hit rates do you get for the string-fasta loops?
/be
Assignee | ||
Comment 35•14 years ago
|
||
(In reply to comment #33)
> the GETELEM PICs
> are a bit more generic
To clarify, when I say "a bit more generic" I mean that they work on a larger
set of cases than indexing into the innermost iterator.
Comment 36•14 years ago
|
||
fasta does a property walk over an object. why does a PIC outperform an optimized iterator data structure?
Assignee | ||
Comment 37•14 years ago
|
||
(In reply to comment #36)
> fasta does a property walk over an object. why does a PIC outperform an
> optimized iterator data structure?
As I mentioned in comment 33, because we have a fast path implemented for it. The current iterator patch performs a slow call on each iteration.
Assignee | ||
Comment 38•14 years ago
|
||
Bumps the megamorphic limit to 20, so we don't disable at all on fasta.
Attachment #458497 -
Attachment is obsolete: true
Attachment #458505 -
Flags: review?(dvander)
Attachment #458497 -
Flags: review?(dvander)
Comment 39•14 years ago
|
||
Your summary above is completely misleading. So we are choosing a less optimal algorithm that we implemented a fast path for over a better algorithm that we didn't finish implementing? This doesn't seem like a good approach to me.
Comment 40•14 years ago
|
||
Tuning for SunSpider is one thing (20 may be overkill -- isn't the magic number for IUB 15?).
Giving the win to the patch that has had some fast-path love, even though it uses caching which is in general not effective on real for-in loops, instead of giving the same advantage of fast-path love to the iterator path, and then judging, is a path-dependent mistake.
Apples to apples, if the fast-path'ed iterator wins, then it should go in.
/be
We should do both, they're not mutually exclusive, and this is ready now.
We fast-path FORLOCAL and MOREITER, we can have a separate bug for noticing where FORLOCAL results flow into.
Comment 42•14 years ago
|
||
Ok, so let's do both -- not make this an either-or on bogus grounds :-/. Also, be clear this is tuning for SunSpider. We can take it, but it should be described as "more general".
Is the other bug on file?
/be
Assignee | ||
Comment 43•14 years ago
|
||
(In reply to comment #34)
> Sorry if I missed this: What PIC hit rates do you get for the string-fasta
> loops?
Not sure that I understand the metric you're looking for, but I'll take a shot.
The patch elides 147331 slow calls to GETELEM versus TIP, so with 52 slow calls
for PIC updates, that's a hit rate of 99.9647%.
Comment 44•14 years ago
|
||
er, "should *not* be described as 'more general'."
/be
Comment 45•14 years ago
|
||
(In reply to comment #43)
> (In reply to comment #34)
> > Sorry if I missed this: What PIC hit rates do you get for the string-fasta
> > loops?
>
> Not sure that I understand the metric you're looking for,
A PIC is a cache. Caches are tested, which either hits or misses (if miss, then fill according to a replacement policy). The hit rate is the ratio of hits to tests. What's the hit rate?
> but I'll take a shot.
> The patch elides 147331 slow calls to GETELEM versus TIP, so with 52 slow calls
> for PIC updates, that's a hit rate of 99.9647%.
That's pretty awesome. Thanks!
/be
Comment 46•14 years ago
|
||
Chris, what happens if you shrink the PIC size to 16? Does the hit rate drop a whole lot?
/be
Assignee | ||
Comment 47•14 years ago
|
||
(In reply to comment #42)
> Is the other bug on file?
dmandelin mentioned in comment 7 that the iterator work is living in bug
578756. This bug happened to become a de-facto string-fasta optimization bug,
which was the cause for the either-or confusion.
Assignee | ||
Comment 48•14 years ago
|
||
(In reply to comment #45)
> > Not sure that I understand the metric you're looking for,
>
> A PIC is a cache.
I may have misrepresented my level of cluelessness...
You were right, 16 is okay.
PIC Size Hit Rate
-------- --------
2: 49322 / 147384 = 33.471%
4: 64493 / 147384 = 43.758%
8: 79014 / 147384 = 53.611%
15: 122866 / 147384 = 83.365%
16: 147331 / 147384 = 99.964%
32: 147331 / 147384 = 99.964%
Just so happens only 17 of the 19 shape/atom combos flow through that GETELEM.
In light of this pathetic analysis, I would like to apologize for the impression I gave that PICing is more general in any serious way.
On the mac-mini:
fasta: 1.40x as fast 41.9ms +/- 0.6% 29.8ms +/- 0.9% significant
The only consistent regression I see across multiple computers is this:
fannkuch: *1.052x as slow* 20.8ms +/- 1.0% 21.8ms +/- 1.0% significant
When there's only a single GETELEM site, things are fine. As more get added the performance degrades. To demonstrate see the attached script. It runs in 2500ms before this patch, and 3000ms after.
It's only 1ms on the benchmark, but my gut feeling is we should investigate this.
Comment 50•14 years ago
|
||
Is it still the case that JM does not handle megamorphism explicitly? I.e. too much PIC thrashing -> recompile a generic stub call?
/be
(In reply to comment #50)
Yeah, though this problem case is the reverse - lots of call sites, each monomorphic.
Comment on attachment 458505 [details] [diff] [review]
PIC GETELEM, max count bumped
>+ /* FIXME: we should factor this out. Easy to miss new members. */
Forgetting to set a member will light up pretty quickly. Omit this or explain it better (maybe even turn it into documentation).
>+ script->pics[i].idReg = pics[i].idReg;
This should be in the |u.get| union struct of ic::PICInfo.
>+ pic.hotPathBegin = masm.label();
>+ /* Guard on shape. */
Newline before comment.
>+ pic.shapeGuard = masm.label();
>+ Jump shapeGuard = masm.branch32(Assembler::NotEqual, shapeReg,
>+ Imm32(int32(JSObjectMap::INVALID_SHAPE)));
>+ /* Guard on id identity. */
Ditto.
>+ static const int32 BOGUS_ATOM = 0xdeadbeef;
>+ Jump idGuard = masm.branch32(Assembler::NotEqual, idReg, Imm32(BOGUS_ATOM));
branchPtr, ImmPtr - or better yet, put a // :FIXME: X64 in there somewhere.
>+ stubcc.linkExit(idGuard, Uses(2));
>+ stubcc.linkExitDirect(shapeGuard, pic.slowPathStart);
This must be linkExit(), not linkExitDirect(), because if the branch is taken registers won't be synced.
>+ void jsop_getelem_dense(FrameEntry *obj, FrameEntry *id, RegisterID objReg,
>+ MaybeRegisterID &idReg, RegisterID shapeReg);
> inline JSC::MacroAssembler::RegisterID
>+FrameState::tryRegForType(FrameEntry *fe, RegisterID fallback)
More on this later.
>+ if (obj->isDenseArray()) {
>+ start = masm.label();
>+ atomGuard = masm.branchPtr(Assembler::NotEqual, pic.idReg, ImmPtr(id));
>+ shapeGuard = masm.branchPtr(Assembler::NotEqual,
>+ Address(pic.objReg, offsetof(JSObject, clasp)),
>+ ImmPtr(obj->getClass()));
Dense arrays are non-native, so an earlier check made this all dead. You can just remove it.
>+ masm.loadData32(proto, pic.objReg);
loadPtr() - looks like an existing bug that got copy pasta'd. proto used to be a jsval.
>+ if (obj->isDenseArray())
>+ disable("dense array");
Can remove this too.
> // This struct comes out to 61 bits.
65 now (70 when idReg moves in).
>+ }
>+ /* Postcondition: type should be in tmpReg, data should be in objReg. */
>+ /* Note: linkExits will be hooked up to a leave() after this method completes. */
Newline before comment, merge comments or use //
>+ if (id->isTypeKnown() && id->getKnownType() == JSVAL_TYPE_INT32 && id->isConstant() &&
>+ id->getValue().toInt32() < 0) {
>+ jsop_getelem_slow();
>+ return;
>+ }
Can this check be moved into the jsop_getelem_dense path?
>+
>+ if (id->isTypeKnown() && id->getKnownType() == JSVAL_TYPE_STRING && id->isConstant()) {
>+ /* Never happens, or I'd optimize it. */
>+ jsop_getelem_slow();
>+ return;
>+ }
Similarly, into the jsop_getelem_pic path.
>+ RegisterID tmpReg = frame.allocReg();
>+ if (!obj->isTypeKnown()) {
>+ RegisterID typeReg = frame.tryRegForType(obj, tmpReg);
>+ Jump objGuard = masm.testObject(Assembler::NotEqual, typeReg);
>+ stubcc.linkExit(objGuard, Uses(2));
>+ }
I'd prefer playing along with the register allocator. Use copyTypeIntoReg, and allocReg on the other side of the |if|. Right now, you might be spilling and then re-loading the type.
>+ if (!id->isConstant())
>+ idReg.setReg(frame.copyDataIntoReg(id));
>+ /* Meat. */
Delicious, but more savory with a preceding newline.
>+ jsop_getelem_dense(obj, id, objReg, idReg, tmpReg);
>+ stubcc.leave();
>+ stubcc.call(stubs::GetElem);
>+ /* Epilogue. */
Ditto.
>+ RegisterID objReg = frame.copyDataIntoReg(obj);
>+ RegisterID idReg = frame.copyDataIntoReg(id);
>+ /* Meat. */
>+ jsop_getelem_pic(obj, id, objReg, idReg, tmpReg);
>+ /* Epilogue. */
Ditto * 2.
>+#ifdef JS_POLYIC
>+ JS_ASSERT(!id->isTypeKnown());
Taking another look, this function is *really* long. Would be nice to separate into smaller pieces (getelem_known_id_type, getelem_with_pic, getelem_without_pic), but up to you.
>+ RegisterID typeReg = frame.tryRegForType(id, tmpReg);
>+ Jump intGuard = masm.testInt32(Assembler::NotEqual, typeReg);
Nice, this is a good use case for tryRegForType, because anything else will be less optimal. Let's overload it to "tempRegForType" and give it a "hint" parameter. Assert that the hint reg is owned.
>+ Jump performedDense = masm.jump();
Hrm... I think we could do better by keeping one case OOL, or PIC'ing everything. For example, if the type isn't known, we'd always go to a PIC. If it saw a string id, it'd guard on that and build string PICs. If it saw an int32 id, it'd guard on that, and build a dense elem path. We most likely don't need or want both in one path, for locality, branch prediction, or code size.
This should be a follow-up bug though.
Nice work so far, looking forward to new patch.
Assignee | ||
Comment 53•14 years ago
|
||
(In reply to comment #52)
> >+ stubcc.linkExit(idGuard, Uses(2));
> >+ stubcc.linkExitDirect(shapeGuard, pic.slowPathStart);
>
> This must be linkExit(), not linkExitDirect(), because if the branch is taken
> registers won't be synced.
I think the same set of registers is dirty on the idGuard taken edge and the shapeGuard taken edge.
> >+ if (obj->isDenseArray()) {
> >+ start = masm.label();
> >+ atomGuard = masm.branchPtr(Assembler::NotEqual, pic.idReg, ImmPtr(id));
> >+ shapeGuard = masm.branchPtr(Assembler::NotEqual,
> >+ Address(pic.objReg, offsetof(JSObject, clasp)),
> >+ ImmPtr(obj->getClass()));
>
> Dense arrays are non-native, so an earlier check made this all dead. You can
> just remove it.
We checked in the caller that the *holder* had a native JSObjectOps, not obj. Is it possible for a dense array to have a prototype with native object ops? I'm somewhat confused by the terminology here, may have to discuss offline.
> Would be nice to separate into smaller pieces
I'd like that as well, will break it up some.
> >+ if (id->isTypeKnown() && id->getKnownType() == JSVAL_TYPE_INT32 && id->isConstant() &&
> >+ id->getValue().toInt32() < 0) {
> >+ jsop_getelem_slow();
> >+ return;
> >+ }
>
> Can this check be moved into the jsop_getelem_dense path?
It's easier to emit all the shortcut-to-slow stuff before the shared isObject guard. I can duplicate that guard a few times if you hate the way it is now.
> Let's overload it to "tempRegForType" and give it a "hint"
> parameter. Assert that the hint reg is owned.
So this just means rename tryRegForType to tempRegForType? tryRegForType already asserted that the fallback register was owned.
(In reply to comment #53)
> I think the same set of registers is dirty on the idGuard taken edge and the
shapeGuard taken edge.
Whoops, yup, you're right. Can you leave a comment in there to avoid accidental breakage?
> We checked in the caller that the *holder* had a native JSObjectOps, not obj.
Okay, yeah, we need to check |obj|, as well as everything in that protochain walk. Another existing bug, we can fix the other cases separately.
> I can duplicate that guard a few times if you hate the way it is now.
If it's a big deal then nevermind.
> So this just means rename tryRegForType to tempRegForType?
Yes.
Comment 55•14 years ago
|
||
(In reply to comment #53)
> > Dense arrays are non-native, so an earlier check made this all dead. You can
> > just remove it.
>
> We checked in the caller that the *holder* had a native JSObjectOps, not obj.
> Is it possible for a dense array to have a prototype with native object ops?
> I'm somewhat confused by the terminology here, may have to discuss offline.
David pointed out js::PropertyCache::fill over IRC, filling a PIC needs to have the same logic about non-native followed by native on proto chain. Fortunately this is rare apart from dense arrays, and for dense arrays we special-case all proto chain identifier-based (atom literal) tests/lookups to skip the direct (dense array) object, since it cannot have named properties.
GETELEM could be accessing any computed property name, though, so if the id at runtime is not an index, but the direct object is a dense array, then you should be able to skip up one level on the proto chain.
/be
Assignee | ||
Comment 56•14 years ago
|
||
Attachment #458505 -
Attachment is obsolete: true
Attachment #458799 -
Flags: review?(dvander)
Attachment #458505 -
Flags: review?(dvander)
Assignee | ||
Comment 57•14 years ago
|
||
Shouldn't have marked for re-review, fixing native logic now...
Assignee | ||
Comment 58•14 years ago
|
||
Attachment #458799 -
Attachment is obsolete: true
Attachment #458846 -
Flags: review?(dvander)
Attachment #458799 -
Flags: review?(dvander)
Comment on attachment 458846 [details] [diff] [review]
PIC GETELEM, take 2, for real.
>+ /*
>+ * Check all nodes along proto chain are JS-native; otherwise,
>+ * obj->shape() is undefined.
>+ */
>+ for (JSObject *protoNode = obj; protoNode; protoNode = protoNode->getProto()) {
As discussed IRL the isNative check should be in the proto chain walk in generateStub(), don't need need two loops.
r=me with that.
The regression to fannkuch is still there, but the attached test case is somewhat better (2800ms from 2500ms). I'll file a follow-up bug to investigate this.
Attachment #458846 -
Flags: review?(dvander) → review+
Comment 60•14 years ago
|
||
Canonical variable names are important in house style, so s/protoNode/pobj/ (it's not a "node" :_P). Drive-by nit, hope you can get it in before it's forgotten.
/be
Assignee | ||
Comment 61•14 years ago
|
||
Whiteboard: fixed-in-jaegermonkey
Updated•14 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•