<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Reporter

Comment 3

•

14 years ago

This is current mozilla-central trunk JS code, so TM. If JM helps with this, I'll cheer a lot for sure! Still, it's a tight loop, so TM should try to jump in in theory, but it happens to be thrown off a lot.

Reporter

Comment 4

•

14 years ago

Actually, what I get with TMFLAGS=minimal,abort is this: Recording starting from chrome://dataman/content/dataman.js:649@242 (FragID=000000) trace stopped: 3512: non-stub getter trace stopped: 12975: name() not accessing a valid slot Abort recording of tree chrome://dataman/content/dataman.js:649@242 at chrome://global/content/bindings/tree.xml:0@8: name. (this is with code I already modified somewhat, so what that output say is line 649 is what's 613 in comment #0) What I wonder about is how it comes to calling into tree.xml there...

Reporter

Comment 5

•

14 years ago

I optimized that loop so that it doesn't fall off trace in http://hg.mozilla.org/users/kairo_kairo.at/dataman/rev/f42548e3f20b but I still get the same effect, i.e. selecting a domain in Data Manager being quite slow with JIT (about 3 seconds for most domains in my debug build, I saw one pretty random case that only has one cookie that took 180 ms, all others I tried were 1.5 to 3 seconds) and very fast without JIT (4 to 8 milliseconds!) - I don't see any trace aborts in anything that would loop significantly now (with TMFLAGS=minimal,abort) so I really wonder what's up there. If you want to try experimenting with this, putting the dataman repository into extensions/dataman and building with --enable-extensions=default,dataman gives it to you as pre-installed add-on, and the version the repo was at in comment #0 is version 0.9 on https://addons.mozilla.org/addon/162068

Comment 6

•

14 years ago

Robert, where are you placing your probes to measure the times you mention in comment 5? Assume I'm at data manager rev f42548e3f20b.

Reporter

Comment 7

•

14 years ago

(In reply to comment #6) > Robert, where are you placing your probes to measure the times you mention in > comment 5? Assume I'm at data manager rev f42548e3f20b. That rev includes the very basic way I took for measuring, just turn the gDatamanDebug var to true at the very beginning of dataman.js and look into error console when selecting different domains. I just watched the differences between "Domain selected" and "Domain select finished", so that's the start of end of domain_select - by very simple |Date.now()/1000| printouts (sorry, I don't know more involved actual profiling tools too well).

Mike Shaver (:shaver emeritus)

Comment 8

•

14 years ago

OK, I just tried doing that, and I can reproduce the bug. Profiles show us spending a lot more time in DOM manipulation, and in fact this loop is responsible (in sort_cookies): for (let node = column.parentNode.firstChild; node; node = node.nextSibling) { node.removeAttribute("sortActive"); node.removeAttribute("sortDirection"); } Note the let. If let is replaced with var the bug goes away. Simple logging indicates that if the loop is run without jit it runs 5 iterations, as expected. With jit it runs ... a variable number of times much bigger than 5. My last two tests were 7593 iterations and 28730 iterations. Nominating for blocking, since we do use let in our chrome and all....

Severity: normal → major

blocking2.0: --- → ?

Comment 9

•

14 years ago

This could be related to the tbird chrome-jit problem, too.

Comment 10

•

14 years ago

If I log the actual value of node as I go (or rather push into an array and then join the array on '\n' at the end, so as to stay on trace) and pipe the result through uniq -c I get things like: 1 [object XULElement @ 0x2067b2d0 (native @ 0x2064a040)] 1 [object XULElement @ 0x206ca3a0 (native @ 0x2064a0a0)] 1 [object XULElement @ 0x2067b620 (native @ 0x2064a0e0)] 52458 [object XULElement @ 0x206ca430 (native @ 0x2064a150)] 1 [object XULElement @ 0x2067b9f0 (native @ 0x2064a190)] So for some reason we spin for a while there on the fourth loop iteration... Robert is going to see if he can create a smaller testcase here.

Reporter

Comment 11

•

14 years ago

Attached file Simplified test case — Details

Here's a quite simple test case. Open it, click the button and have fun!

Reporter

Comment 12

•

14 years ago

BTW, with JIT turned on, I get different counts every time, but things like "Your actual count is 25870809." - with JIT off, it's always 7 as it should be.

Bruno 'Aqualon' Escherl

Comment 13

•

14 years ago

I've run the test case with some older builds, Mozilla/5.0 (X11; Linux i686 (x86_64); en-US; rv:2.0b2pre) Gecko/20100714 Minefield/4.0b2pre (rev 5355f245b916) works, Mozilla/5.0 (X11; Linux i686 (x86_64); en-US; rv:2.0b2pre) Gecko/20100715 Minefield/4.0b2pre (rev 5fda39cd70) shows the problem. Probably was introduced with the merge of tracemonkey into mozilla-central: http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=c26c255bade9

Reporter

Updated

•

14 years ago

Keywords: regression

Summary: JIT vastly slows down Data Manager → JIT vastly slows down Data Manager (let in .nextSibling loop makes TM loop way too often)

Comment 14

•

14 years ago

Bruno, are you willing to try some tracemonkey nightlies to pin this down to a smaller range?

Justin Wood (:Callek)

Comment 15

•

14 years ago

TraceMonkey: July 11: |468e317da3d8| WFM July 12: |cf557e3fc53d| Fail So range is: http://hg.mozilla.org/tracemonkey/pushloghtml?fromchange=468e317da3d8&tochange=cf557e3fc53d (which also has a m-c merge in it)

Reporter

Comment 16

•

14 years ago

As m-c only regressed later, it can't really be that merge, so http://hg.mozilla.org/tracemonkey/rev/cf557e3fc53d from bug 577764 looks very suspect. njn did that patch, dvander reviewed, I guess they should know most about what's going on with this regression.

Comment 17

•

14 years ago

hg bisect says: The first bad revision is: changeset: 47491:cf557e3fc53d user: Nicholas Nethercote <nnethercote@mozilla.com> date: Mon Jul 12 11:01:58 2010 +1000 summary: Bug 577764 - TM: avoid a dead guard at the end of loop fragments. r=dvander.

Blocks: 577764

Assignee

Comment 18

•

14 years ago

That's really weird that bug 577764 would cause this, it was a very simple patch. I'll take a look on Monday.

Assignee

Comment 19

•

14 years ago

Yes, the patch for bug 577764 did cause this. Here's an even simpler test case that can be run from the JS shell: var fourth = { nextSibling: null }; var third = { nextSibling: fourth }; var second = { nextSibling: third }; var first = { nextSibling: second }; function runLoop() { let loopcount = 0; for (let node = first; node; node = node.nextSibling) { loopcount++; } return loopcount; } print(runLoop()); Here's the LIR for it without the bogus patch from bug 577764: ebx = parami 0 ebx esi = parami 1 esi edi = parami 2 edi state = parami 0 ecx label1: sp = ldi.o state[8] (void *) &JS_THREAD_DATA(cx)->operationCallbackFlag = immi ...... ldi1 = ldi.o/v (void *) &JS_THREAD_DATA(cx)->operationCallbackFlag/*...... */[0] immi1 = immi 0 eqi1 = eqi ldi1, immi1/*0*/ xf1: xf eqi1 -> pc=...... imacpc=(nil) sp+8 rp+0 (GuardID=001) ldi2 = ldi.s sp[-8] immi2 = immi 1 addxovi1 = addxovi ldi2, immi2/*1*/ -> pc=...... imacpc=(nil) sp+8 rp+0 (G uardID=002) sti.s sp[-8] = addxovi1 $stack5 = ldi.s sp[0] # !!! map = ldi.o $stack5[0] shape = ldi.o map[0] immi3 = immi 192 guard_kshape = eqi shape, immi3/*192*/ xf2: xf guard_kshape -> pc=...... imacpc=(nil) sp+8 rp+0 (GuardID=003) ldi3 = ldi.o $stack5[36] JSVAL_TAG_OBJECT = immi ...... eqi2 = eqi ldi3, JSVAL_TAG_OBJECT/*......*/ xf3: xf eqi2 -> pc=...... imacpc=(nil) sp+8 rp+0 (GuardID=004) ldi4 = ldi.o $stack5[32] ldi5 = ldi.o ldi4[4] clasp = immi ...... guard(class is Function) = eqi ldi5, clasp/*......*/ xt1: xt guard(class is Function) -> pc=...... imacpc=(nil) sp+8 rp+0 (Guar dID=005) sti.s sp[0] = ldi4 # !!! j -> label1 livei state x1: x -> pc=...... imacpc=(nil) sp+8 rp+0 (GuardID=006) When the patch is applied, the final 'x1' guard is removed. As a result, the StackFilter also removes the final store to sp[0], which I've marked with "!!!". So when we loop and load from sp[0] the next time (also marked with "!!!") the value hasn't been set appropriately. So that final guard, even though it's dead, is serving a purpose... it's indicating which stack values are live at the end of the loop. Wow, nothing like a "trivial" patch that goes subtly wrong to sap one's confidence!

David Anderson [:dvander] - inactive, e-mail if emergency

Assignee

Comment 20

•

14 years ago

Attached patch patch — Details — Splinter Review

This patch reverses the patch from bug 577764 and also adds a comment explaining why the dead guard must be present.

Assignee: general → nnethercote

Status: NEW → ASSIGNED

Attachment #461966 - Flags: review?(dvander)

Comment 21

•

14 years ago

Comment on attachment 461966 [details] [diff] [review] patch Oh, how gross. Sorry for not catching this.

Attachment #461966 - Flags: review?(dvander) → review+