Closed Bug 601794 Opened 15 years ago Closed 15 years ago

Excessive startup time due to quadratic VarTracker behavior

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

Q2 11 - Wasabi

People

(Reporter: wmaddox, Assigned: wmaddox)

Details

(Whiteboard: WE:2700709,fixed-in-salt,fixed-in-spicy,fixed-in-spicier)

Attachments

(5 files, 4 obsolete files)

Workaround for lengthy initialization methods 15 years ago William Maddox 3.81 KB, patch		Details \| Diff \| Splinter Review
Data showing verifyonly vs. interp times for tr-argo release 15 years ago William Maddox 9.33 KB, text/plain		Details
(framesize, codesize) tuples for 30K brightspot swfs 15 years ago Edwin Smith 94.40 KB, text/plain		Details
Workaround for lengthy initialization methods (v2, with explanatory comment) 15 years ago William Maddox 4.15 KB, patch	rreitmai : review+ edwsmith : superreview+	Details \| Diff \| Splinter Review
Fail-fast for JIT when framesize too large for NanoJIT stack limit 15 years ago William Maddox 2.85 KB, patch	edwsmith : feedback+	Details \| Diff \| Splinter Review
Fail-fast for JIT when framesize too large for NanoJIT stack limit (v2) 15 years ago William Maddox 4.68 KB, patch		Details \| Diff \| Splinter Review
Fail-fast for JIT when framesize too large for NanoJIT stack limit (v3) 15 years ago William Maddox 4.69 KB, patch	rreitmai : review+ edwsmith : superreview+	Details \| Diff \| Splinter Review
An AS3 function with just enough local variables to exceed the JIT stack frame size limit 15 years ago William Maddox 31.08 KB, text/plain		Details
Fail-fast for JIT when framesize too large for NanoJIT stack limit (for TR) 15 years ago William Maddox 5.29 KB, patch	rreitmai : review+ edwsmith : superreview+	Details \| Diff \| Splinter Review

William Maddox

Assignee

Description

•

15 years ago

Time spent in trackForwardEdge() is O(number_of_forward_edges * number_of_vars). Certain mechanically-generated methods can be pathogical. One method observed in the wild is 1809481 bytes long with with 149988 variables.

William Maddox

Assignee

Comment 1

•

15 years ago

Attached patch Workaround for lengthy initialization methods (obsolete) — Details — Splinter Review

In the motivating example of a lengthy initialization method, the code is all straight-line control flow without explicit branches appearing in the ABC. The problematic forward edges arise due to null checks, which branch to common code to throw an exception on a null pointer. This patch is a workaround for this case. We take advantage of the fact that the branch target immediately throws an exception, and that vartracker state is not propagated to the handler in this case. (This is actually a bit subtle -- it is only because we throw the exception in the epilogue, rather than at the point the null check is performed, that this is the case.) Note: This patch is against tamarin-redux-argo, though similar issues apply to TR.

William Maddox

Assignee

Updated

•

15 years ago

Attachment #480806 - Attachment is patch: true

William Maddox

Assignee

Updated

•

15 years ago

Whiteboard: WE:2700709

William Maddox

Assignee

Comment 2

•

15 years ago

tr-argo/deb/shell/avmshell -Dverifyonly ~/avmglue.abc ~/es29.swf Patch reduces time for -Dverifyonly from 4m25s to 28s, tr-argo release-debugger build. Time with -Dinterp is 1s, so clearly there's still a lot of time going to the JIT. Applied to fp10-salt, es29.swf starts successfully, albeit slowly.

William Maddox

Assignee

Comment 3

•

15 years ago

Attached file Data showing verifyonly vs. interp times for tr-argo release — Details

There are multiple cases of ~4:1 verifyonly/interp ratio in this dataset, but the ~9:1 seen with es29.swf is a bit of an outlier. Absolute verify times are much lower, as are interpreter times. It's unclear how much any of this means without looking at how much of the swf size is actually do to AS3 code, and the distribution of the method sizes.

Edwin Smith

Updated

•

15 years ago

Assignee: nobody → wmaddox

Edwin Smith

Comment 4

•

15 years ago

Attached file (framesize, codesize) tuples for 30K brightspot swfs — Details

Only lists methods if framesize > 60 or codesize > 10000. out of 30K swfs, there's only 12K 'large' methods, and only a handful look like outliers.

William Maddox

Assignee

Comment 5

•

15 years ago

(In reply to comment #2) > Applied to fp10-salt, es29.swf starts successfully, albeit slowly. For release builds, es29.swf starts in 14s with the patch, compared to 24s on fp10-coral, so we're seeing a nice speedup from 10.1 once the regression is mitigated.

William Maddox

Assignee

Comment 6

•

15 years ago

Attached patch Workaround for lengthy initialization methods (v2, with explanatory comment) (obsolete) — Details — Splinter Review

TR-argo passes acceptance tests with this patch. Patch needs to be carefully reviewed for correctness issues.

Attachment #480806 - Attachment is obsolete: true

Attachment #480995 - Flags: review?(edwsmith)

Attachment #480995 - Flags: feedback?(rreitmai)

Edwin Smith

Comment 7

•

15 years ago

The code might be slightly cleaner if the trackEdge flag were on CodegenLabel. We don't need it for this app, but the same flag probably makes sense for the other handlers too (upe, npe, interrupt, mops-range-check). For the sake of Salt or another maintenance release, we should keep the scope limited, and the patch is probably fine as-is (pending more testing).

William Maddox

Assignee

Updated

•

15 years ago

Attachment #480995 - Flags: superreview?(edwsmith)

Attachment #480995 - Flags: review?(rreitmai)

Attachment #480995 - Flags: review?(edwsmith)

Attachment #480995 - Flags: feedback?(rreitmai)

Rick Reitmaier

Comment 8

•

15 years ago

Also, not that it really matter much, but just curious why we made it a generic trackEdge flag rather than trackForwardEdge flag, seeing as this is supposed to be a short-lived spot fix, no?

William Maddox

Assignee

Comment 9

•

15 years ago

(In reply to comment #8) > Also, not that it really matter much, but just curious why we made it a generic > trackEdge flag rather than trackForwardEdge flag, seeing as this is supposed to > be a short-lived spot fix, no? It's not clear that it will be short-lived, in the sense that TR will likely need something similar until selective translation / OSR is implemented. That said, the only edges that are not tracked are known to be forward edges, so it's not strictly necessary. I had in mind the notion of "not tracking an edge" and implementing it completely, rather than specializing the API to its only use. That would save a few instructions.

Edwin Smith

Comment 10

•

15 years ago

Someone earlier had the idea to just punt to the interpreter if a function is too big. I didn't like the idea because of bug 561368, and its hard to define "too big". But, there's an easy sub-case we could handle. if (framesize*8 + K > NJ_MAX_STACK_ENTRY*4), where K is a conservative estimate of the extra space we always need, e.g. sizeof(MethodFrame) and other odds+ends, then the JIT is guaranteed to fail, but not until we reach the Assembler phase. We could just fail sooner and faster. Even K=0 is okay. Here, "conservative" means we only fail early if the assembler is guaranteed to fail anyway. I still like the earlier patch as a refinement for TR, but failing sooner is so simple/easy we should probably do it, and it will solve the problem in WE:2700709 for Salt.

Trevor Baker

Updated

•

15 years ago

Status: NEW → ASSIGNED

Flags: flashplayer-qrb+

Priority: -- → P3

Target Milestone: --- → flash10.2.x-Spicy

Dan Smith

Updated

•

15 years ago

Target Milestone: flash10.2.x-Spicy → flash10.1.x-Salt

Trevor Baker

Comment 11

•

15 years ago

VarTracker is not in FP9, no backport required. Is this an injection?

Edwin Smith

Comment 12

•

15 years ago

Yes, the extreme slowness was an Argo injection (see the Watson notes).

William Maddox

Assignee

Comment 13

•

15 years ago

Attached patch Fail-fast for JIT when framesize too large for NanoJIT stack limit (obsolete) — Details — Splinter Review

Attachment #481611 - Flags: feedback?(edwsmith)

William Maddox

Assignee

Updated

•

15 years ago

Attachment #481611 - Attachment is patch: true

Edwin Smith

Comment 14

•

15 years ago

Comment on attachment 481611 [details] [diff] [review] Fail-fast for JIT when framesize too large for NanoJIT stack limit Looks exactly like what I'd imagined. Does it work? specifically: - makes the test app start fast(er) - jit doesn't fail any more often than before Nit: jitMustFail() probably should be a static method on CodegenLIR. Nit: calculation of jitMustFail could be after if ((core->isJITEnabled()...

Attachment #481611 - Flags: feedback?(edwsmith) → feedback+

William Maddox

Assignee

Comment 15

•

15 years ago

Attached patch Fail-fast for JIT when framesize too large for NanoJIT stack limit (v2) (obsolete) — Details — Splinter Review

Attachment #481611 - Attachment is obsolete: true

Attachment #481721 - Flags: review?(edwsmith)

Attachment #481721 - Flags: feedback?(rreitmai)

William Maddox

Assignee

Comment 16

•

15 years ago

Comment on attachment 481721 [details] [diff] [review] Fail-fast for JIT when framesize too large for NanoJIT stack limit (v2) While focusing on testing the assertion, I broke the non-debug case, and was too quick on the trigger to post the revised before retesting the non-debug case. New patch forthcoming.

Attachment #481721 - Flags: review?(edwsmith)

Attachment #481721 - Flags: feedback?(rreitmai)

William Maddox

Assignee

Comment 17

•

15 years ago

Attached patch Fail-fast for JIT when framesize too large for NanoJIT stack limit (v3) — Details — Splinter Review

Revised per Edwin's comments, with some infrastructure for testing agreement between fast-fail test and actual JIT failure. Assertion is OK on testcase es29.swf (-Dverifyonly), the cases that JIT fail in the top 100 AS3 swfs (by ABC size), and an artificial test case with lots of local variables showing exactly the point at which framesize-related failure occurs. It turns out the metric used is rather conservative (K = 0), but it should be safe and robust and is adequate to the purpose at hand. In fp10-salt with this fix, es29.swf initializes in ~5s.

Attachment #481721 - Attachment is obsolete: true

Attachment #481741 - Flags: review?(edwsmith)

William Maddox

Assignee

Updated

•

15 years ago

Attachment #481741 - Flags: review?(rreitmai)

William Maddox

Assignee

Comment 18

•

15 years ago

Attached file An AS3 function with just enough local variables to exceed the JIT stack frame size limit — Details

William Maddox

Assignee

Updated

•

15 years ago

Attachment #481743 - Attachment mime type: application/octet-stream → text/plain

Edwin Smith

Comment 19

•

15 years ago

Comment on attachment 480995 [details] [diff] [review] Workaround for lengthy initialization methods (v2, with explanatory comment) (In reply to comment #17) > Assertion is OK on testcase > es29.swf (-Dverifyonly), the cases that JIT fail in the top 100 AS3 swfs (by > ABC size), and an artificial test case with lots of local variables showing > exactly the point at which framesize-related failure occurs. When I first read this, I thought you were saying "its okay for the assert to fire" (no its not). But I reread it and I think you mean "the assert doesn't fire", which is good. Functionally looks right. nit: I would have passed in a framesize or MethodSignaturep into jitCanFail() instead of Verifier& (in TR, CodegenLIR never references Verifier directly), but its fine, lets ship it.

Attachment #480995 - Flags: superreview?(edwsmith) → superreview+

Rick Reitmaier

Comment 20

•

15 years ago

Comment on attachment 481741 [details] [diff] [review] Fail-fast for JIT when framesize too large for NanoJIT stack limit (v3) Looks good.

Attachment #481741 - Flags: review?(rreitmai) → review+

Rick Reitmaier

Updated

•

15 years ago

Attachment #480995 - Flags: review?(rreitmai) → review+

William Maddox

Assignee

Updated

•

15 years ago

Attachment #481741 - Flags: superreview?(edwsmith)

Attachment #481741 - Flags: review?(edwsmith)

Attachment #481741 - Flags: review+

William Maddox

Assignee

Updated

•

15 years ago

Attachment #481741 - Flags: review?(rreitmai)

William Maddox

Assignee

Comment 21

•

15 years ago

Comment on attachment 480995 [details] [diff] [review] Workaround for lengthy initialization methods (v2, with explanatory comment) This patch is obsolete, though something similar is appropriate for TR.

Attachment #480995 - Attachment is obsolete: true

Edwin Smith

Updated

•

15 years ago

Attachment #481741 - Flags: superreview?(edwsmith) → superreview+

Rick Reitmaier

Updated

•

15 years ago

Attachment #481741 - Flags: review?(rreitmai) → review+

Rick Reitmaier

Comment 22

•

15 years ago

submit to salt cl 729458

Whiteboard: WE:2700709 → WE:2700709,fixed-in-salt

William Maddox

Assignee

Comment 23

•

15 years ago

Attached patch Fail-fast for JIT when framesize too large for NanoJIT stack limit (for TR) — Details — Splinter Review

This patch adds fast-fail logic to TR. The semantics follow the following patch to Salt: https://bugzilla.mozilla.org/attachment.cgi?id=484923&action=edit Specifically, we normally fall back to the interpreter for large frame sizes, but -Djitordie is unaffected, allowing the JIT to execute and possibly fail.

Attachment #486502 - Flags: superreview?(edwsmith)

Attachment #486502 - Flags: review?(rreitmai)

William Maddox

Assignee

Comment 24

•

15 years ago

The underlying issue reported in WE:2700709 is present in TR since the Argo release, thus this fix should be applied to all subsequent TR branches to date. (excepting Salt, per comment #22)

Target Milestone: flash10.1.x-Salt → flash10.2.x-Spicy

Rick Reitmaier

Comment 25

•

15 years ago

Comment on attachment 486502 [details] [diff] [review] Fail-fast for JIT when framesize too large for NanoJIT stack limit (for TR) R+ since looks fine; couple of nits. As its now grown quite complex, splitting the shouldJIT() logic across multiple lines might make parsing easier. Also outside the scope of this bug but handy nonetheless, having a more verbose description of exec policy might be helpful, e.g. interp due 'to large method' , interp due 'static init method'

Attachment #486502 - Flags: review?(rreitmai) → review+

Edwin Smith

Comment 26

•

15 years ago

Comment on attachment 486502 [details] [diff] [review] Fail-fast for JIT when framesize too large for NanoJIT stack limit (for TR) Code looks fine. How is test coverage looking? This has enough subtlety to demand 100% branch coverage of shouldJit(), I think.

Attachment #486502 - Flags: superreview?(edwsmith) → superreview+

William Maddox

Assignee

Comment 27

•

15 years ago

http://hg.mozilla.org/tamarin-redux/rev/aa4fb24920d7

Andre Kruetzfeldt

Updated

•

15 years ago

Target Milestone: flash10.2.x-Spicy → flash10.x-Wasabi

William Maddox

Assignee

Comment 28

•

15 years ago

Possible future work item for improving vartracker performance moved to bug 631303 . Closing this bug.

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Whiteboard: WE:2700709,fixed-in-salt → WE:2700709,fixed-in-salt,fixed-in-spicy,fixed-in-spicier

Chris Peyer

Comment 29

•

14 years ago

Fix submitted to salt with P4 CL: 729458, Verified fixed by player qe with player 10.1.102.57

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.