Closed Bug 617938 Opened 14 years ago Closed 14 years ago

Clean-up tasks in exactgc.as

Categories

(Tamarin Graveyard :: Tools, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lhansen, Assigned: lhansen)

References

Details

Attachments

(1 file, 2 obsolete files)

In bug #604701 a few clean-up tasks were identified but left unresolved:

- Break out the regular expressions as named strings that are then joined
  to create the regular expressions used in the various searches, as the
  in-line regular expressions are hard to read and maintain (they're duplicated
  an un-named)

- It wouldn't be very hard to support classes that derive from templated
  classes with a syntax like "GC_CPP_EXACT_T(myclass, baseclass, parameter)";
  at this time, tracers for this case must be hand-written (eg VectorObject).

- It wouldn't be very hard to support generation of tracers for in-line
  structures, like LookupCache in MethodEnv.h, with a structure along the
  lines of "GC_CPP_STRUCT(name)".

- We should pay attention to the performance of exactgc.as because it will run
  on many, many files (large input volume) in the Flash Player.
Depends on: 619922
Depends on: 619931
Attached patch Some performance tweaks (obsolete) — Splinter Review
These simple tweaks - compiling with -AS3 and adding a quick test in the main matching loop - improve script performance by 20% on my MacPro.  (I've not tried to do any serious profiling to see what other opportunities there are, but I figure these fixes are worthwhile and cheap in any case.)
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Attachment #501681 - Flags: review?(fklockii)
Attached patch Some performance tweaks, v2 (obsolete) — Splinter Review
Simple profiling found that the function that splits the text from the file into lines was taking most of the time.  A small amount of work fixes that.  Cumulatively we're down from approx 0.76s to 0.18s (I love optimizing unoptimized code) for the files in the core/ directory.

Also included here is the simple function-profiling framework I used to find this.
Attachment #501681 - Attachment is obsolete: true
Attachment #501689 - Flags: review?(fklockii)
Attachment #501681 - Flags: review?(fklockii)
Comment on attachment 501689 [details] [diff] [review]
Some performance tweaks, v2

...aaand of course then there's no need to use regular expressions for the splitting, and fixing that drops the time in that function by another factor of four.  Another patch is coming up.
Attachment #501689 - Flags: review?(fklockii)
In addition to the further improvement to the readLines function I've factored out and named regular expressions (to avoid recompilation) and introduced slightly smarter filtering of significant lines in the main matching loop.  Running time is down from approx 750ms to approx 95ms on the MacPro; this is probably as far as I'm going to take it without further evidence that it's too slow in practical use.  Even with 100 times as much data the running time would probably be OK, given what compilation times are going to be like.

(In my test the script processes 156 files, 43687 lines, and 1651928 characters; 645 lines are not filtered by the initial fast check.)
Attachment #501689 - Attachment is obsolete: true
Attachment #501707 - Flags: review?(fklockii)
Whiteboard: has-patch
Comment on attachment 501707 [details] [diff] [review]
Some performance tweaks, v3

+// Using single regular expressions in place of the disjunction gave us
+// a factor of five

That is a little surprising to me ... is it a sign of a separate performance problem that should be addressed within the regexp implementation?  Or is there something fundamental that I'm missing; e.g., some costly feature that a disjunctive as3 regexp provides and cannot be disabled that your specialized code gets to skip?
Attachment #501707 - Flags: review?(fklockii) → review+
(In reply to comment #4)
> Running time is down from approx 750ms to approx 95ms on the MacPro; this is
> probably as far as I'm going to take it without further evidence that it's too
> slow in practical use.  Even with 100 times as much data the running time
> would probably be OK, given what compilation times are going to be like.

And if we need to do even better: The next step would be to cache the information resulting from scanning a file (along with the mtime of the file), since file scanning still accounts for about 90% of the running time.
changeset: 5749:0516b5f274d0
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 617938 - Clean-up tasks in exactgc.as: optimize the script execution time (r=fklockii)

http://hg.mozilla.org/tamarin-redux/rev/0516b5f274d0
Whiteboard: has-patch
(In reply to comment #0)
> 
> - It wouldn't be very hard to support generation of tracers for in-line
>   structures, like LookupCache in MethodEnv.h, with a structure along the
>   lines of "GC_CPP_STRUCT(name)".

Moved to bug #624402.
(In reply to comment #0)
> 
> - It wouldn't be very hard to support classes that derive from templated
>   classes with a syntax like "GC_CPP_EXACT_T(myclass, baseclass, parameter)";
>   at this time, tracers for this case must be hand-written (eg VectorObject).

Moved to bug #624403.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: