Closed Bug 697116 Opened 8 years ago Closed 8 years ago

performance tests: all tests should target > 2 sig figs on slow devices

Categories

(Tamarin Graveyard :: Tools, defect)

x86
macOS
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pnkfelix, Assigned: pnkfelix)

Details

Attachments

(1 file, 1 obsolete file)

While working on Bug 688486, I am seeing entries like this in the report:

                                        avm            avm2
test                           best     avg    best     avg   %dBst   %dAvg
Metric: iterations/second 

  array-sort-1                  4.4     4.4     4.4     4.4     0.2     0.1   
  array-sort-2                  0.4     0.4     0.4     0.4    -1.2    -1.4 - 
  array-sort-3                  3.5     3.5     3.5     3.4    -1.0    -1.2   
  array-sort-4                  1.3     1.3     1.3     1.3     0.3     0.2   

  bytearray-read-utfbytes-2     0.4     0.4     0.4     0.4     4.1     5.6 + 
  bytearray-write-bool-1        2.3     2.3     2.3     2.3    -0.1    -0.4   
  bytearray-write-byte-1        2.5     2.5     2.5     2.5    -0.1    -0.1   

The caes with 0.4 have only one sig fig.  The others have two sig figs, but are still small values that yield entries in the right hand column that look more significant to me than what the data on the right shows.

The simplest fix is probably to revise the tests themselves to iterate fewer times in the inner loop so that the number of reported outer iterations is increased to some value > 10 but hopefully < 100 (and thus the number of reported significant figures will approach 3, which hopefully will yield more useful transcripts.

(A workaround is for one to use .csv generation and look at the raw data there.)
Assignee: nobody → fklockii
Another approach to this bug would be to change the script presentation code so that if the magnitude is in (0,10), then it presents the 100's place in the output, and if it is < 1, then it presents the 1000's place as well.

That would require less messing about with the tests in the short (and long?) term.  

I don't know whether there are other benefits to one approach or another in terms of the quality of the reports.
With this patch, we get output that looks like this:

test                           best     avg    best     avg   %dBst   %dAvg
Metric: iterations/second 
Dir: asmicro/
  array-read-int-1             23.3    23.3    73.9    73.8   217.6   217.4 ++
  array-read-int-2             5.85    5.84    5.71    5.70    -2.4    -2.4 - 
  array-read-int-5             20.0    19.9    48.6    48.5   142.6   143.7 ++
  array-read-int-6             19.0    18.9    48.6    48.5   155.8   156.8 ++
  array-read-int-7             19.9    19.6    48.1    48.1   141.2   145.2 ++
  array-sort-1                 4.37    4.37    4.37    4.36     0.1    -0.3   
  array-sort-2                0.407   0.403   0.402   0.402    -1.0    -0.3 - 
  array-sort-3                 3.50    3.49    3.46    3.45    -1.0    -1.2 - 

I briefly looked into trying to make all the decimal points in the output line up, but that seemed like a lot more effort than I wanted to expend.  Hopefully the result is still pretty clear (I think it helps that the %dBst and %dAvg columns are not affected).
Attachment #569726 - Flags: review?(edwsmith)
(I could easily be convinced that that the x<1 case where we emit 3 decimal places is not useful and should be folded into the 2 decimal place case.)
Comment on attachment 569726 [details] [diff] [review]
patch D: increase decimal places based on magnitude

Do callsites pass a boolean or number for sigFigs?  the logic itself seems fine.
Attachment #569726 - Flags: review?(edwsmith) → review+
(In reply to Edwin Smith from comment #4)
> Comment on attachment 569726 [details] [diff] [review] [diff] [details] [review]
> patch D: increase decimal places based on magnitude
> 
> Do callsites pass a boolean or number for sigFigs?  the logic itself seems
> fine.

Number.  Heh, I forgot this isn't scheme; I guess a 0 might be interpreted as False?  (I don't think anyone passes 0, but it would be good to make this more robust.)

Any Python experts have a suggestion for a good Sentinel value is to use here to represent "no value was passed"?  Maybe use None and do "sigFigs == None"?
(In reply to Felix S Klock II from comment #5)
> Any Python experts have a suggestion for a good Sentinel value is to use
> here to represent "no value was passed"?  Maybe use None and do "sigFigs ==
> None"?

Answer (I think): use 'if sigFigs is None: ..."

See also:
  http://boodebr.org/main/python/tourist/none-empty-nothing
Updated to use None to mark default rather than False.  Minor additional code cleanup; this is what I'll land.
Attachment #569726 - Attachment is obsolete: true
changeset:   6693:4bd85ff304e1
user:        Felix S Klock II <fklockii@adobe.com>
date:        Wed Oct 26 11:02:00 2011 +0200
summary:     Bug 697116: perf tests: increase fractional decimals when rendering small numbers (r=edwsmith).

http://hg.mozilla.org/tamarin-redux/rev/4bd85ff304e1
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.