Closed Bug 747488 Opened 13 years ago Closed 7 years ago

lzma performance testing

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: pnkfelix, Assigned: dschaffe)

References

Details

Attachments

(4 files)

AS3 shell code for generating random bytes 13 years ago Felix S. Klock II [:pnkfelix, :fklock] 1.15 KB, text/plain		Details
tar.gz for ad-hoc performance test suite 13 years ago Felix S. Klock II [:pnkfelix, :fklock] 1.12 MB, application/octet-stream		Details
lzma and zlib performance test cases 13 years ago Ingo Richter 95.94 KB, patch		Details \| Diff \| Splinter Review
simplify performance testcases 13 years ago Dan Schaffer 8.13 KB, patch	pnkfelix : review+	Details \| Diff \| Splinter Review

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Description

•

13 years ago

We need performance tests for lzma. In particular we need to double-check that the revised implementation I have suggested (patch S on Bug 729336) is competitive with the original implementation.

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Updated

•

13 years ago

Blocks: 729336

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 1

•

13 years ago

irichter reports that there is a 2x slow-down when using a large .zip file as input to compress. I suspect this is as-designed for the implementation technique I used (since it will goes through 2 passes rather than 1 pass for non-compressible input); I am working on confirming that claim now.

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 2

•

13 years ago

Attached file AS3 shell code for generating random bytes — Details

This is a helper script I hacked together. Its purpose is to generate random inputs of arbitrary length, allowing one to turn a knob to vary the amount of variability in the generated byte sequence. You feed it three arguments: the output file name, the size of the domain to draw from for each byte (in range [1,256]), and the number of bytes to emit. (Choosing a smaller domain for the second argument means that the result is likely to be more compressible, especially as the length gets large.) As an aesthetic nicety, the domain starts at byte 48 (== ASCII '0') and goes up, wrapping around once you request a domain larger than 208 elements. This is just to make it easy to constrain the generated sequences to just decimal digits, or an interesting and printable subset of ASCII codes. Example runs: % avmshell bigrand.abc -- /dev/stdout 1 20 ; echo 00000000000000000000 % avmshell bigrand.abc -- /dev/stdout 2 20 ; echo 10001001011010001000 % avmshell bigrand.abc -- /dev/stdout 2 20 ; echo 00110101100110010010 % avmshell bigrand.abc -- /dev/stdout 10 20 ; echo 24883493077522345450 % avmshell bigrand.abc -- /dev/stdout 78 20 ; echo EUP4Tm4G`?qoQDEgBhDP

Assignee: nobody → fklockii

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 3

•

13 years ago

Or perhaps a better illustration of the end point being made via the utility (in terms of how compressibility varies with the domain size): % FILE=twomill.txt DOM=2 LEN=1000000 ; rm -f $FILE $FILE.gz && avmshell bigrand.abc -- $FILE $DOM $LEN && gzip -c $FILE > $FILE.gz && ls -l $FILE $FILE.gz -rw-r--r-- 1 fklockii staff 1000000 Apr 25 15:37 twomill.txt -rw-r--r-- 1 fklockii staff 159016 Apr 25 15:37 twomill.txt.gz % FILE=tenmill.txt DOM=10 LEN=1000000 ; rm -f $FILE $FILE.gz && avmshell bigrand.abc -- $FILE $DOM $LEN && gzip -c $FILE > $FILE.gz && ls -l $FILE $FILE.gz -rw-r--r-- 1 fklockii staff 1000000 Apr 25 15:37 tenmill.txt -rw-r--r-- 1 fklockii staff 470625 Apr 25 15:37 tenmill.txt.gz % FILE=maxmill.txt DOM=256 LEN=1000000 ; rm -f $FILE $FILE.gz && avmshell bigrand.abc -- $FILE $DOM $LEN && gzip -c $FILE > $FILE.gz && ls -l $FILE $FILE.gz -rw-r--r-- 1 fklockii staff 1000000 Apr 25 15:38 maxmill.txt -rw-r--r-- 1 fklockii staff 1000185 Apr 25 15:38 maxmill.txt.gz

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 4

•

13 years ago

Attached file tar.gz for ad-hoc performance test suite — Details

This is the ad-hoc suite I e-mailed to Ingo on April 23rd. (I had hoped to take the time to clean this up before posting it here. But at this point it is easiest to just post it here.)

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 5

•

13 years ago

FYI Some additional background and dialogue is available to Adobe internally here: https://zerowing.corp.adobe.com/x/34LjJw

Ingo Richter

Comment 6

•

13 years ago

Attached patch lzma and zlib performance test cases — Details — Splinter Review

Thanks for your feedback Felix. I used your implementation of the seeded random number generator to get rid of all test files. For the text files I used a different approach: I put some sample text into the bytearray-test-helper.as and generate text input from this sample. I hope that the generated text data will be as realistic as possible in terms of word distribution and that the lzma implementation shows a similar behavior when it comes to this artificial test data.

Dan Schaffer

Assignee

Comment 7

•

13 years ago

Attached patch simplify performance testcases — Details — Splinter Review

Attachment #630949 - Flags: review?(fklockii)

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 8

•

13 years ago

Comment on attachment 630949 [details] [diff] [review] simplify performance testcases Review of attachment 630949 [details] [diff] [review]: ----------------------------------------------------------------- In its current form, the output from code doesn't match the format of our other performance tests (because it has invented a completely new set of metric names to tag the output from the tests), and so I'm not sure what value we get from it. I guess we might still be able to to run-to-run comparisons of different builds of the shell. So hey, if this is the way QE thinks it can support this, that is fine. That's the main reason I am R+'ing this. (Maybe the reality is that our performance test infrastructure needs revision anyway, and as part of that work, we should support more flexible encoding of performance tests than the one-file : one-individual-test, e.g. perhaps support a many-tests in one-file that this code is clearly calling out for.)

Attachment #630949 - Flags: review?(fklockii) → review+

Tamarin Bot

Comment 9

•

13 years ago

changeset: 7418:d582d61e66fc user: Brent Baker <brbaker@adobe.com> summary: Bug 747488: performance test cases/media for LZMA feature (p=ingo.richter) http://hg.mozilla.org/tamarin-redux/rev/d582d61e66fc

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Updated

•

13 years ago

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 10

•

13 years ago

(wait; I'll wait until attachment 630949 [details] [diff] [review] is pushed before I close this. sorry.)

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Felix S. Klock II [:pnkfelix, :fklock]

Reporter

Comment 11

•

13 years ago

reassigning to Dan, so he can decide whether he wants to land attachment 630949 [details] [diff] [review] or just close as is.

Assignee: fklockii → dschaffe

Sylvestre Ledru [:Sylvestre]

Comment 12

•

7 years ago

Tamarin isn't maintained anymore. WONTFIX remaining bugs.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 7 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.

Bugzilla

lzma performance testing

Categories

(Tamarin Graveyard :: Library, defect)

Tracking

(Not tracked)

People

(Reporter: pnkfelix, Assigned: dschaffe)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(4 files)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type