Benchmark xpi unzip-modify-zip

RESOLVED FIXED in 5.11.1

Status

defect
P1
normal
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: jbalogh, Assigned: basta)

Tracking

Details

(Whiteboard: [z][qa-])

Attachments

(1 attachment)

Reporter

Description

9 years ago
We want to unzip an add-on, change the update url, and zip it back up.  Please benchmark some functions that perform variations on this theme.  Make sure you get add-ons of various sizes.  I can help you find candidates if you need it.

1. Starting from an unzipped version
2. Starting zipped with install.rdf in some random place
3. Zipped with install.rdf at the end
4. ???

The timeit module will probably be useful here. http://docs.python.org/library/timeit.html
Assignee

Comment 1

9 years ago
I downloaded a couple thousand addons today. I'm going to write a script tomorrow to do the test.
Assignee

Comment 2

9 years ago
Dear friends,

I wrote a watermarking script today (in Python) and ran it on ~1000 addons. The whole process took 51 seconds, including disk read/write, unpacking, and repacking. All modification operations took place in-memory (no extraction to the disk or cleanup necessary).

Total Time: 51.433 seconds
Time per addon: 43.96 ms
Reporter

Comment 3

9 years ago
Can you attach the script here?
Assignee

Comment 4

9 years ago
Place this in a directory. In that directory, add another directory called "library". In library, create a directory called "output". Place a good thousand or so XPI and JAR files in the library directory and run this script.
This is going to be happening over NFS which is far slower than a local disk.  Regardless though, it sounds like this bug is done?
At our current rate, a little over 1000 add-ons are downloaded each minute. If we expected a maximum of 30% of that for add-ons that would need to be watermarked (300 per minute), are the above numbers feasible with NFS?
Assignee

Comment 7

9 years ago
Disk latency should not be an issue if the "watermarker" is placed between the file and the user. If the user requests the addon from the watermarker, the script reads the addon entirely into memory before it is modified. It never writes temporary files to the disk. The output in the example is written to the disk for debug purposes; it could very easily be piped back out to the remote user as a file, thereby eliminating an extra write step. Unless there is some sort of caching going on that requires the addon to be a file (or something of that nature), the disk *shouldn't* cause any overhead.
(In reply to comment #7)
> Disk latency should not be an issue if the "watermarker" is placed between the
> file and the user. If the user requests the addon from the watermarker, the
> script reads the addon entirely into memory before it is modified. It never
> writes temporary files to the disk.
It's still reading from NFS which is slow.

Despite this being a relatively fast operation for our current load (do we have predictions for growth?), it's still only half the equation - checking for updates for all these unique (and uncached) URLs is hard to scale, and, with the solution I put in the spec, unnecessary.  I'm not sure why we're pursuing what feels like a dirty and overly complex method.
Are benchmarks done?  Can we close this?
Assignee

Comment 10

9 years ago
1 and 3 would be incredibly time-consuming to setup (they'd require a few scripts to modify all of the XPI files I have). I don't intend to pursue them. If someone specifically requests something, feel free to let me know, otherwise, I'm not chewing on this any more.
Reporter

Comment 11

9 years ago
Thanks Matt.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Whiteboard: [z] → [z][qa-]
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.