Closed Bug 569996 Opened 14 years ago Closed 14 years ago

Benchmark xpi unzip-modify-zip

Categories

(addons.mozilla.org Graveyard :: Maintenance Scripts, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
5.11.1

People

(Reporter: jbalogh, Assigned: basta)

Details

(Whiteboard: [z][qa-])

Attachments

(1 file)

We want to unzip an add-on, change the update url, and zip it back up.  Please benchmark some functions that perform variations on this theme.  Make sure you get add-ons of various sizes.  I can help you find candidates if you need it.

1. Starting from an unzipped version
2. Starting zipped with install.rdf in some random place
3. Zipped with install.rdf at the end
4. ???

The timeit module will probably be useful here. http://docs.python.org/library/timeit.html
I downloaded a couple thousand addons today. I'm going to write a script tomorrow to do the test.
Dear friends,

I wrote a watermarking script today (in Python) and ran it on ~1000 addons. The whole process took 51 seconds, including disk read/write, unpacking, and repacking. All modification operations took place in-memory (no extraction to the disk or cleanup necessary).

Total Time: 51.433 seconds
Time per addon: 43.96 ms
Can you attach the script here?
Place this in a directory. In that directory, add another directory called "library". In library, create a directory called "output". Place a good thousand or so XPI and JAR files in the library directory and run this script.
This is going to be happening over NFS which is far slower than a local disk.  Regardless though, it sounds like this bug is done?
At our current rate, a little over 1000 add-ons are downloaded each minute. If we expected a maximum of 30% of that for add-ons that would need to be watermarked (300 per minute), are the above numbers feasible with NFS?
Disk latency should not be an issue if the "watermarker" is placed between the file and the user. If the user requests the addon from the watermarker, the script reads the addon entirely into memory before it is modified. It never writes temporary files to the disk. The output in the example is written to the disk for debug purposes; it could very easily be piped back out to the remote user as a file, thereby eliminating an extra write step. Unless there is some sort of caching going on that requires the addon to be a file (or something of that nature), the disk *shouldn't* cause any overhead.
(In reply to comment #7)
> Disk latency should not be an issue if the "watermarker" is placed between the
> file and the user. If the user requests the addon from the watermarker, the
> script reads the addon entirely into memory before it is modified. It never
> writes temporary files to the disk.
It's still reading from NFS which is slow.

Despite this being a relatively fast operation for our current load (do we have predictions for growth?), it's still only half the equation - checking for updates for all these unique (and uncached) URLs is hard to scale, and, with the solution I put in the spec, unnecessary.  I'm not sure why we're pursuing what feels like a dirty and overly complex method.
Are benchmarks done?  Can we close this?
1 and 3 would be incredibly time-consuming to setup (they'd require a few scripts to modify all of the XPI files I have). I don't intend to pursue them. If someone specifically requests something, feel free to let me know, otherwise, I'm not chewing on this any more.
Thanks Matt.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [z] → [z][qa-]
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: