Closed Bug 1155451 Opened 9 years ago Closed 9 years ago

Treeherder log parser blocks on completed download before parsing

Categories

(Tree Management :: Treeherder: Data Ingestion, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: wlach)

Details

Attachments

(2 files)

Currently the treeherder log parser waits until it has downloaded the full log before unzipping and processing. We could potentially make it slightly faster if we got started on parsing it before it was complete.

I got curious about how much this could help us so I wrote something up. Unfortunately my benchmarking suggests it's not particularly helpful (it shaves between .4 and .1 seconds usually), but perhaps it's worth adding anyway.
Attached file Benchmark script
On my workstation, I get this set of 10 results on a largish log without the "optimization":

(eideticker)wlach@eideticker:~/src/treeherder-service$ python t.py http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win64/1429215986/mozilla-central-win64-bm82-build1-build170.txt.gz 
[0.8731870651245117, 0.9529199600219727, 0.8774969577789307, 0.9509341716766357, 0.8769950866699219, 0.9473130702972412, 0.863955020904541, 0.9525530338287354, 1.4248991012573242, 1.0506041049957275]
0.977085757256

And this set of results with the optimization:

(eideticker)wlach@eideticker:~/src/treeherder-service$ python t.py http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win64/1429215986/mozilla-central-win64-bm82-build1-build170.txt.gz 
[1.0908939838409424, 0.8961830139160156, 0.7244958877563477, 0.8918678760528564, 0.7274820804595947, 0.9782431125640869, 1.1817591190338135, 0.9811978340148926, 0.727877140045166, 0.9805409908294678]
0.918054103851
Attached file PR
This is like the least urgent thing ever, since it doesn't improve performance that much (see above for benchmarks). I'm kind of on the fence about whether we should even commit it, as it makes the code slightly more complex. I more just wanted to get it out there so people could know that I tried it. Anyway, wouldn't mind a second opinion.
Attachment #8593670 - Flags: review?(mdoglio)
Testing again against on a web server on my local machine (i.e basically instantaneous download), I still get an average difference of about .1 seconds. I guess it partly depends on (1) how fast the machine is, (2) how saturated the CPU is already, and (3) how long the download takes.

This algorithm will show the most improvement on a slow machine with an unsaturated cpu and slow network performance (since we'll take advantage of the long time it takes to download the file to get a jump start on decompression). If network is the dominating factor (which it seems to be, at least on my workstation), or the CPU is already saturated, expect little improvement from doing things this way.
I updated the PR to include a more realistic test program (which we can now run any time), which actually uses treeherder's log parser artifact builder classes. In this case, the difference in speed tends to be greater (.2 secs on average):

Before:

(venv)vagrant@local:~/treeherder-service$ ./manage.py test_parse_log --profile 10 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1429500689/mozilla-central-linux-bm74-build1-build125.txt.gz
Timings: [1.5258538722991943, 1.6284010410308838, 1.7903828620910645, 1.7481331825256348, 2.2356438636779785, 1.8331339359283447, 1.715242862701416, 2.031848907470703, 1.862015962600708, 1.8552160263061523]
Average: 1.82258725166
Total: 18.2258725166

After:

(venv)vagrant@local:~/treeherder-service$ ./manage.py test_parse_log --profile 10 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1429500689/mozilla-central-linux-bm74-build1-build125.txt.gz
Timings: [1.6193771362304688, 1.614927053451538, 1.5895788669586182, 1.4775769710540771, 1.4487788677215576, 1.5538148880004883, 1.507270097732544, 2.427928924560547, 1.3921799659729004, 1.4321610927581787]
Average: 1.60635938644
Total: 16.0635938644
Attachment #8593670 - Flags: review?(mdoglio) → review+
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: