1567724 - Indexing duration regression due to low output-file throughput on mozilla-releases indexer.

Assignee

Description

•

5 years ago

•

The mozilla-releases indexer for today is currently at 9.5 hours runtime. A cursory investigation of the indexer log shows it took about 54min for the output.sh stage to complete for mozilla-esr60, 2 hours for mozilla-release, 2h9min for mozilla-esr68, and mozilla-beta is currently about to hit the 2 hour mark.

The parallel batching job says that for our mozilla-central job (extracting an AVG value from near the end of the run; which may be the wrong average to grab), each parallel job took about 6 seconds to run on average, whereas we're looking at ~11.3s for mozilla-esr60, ~21.9s for mozilla-release, ~23.2s for mozilla-esr68, and ~26.8 for mozilla-beta.

We recently changed the configuration for this indexer because the indexer was filling up.

General data dump

CPU

/cpu/procinfo shows 8 cores like so:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
stepping	: 4
microcode	: 0x42e
cpu MHz		: 2800.106
cache size	: 25600 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase smep erms xsaveopt
bugs		:
bogomips	: 5600.21
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

top matches up fairly well too.

7099 ubuntu    20   0 3030696 624604 329724 R  90.7  4.1   0:08.99 output-file                                                                                                               
 7122 ubuntu    20   0 3030704 624764 329804 R  90.7  4.1   0:09.01 output-file                                                                                                               
 7008 ubuntu    20   0 3030684 623288 328384 R  90.4  4.0   0:09.04 output-file                                                                                                               
 7021 ubuntu    20   0 3030696 623668 328636 R  90.4  4.1   0:09.06 output-file                                                                                                               
 7066 ubuntu    20   0 3030736 624716 329748 R  90.4  4.1   0:09.02 output-file                                                                                                               
 7176 ubuntu    20   0 3030792 624568 329720 R  90.4  4.1   0:08.85 output-file                                                                                                               
 7181 ubuntu    20   0 3030804 624672 329708 R  90.4  4.1   0:08.85 output-file                                                                                                               
 7163 ubuntu    20   0 3030764 624632 329736 R  90.0  4.1   0:08.87 output-file

Disk

ubuntu@ip-elided:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            7.4G     0  7.4G   0% /dev
tmpfs           1.5G  8.7M  1.5G   1% /run
/dev/xvda1      7.8G  5.4G  2.0G  74% /
tmpfs           7.4G     0  7.4G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           7.4G     0  7.4G   0% /sys/fs/cgroup
/dev/xvdb        79G   56M   75G   1% /mnt
/dev/xvdf       296G  234G   47G  84% /index
tmpfs           1.5G     0  1.5G   0% /run/user/1000

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 1

•

5 years ago

During my investigations I found two possible sources of perf problems that make the output-file stage slow. One is the generation of unused analysis data from the clang plugin (bug 1511025). The other is the blame-skipping code that I added. It would be worth removing that and seeing what the numbers are like. I kind of want to remove it entirely, because the results it produces are not very good. I have a different solution in the works although I'm not sure what kind of performance it will give.

However I'm surprised there's such a big discrepancy between m-c avg time and the other branches. Makes sense that esr60 is faster because it doesn't do rust/C++ indexing and so avoids the first problem.

Log from mozilla-releases indexer with blame-skipping disabled 5 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 4.21 MB, application/x-gzip		Details
5 minute perf profile of mozilla-beta indexing on c5d.2xlarge 16gig machine 5 years ago Andrew Sutherland [:asuth] (he/him) 161.39 KB, image/png		Details
5 minutes of perf profile from mozilla-beta output-file indexing 5 years ago Andrew Sutherland [:asuth] (he/him) 3.85 MB, application/gzip		Details
Landed PR saving off the GNU parallel outputs and adding perf logging to output-file.rs via println! 2 years ago Andrew Sutherland [:asuth] (he/him) 47 bytes, text/x-github-pull-request		Details \| Review
Landed PR: Only load the desired tree for perf reasons. 2 years ago Andrew Sutherland [:asuth] (he/him) 47 bytes, text/x-github-pull-request		Details \| Review
Landed PR: Clean up handle_error and add lsof invocation for improved diagnostics 2 years ago Andrew Sutherland [:asuth] (he/him) 47 bytes, text/x-github-pull-request		Details \| Review

repo (in config order)	order 1	time 1 (mins)	order 2/3	time 2 (mins)	time 3 (mins)
mozilla-beta	1	66.2	4	90.7	89
mozilla-release	2	77.3	2	63.9	63
mozilla-esr60	3	40.8	1	39.3	41
mozilla-esr68	4	89.23	3	71.6	69

repo (in config order)	order 1	time 1 (16G)	time 4 (32G)	order 2/3	time 2 (16G)	time 3 (16G)
mozilla-beta	1	66.2	64.0	4	90.7	89
mozilla-release	2	77.3	63.4	2	63.9	63
mozilla-esr60	3	40.8	42.0	1	39.3	41
mozilla-esr68	4	89.23	63.5	3	71.6	69

Indexing duration regression due to low output-file throughput on mozilla-releases indexer.

General data dump

CPU

Disk

release

mozilla-releases

mozilla-central

mozilla-esr68

config order:

python dict is the boss order: