Closed Bug 997388 Opened 11 years ago Closed 10 years ago

Downloading initial list of inbound builds is slow

Categories

(Testing :: mozregression, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Unassigned)

Details

wlach: Right now, we grab the directory of inbound builds on ftp.mozilla.org fresh every time, which is super slow. Ideas for speeding it up: 1. Keep a cached copy and only re-download if the contents of said page change. Unfortunately the contents of this page surely change quite often 2. Create an optimized copy of this index for mozregression to download, instead of using the ftp index. Really, all we need to know are the various timestamps of inbound builds that are available. Everything else is pretty superfluous. (1) is easy but as mentioned may not actually fix the problem in most cases, (2) is difficult if we actually have to do any work. It's possible though that this data is already being generated somewhere. Will have to ask RelEng about this. @nthomas-mozilla Hey Nick, do you know offhand any alternative to getting the list of inbound builds by requesting a URL like http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/ ? If not, could we implement one? A compressed file with just the timestamps would be ideal. nthomas: There's nothing right now, unfortunately. How slow are you seeing ? Typically I see about 3 seconds to generate the page from cold, plus some time to return all the html (which is admittedly large and a json option might be useful). Ideally, I'd like to extend buildapi/self-serve so we can query for file urls by revision. eg given some good & bad revs on a branch * look up revisions in between, probably from hg.m.o * binary search by querying buildapi for installer/dmg/tar.bz2 (instead of scraping) * if the regressing revision is a merge, repeat for the merged branch There's some mapping issues there about all the different names we call things (eg TBPL, buildbot name) but it should be a big improvement. AaronMT: I have a current process fetching an example mozilla-inbound pushlog running for half an hour now :( 3889 python 0.7 00:29.86 2 0 33 220 56M 56M 0B 0B 98M Edit: Oh ... this doesn't help :) traceroute to inbound-archive.pub.build.mozilla.org 20 ec2-50-112-0-173.us-west-2.compute.amazonaws.com (50.112.0.173) 92.191 ms ec2-50-112-0-203.us-west-2.compute.amazonaws.com (50.112.0.203) 101.847 ms ec2-50-112-0-173.us-west-2.compute.amazonaws.com (50.112.0.173) 115.840 ms 21 * * * 22 * * * 23 * * * wlach: I recently applied a patch by Chris Peterson which should help a bit. If you haven't already, you might want to try updating to the latest version. Unfortunately the major cause of the slowness (described above) is still to be addressed... whimboo: Does the slowness only apply to inbound-archive or also our inbound folder at ftp.m.o? wlach: mozregression only uses the inbound archive. whimboo: So that has been changed since your initial comment which talks about ftp.m.o? In that case which steps takes that long? Is it downloading the index of http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/? For me this takes 3s with curl. 100 1114k 100 1114k 0 0 287k 0 0:00:03 0:00:03 --:--:-- 287k Also it's only about 1MB in size so I wonder which step is taking that long for you.
Still an issue; usually takes about 12-15 minutes for me
I'm trying to do bisection of inbound (located in Finland), and for me it never succeeds. Instead the operation always times out and I get various different errors: https://bugzilla.mozilla.org/show_bug.cgi?id=1088020 .
I think it is related to bug 1096254. The patch is present in the new mozregression 0.25, and I'm pretty sure that will help a lot. Aaron, Jukka, could you try with the new version please ?
Oups sorry, I was too fast - mozregression 0.26 will have it, and it is not there yet (must be soon !).
Awesome! If the date range was all that was that's cute, ship it.
(In reply to Julien Pagès from comment #4) > Oups sorry, I was too fast - mozregression 0.26 will have it, and it is not > there yet (must be soon !). Yes, sorry about that, got caught up with something else yesterday. :) mozregression 0.26 is now out with Julien's great work, please give it a try and see if it helps.
It's faster, but it still takes quite some time from the last nightly to the first inbound-try: e.g. 0:06.32 LOG: MainThread Bisector INFO Last good revision: ced1402861b8 (2014-11-26) 0:06.32 LOG: MainThread Bisector INFO First bad revision: cef590a6f946 (2014-11-27) 0:06.32 LOG: MainThread Bisector INFO Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ced1402861b8&tochange=cef590a6f946 0:06.32 LOG: MainThread Bisector INFO ... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed) 0:23.66 LOG: MainThread Regression Runner INFO Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/11/2014-11-19-03-02-00-mozilla-central/firefox-36.0a1.en-US.win32.txt 0:24.06 LOG: MainThread Bisector INFO Getting inbound builds between b4fbeba78a7d and cef590a6f946 1:18.65 LOG: MainThread Bisector INFO Testing inbound build with timestamp 1416791230, revision 52f3f4a5 1:20.16 LOG: MainThread Regression Runner INFO Downloading build from: http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1416791230/firefox-36.0a1.en-US.win32.zip 17 seconds for the seven-day-range and 54 seconds until the download of the first inbound-build. Is that the expected duration?
Well to make a long story short: yes, probably. Now a more detailed explanation. First, I don't think we can do a lot about the 54 seconds to download the build. It is the time required to download the file from the server. Let me explain a bit how inbound build metadata are downloaded now (the 17 seconds remaining): 1. We download the list of mercurial changesets from inbound (as json) within the two first known changesets; This usually take a few seconds, around 3 or 5 seconds for a week (but maybe more, depending on server load I suppose). This step is required, and I don't think we can do too much here to gain speed. 2. Now that we have the changesets and the associated timestamps, we look at the inbound build folders to find valid builds. Here is the improvement: instead of trying every build folder at the same time, we check for the lower and higher bounds of the range we got, then we find the center - with these information we can bissect. But some inbounds builds folders are invalid, and we have to fetch until we find the upper, lower and mid valid build. A build can be invalid for 2 reason I can think of: - The build has failed. - The other reason is more tricky: inbound build folders are organized with timestamp (eg, the last part of https://s3-us-west-2.amazonaws.com/firefox-mozilla-inbound-usw2/mozilla-inbound-linux64/1415964798). But timestamps of builds are not timestamp of changesets, so we have to check that in the corresponding txt file (eg https://s3-us-west-2.amazonaws.com/firefox-mozilla-inbound-usw2/mozilla-inbound-linux64/1415964798/firefox-36.0a1.en-US.linux-x86_64.txt) the changeset is interesting for us. That makes a lot of builds invalids for the first step (were higher and lower build timestamps may be far away from higher and lower changesets timestamps). But for every following steps, things became faster, because we already have at least one valid build (either lower or higher, depending on the good or bad response) and because we are sure now that the builds are interesting for us in term of changeset. You can see more in detail what is happenning when you run moz regression with the '--log-mach-level DEBUG' flag: 0:05.69 LOG: MainThread Bisector INFO ... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed) 0:07.01 LOG: MainThread Regression Runner INFO Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/11/2014-11-15-03-02-05-mozilla-central/firefox-36.0a1.en-US.linux-x86_64.txt 0:07.42 LOG: MainThread Bisector INFO Getting inbound builds between acbd7b68fa8c and 5ba06e4f49e8 0:25.17 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:25.17 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:25.17 LOG: MainThread Inbound Build Finder DEBUG We got 310 inbound folders, we need to fetch [0, 1, 2, 3, 306, 307, 308, 309] 0:28.83 LOG: MainThread Inbound Build Finder DEBUG Now we got 302 inbound folders - 8 were bad 0:28.83 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:28.83 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:28.83 LOG: MainThread Inbound Build Finder DEBUG We got 302 inbound folders, we need to fetch [0, 1, 2, 3, 298, 299, 300, 301] 0:33.46 LOG: MainThread Inbound Build Finder DEBUG Now we got 294 inbound folders - 8 were bad 0:33.46 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:33.46 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:33.46 LOG: MainThread Inbound Build Finder DEBUG We got 294 inbound folders, we need to fetch [0, 1, 2, 3, 290, 291, 292, 293] 0:42.75 LOG: MainThread Inbound Build Finder DEBUG Now we got 286 inbound folders - 8 were bad 0:42.75 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:42.75 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:42.75 LOG: MainThread Inbound Build Finder DEBUG We got 286 inbound folders, we need to fetch [0, 1, 2, 3, 282, 283, 284, 285] 0:48.85 LOG: MainThread Inbound Build Finder DEBUG Now we got 278 inbound folders - 8 were bad 0:48.85 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:48.85 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:48.86 LOG: MainThread Inbound Build Finder DEBUG We got 278 inbound folders, we need to fetch [0, 1, 2, 3, 274, 275, 276, 277] 0:53.42 LOG: MainThread Inbound Build Finder DEBUG Now we got 270 inbound folders - 8 were bad 0:53.42 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:53.42 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:53.42 LOG: MainThread Inbound Build Finder DEBUG We got 270 inbound folders, we need to fetch [0, 1, 2, 3, 266, 267, 268, 269] 0:56.83 LOG: MainThread Inbound Build Finder DEBUG Now we got 262 inbound folders - 8 were bad 0:56.83 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound lower limit 0:56.83 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 0:56.83 LOG: MainThread Inbound Build Finder DEBUG We got 262 inbound folders, we need to fetch [0, 1, 2, 3, 258, 259, 260, 261] 1:00.67 LOG: MainThread Inbound Build Finder DEBUG Now we got 257 inbound folders - 5 were bad 1:00.67 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound higher limit 1:00.67 LOG: MainThread Inbound Build Finder DEBUG We got 257 inbound folders, we need to fetch [249, 250, 251, 252, 253, 254, 255, 256] 1:06.88 LOG: MainThread Inbound Build Finder DEBUG Now we got 257 inbound folders - 0 were bad 1:06.88 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound mid point 128 1:06.88 LOG: MainThread Inbound Build Finder DEBUG We got 257 inbound folders, we need to fetch [124, 125, 126, 127, 128, 129, 130, 131] 1:10.84 LOG: MainThread Inbound Build Finder DEBUG Now we got 256 inbound folders - 1 were bad 1:10.84 LOG: MainThread Bisector INFO Testing inbound build with timestamp 1416351206, revision c14cd1b9 And look at some following steps: 8:43.27 LOG: MainThread Bisector INFO Narrowed inbound regression window from [e2efedec, c69ae3b1] (256 revisions) to [803d228d, c69ae3b1] (127 revisions) (~6 steps left) 8:43.27 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound mid point 63 8:43.27 LOG: MainThread Inbound Build Finder DEBUG We got 127 inbound folders, we need to fetch [59, 60, 61, 62, 63, 64, 65, 66] 8:46.89 LOG: MainThread Inbound Build Finder DEBUG Now we got 125 inbound folders - 2 were bad 8:46.89 LOG: MainThread Bisector INFO Testing inbound build with timestamp 1416450964, revision 5cb767f5 ... 23:28.79 LOG: MainThread Bisector INFO Narrowed inbound regression window from [803d228d, c69ae3b1] (125 revisions) to [91f983c5, c69ae3b1] (62 revisions) (~5 steps left) 23:28.79 LOG: MainThread Inbound Build Finder DEBUG We need to fetch the inbound mid point 31 23:28.79 LOG: MainThread Inbound Build Finder DEBUG We got 62 inbound folders, we need to fetch [27, 28, 29, 30, 31, 32, 33, 34] 23:33.17 LOG: MainThread Inbound Build Finder DEBUG Now we got 61 inbound folders - 1 were bad 23:33.17 LOG: MainThread Bisector INFO Testing inbound build with timestamp 1416497347, revision e3cba62a As you can see, first step took about 1 minute with my wireless wifi (including 18 seconds for the changesets json download - network is currently bad, or server is busy) and other steps are only taking around three - five seconds. But as you can see, there is a lot of time between the steps (build downloading) so my network is currently really bad. But still, these numbers are useful for comparison. Hope this is clear enough. :) Maybe we can take another approach to speed this up. For now, I hope that inbound bissection will be more useable.
(In reply to Julien Pagès from comment #8) > Hope this is clear enough. :) Maybe we can take another approach to speed > this up. For now, I hope that inbound bissection will be more useable. Perfectly clear, thank you very much.
Resolving this bug as I think Julien's work has improved things enough for now (i.e. mozregression is now usable for bisecting inbound again). Fixing bug 1095756 should make things faster still, if we get around to it.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.