Closed
Bug 1088020
Opened 10 years ago
Closed 9 years ago
mozregression network operation hangs in getting mozilla-inbound builds.
Categories
(Testing :: mozregression, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jujjyl, Unassigned)
References
Details
Attachments
(2 files)
On a Windows mozilla-build shell, type
mozregression --good=2014-09-22 --bad=2014-09-23 --bits=32
This prints:
$ mozregression --good=2014-09-22 --bad=2014-09-23 --bits=32
Got as far as we can go bisecting nightlies...
Ensuring we have enough metadata to get a pushlog...
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/09/2014-09-22-03-02-04-mozilla-central/firefox-35.0a1.en-US.win32.txt
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/09/2014-09-23-16-59-02-mozilla-central/firefox-35.0a1.en-US.win32.txt
Last good revision: 5bd6e09f074e (2014-09-22)
First bad revision: afc933adf723 (2014-09-23)
Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5bd6e09f074e&tochange=afc933adf723
... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed)
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/09/2014-09-15-03-02-04-mozilla-central/firefox-35.0a1.en-US.win32.txt
Getting inbound builds between bf5fcc0c4b27 and afc933adf723
Retrieving valid builds from u'http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1411540853/' generated an exception: 504 Server Error: Gateway Timeout
Is the server http://inbound-archive.pub.build.mozilla.org/ experiencing network trouble? I see if I point my browser to that page, that it loads after a while, but with mozregression it never finishes. Occasionally I get this error instead:
... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed)
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/09/2014-09-15-03-02-04-mozilla-central/firefox-35.0a1.en-US.win32.txt
Getting inbound builds between bf5fcc0c4b27 and afc933adf723
Interrupted.
Retrieving valid builds from u'http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1410784509/' generated an exception: 503 Server Error: Service Unavailable: Back-end server is at capacity
Reporter | ||
Comment 1•10 years ago
|
||
Trying to bisect another regression, I'm getting
$ mozregression --good=2014-10-10 --bad=2014-10-11 --bits=32
Got as far as we can go bisecting nightlies...
Ensuring we have enough metadata to get a pushlog...
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/10/2014-10-10-03-02-01-mozilla-central/firefox-35.0a1.en-US.win32.txt
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/10/2014-10-11-03-02-03-mozilla-central/firefox-35.0a1.en-US.win32.txt
Last good revision: 50b689feab5f (2014-10-10)
First bad revision: f74ad36bb97b (2014-10-11)
Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=50b689feab5f&tochange=f74ad36bb97b
... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed)
Getting http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/10/2014-10-03-03-02-04-mozilla-central/firefox-35.0a1.en-US.win32.txt
Getting inbound builds between b85c260821ab and f74ad36bb97b
Retrieving valid builds from u'http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1412456373/' generated an exception: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
Comment 2•10 years ago
|
||
Yes, we should probably retry in this case. Or perhaps fix the problem properly, by creating some kind of web-service with this information pre-processed. See also bug 997388, which is essentially the same problem from a different angle (crawling so many directories in inbound-archive is *slow*).
Comment 3•10 years ago
|
||
I'm hoping that implementing the solution described in bug 1095756 will fix this.
Comment 4•10 years ago
|
||
I think that the patch introduced in bug 1096254 will fix this.
Reporter | ||
Comment 5•10 years ago
|
||
Tested this today in an attempt to bisect issue #1128163, but this still does not work. I am no longer getting failures due to slow operation, but on three consecutive attempts, a different error was generated each time. See the attached log.
Reporter | ||
Comment 6•10 years ago
|
||
Comment 7•10 years ago
|
||
Hm, I would say that the 1 and 3 errors are due to bad network (maybe the 2 also, see below). What hopefully will fix this is Bug 1095756. In the mean time, you could try to adjust the http timeout option to get rid of error 1:
mozregression -b 2014-11-29 -g 2014-11-28 --http-timeout 30
The second error you get, "requests.exceptions.HTTPError: 400 Client Error: no such method: pushes" do bother me. But I am unable to reproduce it, the same command you did works fine for me (with a higher http timeout value as here at work the proxy really slow down everything):
mozregression -b 2014-11-29 -g 2014-11-28 --http-timeout 30
0:01.76 LOG: MainThread Bisector INFO Got as far as we can go bisecting nightlies...
0:01.76 LOG: MainThread Bisector INFO Last good revision: 17de0f463944
0:01.76 LOG: MainThread Bisector INFO First bad revision: e90536aa55dd
0:01.76 LOG: MainThread Bisector INFO Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=17de0f463944&tochange=e90536aa55dd
0:01.76 LOG: MainThread Bisector INFO ... attempting to bisect inbound builds (starting from previous week, to make sure no inbound revision is missed)
0:02.45 LOG: MainThread Bisector INFO Getting inbound builds between 5ba06e4f49e8 and e90536aa55dd
0:39.05 LOG: MainThread Test Runner INFO Testing inbound build with timestamp 1416958654, revision 3c0bba40
0:39.05 LOG: MainThread Test Runner INFO Downloading build from: http://inbound-archive.pub.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1416958654/firefox-36.0a1.en-US.linux-x86_64.tar.bz2
As that works for me, I suspect that every of these errors are network related. Not sure what to do here - I suspect the best option is to wait for bug 1095756.
Reporter | ||
Comment 8•10 years ago
|
||
I do not believe that I have a bad network, and the timeout occurred in well less than five seconds from issuing the command. However I'll retry the steps again at a later time.
Comment 9•10 years ago
|
||
Ok - well 'bad network' is probably a wrong term I used - it could be that one contacted server is slow to answer (https://hg.mozilla.org or http://inbound-archive.pub.build.mozilla.org in this case).
I did a few more tests and I am able to reproduce the 'no such method: pushes' error sometimes. In fact, I can reproduce it without any use of mozregression - look at the attached script. It appears that sometimes the server just answer 400 error for apparently no reason.
I am also pretty sure that the "Bad status line" appears randomly too.
Well, as said before, this will be corrected with bug Bug 1095756 (well - users won't see this happening but we will have to handle it on the server side :))
I am thinking that we could "fix" these issues by retrying (say 3 times max) on these errors:
- requests.exceptions.HTTPError
- requests.exceptions.ConnectionError
at different places in the code (when downloading push data and when looking into build folders as a start).
Maybe raise the default timeout to 20 instead of 10 may also help (or 30...) to get less network errors.
Also, I think we can fix all this without waiting for bug 1095756 - it will get the fixes later anyway.
@Will
William, what do you think ? I can mentor a good next bug on this if you're OK and that you agree on what I said. I think I would prefer to open a new bug for that.
Flags: needinfo?(wlachance)
Comment 10•10 years ago
|
||
Yes, I think this makes sense. Maybe we can try using Mozilla releng's redo module for handling some of the retry stuff, IIRC it was written exactly for situations like this:
https://pypi.python.org/pypi/redo
Flags: needinfo?(wlachance)
Comment 11•9 years ago
|
||
Hi Jukka,
I have released mozregression 0.37, and it changes the way inbound bisection is done (should be faster, and no more ftp crawling!), so I think this bug is resolved for you.
Could you try it please ?
Thanks!
Flags: needinfo?(jujjyl)
Reporter | ||
Comment 12•9 years ago
|
||
Thanks for the progress here!
I tried the newest as of today, 0.40, and tried to bisect bug 1150552. Bisecting days gave some noisy exceptions spam in the console, however the bisection process itself seemed to work ok:
Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry
', 'back' or 'exit' and press Enter): move() failed for "(u'c:\\users\\clb\\appd
ata\\local\\temp\\tmpielb8d\\2015-05-23--mozilla-central--firefox-41.0a1.en-US.w
in64.zip.tmp', u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015-05-23--mo
zilla-central--firefox-41.0a1.en-US.win64.zip')". Reason: The process cannot acc
ess the file because it is being used by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015-05-2
3--mozilla-central--firefox-41.0a1.en-US.win64.zip.tmp', u'c:\\users\\clb\\appda
ta\\local\\temp\\tmpielb8d\\2015-05-23--mozilla-central--firefox-41.0a1.en-US.wi
n64.zip')". Reason: The process cannot access the file because it is being used
by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015-05-2
3--mozilla-central--firefox-41.0a1.en-US.win64.zip.tmp', u'c:\\users\\clb\\appda
ta\\local\\temp\\tmpielb8d\\2015-05-23--mozilla-central--firefox-41.0a1.en-US.wi
n64.zip')". Reason: The process cannot access the file because it is being used
by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015-05-2
3--mozilla-central--firefox-41.0a1.en-US.win64.zip.tmp', u'c:\\users\\clb\\appda
ta\\local\\temp\\tmpielb8d\\2015-05-23--mozilla-central--firefox-41.0a1.en-US.wi
n64.zip')". Reason: The process cannot access the file because it is being used
by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015-05-2
3--mozilla-central--firefox-41.0a1.en-US.win64.zip.tmp', u'c:\\users\\clb\\appda
ta\\local\\temp\\tmpielb8d\\2015-05-23--mozilla-central--firefox-41.0a1.en-US.wi
n64.zip')". Reason: The process cannot access the file because it is being used
by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014-12-3
0--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp', u'c:\\users\\clb
\\appdata\\local\\temp\\tmpielb8d\\2014-12-30--mozilla-central--firefox-37.0a1.e
n-US.win64-x86_64.zip')". Reason: The process cannot access the file because it
is being used by another process (13). Retrying...
Exception in thread Thread-20:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\lib\site-packages\mozregression\download_manager.py", line 1
67, in _download
mozfile.move(temp_dest, dest)
File "C:\Python27\lib\site-packages\mozfile\mozfile.py", line 242, in move
_call_windows_retry(shutil.move, (src, dst))
File "C:\Python27\lib\site-packages\mozfile\mozfile.py", line 147, in _call_wi
ndows_retry
func(*args)
File "C:\Python27\lib\shutil.py", line 303, in move
os.unlink(src)
WindowsError: [Error 32] The process cannot access the file because it is being
used by another process: u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2015
-05-23--mozilla-central--firefox-41.0a1.en-US.win64.zip.tmp'
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014-12-3
0--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp', u'c:\\users\\clb
\\appdata\\local\\temp\\tmpielb8d\\2014-12-30--mozilla-central--firefox-37.0a1.e
n-US.win64-x86_64.zip')". Reason: The process cannot access the file because it
is being used by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014-12-3
0--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp', u'c:\\users\\clb
\\appdata\\local\\temp\\tmpielb8d\\2014-12-30--mozilla-central--firefox-37.0a1.e
n-US.win64-x86_64.zip')". Reason: The process cannot access the file because it
is being used by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014-12-3
0--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp', u'c:\\users\\clb
\\appdata\\local\\temp\\tmpielb8d\\2014-12-30--mozilla-central--firefox-37.0a1.e
n-US.win64-x86_64.zip')". Reason: The process cannot access the file because it
is being used by another process (13). Retrying...
move() failed for "(u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014-12-3
0--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp', u'c:\\users\\clb
\\appdata\\local\\temp\\tmpielb8d\\2014-12-30--mozilla-central--firefox-37.0a1.e
n-US.win64-x86_64.zip')". Reason: The process cannot access the file because it
is being used by another process (13). Retrying...
Exception in thread Thread-23:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\lib\site-packages\mozregression\download_manager.py", line 1
67, in _download
mozfile.move(temp_dest, dest)
File "C:\Python27\lib\site-packages\mozfile\mozfile.py", line 242, in move
_call_windows_retry(shutil.move, (src, dst))
File "C:\Python27\lib\site-packages\mozfile\mozfile.py", line 147, in _call_wi
ndows_retry
func(*args)
File "C:\Python27\lib\shutil.py", line 303, in move
os.unlink(src)
WindowsError: [Error 32] The process cannot access the file because it is being
used by another process: u'c:\\users\\clb\\appdata\\local\\temp\\tmpielb8d\\2014
-12-30--mozilla-central--firefox-37.0a1.en-US.win64-x86_64.zip.tmp'
After bisecting through the dates, it printed out the following:
8:32.14 LOG: MainThread Bisector INFO Narrowed nightly regression window from [
2014-12-08, 2014-12-10] (2 days) to [2014-12-08, 2014-12-09] (1 days) (~0 steps
left)
8:32.14 LOG: MainThread Bisector INFO Got as far as we can go bisecting nightli
es...
8:32.14 LOG: MainThread Bisector INFO Last good revision: 035a951fc24a
8:32.14 LOG: MainThread Bisector INFO First bad revision: f1f48ccb2d4e
8:32.14 LOG: MainThread Bisector INFO Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=035a951fc24a&tocha
nge=f1f48ccb2d4e
8:32.14 LOG: MainThread Bisector INFO ... attempting to bisect inbound builds (
starting from 4 days prior, to make sure no inbound revision is missed)
8:34.31 LOG: MainThread Bisector INFO Getting inbound builds between a9fc463556
61 and f1f48ccb2d4e
8:58.96 LOG: MainThread Bisector INFO No inbound data found.
8:58.96 LOG: MainThread Bisector INFO There are no build artifacts on inbound f
or these changesets (they are probably too old).
I am not sure how to interpret this. Is 2014-12 too old for build artifacts and the error message is to be expected?
Flags: needinfo?(jujjyl)
Comment 13•9 years ago
|
||
Hi Jukka,
Thanks for the feedback!
mozregression will try to bisect inbound builds once there are no more nightly builds. But inbound builds are removed after something like six month currently, so the last error message you got is normal.
Could you try with newer builds, to see if inbound bisection works better for you now ?
Hum, I am a bit worried about these
WindowsError: [Error 32] The process cannot access the file
But it is for another bug - I'll give it a try later on a windows machine.
Comment 14•9 years ago
|
||
Oh, I reopened bug 1185756 for the windows error.
Reporter | ||
Comment 15•9 years ago
|
||
Thanks, bisecting inbounds from May 2015 worked ok!
Comment 16•9 years ago
|
||
Cool, thanks!
Let's close this bug then. :)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•