Closed Bug 548320 Opened 15 years ago Closed 15 years ago

winnt 5.1/6.1 have trouble pinging the graph server

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: anodelman, Assigned: anodelman)

References

Details

Attachments

(2 files, 3 obsolete files)

A lot of orange "failure to contact graph server errors" - since this is isolated to only windows machines I don't think that it is graph server failure but talos slave failure.
Turns out that urlerror will also catch httperror - I'm hoping that by splitting these out I'll get a better idea what's going wrong.
Assignee: nobody → anodelman
Attachment #428748 - Flags: review?(bhearsum)
Attachment #428748 - Flags: review?(bhearsum) → review+
Comment on attachment 428748 [details] [diff] [review] [checked in]better error messaging to diagnose the real failure /cvsroot/mozilla/testing/performance/talos/post_file.py,v <-- post_file.py new revision: 1.7; previous revision: 1.6 done
Attachment #428748 - Attachment description: better error messaging to diagnose the real failure → [checked in]better error messaging to diagnose the real failure
Attachment #428748 - Flags: checked-in+
Better error message now: FAIL: graph server URLError FAIL: <urlopen error (10060, 'Operation timed out')>
Depends on: 548371
Attachment #428770 - Flags: review?(bhearsum)
Comment on attachment 428770 [details] [diff] [review] three retries for pinging the graph server This loop is pretty confusing...the "success" return is inside of the while loop...what do you think about dropping the 'return' inside of the loop and then doing: if messages: return 0 else: return 1 You'll need a break in the 'try' part, I guess. Also, 'retries' seems like it would be good thing to parameterize.
Currently running on staging. Loop should be a little clearer now.
Attachment #428770 - Attachment is obsolete: true
Attachment #428850 - Flags: review?(bhearsum)
Attachment #428770 - Flags: review?(bhearsum)
Comment on attachment 428850 [details] [diff] [review] three retries for pinging the graph server (take 2) Looks good to me.
Attachment #428850 - Flags: review?(bhearsum) → review+
This appears more stable on stage. I still plan on doing more investigation with tcpdump as I'm starting to think that network issues are involved.
Attachment #428850 - Attachment is obsolete: true
Attachment #429142 - Flags: review?(bhearsum)
Attachment #429142 - Attachment is obsolete: true
Attachment #429143 - Flags: review?(bhearsum)
Attachment #429142 - Flags: review?(bhearsum)
Comment on attachment 429143 [details] [diff] [review] [checked in]use HTTP instead of HTTPConnection (take 2) > import httplib, mimetypes, urllib2 >+import socket > from socket import error, herror, gaierror, timeout >+socket.setdefaulttimeout(None) Are you sure you don't want to just change the one in run_tests.py? It's a global change, so I think it makes sense to do it in higher level place. >+ h = httplib.HTTP(host) ## Make HTTPConnection Object >+ h.putrequest('HEAD', selector) >+ h.putheader('content-type', "text/plain") >+ h.putheader('content-length', str(len(msg))) >+ h.putheader('Host', host) >+ h.endheaders() >+ h.send(msg) >+ >+ errcode, errmsg, headers = h.getreply() >+ if errcode == 200: >+ found = 1 nit: errcode is better named as ret or httpcode or status since not all HTTP codes are errors. Can you fix that on checkin?
Attachment #429143 - Flags: review?(bhearsum) → review+
Comment on attachment 429143 [details] [diff] [review] [checked in]use HTTP instead of HTTPConnection (take 2) Checking in run_tests.py; /cvsroot/mozilla/testing/performance/talos/run_tests.py,v <-- run_tests.py new revision: 1.60; previous revision: 1.59 done Checking in post_file.py; /cvsroot/mozilla/testing/performance/talos/post_file.py,v <-- post_file.py new revision: 1.8; previous revision: 1.7 done
Attachment #429143 - Attachment description: use HTTP instead of HTTPConnection (take 2) → [checked in]use HTTP instead of HTTPConnection (take 2)
Attachment #429143 - Flags: checked-in+
Anything left to do here? (There's some intermittent graphserver+network problems being tracked in bug#553590 but they are not Win5.1/6.1 specific.)
This has been narrowed down to issues with the graph server itself, this bug is no longer needed.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → INVALID
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: