Closed Bug 970820 Opened 10 years ago Closed 10 years ago

Mozmill send_report fails with timeout

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: andrei, Assigned: sayan_666)

References

()

Details

(Whiteboard: [mozmill-2.0.7][mentor=whimboo][lang=python][for a test see comment 22])

Attachments

(2 files, 6 obsolete files)

Attached file log.txt
We've had a batch of testruns that failed to send their report.

I've considered a network outage (still very likely) because all of the failures happened in the same 2 minute timeframe.

Still, 2 of the testruns I've retriggered also failed with the same error.
This bugs purpose is to track and fix this issue if it is not _only_ a network problem.

I've attached a complete log.

This is logged in the Mozmill component since the relevant code is part of Mozmill itself.

===

We should at least investigate why this wasn't caught by our exception handling.
We are handling HTTPError and URLError at the moment. We might want to see how this failed and handle this more gracefully.
Summary: Mozmill fails to send the report with a timeout → Mozmill send_report fails with timeout
I wonder which kind of exception we are hitting here then. It should be easy to fix then, so that we react accordingly.
Whiteboard: [mentor=whimboo][lang=python]
I would like to contribute to this bug but this would be my first so where should i start.?
Flags: needinfo?(hskupin)
Hi Sayan. It's great to hear that you have interests in helping us with this bug. Sorry that I missed to add more details in how to get started. I did this now by adding the link to the repo setup steps to the URL field.

The code in question can then be found at:
https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/report.py#L111

I hope you are ok when I assign the bug to you.
Assignee: nobody → sayan_666
Status: NEW → ASSIGNED
ok i'll look into and start working .
Flags: needinfo?(hskupin)
In order to check which type of exception(if any) is being raised inside the try block
we can add
except Exception,e:
	print str(e)
as one of the handler to explicitly catch the error and understand it(if not a simple network problem).
Sounds like a good idea. That will at least make that our send_report method doesn't hard-stop Mozmill when it cannot send the report.
How do i do reproduce the error to see what exception is being caught ?
Attachment #8375447 - Flags: review?(hskupin)
Comment on attachment 8375447 [details] [diff] [review]
added generic exception to the report to catch any  missed exception

Review of attachment 8375447 [details] [diff] [review]:
-----------------------------------------------------------------

Sayan, something went totally wrong with your patch here. It includes nearly every file from the mozmill source code, but not the changes. It might be that `git format-patch` was not run correctly?

(In reply to Sayan Paul from comment #7)
> How do i do reproduce the error to see what exception is being caught ?

You could try to find a server which doesn't give you a response, so that we timeout. You might be able by doing that via a local proxy which doesn't forward the request. Otherwise you could comment out the exceptions we already catch for HTTP failures, so that we always run into the else case. That way we can check that we behave correctly.
Attachment #8375447 - Flags: review?(hskupin) → review-
Attachment #8375447 - Attachment is obsolete: true
Attachment #8376269 - Flags: review?(hskupin)
Comment on attachment 8376269 [details] [diff] [review]
Corrected the patch , kept only generic exception and others commented

Review of attachment 8376269 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozmill/mozmill/report.py
@@ +143,2 @@
>                print "Sending results to '%s' failed (%s)." % (report_url,
> +                                                               str(e))
\ No newline at end of file

We should still keep each individual except statement here. Otherwise we loose the information from data['reason'].

Also which exception gets thrown in detail when we hit the timeout? Were you able to reproduce this?
Attachment #8376269 - Flags: review?(hskupin) → review-
Attached patch Added back exceptions (obsolete) — Splinter Review
Added back those previously commented Exception .

I am not being figure how to run these tests and reproduce the error .
Attachment #8376269 - Attachment is obsolete: true
Attachment #8382841 - Flags: review?(hskupin)
Comment on attachment 8382841 [details] [diff] [review]
Added back exceptions

Review of attachment 8382841 [details] [diff] [review]:
-----------------------------------------------------------------

Something's wrong with this patch. I suspect it's a diff since the previous patch, rather than a combination of all changes. If you could take a look and see if you can come up with a final patch I'd be happy to review in Henrik's place (he's away until Monday).
Attachment #8382841 - Flags: review?(hskupin) → review-
Attached patch Corrected Patch (obsolete) — Splinter Review
Attachment #8382841 - Attachment is obsolete: true
Attachment #8383473 - Flags: review?(hskupin)
Attached patch Corrected Patch (obsolete) — Splinter Review
Attachment #8383473 - Attachment is obsolete: true
Attachment #8383473 - Flags: review?(hskupin)
Attachment #8383484 - Flags: review?(hskupin)
Comment on attachment 8383484 [details] [diff] [review]
Corrected Patch

Review of attachment 8383484 [details] [diff] [review]:
-----------------------------------------------------------------

Other than a small nit, this looks good to me. I think we could reduce the duplication by throwing a custom exception that takes the URL and reason as arguments. If you'd like to give that a go in this patch please feel free, otherwise we can take care of it in a separate bug.

::: mozmill/mozmill/report.py
@@ +139,4 @@
>          except urllib2.URLError as e:
>              print "Sending results to '%s' failed (%s)." % (report_url,
>                                                              e.reason)
> +	except Exception,e:

Please use `except Exception as e:`
Attachment #8383484 - Flags: review?(hskupin) → review-
Do u mean a single custom exception to handle all the Exception being caught here ?
Flags: needinfo?(dave.hunt)
(In reply to Sayan Paul from comment #16)
> Do u mean a single custom exception to handle all the Exception being caught
> here ?

Yes, all these exceptions are related in that the report fails to be sent. We could therefore have a MozmillSendReportError exception or similar.
Flags: needinfo?(dave.hunt)
I would say we should do the custom exception changes in a follow-up bug. We currently don't have an errors.py module which would contain all the possible mozmill related exceptions. So IMHO it wouldn't add that much for this bug.
Attached patch Final correction made (obsolete) — Splinter Review
Attachment #8383484 - Attachment is obsolete: true
Attachment #8385882 - Flags: review?(hskupin)
Then for this bug the above patch should be enough I guess
Attached patch Final patchSplinter Review
Attachment #8385882 - Attachment is obsolete: true
Attachment #8385882 - Flags: review?(hskupin)
Attachment #8385888 - Flags: review?(hskupin)
Comment on attachment 8385888 [details] [diff] [review]
Final patch

Review of attachment 8385888 [details] [diff] [review]:
-----------------------------------------------------------------

I have tested this patch with a code snippet as given here: http://stackoverflow.com/questions/19928779/node-js-delayed-response. It works fine with a delayed response >30s. So we do not early abort Mozmill anymore. Sadly we cannot run node tests via Mutt, so a real automated test might have to be added later. So I set the in-testsuite flag for now.

There is a missing empty line of that file, which I will add right before landing it.

::: mozmill/mozmill/report.py
@@ +140,5 @@
>              print "Sending results to '%s' failed (%s)." % (report_url,
>                                                              e.reason)
> +        except Exception as e:
> +            print "Sending results to '%s' failed (%s)." % (report_url,
> +                                                            str(e))
\ No newline at end of file

nit: Can you please add a final new line to that file? Git complains about it.
Attachment #8385888 - Flags: review?(hskupin) → review+
https://github.com/mozilla/mozmill/commit/655854f40906d191e50696831a2126cd44d43852 (master)
https://github.com/mozilla/mozmill/commit/15b06c7aa8b4a06560eecb9334888aa87316ee9e (hotfix-2.0)

In case we have a mozmill 2.0.7 release, I also pushed this change to the hotfix--2.0 branch.

Sayan, thanks a lot for your work on this bug! Let me know if you want to work on the follow-up bug, which I will file now so we can handle all those different things better.
Blocks: 980801
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Whiteboard: [mentor=whimboo][lang=python] → [mozmill-2.0.7][mentor=whimboo][lang=python][for a test see comment 22]
Ya I would like to work on the follow-up bug .
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: