Closed Bug 583834 Opened 14 years ago Closed 13 years ago

[report] Report should contain test related timestamps

Categories

(Testing Graveyard :: Mozmill, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: k0scist)

References

Details

(Whiteboard: [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.18+])

Attachments

(2 files, 1 obsolete file)

With the current report format we only have time information for the complete test-run. There is no information contained which we could use for performance related analysis. Not sure which of the following options are easy to do and which are impossible.

1. Adding a time_start and time_end entry for each single test function inside a test module.

2. Adding time information for each single test result. I'm not sure if we should do this by default, because it will collect probably too much data. Eventually we could combine it with option 3.

3. Adding a hook scripts can use to add that information on their own. I'm not sure if overriding an existent method of the Mozmill class is sufficient at this time. Clint, Jeff?

Option 3 would definitely be the best one. We could use it in our automation scripts to magically turn Mozmill testing into performance testing.

Geo, have I something forgotten?
Nope, think that covers it.  Test-project-side hook scripts are useful in a number of ways, one of which is decoupling some of the implementation details of how the project reports from the mozmill dev process.  

That'll probably result in less "urgent, need change to support brasstacks reporting!" enhancement requests and more of a controlled migration path as good ideas get promoted from test project implementations to MozMill harness implementations.

It also may make MozMill more flexible for other test projects (other products internal to Mozilla and adoption by the outside community).
Jeff, would it already possible for us by overriding the endTest callback function?

http://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/__init__.py#L279
Assignee: nobody → jhammel
(In reply to comment #0)
> With the current report format we only have time information for the complete
> test-run. There is no information contained which we could use for performance
> related analysis. Not sure which of the following options are easy to do and
> which are impossible.
> 
> 1. Adding a time_start and time_end entry for each single test function inside
> a test module.

We currently have a startTest_listener and an endTest_listener.  Personally I prefer start_time and end_time as its more semantic, but not a huge deal.  This should be easy to add in there.

Should we send seconds since epoch as the JSON?  sadly, python datetime objects are not serializable.

> 
> 2. Adding time information for each single test result. I'm not sure if we
> should do this by default, because it will collect probably too much data.
> Eventually we could combine it with option 3.

I also think that this is too fine a level to work at.

> 3. Adding a hook scripts can use to add that information on their own. I'm not
> sure if overriding an existent method of the Mozmill class is sufficient at
> this time. Clint, Jeff?

I'm not sure what is meant by overriding here.
 
> Option 3 would definitely be the best one. We could use it in our automation
> scripts to magically turn Mozmill testing into performance testing.
> 

I think that having pluggable events (presumably using setuptools entry points) is probably the way to go for the future.  However, after looking at the code today, I recommend against this for 1.4.2:

 * our event system is messy and needs cleaning up.  Its not documented (not a huge deal) and how it is used is implicit to the code.

 * if we're going to move towards a component architecture (read:  adding a single plugin point doesn't make a component architecture, exactly, but it does move us towards there), that is to say, a slotted system, then the first thing to do is fixing and probably formalizing the event system a bit.  This is probably not a good thing to do at the last minute before a release.  I know, I know, it would be nice if 1.4.2 could contain *everything*! But the fact is, our listener system is, IMHO, not ready for it, and I would not want to tackle adding plugins without having the event system ready.  Please, we have enough hacks and workarounds already.

 * as I was talking about with :geo, a slotted system is only as good as the slots you choose.  We should do this, IMHO, but we should do it with some care and genius and not as a last minute addition.

 * we have a way of adding python module callbacks.  Its really hacky, but it is already there. Does it help with this bug?  No.  And that's exactly why I want to think about the right way of doing things and clean up our listener system (at least the python side) before adding something else that doesn't quite do what we want going forward.
Thanks for the analysis Jeff! I feel the same. Lets not move it into 1.4.2 then. The risk is too high for this type of change.
Plussed for option 1
Whiteboard: [mozmill-1.4.2?] → [mozmill-1.4.2+]
Should we just record the time passed for a test? Is that sufficient?
Also, I'll say ahead of time the way I plan on doing this, just so that the scope is perfectly clear and people won't expect more than the nuts and bolts that I'm going to do:

 1. A test starts.  startTests_listener will record the time (hashing test name and the test file) in the harness's self.

 2. A test ends.  The harness will check the current time.  The test's starting time will be popped from the hash, if it exists.  The endtime - starttime (in seconds) will be recorded on the test if the hash is found.  If the hash is not found, a null will be recorded.

Note:

 - this is not fool proof.  If bad things happen, it is entirely possible that the hash won't be found.  I don't care about this because ...

 - this is not the correct way of fixing this.  The listener system should be pluggable, for one.  For another, we should have a way of matching up tests.  The entire harness needs to be more robust.  So I don't think doing anything more complicated than this is a good idea that 1.4.2

I was going to write a blog post about the perils of polished one-offs, but as a one-sentence summary, I don't have a problem with doing a one-off if it is quick and fast.  But fixing something the wrong way (as I will be doing here) on an unstable platform means that this can only be quick and fast if you don't polish the edge cases.  So let's either:

 A) do this and not polish the edge cases (e.g. when things go bad)
 - or - 
 B) punt on this for 1.4.2 and fix it correctly when we have the chance to make the platform more stable and maintainable.

I'm not interested in making this bug work for all possible cases.  Zilch. Zero. None. </rant>
I'd rather store primary data (start/end) and display derived data (duration).  It's somewhat more flexible.  The other problem with duration-only is that rounding errors stack making the run duration != sum of result durations.

Re: option 2, I was the one who suggested that to Henrik.  My basic position is when you have to debug something primarily through logs after the fact, timestamps are awfully nice to have for forensic purposes.  If you don't generally have to do this, I can understand why you wouldn't see the point.

To be clear, though, I was talking about timestamping everything, including DEBUG/INFO.  My eventual vision for the results dashboard includes being able to get a verbose --showall-style view of each test run including the ability for the test to log runtime commentary via those mechanisms or otherwise.

That approach also gets us away from having to special case "this line has this information, this line has that information" and just store/display the test run in a row/col drilldown matrix with consistent columns.

With that in mind, not sure it'd be too much data to store.  It'd probably be a small percentage of the overall data.
I second on that. Having the start and end time (UTC) logged would be a better way. Being limited to the duration, we can't work with the real time. Having the real time for a test we can simply calculate the duration if needed.

Would be great to get two entries under each test result in the report, which look like this:

time_start: 2010-08-02T22:27:21Z
time_end: 2010-08-02T22:27:25Z
Those timestamps are using the format introduced by bug 584203. I will add the bug to the dependency list.
Depends on: 584203
It sounds like there's some discussion to be had here.  Since this is a new feature, we haven't pinned down what that feature is, and there is now an unresolved dependency, I recommend putting this off to past 1.4.2 when it can be done correctly.
Well, the dependency doesn't really affect the code for this particular feature. You simply have to use the same formatting of the timestamp.

I thought we agreed on option 1, as also given in comment 5?
No longer depends on: 584203
So there are a few issues here:

- there is the timestamp issue;  I'm not sure why the dependency was removed, since if we're going to have unified timestamps that should probably be resolved first.  This means we have two-additional last minute features for 1.4.2.

- in comment 7, I proposed how I was planning on resolving this bug as a one-off and just making sure that that was sufficient;  Given geo's comments in 8, it is not clear to me that what I proposed is in fact sufficient to the need.

- given the above two, and the fact that more time has been taken to discuss the issue than to implement option 1, it seems like this shouldn't be done unless it's really important.  i am much more interested in getting 1.4.2 out the door than trying to throw in some last minute features when it seems like there is discussion that still needs to take place, especially since following 1.4.2 refactorings will take place that will make it possible to implement these features in a robust manner.
I'll weigh on on the unified timestamps thing in that bug.  If we decide it's a blocker, yeah, we should replace the dependency.

I'd rather have timestamps per test func (option 1) than not at all.  I don't have a strong opinion on whether we need them for 1.4.2, but it's a "would be nice" definitely.

I threw out my motivations re: option 2 because it seemed like it was getting dismissed out of hand, and also because it might influence how we'd implement option 1.  But I think there's something to be said for adding the functionality incrementally to get max value soonest.
We absolutely can't do option 2 in 1.4.2.  

The only thing we can take here on 1.4.2 is the simplest option 1 fix that Jeff outlined in comment 7.  We can do it without the duration piece if you want, which actually makes it even easier.

So, do you want that ^ or not?

If not, then I'm very happy to fix this the right way post 1.4.2.
I've heard no justifications for taking the hacky one-off fix, and the more I think about it and the more I talk about it with Jeff, the less comfortable we feel in doing this.  We need to take a step back, design this feature from the ground up so that we can be sure we're getting Geo what he really needs here.  We need to do that in a release after 1.4.2 as I really don't think this is important enough to delay 1.4.2 for.  We need to go ahead and release 1.4.2 and get it out the door.  Minusing.
Depends on: 584524, 584203
Whiteboard: [mozmill-1.4.2+] → [mozmill-1.4.2-]
Assignee: jhammel → nobody
Not sure what the status/priority is here, but i'm taking myself off as assigned as I'm not currently working on it (nor have been for awhile).
For master, this should be really easy to add to the TestResults class, added for bug 585106 . It wouldn't be hard for the 1.5 series, but would involve adding things to the test dict.  Not necessarily horrible
Summary: Report should contain test related timestamps → [report] Report should contain test related timestamps
Severity: normal → enhancement
Whiteboard: [mozmill-1.4.2-] → [mozmill-1.4.2-][mozmill-2.0?]
Whiteboard: [mozmill-1.4.2-][mozmill-2.0?] → [mozmill-1.4.2-][mozmill-2.0+]
Assignee: nobody → jhammel
So, just making sure since this bug hasn't been touched in awhile, we want per test function timestamps?
Nothing has been changed on our side and that information would be kinda helpful for us.
This records the test times python-side and sends them with the report.  However, testing with a fast test shows this probably isn't good enough as the test times are identical down to the second. We will need to keep more precision than this.  Additionally, its probably better to get the times JS-side as there could be inaccuracy in the processing of events (I think?)

I'll look at this closer going forward.  In the meantime, feedback appreciated.
Attachment #521731 - Flags: feedback+
So, regardless of the fact that timestamps should probably be done JS-side, we have an issue.  Currently, we send the test timestamp as a string using strftime: "2011-03-25T11:28:25Z" . `man date` on my system gives '%N' as nanoseconds; however, this doesn't seem supported in python's datetime.datetime.strftime.  In short, I can see no way of using strftime to get sub-second precision.  So, what can we do?

- instead of a formatted date, send seconds since (UTC) epoch; brasstacks then becomes responsible for doing what it wants with the date
- instead of a formatted date, send a (UTC) time tuple; again, brasstacks can do what it wants with it
- deal with limit of 1s precision
- don't use strftime in report.py but format in our own special way; i'm fairly against this option as currently the date format is stored as a user-controllable string.  Its well understood and involves a single function call.  If we go to special formatting, not only will the user have no control over how datetimes are serialized, but in addition, each change will not only a programmatic change but also will probably require a server-side change :(

So, in short, I'm fine with the first three options and very leary of the fourth.  Since, however, this really has little to do with this bug, I'll put this up for review soon and we can figure out these details later.
How would this time tuple look like? Does it contain milliseconds?
Comment on attachment 521731 [details] [diff] [review]
get and report per-function timestamps on the python side

I'm going to go ahead and put this up for review.  Regardless of our solution to the sub-second precision problem (which existed before this bug) I think the general form is worth reviewing.
Attachment #521731 - Flags: review?(ahalberstadt)
Attachment #521731 - Flags: feedback?(ctalbert)
Attachment #521731 - Flags: feedback+
(In reply to comment #24)
> How would this time tuple look like? Does it contain milliseconds?

Well, I thought it contained microseconds, but it turns out it does not:

>>> datetime.datetime.now().timetuple()
time.struct_time(tm_year=2011, tm_mon=3, tm_mday=25, tm_hour=13, tm_min=41, tm_sec=52, tm_wday=4, tm_yday=84, tm_isdst=-1)

datetime objects do have a .microsecond attribute , so they do have more refined information, just not in the time tuple.

I'm getting more and to thinking we should just use seconds since epoch as what we use as internal data anyway.  Its simple.  Its easy to do math with. And its easy to get in python or js:

python: import time; time.time()
JS: foo = new Date(); foo.getTime()/1000.; 

If no one disagrees, I'm happy to file a bug about this.  Reporting could still format however we wanted.  There's kinda two joined discussions here:
- what do we use internally to store the date? (Currently a DateTime object)
- how does report.py serialize this object? (Currently .strftime(date_format))
Comment on attachment 521731 [details] [diff] [review]
get and report per-function timestamps on the python side


(In reply to comment #26)
> (In reply to comment #24)
> > How would this time tuple look like? Does it contain milliseconds?
> 
> Well, I thought it contained microseconds, but it turns out it does not:
> 
> >>> datetime.datetime.now().timetuple()
> time.struct_time(tm_year=2011, tm_mon=3, tm_mday=25, tm_hour=13, tm_min=41,
> tm_sec=52, tm_wday=4, tm_yday=84, tm_isdst=-1)
> 
> datetime objects do have a .microsecond attribute , so they do have more
> refined information, just not in the time tuple.
> 
> I'm getting more and to thinking we should just use seconds since epoch as what
> we use as internal data anyway.  Its simple.  Its easy to do math with. And its
> easy to get in python or js:
> 
> python: import time; time.time()
> JS: foo = new Date(); foo.getTime()/1000.; 
> 
The only way to get the python time functionality out is to convert it to seconds from epoc which means that we're never going to have very good precision.  I think we have to measure from javascript so that we can obtain a time in milliseconds since epoc.  And what we should report is milliseconds since epoc.  And reporting system that uses the web can use javascript and easily deal with such a value.

So, f-, let's do this in JS cause python just isn't going to be sufficient.
Attachment #521731 - Flags: feedback?(ctalbert) → feedback-
(In reply to comment #26)
> > How would this time tuple look like? Does it contain milliseconds?
> 
> Well, I thought it contained microseconds, but it turns out it does not:

Hm, given that a lot of our tests are really fast and we wouldn't have the right resolution to determine the duration of such a test. With isoformat() we would have a standardized string which gets transmitted and which includes milliseconds.

>>> datetime.datetime.utcnow()
datetime.datetime(2011, 3, 25, 21, 13, 53, 688010)


(In reply to comment #26)
> >>> datetime.datetime.now().timetuple()
> time.struct_time(tm_year=2011, tm_mon=3, tm_mday=25, tm_hour=13, tm_min=41,
> tm_sec=52, tm_wday=4, tm_yday=84, tm_isdst=-1)

I'm not sure if JS can handle that format. If we would transform it into a string in the report module, would the milliseconds be lost? I would assume so.
(In reply to comment #28)
> >>> datetime.datetime.utcnow()
> datetime.datetime(2011, 3, 25, 21, 13, 53, 688010)

Sorry, that was the wrong content. I intended to post the following:

>>> datetime.datetime.utcnow().isoformat()
'2011-03-25T21:14:16.447808'
(In reply to comment #27)

> The only way to get the python time functionality out is to convert it to
> seconds from epoc which means that we're never going to have very good
> precision.

I don't think so? AFAIK, python keeps stuff down to the microsecond:

>>> time.time()
1301088147.525948

Hence the "if we want seconds since epoch" examples
(In reply to comment #28)
> (In reply to comment #26)
> > > How would this time tuple look like? Does it contain milliseconds?
> > 
> > Well, I thought it contained microseconds, but it turns out it does not:
> 
> Hm, given that a lot of our tests are really fast and we wouldn't have the
> right resolution to determine the duration of such a test. With isoformat() we
> would have a standardized string which gets transmitted and which includes
> milliseconds.

Microseconds, I believe you mean

> >>> datetime.datetime.utcnow()
> datetime.datetime(2011, 3, 25, 21, 13, 53, 688010)
> 
> 
> (In reply to comment #26)
> > >>> datetime.datetime.now().timetuple()
> > time.struct_time(tm_year=2011, tm_mon=3, tm_mday=25, tm_hour=13, tm_min=41,
> > tm_sec=52, tm_wday=4, tm_yday=84, tm_isdst=-1)
> 
> I'm not sure if JS can handle that format. If we would transform it into a
> string in the report module, would the milliseconds be lost? I would assume so.

Milliseconds -- or microseconds -- are stupidly not available in the time tuple.  We would have to custom add them :/ They're on the DateTime object but stupidly inaccessible in any reasonable way :(
(In reply to comment #29)
> (In reply to comment #28)
> > >>> datetime.datetime.utcnow()
> > datetime.datetime(2011, 3, 25, 21, 13, 53, 688010)
> 
> Sorry, that was the wrong content. I intended to post the following:
> 
> >>> datetime.datetime.utcnow().isoformat()
> '2011-03-25T21:14:16.447808'

To reiterate, I am fine going this route (serializing to strings this way) with the caveat that if we do this I don't want to be futzing with formatting ever again.  As idiotic as strftime is and as idiot as python is for not including the %N to display nanoseconds, its at least somewhat malleable.  Once we go with the above....we're done and shouldn't change it again (and note that it is missing the infamous Z).
Another thing to think about: do we really care about start and stop times? or do we just care about durations?  If only the latter, this whole problem becomes much easier.  We would still have when the test was uploaded as a date.
(In reply to comment #32)
> (In reply to comment #29)
> > (In reply to comment #28)
> > > >>> datetime.datetime.utcnow()
> > > datetime.datetime(2011, 3, 25, 21, 13, 53, 688010)
> > 
> > Sorry, that was the wrong content. I intended to post the following:
> > 
> > >>> datetime.datetime.utcnow().isoformat()
> > '2011-03-25T21:14:16.447808'
> 
> To reiterate, I am fine going this route (serializing to strings this way) with
> the caveat that if we do this I don't want to be futzing with formatting ever
> again.  As idiotic as strftime is and as idiot as python is for not including
> the %N to display nanoseconds, its at least somewhat malleable.  Once we go
> with the above....we're done and shouldn't change it again (and note that it is
> missing the infamous Z).
Why the hell are we putting things in strings?  This is a framework, not a reporting engine.  The reporting plugins should be responsible for formatting the time any way you want.  The framework's responsibility is to give you the most detailed information it can give you in a format you can parse.

The most detailed information we can get is milliseconds from epoc, and it is readily available in JS.  That is what the framework should expose.  Then the report plugins can do whatever display mechanics they want on that time.

I think in order to have maximum versatility, the framework should report the following:
* overall test run start timestamp
* individual test start timestamp (per test)
* individual test end timestamp (per test)
* overall test run end timestamp

Calculating duration is a function of reporting and should be handled in the report plugins.
The conversion to string is in the report module (report.py).  Sorry if that wasn't clear.

...
> * individual test start timestamp (per test)
> * individual test end timestamp (per test)

Per test (==file) or per function?  I was under the impression that we were doing per function but....
Revisiting this bug.  I'm still not sure why were are doing timestamps for each event, vs e.g. a single timestamp for run-start and then duration of each test and the run.  If desired, I can do full (milliseconds) since epoch for everything but I'm not sure I see the need
Jeff:

I already addressed this.

https://bugzilla.mozilla.org/show_bug.cgi?id=583834#c8

It's information I've frequently found useful in the past. I miss having it here when I try to diagnose test breakage. 

If we have a hook that we could use to annotate our results ourselves with extra fields (note that I don't really consider doing a whole custom subclass "a hook") I might be willing to go that route, but it is something I'd like us to have.
milliseconds since epoch for test and run start/end times it is. We can further consider adding more once these are in place (suggestion)
This gathers start + stop time per test function in the Results class. these are stored in ms since epoch. report.py was deliberately not updated to cast these to a string so that the whole precision could be kept
Attachment #521731 - Attachment is obsolete: true
Attachment #541146 - Flags: review?(fayearthur+bugs)
Attachment #521731 - Flags: review?(halbersa)
Comment on attachment 541146 [details] [diff] [review]
get ms since epoch on the JS side

Works great, a small detail you don't really have to change:

>+  var d = new Date();
>+  test.__end__ = d.getTime();

Date.now() works the same and is a bit shorter.
Attachment #541146 - Flags: review?(fayearthur+bugs) → review+
pushed to master: https://github.com/mozautomation/mozmill/commit/e5d8f883747b092b43543b69d431143705d14812

Thanks for the heads up about Date.now, :harth. I had completely forgotten about that
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Thanks, really appreciate the faith on the change. :) I agree, let's see how this works out, then make incremental changes later as needed.
As given by the agreement in comment 15, I request a backport to Mozmill 1.5.x. It would be kinda useful for us. Thanks.
Whiteboard: [mozmill-1.4.2-][mozmill-2.0+] → [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.5?]
Whiteboard: [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.5?] → [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.6?]
This is an easy backport we never have been done for 1.5. Lately its getting more important for us to see how long individual tests take. So I would kinda like to see this included in 1.5.18.
Whiteboard: [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.6?] → [mozmill-1.4.2-][mozmill-2.0+][mozmill-1.5.18+]
Pointer to Github pull-request
Comment on attachment 655051 [details]
Patch v1.0 (hotfix-1.5)

Simple backport which gives us the helpful feature on hotfix-1.5.
Attachment #655051 - Attachment description: Pointer to Github pull request: https://github.com/mozilla/mozmill/pull/90/files → Patch v1.0 (hotfix-1.5)
Attachment #655051 - Flags: review?(jhammel)
Blocks: 785442
Comment on attachment 655051 [details]
Patch v1.0 (hotfix-1.5)

looks good
Attachment #655051 - Flags: review?(jhammel) → review+
Jeff, shall we drop the change from mozrunner/__init__.py? That would require us to release another mozrunner release which definitely will cause troubles again I kinda would like avoid.
So the change in mozrunner is actually not related to the rest of the bug.  It is a text string fix.  

Strictly speaking, this change doesn't actually require a mozrunner release:  it is not an API change and it does not change functionality.  It is just a text string fix.

If this were mozbase/mozrunner I certainly wouldn't release.  That said its mozmill/mozrunner for 1.5 so I don't really care.

Personally, I'd push the change and not release.  Though I also probably wouldn't be fixing text strings for mozmill/mozrunner 1.5
(In reply to Jeff Hammel [:jhammel] from comment #49)
> Personally, I'd push the change and not release.  Though I also probably
> wouldn't be fixing text strings for mozmill/mozrunner 1.5

Yeah. That's what I also thought about in the last two hours. Lets get it in and do not release. If we need another mozrunner release it will be included.

Landed on hotfix-1.5:
https://github.com/mozilla/mozmill/commit/8013c30fbb58333521e2cbdc4b80c9a8b1c6ac24
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: