Closed Bug 797339 Opened 7 years ago Closed 6 years ago

Talos ts_shutdown numbers look bogus, test takes >80 minutes

Categories

(Firefox for Android :: General, defect)

ARM
Android
defect
Not set

Tracking

()

RESOLVED DUPLICATE of bug 918463

People

(Reporter: gcp, Assigned: jmaher)

References

Details

Attachments

(1 file, 1 obsolete file)

Android Tegra 250 mozilla-inbound talos remote-ts

    s: tegra-279
    ts: 3229.84 (details)
    ts_shutdown: 25204690.84 (details)

Two problems here:

1) The ts_shutdown number appears to be bogus
2) This test takes >80 minutes, and generally lags badly behind all others on all TBPL. Given that it's supposed to just start and stop the application (I think??), even with a few repetitions there seems to be no reason why it should take so long.

From a wild gues we're failing to shut down the app and something just times out eventually?
FWIW, tp4m's shutdown value is similar: tp4m_shutdown_nochrome: 25204832.0

ts runs 20 Talos cycles, whereas most (all?) other Android Talos tests use 1 Talos cycle. In contrast, tp4m has 1 Talos cycle, but then repeats tests internally; it seems to do 40 page loads in about 5 minutes.
the 80 minutes includes setup, reboots, cleanup/verify and testing.

The test itself takes 61 minutes.  There is about 4 minutes for test setup (profile, first run, calibration), then 55+ minutes for the 20 cycles.  That is <3 minutes/cycle, is that really necessary, it should be <1 minute/cycle.
the test itself takes about 15 seconds total to run including launching the process and terminating the process.  The rest would be related to gathering the logs, still this should be <1 minute!
for the ts_shutdown, we have a start timestamp of:
1349251342477 <- from the python script (although pulled from the device)
1349251345975 <- from the test case

this is off by an hour...odd, really odd.

I am unable to tell what the end timestamp is, but the difference is:
25204673.58

This difference is about 10 months.  ack.  Maybe we read a timestamp from the webserver instead of the host machine or the device.  If that is the case we should check what bm-remote has for a time.
verified the devices, host controllers (foopy) and webserver (bm-remote) are all on the same (very similar) timestamp.
this patch has been tested on try server and the only suite that sees significant improvement is ts.  The total time went from 80 minutes to 23 minutes.
Attachment #667621 - Flags: review?(jhammel)
Comment on attachment 667621 [details] [diff] [review]
remove browser_wait for remote testing (1.0)


+                if not browser_config['browser_wait']:
+                    time.sleep(browser_config['browser_wait']) #wait out the browser closing

Did you mean `if not browser_config['remote']` like appears elsewhere in the bug?
Assignee: nobody → jmaher
Attachment #667621 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #667621 - Flags: review?(jhammel)
Attachment #667953 - Flags: review?(jhammel)
Comment on attachment 667953 [details] [diff] [review]
remove browser_wait for remote testing (1.1)

wfm;  a comment as to why we don't wait on remote and/or link to this bug accepted for extra credit :)
Attachment #667953 - Flags: review?(jhammel) → review+
speed up ts:
http://hg.mozilla.org/build/talos/rev/97cbf16e9846

lets leave this bug open to figure out the large ts_shutdown numbers.
(In reply to Joel Maher (:jmaher) from comment #4)
> for the ts_shutdown, we have a start timestamp of:
> 1349251342477 <- from the python script (although pulled from the device)
> 1349251345975 <- from the test case
> 
> this is off by an hour...odd, really odd.
> 
> I am unable to tell what the end timestamp is, but the difference is:
> 25204673.58

Shouldn't all the time stamps be integers? How can the difference be non-integer (xxx.58)? Are we subtracting apples from oranges?
what is reported is an average of 20 values.  I need to do some debugging on try server to see more timestamps in the logfile to determine why we have such a large number here.
Depends on: 801633
I just had to have a look...

results.py generates the "shutdown" value from int(self.endTime - self.startTime)

self.startTime is pulled from the log result __startTimestamp
self.endTime is pulled from the log result __startAfterTerminationTimestamp

Values from a recent tbpl run: 
endTime:   1352325442738
startTime: 1352296636770
difference:     28805968 (reported to graphserver)

Both startTime and endTime look to be generated by int(time.time()*1000) -- or milliseconds since the epoch, likely Jan 1/1970.

Consider:
>>> print (time.asctime(time.gmtime(1352325442738/1000)))  # endTime
Wed Nov  7 21:57:22 2012
>>> print (time.asctime(time.gmtime(1352296636770/1000)))  # startTime
Wed Nov  7 13:57:16 2012

From the tbpl log:
...
Running test tp4m: 
		Started Wed, 07 Nov 2012 13:54:45
...
Completed test tp4m: 
		Stopped Wed, 07 Nov 2012 13:57:37

and from the end of the logcat:
11-07 13:57:16.770 I/GeckoDump( 2189): __startTimestamp1352296636770__endTimestamp

It looks to me like the startTime / __startTimestamp reflects the time the test ends. The endTime is about 8 hours in the future (hopefully a timezone issue!).
(In reply to Geoff Brown [:gbrown] from comment #13)
> It looks to me like the startTime / __startTimestamp reflects the time the
> test ends. The endTime is about 8 hours in the future (hopefully a timezone
> issue!).

I confused myself there. This test ended at approx 13:57 local time (PT) or about 21:57 GMT. The end time (__startAfterTerminationTimestamp) is approximately correct. 

The start time (__startTimestamp) is off by 8 hours, as interpreted here. And that is because it is not a python time.time()*1000, but a js Date.getTime():

https://hg.mozilla.org/build/talos/file/0316cd82e0a4/talos/pageloader/chrome/report.js#l114
we are going to feature parity with desktop where we don't look for shutdown numbers on ts
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 918463
You need to log in before you can comment on or make changes to this bug.