Closed Bug 554108 Opened 14 years ago Closed 14 years ago

increase n900 cycle counts

Categories

(Release Engineering :: General, defect, P4)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jhford, Assigned: jhford)

Details

Attachments

(1 file)

In order to get more stable numbers, we would like to bump the cycle count for talos tests.

These tests are from mozilla-central 4c99e481192d with mobile-browser fdb9b8bce411

Test times only include the run_tests.py total time.  I am not sure how the fixed costs of a test run compare to the incremental cost of a cycle.

tdhtml - 283s - 3 cycles
tgfx - 93s - 3 cycles
ts - 298s - 10 cycles
tsspider - 153s - 3 cycles
tsvg - 961s - 3 cycles
txul - 204s - ? cycles

talos-checkout - 156s
tools-clone - 48s
get build - 50s
untar build - 24s
TOTAL: 278s

Time to reboot:
mostly either 3-5 minutes or 10-15 minutes but I have seen some take as long as 50 minutes to go from rebooting to connected back to buildbot.  The actual reboot takes 3-4 minutes.  The devices attempt to connect to the master as soon as the device is on, which should be triggering the wlancond to find a network.  The long haul on rebooting is networking.

If we were to not do a reboot, we would likely get less stable numbers and have device crash more frequently.  We also found that the filesystem got corrupted fairly easily on the n810s when not doing reboots, so this will hopefully help with that.  If we wanted to do a filesystem format as part of the build process (which I'd highly recommend if we don't reboot) then we would need to allow about 3-4 minutes minimum and be prepared for devices to break often because it fails to umount the partition.

We also don't have a working Tp4 on n900s yet so I cannot give numbers for that.  I am going to assume that it will be the longest running test.

After speaking with Alice about number validitiy, it seem that 25 cycles is ideal for ts/txul style tests and 10 cycles for tp style tests.  I propose that we increase the cycle counts to (times are rough estimate)
tdhtml - 10 cycles - 943s approx
tgfx  - 10 cycles - 310s approx
ts - 25 cycles - 745s approx
tsspider - 10 cycles - 510s approx
tsvg - 5 cycles - 1601s approx
txul - ? cycles - ?s approx
Assignee: nobody → jhford
Priority: -- → P3
Attached patch patch-v1Splinter Review
I am not sure if this patch will increase the Txul cycle count.

This patch will increase the N810 cycle counts as well.  Not sure if we should do this, only testing will see if we have many more failures.  If we do end up with seperate n810 and n900 config files, could I attach a full copy of the n900 config file for review?
Attachment #461413 - Flags: feedback?(anodelman)
Pretty sure I reduced these numbers to reduce a) the cycle times and b) crashing.

Have you verified that these run smoothly with these cycle counts?
Priority: P3 → P4
I agree with aki, I'd like to have a staging run that suggests that 

1 - we can run to completion with more cycles
2 - the results are statistically smoother
Attachment #461413 - Flags: feedback?(anodelman) → feedback-
Strongly suggest WONTFIX.
(In reply to comment #4)
> Strongly suggest WONTFIX.

well... let's get data from staging first, right?
(In reply to comment #6)
> In order to get more stable numbers, we would like to bump the cycle count for
> talos tests.
...
> After speaking with Alice about number validitiy, it seem that 25 cycles is
> ideal for ts/txul style tests and 10 cycles for tp style tests.  I propose that
> we increase the cycle counts to (times are rough estimate)
> tdhtml - 10 cycles - 943s approx
> tgfx  - 10 cycles - 310s approx
> ts - 25 cycles - 745s approx
> tsspider - 10 cycles - 510s approx
> tsvg - 5 cycles - 1601s approx
> txul - ? cycles - ?s approx


(In reply to comment #4)
> Strongly suggest WONTFIX.


(In reply to comment #5)
> (In reply to comment #4)
> > Strongly suggest WONTFIX.
> 
> well... let's get data from staging first, right?



alice/jhford: how do you want to proceed here?
I'm sticking with comment #3:

- need staging to prove that we can run that many cycles to completion
&
- need staging to show that the numbers would indeed be smoother
Looking at a couple graphs, it seems like most test results are within 1% of each other.  I am going to mark this bug as WONTFIX.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: