Closed Bug 765850 Opened 12 years ago Closed 12 years ago

Compare startup(firstPaint, firstloaduri) for different values of STARTUP_USING_PRELOAD_TRIAL

Categories

(Mozilla Metrics :: Data/Backend Reports, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Unreviewed

People

(Reporter: taras.mozilla, Unassigned)

References

Details

(Whiteboard: [Telemetry:P1])

Attachments

(1 file)

We've setup a randomized trial of different Firefox loading mechanisms in bug 764905. Builds should be reporting STARTUP_USING_PRELOAD_TRIAL starting on 2012-06-17.
Any update on this, we're waiting on this data to see if we should disable the feature on Aurora before the next migration.  We should probably disable the feature on Aurora soon if we are going to.
is the channel to look at Aurora or Nightly?
also, 
1. what is firstloaduri
2. what is  STARTUP_USING_PRELOAD_TRIAL (and what do the values mean?)
3. What is the hypothesis?

Helpful if these are mentioned up front.
> is the channel to look at Aurora or Nightly?

Nightly

> 1. what is firstloaduri

I think this is the time it takes to load the first uri

> what is  STARTUP_USING_PRELOAD_TRIAL (and what do the values mean?)

It's a randomly flipped value. There are 4 possible values:
0 - prefetch is enabled and preload is enabled
1 - prefetch is enabled and preload is disabled
2 - prefetch is disabled and preload is enabled
3 - prefetch is disabled and preload is disabled

> 3. What is the hypothesis?

We believe that preload is not helping in any case.  We also believe that prefetch being disabled is a net good.
Status: NEW → ASSIGNED
Hello,
(FALSE is disabled, TRUE is enabled)

                      preload
prefetch     FALSE      TRUE
   FALSE 0.2158907 0.2174653
   TRUE  0.2836624 0.2829815

The observation  here is that a) PRELOAD is 50/50 split between enabled and disabled. but (b) PREFTECH is biased towards enabled :
prefetch
   FALSE     TRUE 
0.433356 0.566644 

Q: Is this by design? It could well have been part of the design plan, but if not, you might want to ask why.

ALso, 
if Z:= STARTUP_USING_PRELOAD_TRIAL
prefetch is 1 if Z is 0,1 else prefetch = 0
preload is 1 if Z is 0,2 else preload =0
Are these numbers correlated with the startup time? Or are you just giving the distribution of the values?  The distributions of the true/false values do not matter and we can already see that on the histogram.
What we're looking for is a comparison of the startup time in each of the 4 groups (prefetch? x preload?) to see which group gives the best startup time.
Also could you only consider values for builds after june 16 2012.
I've completed this and will update the bug final results tomorrow morning (EDT).
Sounds great, thanks!
Attached file Summary of Bug
Data Used:
    if(json$info$appName == 'Firefox'
     && json$info$OS   == 'WINNT'
     && json$info$appUpdateChannel == "nightly")

where json is the Telemetry Data packet.

Dates: "20120617" ... "20120623"


uri := firstLoadURI
paint := firstPaint
(both in seconds), 

prefetch:= STARTUP_USING_PRELOAD_TRIAL in (0,1)
preload:=  STARTUP_USING_PRELOAD_TRIAL in (0,2)

After dropping missing rows, all in all: 47,005 rows (1757 dropped because of missing variables).
See summary at end for a ... summary.

1. uri and paint are highly correlated (see page 1 of attached PDF)
i only looked at paint.

2. Percentiles of log(paint) are skewed

        0%         1%        10%        20%        30%        40%        50%        60%        70%        80%        90%        99%       100% 
-1.7092582 -0.7571525 -0.1098149  0.3625576  0.7309245  1.1079023  1.4747630  1.8528859  2.2452314  2.6762981  3.3448598  4.9689482 12.9452073 

To see the dependence of paint on preload,prefetch, we did a piece wise ANOVA, to avoid having to drop the top/bottom 1%.


3.  There is little to no interaction between prefetch and preload (see page 2), thus the model used (and also has the lowest BIC(Bayesian Information Criterion for choosing simpler models from a collection)

log(paint) ~ preload + prefetch + pwhere:preload + pwhere:prefetch

where pwhere = 0 if log(paint) < 1% percentile of log(paint), 1 if log(paint) between 1 and 99% and 2 if log(paint) > 99%.
and fpaint defined as above (2)

This method estimates the effect of prefetch and preload controlloing the different behavior in the tails of log(paint).

Output of ANOVA (having removed 401 points with high influence on coefficients)

Coefficients (base is <1% for pwhere and preload=OFF and prefetch=OFF):

                                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)                   -0.952187   0.099186  -9.600   <2e-16 ***
(1)  preloadpreloadON               0.003097   0.096171   0.032   0.9743    
(2a) prefetchprefetchON            -0.033319   0.108449  -0.307   0.7587    
     pwhere1-99                     2.414770   0.099530  24.262   <2e-16 ***
     pwhere>99                      4.978823   0.161488  30.831   <2e-16 ***
(1)  preloadpreloadON:pwhere1-99   -0.024909   0.096600  -0.258   0.7965    
(1)  preloadpreloadON:pwhere>99    -0.021531   0.175782  -0.122   0.9025    
(2)  prefetchprefetchON:pwhere1-99 -0.223112   0.108836  -2.050   0.0404 *  
(2b) prefetchprefetchON:pwhere>99   0.071636   0.182834   0.392   0.6952  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9757 on 46677 degrees of freedom
Multiple R-squared: 0.08732,	Adjusted R-squared: 0.08716 
(The fit is not very good)

* Interpretation
Here the base for comparison is the 1-99% region

(1) Preload=ON has little to no effect on log(paint) , whatever value of log(paint).
(2) Prefetch=ON has a small effect  for where log(paint) lies in the 1-99% of it's distribution. 
The overall main effect (2a) is not signficant because it is 'crossed' : see (2b), the effect is positive i.e when prefetch is ON  log(paint)  is larger. 

- Page 3 and 4 of attached PDF are boxplots of the data, each panel a combination of preload, prefetch and 1%,1-99% and >99% - they support (1) and (2)
- Page 5 and 6 have QQ plots supporting (1) and (2). Not seen in the above table , but in Page 6 of the pdf is that prefetch=ON causes paint to drop (left most panel 1 on Page 6) - this is not statistically significant.


* Summary
Prefetch decreases time but statistically signficantly so for when log(paint) is not in the left or right tails(bottom 1% or top 1%). The effect is cause first paint
to dropy by 20%.

Aalso, inspecting  the bw plots (page  4), though prefetch causes paint to decrease, the variance remains unchanged
Thanks Saptarshi!

So I think based on this data, we should stop doing preload always on Windows (since it makes no difference), and stop disabling the prefetch since having prefetch on is a benefit in most cases 1%-99%.  Can you confirm that I understood the data correctly Saptarshi?
And if the numbers do not change, lets do what Brian suggested.
Hi Brian,

yes the understanding is correct. 


but do keep in mind, though this does decrease the median, the variance appears to be unchanged (see page 4, middle 2 panels)

(In reply to Brian R. Bondy [:bbondy] from comment #11)
> Thanks Saptarshi!
> 
> So I think based on this data, we should stop doing preload always on
> Windows (since it makes no difference), and stop disabling the prefetch
> since having prefetch on is a benefit in most cases 1%-99%.  Can you confirm
> that I understood the data correctly Saptarshi?
(In reply to Taras Glek (:taras) from comment #12)
> I have one more theory, Saptashi, can you filter the data by
...
> 
Could you redo the analysis, but filter on
> simpleMeasures.start >=1second? I'm wondering if the fact that most startups
> are warm is skewing the data.
Sorry - Given my last comment, the further asks are actually requests for new analysis work.

Please file a new request for new work. CLosing this one out.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to Annie Elliott from comment #16)
> Sorry - Given my last comment, the further asks are actually requests for
> new analysis work.
> 
> Please file a new request for new work. CLosing this one out.

No, we are asking to rerun the analysis.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [Telemetry:P1]
As above, I am closing this out. 

Going forward, when the original ask is completed, new asks, even if based on an old ask, require a new ticket, as they require further resourcing.

This is metrics workflow - there are no open-ended tickets. 

The additional work is in https://metrics.mozilla.com/projects/browse/METRICS-826.
Assignee: sguha → nobody
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Blocks: 771745
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: