Last Comment Bug 765850 - Compare startup(firstPaint, firstloaduri) for different values of STARTUP_USING_PRELOAD_TRIAL
: Compare startup(firstPaint, firstloaduri) for different values of STARTUP_USI...
Status: RESOLVED FIXED
[Telemetry:P1]
:
Product: Mozilla Metrics
Classification: Other
Component: Data/Backend Reports (show other bugs)
: unspecified
: x86_64 Windows 7
: -- normal (vote)
: Unreviewed
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
Depends on: 764905
Blocks: 771745
  Show dependency treegraph
 
Reported: 2012-06-18 11:46 PDT by (dormant account)
Modified: 2012-08-07 02:11 PDT (History)
6 users (show)
See Also:
Due Date:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Summary of Bug (319.74 KB, application/pdf)
2012-06-28 10:10 PDT, "Saptarshi Guha[:joy]"
no flags Details

Description (dormant account) 2012-06-18 11:46:47 PDT
We've setup a randomized trial of different Firefox loading mechanisms in bug 764905. Builds should be reporting STARTUP_USING_PRELOAD_TRIAL starting on 2012-06-17.
Comment 1 Brian R. Bondy [:bbondy] 2012-06-25 06:24:25 PDT
Any update on this, we're waiting on this data to see if we should disable the feature on Aurora before the next migration.  We should probably disable the feature on Aurora soon if we are going to.
Comment 2 "Saptarshi Guha[:joy]" 2012-06-25 09:19:03 PDT
is the channel to look at Aurora or Nightly?
Comment 3 "Saptarshi Guha[:joy]" 2012-06-25 09:41:45 PDT
also, 
1. what is firstloaduri
2. what is  STARTUP_USING_PRELOAD_TRIAL (and what do the values mean?)
3. What is the hypothesis?

Helpful if these are mentioned up front.
Comment 4 Brian R. Bondy [:bbondy] 2012-06-25 12:05:14 PDT
> is the channel to look at Aurora or Nightly?

Nightly

> 1. what is firstloaduri

I think this is the time it takes to load the first uri

> what is  STARTUP_USING_PRELOAD_TRIAL (and what do the values mean?)

It's a randomly flipped value. There are 4 possible values:
0 - prefetch is enabled and preload is enabled
1 - prefetch is enabled and preload is disabled
2 - prefetch is disabled and preload is enabled
3 - prefetch is disabled and preload is disabled

> 3. What is the hypothesis?

We believe that preload is not helping in any case.  We also believe that prefetch being disabled is a net good.
Comment 5 "Saptarshi Guha[:joy]" 2012-06-26 14:21:26 PDT
Hello,
(FALSE is disabled, TRUE is enabled)

                      preload
prefetch     FALSE      TRUE
   FALSE 0.2158907 0.2174653
   TRUE  0.2836624 0.2829815

The observation  here is that a) PRELOAD is 50/50 split between enabled and disabled. but (b) PREFTECH is biased towards enabled :
prefetch
   FALSE     TRUE 
0.433356 0.566644 

Q: Is this by design? It could well have been part of the design plan, but if not, you might want to ask why.

ALso, 
if Z:= STARTUP_USING_PRELOAD_TRIAL
prefetch is 1 if Z is 0,1 else prefetch = 0
preload is 1 if Z is 0,2 else preload =0
Comment 6 Brian R. Bondy [:bbondy] 2012-06-26 16:22:53 PDT
Are these numbers correlated with the startup time? Or are you just giving the distribution of the values?  The distributions of the true/false values do not matter and we can already see that on the histogram.
Comment 7 Brian R. Bondy [:bbondy] 2012-06-26 16:39:59 PDT
What we're looking for is a comparison of the startup time in each of the 4 groups (prefetch? x preload?) to see which group gives the best startup time.
Also could you only consider values for builds after june 16 2012.
Comment 8 "Saptarshi Guha[:joy]" 2012-06-27 14:19:43 PDT
I've completed this and will update the bug final results tomorrow morning (EDT).
Comment 9 Brian R. Bondy [:bbondy] 2012-06-27 18:06:29 PDT
Sounds great, thanks!
Comment 10 "Saptarshi Guha[:joy]" 2012-06-28 10:10:15 PDT
Created attachment 637566 [details]
Summary of Bug


Data Used:
    if(json$info$appName == 'Firefox'
     && json$info$OS   == 'WINNT'
     && json$info$appUpdateChannel == "nightly")

where json is the Telemetry Data packet.

Dates: "20120617" ... "20120623"


uri := firstLoadURI
paint := firstPaint
(both in seconds), 

prefetch:= STARTUP_USING_PRELOAD_TRIAL in (0,1)
preload:=  STARTUP_USING_PRELOAD_TRIAL in (0,2)

After dropping missing rows, all in all: 47,005 rows (1757 dropped because of missing variables).
See summary at end for a ... summary.

1. uri and paint are highly correlated (see page 1 of attached PDF)
i only looked at paint.

2. Percentiles of log(paint) are skewed

        0%         1%        10%        20%        30%        40%        50%        60%        70%        80%        90%        99%       100% 
-1.7092582 -0.7571525 -0.1098149  0.3625576  0.7309245  1.1079023  1.4747630  1.8528859  2.2452314  2.6762981  3.3448598  4.9689482 12.9452073 

To see the dependence of paint on preload,prefetch, we did a piece wise ANOVA, to avoid having to drop the top/bottom 1%.


3.  There is little to no interaction between prefetch and preload (see page 2), thus the model used (and also has the lowest BIC(Bayesian Information Criterion for choosing simpler models from a collection)

log(paint) ~ preload + prefetch + pwhere:preload + pwhere:prefetch

where pwhere = 0 if log(paint) < 1% percentile of log(paint), 1 if log(paint) between 1 and 99% and 2 if log(paint) > 99%.
and fpaint defined as above (2)

This method estimates the effect of prefetch and preload controlloing the different behavior in the tails of log(paint).

Output of ANOVA (having removed 401 points with high influence on coefficients)

Coefficients (base is <1% for pwhere and preload=OFF and prefetch=OFF):

                                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)                   -0.952187   0.099186  -9.600   <2e-16 ***
(1)  preloadpreloadON               0.003097   0.096171   0.032   0.9743    
(2a) prefetchprefetchON            -0.033319   0.108449  -0.307   0.7587    
     pwhere1-99                     2.414770   0.099530  24.262   <2e-16 ***
     pwhere>99                      4.978823   0.161488  30.831   <2e-16 ***
(1)  preloadpreloadON:pwhere1-99   -0.024909   0.096600  -0.258   0.7965    
(1)  preloadpreloadON:pwhere>99    -0.021531   0.175782  -0.122   0.9025    
(2)  prefetchprefetchON:pwhere1-99 -0.223112   0.108836  -2.050   0.0404 *  
(2b) prefetchprefetchON:pwhere>99   0.071636   0.182834   0.392   0.6952  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9757 on 46677 degrees of freedom
Multiple R-squared: 0.08732,	Adjusted R-squared: 0.08716 
(The fit is not very good)

* Interpretation
Here the base for comparison is the 1-99% region

(1) Preload=ON has little to no effect on log(paint) , whatever value of log(paint).
(2) Prefetch=ON has a small effect  for where log(paint) lies in the 1-99% of it's distribution. 
The overall main effect (2a) is not signficant because it is 'crossed' : see (2b), the effect is positive i.e when prefetch is ON  log(paint)  is larger. 

- Page 3 and 4 of attached PDF are boxplots of the data, each panel a combination of preload, prefetch and 1%,1-99% and >99% - they support (1) and (2)
- Page 5 and 6 have QQ plots supporting (1) and (2). Not seen in the above table , but in Page 6 of the pdf is that prefetch=ON causes paint to drop (left most panel 1 on Page 6) - this is not statistically significant.


* Summary
Prefetch decreases time but statistically signficantly so for when log(paint) is not in the left or right tails(bottom 1% or top 1%). The effect is cause first paint
to dropy by 20%.

Aalso, inspecting  the bw plots (page  4), though prefetch causes paint to decrease, the variance remains unchanged
Comment 11 Brian R. Bondy [:bbondy] 2012-06-28 10:40:33 PDT
Thanks Saptarshi!

So I think based on this data, we should stop doing preload always on Windows (since it makes no difference), and stop disabling the prefetch since having prefetch on is a benefit in most cases 1%-99%.  Can you confirm that I understood the data correctly Saptarshi?
Comment 13 (dormant account) 2012-06-28 10:45:16 PDT
And if the numbers do not change, lets do what Brian suggested.
Comment 14 "Saptarshi Guha[:joy]" 2012-06-28 10:48:04 PDT
Hi Brian,

yes the understanding is correct. 


but do keep in mind, though this does decrease the median, the variance appears to be unchanged (see page 4, middle 2 panels)

(In reply to Brian R. Bondy [:bbondy] from comment #11)
> Thanks Saptarshi!
> 
> So I think based on this data, we should stop doing preload always on
> Windows (since it makes no difference), and stop disabling the prefetch
> since having prefetch on is a benefit in most cases 1%-99%.  Can you confirm
> that I understood the data correctly Saptarshi?
Comment 15 Annie Elliott 2012-06-28 10:58:21 PDT
(In reply to Taras Glek (:taras) from comment #12)
> I have one more theory, Saptashi, can you filter the data by
...
> 
Could you redo the analysis, but filter on
> simpleMeasures.start >=1second? I'm wondering if the fact that most startups
> are warm is skewing the data.
Comment 16 Annie Elliott 2012-06-28 10:59:23 PDT
Sorry - Given my last comment, the further asks are actually requests for new analysis work.

Please file a new request for new work. CLosing this one out.
Comment 17 (dormant account) 2012-06-28 11:18:17 PDT
(In reply to Annie Elliott from comment #16)
> Sorry - Given my last comment, the further asks are actually requests for
> new analysis work.
> 
> Please file a new request for new work. CLosing this one out.

No, we are asking to rerun the analysis.
Comment 18 Annie Elliott 2012-07-06 14:43:07 PDT
As above, I am closing this out. 

Going forward, when the original ask is completed, new asks, even if based on an old ask, require a new ticket, as they require further resourcing.

This is metrics workflow - there are no open-ended tickets. 

The additional work is in https://metrics.mozilla.com/projects/browse/METRICS-826.

Note You need to log in before you can comment on or make changes to this bug.