Closed Bug 896019 Opened 11 years ago Closed 7 years ago

Android startup tests should use a cold cache

Categories

(Testing Graveyard :: Eideticker, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: wlach, Assigned: wlach)

Details

Taras Glek says:

"Turns out eideticker is testing warm startup. Warm startup means browser files are cached in RAM so there is no IO penalty. This is unrealistic as Android is very likely to completely push apps out of memory while they are in background due to app vs physical RAM constraints. Loading libraries from jars should allow us to start faster than other Android browsers due to requiring less IO. This also means we can't match their warm startup speeds.

Is there a reason we aren't flushing caches* before testing startup? I think that would make for a more realistic benchmark.

* Something like 'echo 3 > /proc/sys/vm/drop_caches' should flush caches on startup if one has root. There are lamer ways to flushes caches if you don't."

This is all true. There's no reason we shouldn't be doing this, as cold startup is indeed a more realistic test case.

This applies to the following tests:

* startup-abouthome-dirty
* startup-abouthome-fresh
* nytimes-load

It does not apply to nytimes-load-poststartup, as the page is loaded after fennec starts up.
While warm cache is biased in the described way, cold cache is biased in the opposite direction. It flushes everything, including things that would *not* be flushed from memory even on low-memory devices, like dalvik executables and libs, low-level system libraries.

This includes the system webkit, which is very likely to be in memory considering the number of apps that use a webview. Heck, even Firefox for android loads the webkit library (!) but doesn't use it, so it actually doesn't pull the whole file in memory.

So, yes, this means the android native browser has a huge advantage on any other browser not using a webview. But this means a pure cold cache approach wouldn't show that.
So, in fact, i'd suggest flushing caches, and start some small random app that uses a webview and make it display a web page, before testing any browser.
(In reply to Mike Hommey [:glandium] from comment #2)
> So, in fact, i'd suggest flushing caches, and start some small random app
> that uses a webview and make it display a web page, before testing any
> browser.

A pure cold start would be a fair comparison against current revisions of Chrome...I agree about warming up cache with some trivial app, but I don't think we should warm it up with webview .so
(In reply to Taras Glek (:taras) from comment #3)
> (In reply to Mike Hommey [:glandium] from comment #2)
> > So, in fact, i'd suggest flushing caches, and start some small random app
> > that uses a webview and make it display a web page, before testing any
> > browser.
> 
> A pure cold start would be a fair comparison against current revisions of
> Chrome...I agree about warming up cache with some trivial app, but I don't
> think we should warm it up with webview .so

A test agent app is *always* running on the device to allow the device to be controlled remotely (without adb):

https://hg.mozilla.org/mozilla-central/file/5aa02ee02f4b/build/mobile/sutagent/android

Is this sufficient? If not, any suggestions on something else to load? Ideally this would be something present in both android 2.2/2.3 and 4.0. Installing an app on the system would not be a problem.
The patch at:

https://github.com/mozilla/eideticker/pull/6

adds a --flush-caches option to ensure that apps start up with a cold cache.  The way it's currently structured is that for a test like nytimes-load, which starts the app with about:home, shuts it down, and starts the actual test, is that both launches of the app will be done with a cold cache.  I think this fulfills the intent of cold-cache testing the best.

The timings for the browsers are, from the 2nd app start to the "Test finished callback" on nytimes-load:

Stock browser: 5s
Chrome: 7-8s
Firefox: 15-16s (!)

The numbers for first app startup to the second app startup are all about the same, about 12s.  Not sure if that's just happenstance due to sleeping in the test framework or what.

Still need to hack eideticker and determine what something like firstPaint looks like on this test.
That was on a Galaxy Nexus, BTW.  browser.js in Firefox starts running at 3321ms and first URI load happens at 3975ms.
Eideticker has been discontinued, see bug 1361056
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.