Closed Bug 793755 Opened 12 years ago Closed 11 years ago

Talos Regressions Trobopan, Tdhtml, tp4m_nochrome on Sept 22

Categories

(Firefox for Android Graveyard :: General, defect)

All
Android
defect
Not set
normal

Tracking

(fennec+)

RESOLVED FIXED
Firefox 21
Tracking Status
fennec + ---

People

(Reporter: gbrown, Assigned: blassey)

References

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

From dev-tree-management Digest, Vol 45, Issue 147:

   1. Talos Regression Robocop Pan Benchmark increase 41.9% on
      Android        2.2 (Native) mobile (nobody@cruncher.build.mozilla.org)
   2. Talos Regression DHTML NoChrome increase 14.6% on Android
      2.2        (Native) mobile (nobody@cruncher.build.mozilla.org)
   3. Talos Regression Tp4 Mobile NoChrome increase 20.8% on
      Android 2.2        (Native) mobile (nobody@cruncher.build.mozilla.org)

(I haven't seen alerts for m-i yet.)

It looks to me like the offending check-in was either

https://hg.mozilla.org/integration/mozilla-inbound/rev/f753afbe0d85 - bug 781588, jwein@mozilla.com 

https://hg.mozilla.org/integration/mozilla-inbound/rev/9108fa673bfe, etc - bug 716575, mbrubeck@mozilla.com
I noticed bug 716575 has a try run with android Talos tests, and the regressions are evident in those tests: It appears that the patches for bug 716575 caused the regression.
Blocks: 716575
The code changed by bug 716575 is currently called only by nsLayoutUtils::FontSizeInflationEnabled.  Any regression must be because (a) the result changed, so font inflation is now enabled on pages where it was previously disabled or vice-versa, or (b) the cost of the check itself is higher even if the result is the same.

Bug 793419 may be related.  In short, we discovered that the Gecko screen size is set to 0x0 when running Android Fennec with the Talos pageloader extension.  To prevent this from causing crashes in bug 716575 I added a fallback to a default screen size.  So the viewport size check previously acted as if the screen size in Talos was 0x0, and now acts as if it is 980x980.

I think this should make font inflation in the Talos pageloader closer to font inflation in the real browser.  The Talos changes might just be because the more accurate layout makes the pages more expensive to render.  In this case, the new numbers are actually more accurate and we don't need to "fix" them.

To verify this, we could profile page load in the real browser chrome, and check whether the time spent in GetViewportInfo is higher than it was before bug 716575.  If it is higher now, then there is a real performance regression that we should fix.
tracking-fennec: --- → ?
tracking-fennec: ? → 18+
Assignee: nobody → bugmail.mozilla
Kats, have you looked at this?
Flags: needinfo?(bugmail.mozilla)
Not yet, no. The trail had already gone cold on this bug (i.e. the relevant inbound data is gone) so I'll have to rebuild those csets myself first to do the before/after checks.
Flags: needinfo?(bugmail.mozilla)
TL;DR: yes, there is a real regression introduced by bug 716575.

(In reply to Matt Brubeck (:mbrubeck) from comment #2)
> To verify this, we could profile page load in the real browser chrome, and
> check whether the time spent in GetViewportInfo is higher than it was before
> bug 716575.  If it is higher now, then there is a real performance
> regression that we should fix.

I built the csets e96de9481915 and 9108fa673bfe (i.e. before and after the patches from bug 716575 landed) with this applied:

-    ViewportInfo vInf =
-      nsContentUtils::GetViewportInfo(aPresContext->PresShell()->GetDocument(),
+    mozilla::TimeStamp ts = mozilla::TimeStamp::Now();
+    ViewportInfo vInf;
+    for (int i = 0; i < 10000; i++) {
+      vInf = nsContentUtils::GetViewportInfo(aPresContext->PresShell()->GetDocument(),
                                       screenWidth, screenHeight);
+    }
+    printf_stderr("staktrace: ts: %f\n", (mozilla::TimeStamp::Now() - ts).ToMicroseconds());

Basically instead of calling GetViewportInfo once, each call to FontSizeInflationEnabled called GetViewportInfo 10000 times and recorded the microseconds it took to do all that.

Then I ran the builds and grabbed the output from startup to completely loading the SUMO page (which has a meta viewport tag), and got the printed times. The results:

cset e96de9481915 (before bug 716575):

Samples: 4975
Average: 49498.2
Stddev: 4747.36
Max: 93135
Min: 47783

cset 9108fa673bfe (after bug 716575):

Samples: 4975
Average: 76271.7
Stddev: 7751.45
Max: 157601
Min: 32049

Also note the extremely high number of samples: GetViewportInfo was called 95 times during startup and ~5000 times while loading support.mozilla.org. This seems like something that could be cached.
Hardware: x86 → All
Also here is a profile from cset 9108fa673bfe + my patch that I grabbed while it was busy loading support.mozilla.org. It gives some idea of where the time is being spent. However we'll probably want to grab a profile from latest m-c code if we're going to optimize it since it may have changed since then.

http://people.mozilla.com/~bgirard/cleopatra/#report=a68206f160f10a029456281dd0b959786cf1432d
Kats do you have a plan for this?
Keywords: regression
I have two plans of attack:
1) Cache the results from GetViewportInfo so that it doesn't get called quite so much. I'm not sure what triggers we would need to invalidate the cache though, so this might turn out to be non-trivial.
2) Modify the code to invoke GetViewportInfo lots of times (similar to my patch in comment 5) on recent code and then profile it to see what hotspots can be optimized.

Scott, you were also working near this code recently, so if you want to grab this bug feel free to do so :)
The implementation of GetViewportInfo makes a lot of calls to nsDocument::GetHeaderData which walks a list to look up the data each time:
http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocument.cpp#2977

This seems like a place we could optimize. Perhaps a switch statement for storing and retrieving frequently accessed headers, with the default case going to the list.
bumping to a tracking+ since 18 is ready to ship
tracking-fennec: 18+ → +
Attached patch patchSplinter Review
before:
I/Gecko   ( 6244): staktrace: ts: 56249.048000
I/Gecko   ( 6244): staktrace: ts: 55547.077000
I/Gecko   ( 6244): staktrace: ts: 65954.523000
I/Gecko   ( 6244): staktrace: ts: 55211.355000
I/Gecko   ( 6244): staktrace: ts: 67877.307000

after:
I/Gecko   ( 1701): staktrace: ts: 8118.419000
I/Gecko   ( 1701): staktrace: ts: 8026.857000
I/Gecko   ( 1701): staktrace: ts: 7935.297000
I/Gecko   ( 1701): staktrace: ts: 8789.867000
I/Gecko   ( 1701): staktrace: ts: 9339.234000
Attachment #697771 - Flags: review?(sjohnson)
I ran this patch through talos on try. I'm using tip of m-c for a comparison since I neglected to run the changeset I built on top of through try as well


                         w/o patch | w/patch
tcheckerboard:              0.02   |  0.0
tcheck2:                    4.36   |  8.96
trobopan:                 11131.5  |  8638.25
tprovider:                 374.75  |  373.75
tsvg_nochrome:            4384.45  |  4321.77
tp4m_nochrome:             833.05  |  781.33
tp4m_main_rss_nochrome:    86.0MB  |  85.5MB
tp4m_shutdown_nochrome: 28802138.0 |  28803098.0
ts:                       3517.89  |  5239.0
ts_shutdown:           28802908.11 |  28806785.16
Assignee: bugmail.mozilla → blassey.bugs
Try run for e3fb8ca6b220 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e3fb8ca6b220
Results (out of 56 total builds):
    success: 55
    warnings: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e3fb8ca6b220
(In reply to Brad Lassey [:blassey] from comment #9)
> The implementation of GetViewportInfo makes a lot of calls to
> nsDocument::GetHeaderData which walks a list to look up the data each time:
> http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocument.
> cpp#2977
> 
> This seems like a place we could optimize. Perhaps a switch statement for
> storing and retrieving frequently accessed headers, with the default case
> going to the list.

It's too bad we can't send GetHeaderData() a list of items, along with a pointer to some type of dynamic storage (e.g. a map). Then, we could perform a single call to GetHeaderData() to get all the items we need (with some being null, since they might not exist).
Try run for e3fb8ca6b220 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e3fb8ca6b220
Results (out of 57 total builds):
    success: 55
    warnings: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e3fb8ca6b220
Try run for 6fc56310df75 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=6fc56310df75
Results (out of 57 total builds):
    success: 55
    warnings: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-6fc56310df75
(In reply to Scott Johnson (:jwir3) from comment #15)
> (In reply to Brad Lassey [:blassey] from comment #9)
> > The implementation of GetViewportInfo makes a lot of calls to
> > nsDocument::GetHeaderData which walks a list to look up the data each time:
> > http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocument.
> > cpp#2977
> > 
> > This seems like a place we could optimize. Perhaps a switch statement for
> > storing and retrieving frequently accessed headers, with the default case
> > going to the list.
> 
> It's too bad we can't send GetHeaderData() a list of items, along with a
> pointer to some type of dynamic storage (e.g. a map). Then, we could perform
> a single call to GetHeaderData() to get all the items we need (with some
> being null, since they might not exist).

Turns out that just caching the values of the headers didn't help much.
Comment on attachment 697771 [details] [diff] [review]
patch

Review of attachment 697771 [details] [diff] [review]:
-----------------------------------------------------------------

Some very minor stuff, but otherwise looks good. r=jwir3

::: content/base/public/nsIDocument.h
@@ +1833,4 @@
>    already_AddRefed<nsINode>
>      ImportNode(nsINode& aNode, bool aDeep, mozilla::ErrorResult& rv) const;
>    nsINode* AdoptNode(nsINode& aNode, mozilla::ErrorResult& rv);
> +

Nit: Not sure if this newline was intentional or not

::: content/base/src/nsDocument.cpp
@@ +6452,5 @@
> +      mAutoSize = true;
> +    }
> +
> +    if (widthStr.IsEmpty() && heightStr.EqualsLiteral("device-height"))
> +    {

'{' Should probably go at the beginning of the previous line. (Not your bug... this is probably my mistake when I converted this stuff over from JS).

@@ +6496,5 @@
> +    mScaleStrEmpty = scaleStr.IsEmpty();
> +    mWidthStrEmpty = widthStr.IsEmpty();
> +    mValidScaleFloat = !scaleStr.IsEmpty() && NS_SUCCEEDED(scaleErrorCode);
> +    mValidMaxScale = !maxScaleStr.IsEmpty() && NS_SUCCEEDED(scaleMaxErrorCode);
> +  

Nit: Spacing error I think (or was this intentional?)

::: content/base/src/nsDocument.h
@@ +1362,5 @@
> +  ViewportType mViewportType;
> +
> +  float mScaleMinFloat, mScaleMaxFloat, mScaleFloat, mPixelRatio;
> +  bool mAutoSize, mAllowZoom, mValidScaleFloat, mValidMaxScale, mScaleStrEmpty, mWidthStrEmpty;
> +  uint32_t mViewportWidth, mViewportHeight;

So, rather than keeping these around, perhaps we should just keep a copy of the nsViewportInfo object that gets created at the end of the GetViewportInfo() function? (TBH, I'm not entirely sure why these need to be kept as member variables, rather than just function-local variables, since I don't see where they are used anywhere...)
Attachment #697771 - Flags: review?(sjohnson) → review+
(In reply to Brad Lassey [:blassey] from comment #18)
> (In reply to Scott Johnson (:jwir3) from comment #15)
> > (In reply to Brad Lassey [:blassey] from comment #9)
> Turns out that just caching the values of the headers didn't help much.

Maybe I'm missing what you're saying, but:
trobopan:                 11131.5  |  8638.25 <---- 22% decrease
tp4m_nochrome:             833.05  |  781.33  <---- 6% decrease

These seem like solid numbers, and worth pushing the patch for... unless I'm reading this incorrectly?
(In reply to Scott Johnson (:jwir3) from comment #20)
> So, rather than keeping these around, perhaps we should just keep a copy of
> the nsViewportInfo object that gets created at the end of the
> GetViewportInfo() function? (TBH, I'm not entirely sure why these need to be
> kept as member variables, rather than just function-local variables, since I
> don't see where they are used anywhere...)

These are used between runs of GetViewportInfo(). The values are established when the viewport type is Unknown and used subsequently when it is Specified. 

I don't think we can cache the nsViewportInfo object itself as the value depends on the width and height that are passed in
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee16318ac67b
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I don't think this is fixed anymore (based on comment 23)?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
As luck would have it, M1 did run in the try push https://tbpl.mozilla.org/?tree=Try&rev=6fc56310df75
(In reply to Brad Lassey [:blassey] from comment #25)
> As luck would have it, M1 did run in the try push
> https://tbpl.mozilla.org/?tree=Try&rev=6fc56310df75

By 'did run', did you perhaps mean 'did not run'?   ;)
Also by 'luck', did you perhaps mean 'trychooser syntax'? :)
Try run for 6fc56310df75 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=6fc56310df75
Results (out of 58 total builds):
    success: 56
    warnings: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-6fc56310df75
Attached patch follow up patch (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=6fea4336c92f

This fixes the test failures.

It also adds a check for the doctype changing dynamically, which our tests don't test. I'm not sure if that's a thing that can happen in the real world. Any opinions?
Attachment #699580 - Flags: review?(sjohnson)
Attached patch follow up patchSplinter Review
left a bit of debugging in the prior patch, cleaned up with this one and repushed to try https://tbpl.mozilla.org/?tree=Try&rev=5ad26b844973
Attachment #699580 - Attachment is obsolete: true
Attachment #699580 - Flags: review?(sjohnson)
Attachment #699583 - Flags: review?(sjohnson)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #27)
> Also by 'luck', did you perhaps mean 'trychooser syntax'? :)

'luck' being that http://trychooser.pub.build.mozilla.org/ doesn't have a checkbox for M1-5. I wonder what else is missing.
Status: REOPENED → NEW
(In reply to Brad Lassey [:blassey] from comment #31)
> 'luck' being that http://trychooser.pub.build.mozilla.org/ doesn't have a
> checkbox for M1-5. I wonder what else is missing.

It does have those checkboxes, just not under the "android-only" section.
Comment on attachment 699583 [details] [diff] [review]
follow up patch

Review of attachment 699583 [details] [diff] [review]:
-----------------------------------------------------------------

r=jwir3 with the one small nit, and the changes in http://www.pastebin.mozilla.org/2047487 that you mentioned.

::: content/base/src/nsDocument.cpp
@@ +6376,5 @@
>  {
> +  nsCOMPtr<nsIDOMDocumentType> docType;
> +  nsresult rv = GetDoctype(getter_AddRefs(docType));
> +  if (mCachedDoctype != docType)
> +    mViewportType = Unknown;

{} please. :)
Attachment #699583 - Flags: review?(sjohnson) → review+
https://hg.mozilla.org/mozilla-central/rev/8d8f921899ae
Status: NEW → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 21
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: