Closed Bug 934710 Opened 11 years ago Closed 11 years ago

implement low-level text performance logging

Categories

(Core :: Graphics: Text, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28

People

(Reporter: jtd, Assigned: jtd)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(3 files, 6 obsolete files)

To be able to evaluate better how text system performance affects overall page load time, it would be helpful to log text performance metrics:

- number of textruns created, content vs. chrome
- max size of textrun with histogram of size
- word cache lookups/hit ratio
- number of times fallback occurs (system, pref)

It would be helpful to log this per-reflow and at load end, since text performance after the initial load is less important.
Create a text perf metrics object in the prescontext if logging is enabled.  Jam it into the font group and use it to record various text stats.  The struct has two sets of counts, "current" and "cumulative", so that per-reflow stats can be recorded.
Log the text perf stats at page load end, during reflows and during presshell teardown.
add in const/destr counts for textrun's
Attachment #827231 - Attachment is obsolete: true
log the doc url at loaddone, rearrange value order and dump const/destr stats
Attachment #827232 - Attachment is obsolete: true
Apparently nsPresShell::EndLoad isn't really equivalent to the onload handler firing.  This occurs in nsDocumentViewer::LoadComplete, so call a new LoadComplete method from there and dump out the text perf stats at that point.
Attachment #827884 - Attachment is obsolete: true
Attachment #829066 - Flags: review?(cam)
Comment on attachment 827883 [details] [diff] [review]
patch, part 1, add text perf metrics

This includes everything mentioned in the description with the exception of the histogram of textrun sizes.  For now, I only include the size of the largest one.  Raw counts of textrun ctor/dtor's are also included.  The 'chars' parameter is the number of characters shaped, not the size of the content.

When dumping out the output, I set it up to ignore reflows with very small amounts of text shaping.  This reduces the noise level in the logs.
Attachment #827883 - Flags: review?(jfkthame)
Comment on attachment 829066 [details] [diff] [review]
patch, part 2, log various text perf stats

Review of attachment 829066 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with these mostly minor comments addressed.

::: layout/base/nsIPresShell.h
@@ +417,5 @@
>     */
>    virtual bool IsLayoutFlushObserver() = 0;
>  
>    /**
> +   * Called when document load completes

"." at end of sentence.

::: layout/base/nsPresShell.cpp
@@ +888,5 @@
> +static void
> +LogTextPerfStats(gfxTextPerfMetrics *aTextPerf,
> +                 PresShell *aPresShell,
> +                 const gfxTextPerfMetrics::TextCounts& aCounts,
> +                 float aTime, TextPerfLogType aLogType, char *aURL)

"*"s next to types.  Make aURL a "const char*" so that you don't need to cast away the const in PresShell::LoadComplete.

@@ +893,5 @@
> +{
> +  char prefix[256];
> +
> +  switch (aLogType) {
> +  case eLog_reflow:

Style guide says cases should take up an indentation level.

@@ +900,5 @@
> +  case eLog_loaddone:
> +    sprintf(prefix, "(textperf-loaddone) %p time-ms: %7.0f", aPresShell, aTime);
> +    break;
> +  default:
> +    sprintf(prefix, "(textperf-totals) %p", aPresShell);

MOZ_ASSERT in the "default" case that aLogType is eLog_totals.

@@ +919,5 @@
> +  }
> +
> +  if (aLogType == eLog_loaddone) {
> +    PR_LOG(tpLog, logLevel,
> +    		   ("%s reflow: %d chars: %d "

Get rid of tabs.

@@ +939,5 @@
> +            aCounts.textrunConst, aCounts.textrunDestr,
> +            aTextPerf->cumulative.textrunDestr));
> +  } else {
> +    PR_LOG(tpLog, logLevel,
> +    		   ("%s reflow: %d chars: %d "

Get rid of tabs.

@@ +967,5 @@
>      "destroy called on presshell while scripts not blocked");
>  
> +  // dump out cumulative text perf metrics
> +#ifdef PR_LOGGING
> +  gfxTextPerfMetrics *tp;

"*" next to type.

@@ +2491,5 @@
>  {  
>    mDocumentLoading = true;
>  
>  #ifdef PR_LOGGING
> +  gfxTextPerfMetrics *tp = nullptr;

"*" next to type.

@@ +2496,5 @@
> +  if (mPresContext) {
> +    tp = mPresContext->GetTextPerfMetrics();
> +  }
> +
> +  bool logPS = (gLog && PR_LOG_TEST(gLog, PR_LOG_DEBUG));

Not really obvious what "logPS" means.  "Log PresShell"?  Maybe "shouldLog" is a better name.  Also the outer parens aren't needed.

@@ +2531,2 @@
>  #ifdef PR_LOGGING
> +  gfxTextPerfMetrics *tp = nullptr;

"*" next to type.

@@ +2537,2 @@
>    // log load
> +  bool logPS = (gLog && PR_LOG_TEST(gLog, PR_LOG_DEBUG));

Same as above.

@@ +2551,5 @@
> +    if (tp) {
> +      tp->Accumulate();
> +      if (tp->cumulative.numChars > 0) {
> +        LogTextPerfStats(tp, this, tp->cumulative, loadTime.ToMilliseconds(),
> +                         eLog_loaddone, (char*)(spec.get()));

Remove the cast.

@@ +7994,5 @@
>    if (mIsZombie) {
>      return true;
>    }
>  
> +  gfxTextPerfMetrics *tp = mPresContext->GetTextPerfMetrics();

"*" next to type.
Attachment #829066 - Flags: review?(cam) → review+
Comment on attachment 827883 [details] [diff] [review]
patch, part 1, add text perf metrics

Review of attachment 827883 [details] [diff] [review]:
-----------------------------------------------------------------

In general, the code here looks like it'd work fine. But it troubles me to be adding code that's unconditionally part of release builds to such a hot part of the text inner-loop, when it's only going to be used by a few developers in a special logging situation. GetShapedWord, in particular, is a very hot function, and this patch adds an extra parameter to it, as well as an extra if-test within it.

Can we find some way to do this with lower overhead for the 99.9999% of users who'll never want to enable it? Or maybe put it behind an #ifndef RELEASE_MODE conditional, so that it's only present in pre-release channels?

::: gfx/thebes/gfxFont.cpp
@@ +3390,5 @@
>  
> +    gfxTextPerfMetrics *tp = aTextRun->GetFontGroup()->GetTextPerfMetrics();
> +    if (tp) {
> +        (mStyle.systemFont ?
> +             tp->current.numChromeTextRuns++ : tp->current.numContentTextRuns++);

Huh? Using a conditional expression here seems unnecessarily cryptic; just make it a normal if-statement.
Updated to make the use of the textperfs struct dependent on !RELEASE_BUILD.
Attachment #827883 - Attachment is obsolete: true
Attachment #827883 - Flags: review?(jfkthame)
Attachment #8333655 - Flags: review?(jfkthame)
fix missing initialization
Attachment #8333655 - Attachment is obsolete: true
Attachment #8333655 - Flags: review?(jfkthame)
Attachment #8333711 - Flags: review?(jfkthame)
Comment on attachment 8333711 [details] [diff] [review]
patch, part 1, add text perf metrics (revised)

Review of attachment 8333711 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFont.cpp
@@ +3091,5 @@
>                         uint32_t    aHash,
>                         int32_t     aRunScript,
>                         int32_t     aAppUnitsPerDevUnit,
> +                       uint32_t    aFlags,
> +                       gfxTextPerfMetrics *aTextPerf)

To avoid potential warnings (and maybe give the optimizer a hint?), we should annotate this with the 'unused' attribute when RELEASE_BUILD is true. (At least for gcc/clang. I'm not sure what the MSVC equivalent would be.)

I assumed we'd have a handy macro for this already, but I don't see one; so perhaps adding something like MOZ_UNUSED to mfbt/Attributes.h would be the best approach. With review from the compiler gurus over there, of course.

(I'd really prefer if the parameter didn't exist at all in RELEASE_BUILD builds, so that the compiler doesn't reserve a register to pass it in, or anything like that. But that would mean messy #ifdef'd code in additional places, so I guess we can live with this. Perhaps a good optimizer may be able to avoid any overhead?)

::: gfx/thebes/gfxFont.h
@@ +1042,5 @@
> +
> +    gfxTextPerfMetrics() {
> +        reflowCount = 0;
> +        memset(&current, 0, sizeof(current));
> +        memset(&cumulative, 0, sizeof(cumulative));

Might as well do this with a single memset() for the entire size of the object.

@@ +1700,5 @@
>                                   uint32_t aHash,
>                                   int32_t aRunScript,
>                                   int32_t aAppUnitsPerDevUnit,
> +                                 uint32_t aFlags,
> +                                 gfxTextPerfMetrics *aTextPerf = nullptr);

Let's not have a default here; AFAICS the only caller where you're not actually adding a textPerf parameter is in SetSpaceGlyph, so just pass an explicit nullptr there.
Updated based on review comments.  I used the MAYBE_UNUSED pattern that exists in cairo and graphite code.
Attachment #8333711 - Attachment is obsolete: true
Attachment #8333711 - Flags: review?(jfkthame)
Attachment #8334949 - Flags: review?(jfkthame)
Comment on attachment 8334949 [details] [diff] [review]
patch, part 1 v2c, add text perf metrics

Review of attachment 8334949 [details] [diff] [review]:
-----------------------------------------------------------------

OK, thanks. (I still feel that perhaps we should have a centrally-defined MOZ_MAYBE_UNUSED, but if there's nowhere else we need it at the moment aside from bulk-imported code that does its own version, then I guess it's fine for now.)

I'm kinda assuming that all the places we're passing that extra parameter around are not sufficiently perf-critical that it'll matter. Have you run some comparisons (tryserver/talos and/or local perf tests) with and without this patch, to check that it doesn't have a measurable impact at least when the logging isn't enabled?
Attachment #8334949 - Flags: review?(jfkthame) → review+
Attachment #827883 - Flags: review?(jfkthame)
Pushed to inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ae10dc494d8
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa28acc5960a


I pushed two tryserver builds, one with and one without the patch.  None of the local tests I ran show any difference, other than those well within the variance of the underlying tests themselves.
https://hg.mozilla.org/mozilla-central/rev/2ae10dc494d8
https://hg.mozilla.org/mozilla-central/rev/fa28acc5960a
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: