Closed Bug 801780 Opened 12 years ago Closed 12 years ago

Explicitly call out long strings in about:memory

Categories

(Toolkit :: about:memory, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla19
Tracking Status
firefox18 --- verified

People

(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)

References

Details

(Keywords: qawanted)

Attachments

(2 files, 1 obsolete file)

The patch I'll attach in a moment separates long strings from short strings in about:memory and explicitly lists the first few chars of each long string.

In about:memory, it looks like the following:

├──14.95 MB (22.57%) -- window-objects
│  ├──14.34 MB (21.64%) -- top(app://system.gaiamobile.org/index.html, id=6)/active
│  │  ├───9.76 MB (14.74%) -- window(app://browser.gaiamobile.org/index.html)
│  │  │   ├──9.40 MB (14.18%) -- js/compartment(app://browser.gaiamobile.org/index.html)
│  │  │   │  ├──8.47 MB (12.78%) -- string-chars/long
│  │  │   │  │  ├──7.24 MB (10.93%) ── string (length 631106):  [...] [6]
│  │  │   │  │  ├──1.21 MB (01.83%) ── string (length 633838):  [...]

31.52 MB (100.0%) -- js-main-runtime
├──28.57 MB (90.64%) -- compartments
│  ├──10.91 MB (34.63%) -- string-chars
│  │  ├──10.14 MB (32.19%) -- long
│  │  │  ├───7.24 MB (22.98%) ── string (length 631106):  [...] [6]
│  │  │  ├───2.42 MB (07.68%) ── string (length 633838):  [...] [2]
│  │  │  └───0.48 MB (01.52%) ++ (13 tiny)
│  │  └───0.77 MB (02.44%) ── short
Assignee: nobody → justin.lebar+bug
This is a bit ugly in places.

Ideally, I'd like LongStringInfo::MinSize to be a pref, but it seemed complicated to add that to the JS engine.  I don't see many strings longer than 4kb in normal usage, so perhaps this is OK as-is.

I had to add explicit initialization of all the fields in CompartmentStats because I didn't want to memset(0) a vector.  Similarly, I had to add an explicit copy-constructor to CompartmentStats because we have a vector of CompartmentStats objects, which AIUI requires either a copy- or a move-constructor.
Attachment #671530 - Flags: review?(n.nethercote)
I thought we did this earlier, but apparently not.
Attachment #671531 - Flags: review?(n.nethercote)
Attachment #671530 - Attachment description: Bug 801780 - Part 1: Explicitly call out long strings in about:memory. → Part 1: Explicitly call out long strings in about:memory.
I'm tempted to put a SHA1 hash of the full string in the description so we can tell for sure whether we have multiple copies of the same string or not.  But I guess it's not really needed at the moment, since I have to imagine that merely the length of a compressed screenshot is a pretty good indicator of uniqueness.
Attachment #671531 - Flags: review?(n.nethercote) → review+
Comment on attachment 671530 [details] [diff] [review]
Part 1: Explicitly call out long strings in about:memory.

Review of attachment 671530 [details] [diff] [review]:
-----------------------------------------------------------------

I really like this idea, and the patch is mostly good.  But there's a problem with CompartmentStats::add() that I describe below, and I'd like to see a revised patch with the suggested change before landing.  Thanks!

::: js/public/MemoryMetrics.h
@@ +58,5 @@
> +
> +    /*
> +     * A string needs to take up this many bytes of storage before we consider
> +     * it to be "long".
> +     */

Nit: C++-style comments are acceptable in the JS engine, and this file uses them in most places, so please do likewise in new code.  Bonus points if you change the existing C-style comments to C++-style :)

@@ +67,5 @@
> +     * because the allocator and the JS engine both may round up.
> +     */
> +    size_t length;
> +    size_t size;
> +    char buffer[32];

At this point I'm guessing this is going to be the first 32 chars of the string, but a comment would be nice :)

@@ +144,5 @@
> +      , regexpCompartment(0)
> +      , debuggeesSet(0)
> +    {}
> +
> +    CompartmentStats(const CompartmentStats &other)

I guess the copy constructor is needed because of the 
|longStrings.append(cStats.longStrings);| call below?

@@ +208,5 @@
>      size_t objectSlots;
>      size_t objectElements;
>      size_t objectMisc;
>      size_t objectPrivate;
> +    size_t shortStringChars;

The JS engine already has "short strings", which are strings with few enough chars that the chars can be stored inline in the JSString object.  (Come to think of it, I should measure and report those separately from normal strings.)

So calling these non-huge strings "short" is a bit confusing.  What about "huge" vs "non-huge"?

Finally, the names of these fields mirror the paths used in the reporters.  So if the path ends up being "string-chars/non-huge", the field name should be "stringCharsNonHuge".

@@ +265,5 @@
>  
>          #undef ADD
>  
>          typeInferenceSizes.add(cStats.typeInferenceSizes);
> +        longStrings.append(cStats.longStrings);

So this is interesting.  add() is used only to sum  all the compartment numbers to give the totals shown under "js-main-runtime/compartments" in "Other Measurements".  By appending this vector, all the huge strings will end up being shown in that tree.  I'm not sure if you think this is a bug or a feature, or if you even realized it, but I think it's a bug.

What I'd prefer is if this tree had an additional "string-chars/huge" entry that none of the individual compartments have.  This would require adding a stringCharsHuge field that would only be assigned to here -- instead of appending, you'd sum the sizes of all the long strings in |cStats.longStrings| and add that to |stringCharsHuge|.

The sundries thresholding will ensure that "string-chars/huge" won't show up for any of the individual compartments, because they'll all be zero.

Does that make sense?  There'll need to be a comment here explaining things, and also a brief comment on the |stringCharsHuge| field that points to this comment.

Finally, I think that will allow you to remove the copy constructor.

::: js/src/jsmemorymetrics.cpp
@@ +159,5 @@
> +
> +        size_t strSize = str->sizeOfExcludingThis(rtStats->mallocSizeOf);
> +
> +        if (strSize >= LongStringInfo::MinSize) {
> +            MOZ_ALWAYS_TRUE(cStats->longStrings.growBy(1));

This is misleading -- it looks like you're claiming it can never fail, when really you just don't want to have to think about what happens when it fails.

I suggest you put the growBy() call in the condition, so that if it fails, the string'll be counted among the non-huge strings (and the browser will likely crash soon after).  A comment too would be nice.

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +1642,5 @@
>                    "Memory used during type inference and compilation to hold "
>                    "transient analysis information.  Cleared on GC.");
>  
> +    // We can't use CREPORT_BYTES here because we want to build the description
> +    // at runtime.

Can you create a new macro that takes an nsCString description?  I'm not fussed about the name, CREPORT_BYTES_2 will do if you like.  I want that because it makes ReportCompartmentStats() much easier to read if all the numbers are reported in a similar fashion.

@@ +1645,5 @@
> +    // We can't use CREPORT_BYTES here because we want to build the description
> +    // at runtime.
> +    if (cStats.shortStringChars >= SUNDRIES_THRESHOLD) {
> +        nsPrintfCString desc(
> +            "Memory allocated to hold string characters strings which use fewer "

"string character strings"?  Hmm.

How about "Memory allocated to hold string characters of non-huge strings, i.e. those strings whose characters take up less than %d bytes of memory."

Although, in the description below you mention the |length| rather than the |size|, so perhaps  it should be "strings of length lower than %d", where the %d param is |MinSize / sizeof(jschar)|.

@@ +1673,5 @@
> +        nsCString longString;
> +        longString.AppendASCII(info.buffer);
> +
> +        nsCString escapedString(longString);
> +        escapedString.ReplaceSubstring("/", "\\/");

Comment this, please.  Look at GetCompartmentName() for an example.  Bonus points if you create a factored-out function for those two places.  (And note that ReplaceChar is probably faster than ReplaceSubstring.)

@@ +1679,5 @@
> +        nsAutoCString path = cJSPathPrefix +
> +            nsPrintfCString("string-chars/long/string (length %d): %s [...]",
> +                            info.length, escapedString.get());
> +
> +        nsPrintfCString desc("Memory allocated to hold string characters for "

s/for/of/

@@ +1689,5 @@
> +            info.length, longString.get());
> +
> +        // We can't use CREPORT_BYTES here because we want to build the
> +        // description at runtime and because we want to avoid sundries
> +        // thresholding.

Why do you want to avoid the sundries thresholding?  I'd prefer it if you didn't.  The two thresholds are 8KB and 4KB, I'd be ok if you changed them both to 4KB.  (Indeed, then SUNDRIES_THRESHOLD could be defined in terms of LongStringInfo::MinSize.)  Or I'd be ok with 8KB for both of them too.

Once you've done that, you'll be able to use the new CREPORT_BYTES_2 macro here.
Attachment #671530 - Flags: review?(n.nethercote) → review-
> I guess the copy constructor is needed because of the 
> |longStrings.append(cStats.longStrings);| call below?

That would be the copy constructor for LongStringInfo (which is implicit).

AIUI, the explicit copy constructor on CompartmentStats is needed because we
keep a JS::vector of them in RuntimeStats, and vector::growBy() requires a
copy-constructor.  We'd been using the implicit copy constructor on
CompartmentStats, but we need an explicit one now because JS::vector does not
have an implicit copy constructor.

> So calling these non-huge strings "short" is a bit confusing.  What about "huge" vs "non-huge"?

Sure.

> By appending this vector, all the huge strings will end up being shown in
> that tree.  I'm not sure if you think this is a bug or a feature, or if you
> even realized it, but I think it's a bug.

Hm.  I kind of liked the output the way it was.  There's something to be said
for keeping the runtime summary identical in form to the compartment reporters.

But if we did want to show the long strings only once (which I'm fine with),
can we keep CompartmentStats the same as it is and simply sum the strings when
we compute the summary reporter?  That way we don't have this field which is
sometimes a lie, and which we report but always expect to be 0, and so on.
Loose coupling is good.  And anyway, I'm pretty sure that what you suggested
doesn't let us get rid of the CompartmentStats copy constructor.

> This is misleading -- it looks like you're claiming it can never fail, when
> really you just don't want to have to think about what happens when it fails.

I was only cargo-culting (from this file, no less)!

> I suggest you put the growBy() call in the condition, so that if it fails,
> the string'll be counted among the non-huge strings (and the browser will
> likely crash soon after).

Can we just make this vector explicitly infallible?  Does that exist in JS
land?

> Although, in the description below you mention the |length| rather than the
> |size|, so perhaps  it should be "strings of length lower than %d", where the
> %d param is |MinSize / sizeof(jschar)|.

Ah, but the allocator may round up, so this isn't right!  (I mentioned this
explicitly in a comment on LongStringInfo.  :)

I mention the length in the description because the size is already reported
(the node size).  Although when we have multiple strings with the same prefix
and length collide, the size we report is N * malloc_usable_size(str).  That
doesn't bother me much.

> Why do you want to avoid the sundries thresholding? 

I don't think we should sundry-ify non-huge strings.  That is, the decision as
to whether to call out a string should be made in just one place
(LongStringStats::MinSize, or a pref), and not in two places
(LongStringStats::MinSize and sundry size).

Now, we could statically assert that MinSize >= sundry size.  But then that
precludes making either one dynamic.

But it seemed simpler to me just to say that MinSize takes precedence over the
sundry size.  This is also reasonable because sundries apply to aggregate
quantities (total amount of memory spent on X), while we're reporting here
individual objects.

Having "strings/long/sundries" does not make much sense to me.  Those are
called short strings.

(Another option would be to leave MinSize where it is, and, when iterating over
the long strings in the memory reporter, move any long strings smaller than the
sundry size into the small strings category.  This is kind of complicated, but
I'd be OK with that.)
> Hm.  I kind of liked the output the way it was.  There's something to be said
> for keeping the runtime summary identical in form to the compartment reporters.

Another (small) advantage of the way it is in the patch is that it makes it easier to see string prefix+length collisions across compartments.

(I think your way is fine too, though.)
> But if we did want to show the long strings only once (which I'm fine with),
> can we keep CompartmentStats the same as it is and simply sum the strings
> when
> we compute the summary reporter?  That way we don't have this field which is
> sometimes a lie, and which we report but always expect to be 0, and so on.
> Loose coupling is good.  And anyway, I'm pretty sure that what you suggested
> doesn't let us get rid of the CompartmentStats copy constructor.

ReportCompartmentStats() is used both for individual compartments and the totals tree at the end.  To do what you'd want it'd have to take an extra boolean arg to distinguish the two cases.

Ok, maybe I can live with the huge strings showing up in the totals.

> > This is misleading -- it looks like you're claiming it can never fail, when
> > really you just don't want to have to think about what happens when it fails.
> 
> I was only cargo-culting (from this file, no less)!

The other case involves a prior call to reserve(), so it really is infallible, as the comment above it explains.

> Can we just make this vector explicitly infallible?  Does that exist in JS
> land?

Nope :/

> > Why do you want to avoid the sundries thresholding? 
> 
> I don't think we should sundry-ify non-huge strings.  That is, the decision
> as
> to whether to call out a string should be made in just one place
> (LongStringStats::MinSize, or a pref), and not in two places
> (LongStringStats::MinSize and sundry size).

I agree.  My suggestion (defining one in terms of the other) did that.  It's a bit weird having two values but that seems unavoidable due to the whole JS/xpconnect split.

> Now, we could statically assert that MinSize >= sundry size.  But then that
> precludes making either one dynamic.

YAGNI :)

> Having "strings/long/sundries" does not make much sense to me.  Those are
> called short strings.

With my suggestion you won't have entries like that;  all huge strings will by definition not be sundry-able.
> (Indeed, then SUNDRIES_THRESHOLD could be defined in terms of LongStringInfo::MinSize.)

Oh, I see this now.

That seems bogus to me.  The sundries threshold shouldn't depend on the size above which we consider a string to be "huge".  That's tight coupling in an awkward way.  It would make sense to define LongStringInfo::MinSize in terms of the sundries size, but we can't go in that direction.  :(
You could add a JS::sundriesThreshold constant in jsmemorymetrics.cpp, and use it to replace both LongString::MinSize and SUNDRIES_THRESHOLD.
(In reply to Nicholas Nethercote [:njn] from comment #9)
> You could add a JS::sundriesThreshold constant in jsmemorymetrics.cpp, and
> use it to replace both LongString::MinSize and SUNDRIES_THRESHOLD.

That's in the wrong place -- the sundries threshold should be in the XPC code -- but maybe that's good enough.
> (And note that ReplaceChar is probably faster than ReplaceSubstring.)

>  void ReplaceChar( char_type aOldChar, char_type aNewChar );
>  void ReplaceChar( const char* aSet, char_type aNewChar );

Ignoring the fact that this is a premature optimization, ReplaceChar only lets me do a 1 -> 1 substitution, and I want to do a 1 -> 2 substitution.
> How about "Memory allocated to hold string characters of non-huge strings, i.e. those strings whose 
> characters take up less than %d bytes of memory."

I didn't use this because it's a description for string-chars/huge, not for string-chars/huge/my-particular-string.
(In reply to Justin Lebar [:jlebar] from comment #12)
> > How about "Memory allocated to hold string characters of non-huge strings, i.e. those strings whose 
> > characters take up less than %d bytes of memory."
> 
> I didn't use this because it's a description for string-chars/huge, not for
> string-chars/huge/my-particular-string.

Oh, sorry!  I attached this to the wrong MR description.
Comment on attachment 672068 [details] [diff] [review]
Part 2, v2: Explicitly call out long strings in about:memory.

Review of attachment 672068 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  Just a few nits.

::: js/public/MemoryMetrics.h
@@ +27,5 @@
> +// memoryReportingSundriesThreshold.
> +//
> +// We need to define this value here, rather than in the code which actually
> +// generates the memory reports, because HugeStringInfo uses this value.
> +static const size_t memoryReportingSundriesThreshold = 8 * 1024;

Ok, some JSAPI weirdness I failed to know/mention:  apparently static constants are problematic.  So please make this a "friend" function in the |js| namespace (not in the |JS| namespace).  I.e. do this in public/MemoryMetrics.h:

namespace js {
JS_FRIEND_API(size_t) MemoryReportingSundriesThreshold();
}

and then define it in jsmemorymetrics.cpp.  Note the upper-case 'M'.  A try run would be worthwhile to make sure it compiles on Windows.

@@ +73,5 @@
> +    size_t length;
> +    size_t size;
> +
> +    // We record the first 32 chars of the string here.
> +    char buffer[32];

jschar is usually used for strings in the JS engine, so please add a comment that this string contains escapes in order to make it char[] instead of jschar[].

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +1421,3 @@
>  // Nb: all non-GC compartment reports are currently KIND_HEAP, and this macro
>  // relies on that.
> +#define CREPORT_BYTES(_path, _amount, _descLiteral)                           \

Oh, I think you can now do this:

#define CREPORT_BYTES(_path, _amount, _descLiteral) \
  CREPORT_BYTES2(_path, _amount, NS_LITERAL_CSTRING(_descLiteral))

@@ +1690,5 @@
> +
> +        CREPORT_BYTES2(
> +            cJSPathPrefix +
> +            nsPrintfCString("string-chars/huge/string (length %d): %s [...]",
> +                            info.length, escapedString.get()),

An example:

  string(length 631106):  [...]

A few things about this look funny to me:

- The combination of spaces and colon is atypical for about:memory.

- The string isn't quoted, which is surprising.

- The " [...]" indicates "truncated", right?  It looks strange to me.

How about this?

  string("...", 631106)

And maybe add "length=" before the length?
Attachment #672068 - Flags: review?(n.nethercote) → review+
>  string("...", 631106)

I like this, but can I flip it so the length comes first?  That way we truncate the long string before we truncate the length.

  string(length=631106, "...")
> I like this, but can I flip it so the length comes first?

Sure!
Windows builds are coming at https://tbpl.mozilla.org/?tree=Try&rev=a280b3a73ff1.  I particularly want to test that my cleverness to avoid passing a non-literal to NS_LITERAL_CSTRING doesn't cause problems on Windows.

I should also fix that macro...
> I should also fix that macro...

Filed as bug 802469, if you're curious.
Attachment #671530 - Attachment is obsolete: true
Attachment #671531 - Attachment description: Part 2: Escape quotes in memory reporter paths when dumping to JSON. → Part 1: Escape quotes in memory reporter paths when dumping to JSON.
Try run for a280b3a73ff1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a280b3a73ff1
Results (out of 49 total builds):
    success: 45
    warnings: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-a280b3a73ff1
Comment on attachment 671531 [details] [diff] [review]
Part 1: Escape quotes in memory reporter paths when dumping to JSON.

[Approval Request Comment]
Trivial fix to memory report dumping code.  No string changes.
Attachment #671531 - Flags: approval-mozilla-aurora?
Comment on attachment 672068 [details] [diff] [review]
Part 2, v2: Explicitly call out long strings in about:memory.

[Approval Request Comment]
Change to memory reporting that we need to understand where memory is going in B2G.  Affects desktop as well as mobile.  Most of this code should not be run except when you load about:memory.
Attachment #672068 - Flags: approval-mozilla-aurora?
Comment on attachment 671531 [details] [diff] [review]
Part 1: Escape quotes in memory reporter paths when dumping to JSON.

If you think of anything outside of about:memory that may regress, please call it out for QA.
Attachment #671531 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #672068 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
QA - please take a look at about:memory and make sure that there are no obvious regressions once this lands.
Keywords: qawanted, verifyme
https://hg.mozilla.org/mozilla-central/rev/6d4ffa88e7d3
https://hg.mozilla.org/mozilla-central/rev/b803ce91fc0e
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
This doesn't apply on Aurora because Aurora doesn't have bug 799019.  I've asked for approval of that bug.
This issue seems to be fixed. I can't see no obvious regressions.
Verified with Firefox 18 beta 1.

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/18.0 Firefox/18.0
Build ID: 20121121075611
Status: RESOLVED → VERIFIED
Keywords: verifyme
QA Contact: manuela.muntean
Depends on: 893222
Depends on: 899256
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: