Closed
Bug 1044709
Opened 10 years ago
Closed 10 years ago
DMD: make output machine-readable
Categories
(Core :: DMD, defect)
Core
DMD
Tracking
()
RESOLVED
FIXED
mozilla35
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: [MemShrink:P2])
Attachments
(4 files, 3 obsolete files)
DMD currently uses an ad hoc output that's not hard to parse, but does require a custom parser. It would probably be better to use a standard machine-readable form such as JSON or Google's protocol buffers. This would make the creation of new scripts easier. A version number should be part of the format.
Assignee | ||
Comment 1•10 years ago
|
||
Another constraint: fix-linux-stack.pl and fix_macosx_stack.py currently require that stack trace entry lines have a particular format. It would probably make sense to factor out the stack trace conversion code into a separate module and then rewrite the fix scripts in terms of those, and then the new DMD post-conversion scripts could use those modules too. Rewriting fix-linux-stack.py in Python would make sense at the same time.
Assignee | ||
Comment 2•10 years ago
|
||
Also, DMD will soon be able to produce two or more kinds of outputs, depending on how its invoked. So that needs to be handled somehow.
Comment 3•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #1) > Rewriting fix-linux-stack.py in Python would make sense at the same time. FWIW, when I was ignorant of fix-linux-stack.py, I (essentially) rewrote it in perl and python (partly to debug my perl). The perl was significantly faster out of the gate; I don't know if the python code could have been made faster or not.
Assignee | ||
Comment 4•10 years ago
|
||
I've done quite a bit of profiling with fix-linux-stack.pl (e.g. I tried parallelizing it a while back), and my experience is that its running time is entirely dominated by the addr2line calls; in-script time is negligible.
Assignee | ||
Updated•10 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
Assignee | ||
Comment 6•10 years ago
|
||
Here's a patch in good but not-quite-complete state. I will ask erahm to take a look, and then ask mccr8 to review the final patch once it's polished. First of all, making the output machine readable is a great idea, so thanks for pushing me to do this. Doing post-processing is *so* much easier and more flexible in Python. It's also incredibly useful to be able to do multiple kinds of post-processing on a single output file. Things that are in place. - DMD now produces JSON output, using mfbt/JSONWriter. The format is described in DMD.h. - DMD now only prints stats about its own data structures and memory usage if the (new) --show-dump-stats=yes option is given. - DMD.cpp is 396 lines shorter than it was, yay. - The dmd.py script does all the post-processing. It's 416 lines, but it can do numerous things that the old DMD could not -- just look at its options. (E.g. the --show-all-block-sizes option implements bug 855878). - The way test mode works has changed: instead of producing a single output file containing eight separate outputs, it now produces four files each with a single output, and each file is run through dmd.py in two different ways. The expected test file outputs have been split up similarly, but their contents are almost entirely unchanged. So you should split - (Semi-related) I replaced the JSON schema describing the memory report format with inline comments in the example. It's much shorter, easier to read, and there's not point having a formal JSON schema if we're not using it as input to any programs. Things still to do. - dmd.py doesn't do stack-fixing; you have to do that yourself either before or after. I'm not sure what's the best way to handle this. - I haven't implemented diffs. Shouldn't be hard. The plan is that if you give dmd.py two filenames it'll diff them. - Need to resolve the situation with frame numbers in bug 1062709, which will affect things somewhat. - Needs more testing, especially of the various dmd.py options. I will add some hand-crafted JSON files for this purpose, as opposed to the current testing which only works with real DMD output.
Attachment #8493543 -
Flags: feedback?(erahm)
Comment 8•10 years ago
|
||
Comment on attachment 8493543 [details] [diff] [review] DMD: emit JSON output and use Python for post-processing Review of attachment 8493543 [details] [diff] [review]: ----------------------------------------------------------------- This looks awesome! As far as the overall design goes, my main thought is we don't need the associative arrays. It'll save us some space on the output side, parsing might end up a bit easier too. OTOH it seems like it might make generation more complicated and possibly have more overhead, so I can go either way on this. For mccr8's sake it might be helpful to break this up into a few patches when you're ready for full review (particularly splitting out any cleanup/reformatting into a rubber-stampable patch). ::: memory/replace/dmd/DMD.cpp @@ +1343,5 @@ > StatusMsg(" (prime numbers are recommended)\n"); > StatusMsg(" --max-frames=<1..%d> Max. depth of stack traces [%d]\n", > int(mMaxFrames.mMax), > int(mMaxFrames.mDefault)); > + StatusMsg(" --show-dump-stats=<yes|no> Show stats about dumps? [no]\n"); We'll need to update |mach dmd| with changed params. @@ +1658,5 @@ > + aWriter.StartObjectProperty("stacks"); > + { > + for (StackTraceSet::Enum e(usedStackTraces); !e.empty(); e.popFront()) { > + const StackTrace* const st = e.front(); > + snprintf(buf, buflen, "0x%" PRIxPTR, uintptr_t(st)); Docs indicate no leading 0x, if we can I'd stick with that. @@ +1682,5 @@ > + char pcStr[32]; > + snprintf(pcStr, 32, "0x%" PRIxPTR, uintptr_t(pc)); > + > + // Use 99 for the frame number. See the JSON format description comment > + // in DMD.h to understand why. Can we make the frame number optional instead? I guess another workaround would be to do |StringProperty(pcStr, buf + strlen("#99: "))| ::: memory/replace/dmd/DMD.h @@ +46,5 @@ > +// > +// // Information about how DMD was invoked. A required object. > +// "invocation": { > +// // The contents of the $DMD environment variable. A required string. > +// "dmdEnvVar": "1", Maybe we should just omit this and explicitly list all the config values. @@ +66,5 @@ > +// > +// // The stack trace at which the block was allocated. A mandatory > +// // string (in both sampled and non-sampled blocks), which indexes into > +// // the "stacks" object. > +// "allocatedAt": "7f4ead5e9e60", Are we going to include the heap address too? I seem to recall discussing that w/ mccr8. @@ +81,5 @@ > +// }, > +// ], > +// > +// // The stack traces referenced by the "blocks" array. > +// "stacks": { This could just be an array, and |block.reportedAt| could point to the index. @@ +93,5 @@ > +// // The frames referenced by the "stacks" object. The descriptions can be > +// // quite long, so they are stored separately from the "stacks" object > +// // so that each one only has to be written once, which reduces file size > +// // significantly. > +// "frames": { And this could just be an array too. @@ +100,5 @@ > +// // with #99 -- stack frame numbers are not meaningful when stack frames > +// // are written in isolation like this, so we use 99 which is an > +// // obviously bogus value. The post-processing scripts can easily replace > +// // #99 with a correct frame number when they reconstruct each stack > +// // trace. Can't we just omit the "#99: " and let the scripts add that in? ::: memory/replace/dmd/check_test_output.py @@ +52,4 @@ > sys.exit(1) > > + subprocess.call(fix, stdin=open(in_name, 'r'), > + stdout=open(fixed_name, 'w')) These should be callable as libraries now, right? ::: memory/replace/dmd/dmd.py @@ +144,5 @@ > + default=sortByChoices.keys()[0], > + help='which metric should records be sorted by?') > + p.add_argument('input1', nargs='?', type=argparse.FileType('r'), > + default=sys.stdin) > + p.add_argument('input2', nargs='?', type=argparse.FileType('r')) Another interesting follow up would be to hook into stack fixing here. Something like: --fix-stacks => foreach frame in frames: stackfixer.fix(frame)
Attachment #8493543 -
Flags: feedback?(erahm) → feedback+
Assignee | ||
Comment 9•10 years ago
|
||
erahm and I discussed some of this via IRC, but not all of it, so here are some responses. > This looks awesome! As far as the overall design goes, my main thought is we > don't need the associative arrays. It'll save us some space on the output > side, parsing might end up a bit easier too. OTOH it seems like it might > make generation more complicated and possibly have more overhead, so I can > go either way on this. There are three moderate benefits to associative arrays: - Easier to generate. - When reading the JSON file, it's easier to search for unique addresses than for non-unique indices, and finding the 999th stack trace in an array of several thousand is a pain. - I like having the PCs in the file. Not for any particular reason, it just feels nice. As for disadvantages: parsing is no harder (the JSON parsing lib does it all for you), so the only real disadvantage is it increases the file size. With one sample file I tried changing all the addresses to three digit values and the file size decreased from 7 MB to 5 MB. Hmm. > Docs indicate no leading 0x, if we can I'd stick with that. I'll remove the 0x, mostly to save space. > Can we make the frame number optional instead? Maybe. The fix*.py scripts rely on it being there if we use them in a "run on this whole file" fashion. If we do targeted fixing of just the frame descriptions we may be able to do something else. > Are we going to include the heap address too? I seem to recall discussing > that w/ mccr8. mccr8 needs it for his SCC stuff, but we don't need it for the existing reports output, so I've left it out for now. I think mccr8's stuff will have to be opt-in, because printing heap block contents will increase the size of the output greatly. > > + subprocess.call(fix, stdin=open(in_name, 'r'), > > + stdout=open(fixed_name, 'w')) > > These should be callable as libraries now, right? Yes. But I left this code unchanged because it works as is, and there's enough other changes going on here anyway...
Assignee | ||
Comment 10•10 years ago
|
||
> As for disadvantages: parsing is no harder (the JSON parsing lib does it all
> for you), so the only real disadvantage is it increases the file size. With
> one sample file I tried changing all the addresses to three digit values and
> the file size decreased from 7 MB to 5 MB. Hmm.
Here's an alternative: I have code working which converts each address into a unique base-32 number. I'm still using associative arrays, which keeps the inspecting of the JSON code simple, but the numbers are typically 2 or 3 chars, which saves lots of space. PCs are no longer there, but that's probably not a great loss.
Assignee | ||
Comment 11•10 years ago
|
||
Ok, I think this is ready! Significant new stuff since the previous patch: - The output files are about half the size they used to be, due to (a) shorter property names (e.g. 'req' instead of 'reqSize'), (b) omitting some properties when unnecessary (e.g. no 'slop' property when it's zero, as it often is), and (c) using base-32 indices instead of addresses, as per comment 10. - I updated |mach dmd|. - I moved the nsIMemoryInfoDumper.idl comment change into a separate patch. Stuff I'm postponing to follow-up bugs: - A --fix-stacks option that will overwrite the input file with a stack-fixed copy. Postponing because it involves some non-trivial build system jiggery-pokery to get dmd.py to talk to fix*.py. - Improved testing, esp. of the various flags in dmd.py. I'll do this as part of bug 1073312. - Diffs. That's bug 1014343.
Attachment #8495672 -
Flags: review?(continuation)
Assignee | ||
Updated•10 years ago
|
Attachment #8493543 -
Attachment is obsolete: true
Assignee | ||
Comment 12•10 years ago
|
||
Attachment #8495673 -
Flags: review?(continuation)
Comment 13•10 years ago
|
||
Could you attach a sample log please? Or at least a representative chunk of one.
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 14•10 years ago
|
||
This is gzipped, stack-fixed output from a run with default options (i.e. sampling is on) after loading Gmail. It's 4.43 MB unzipped, and 0.67 MB zipped.
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 15•10 years ago
|
||
BTW, I have follow-up plans to shrink the output some more by putting the contents of each block object and trace array on a single line. This will require some tweaking of JSONWriter.h.
Comment 16•10 years ago
|
||
Comment on attachment 8495672 [details] [diff] [review] DMD: emit JSON output and use Python for post-processing Review of attachment 8495672 [details] [diff] [review]: ----------------------------------------------------------------- Do you think testing the actual intermediate JSON might be useful? I suppose it would be hard to get the final output right and have the intermediate stuff be right. I only lightly skimmed the tests. I still need to look over dmd.py but I have reviewed the rest, and it looks fine to me modulo the review comments. ::: memory/replace/dmd/DMD.cpp @@ +871,5 @@ > { > return mAllocStackTrace_mSampled.Ptr(); > } > > + const StackTrace* ReportStackTrace1() const I don't mind reviewing patches where you fix up a bunch of nits like this, but please split them out from patches containing substantive changes. Now I have to carefully check for each of these if you changed a return type, a function name or an argument at the same time. @@ -1667,5 @@ > > StatusMsg("running stress mode...\n"); > RunStressMode(fp); > StatusMsg("finished stress mode\n"); > - fclose(fp); Is there some reason you aren't closing this file any more? Should RunStressMode be converted to the UniquePointer stuff like RunTestMode? @@ +1591,5 @@ > +// the terminating null char), which is the base-32 representation of > +// 0xffffffff. > +// > +// We use base-32 values for indexing into the traceTable and the frameTable, > +// for the following reasons. : not . maybe? @@ +1606,2 @@ > { > + static const char digits[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef"; I don't know if I agree with the particular aesthetics of this set of digits, but basically nobody is going to look at these so an arbitrary set of them is okay by me. (People have a lot of opinions on base 32: http://en.wikipedia.org/wiki/Base32 ) @@ +1610,5 @@ > + *b = '\0'; > + do { > + b--; > + if (b == aBuf) { > + MOZ_CRASH(); // buffer too small nit: MOZ_CRASH() can take a string argument, so please use one instead of the comment. @@ +1655,4 @@ > > + aWriter.Start(); > + { > + aWriter.IntProperty("version", 1); nit: Please define some kind of kVersionNumber constant near the top of this file and use that instead of just the literal 1. @@ +1712,5 @@ > > + StatusMsg(" Constructing the stack trace table...\n"); > + > + static const size_t buflen = 1024; > + char buf[buflen]; nit: It looks like this isn't used until the frameTable, so maybe move it down closer to that and give it a clearer name like frameBuf. @@ +1754,2 @@ > > + #undef ID micronit: It seems somewhat odd to me to undef ID at a different level of indentation then you define it. @@ +1795,5 @@ > size_t hits = locService->NumCacheHits(); > size_t misses = locService->NumCacheMisses(); > size_t requests = hits + misses; > + StatusMsg(" Location service: %10s requests\n", > + Show(requests, buf1, kBufLen)); Do you want to add reports here for the various new data structures you've added, like idMap, usedStackTraces, usedPcs, etc.? It seems to me like they could add up. ::: memory/replace/dmd/DMD.h @@ +36,5 @@ > > +// Determines which heap blocks have been reported, and dumps JSON output > +// (via |aWriter|) describing the heap. > +// > +// What follow is some sample output, with comments explaining the format and nit: "What follows" maybe? The "some" is also not necessary. @@ +43,5 @@ > +// omitting properties whenever possible. > +// > +// { > +// // The version number of the format, which will be incremented each time > +// // backwards-incompatible changes are made. A mandatory integer. nit: "required" here instead of "mandatory" to be consistent? Unless that implies something else. @@ +60,5 @@ > +// "blockList": [ > +// // An example of a non-sampled heap block. > +// { > +// // Requested size, in bytes. In non-sampled blocks this is a > +// // required integers. In sampled blocks this is not present, and the nit: "integers" should be "integer" nit: "is not present, and" maybe to "is not present:", to emphasize that the latter is okay because of the former. @@ +63,5 @@ > +// // Requested size, in bytes. In non-sampled blocks this is a > +// // required integers. In sampled blocks this is not present, and the > +// // requested size is equal to the "sampleBelowSize" value. Therefore, > +// // the presence/absence of this field indicates if the block is > +// // non-sampled/sampled. nit: This parallel structure seems overly complex to me. Maybe "Therefore, the block is sampled if and only if the "req" field is absent." @@ +68,5 @@ > +// "req": 3584, > +// > +// // Requested slop size, in bytes. This is only present if it is > +// // non-zero. Because sampled blocks never have slop, this is never > +// // present for non-sampled blocks. Should that be "for sampled blocks"? I'd also prefer something more like "This is never present for sampled blocks, because they have no slop." (I'm not a fan of leading with because...) @@ +72,5 @@ > +// // present for non-sampled blocks. > +// "slop": 512, > +// > +// // The stack trace at which the block was allocated. A mandatory > +// // string (in both sampled and non-sampled blocks), which indexes into nit: "(in both sampled and non-sampled blocks)" feels unnecessary to me. I think it is easily inferred that if you don't specify that it is only mandatory for one or the other, then it isn't mandatory. Also, this should probably be "required" instead of "mandatory" for consistency. @@ +74,5 @@ > +// > +// // The stack trace at which the block was allocated. A mandatory > +// // string (in both sampled and non-sampled blocks), which indexes into > +// // the "traceTable" object. > +// "alloc": "A", It is a little odd that this one has a trailing comma, but some places above (like "sampleBelowSize") don't. @@ +94,5 @@ > +// // which stacks correspond to which references in the "blockList" array. > +// "traceTable": { > +// // Each property corresponds to a stack trace mentioned in the "blocks" > +// // object. Each element is an index into the "frameTable" object. > +// "A": ["D", "E"], It might be kind of slick if stack traces and frame names were somehow distinguishable, like one is upper case and one is lower case. I guess they are obvious from context, so maybe not really. @@ +102,5 @@ > +// > +// // The stack frames referenced by the "traceTable" object. The > +// // descriptions can be quite long, so they are stored separately from the > +// // "traceTable" object so that each one only has to be written once, which > +// // reduces file size significantly. This could also be an array, but "which reduces file size significantly" feels a little redundant with "the descriptions can be quite long". @@ +107,5 @@ > +// // again, making it an object makes it easier to find which frames > +// // correspond to which references in the "traceTable" object. > +// "frameTable": { > +// // Each property is a frame key mentioned in the "traceTable" object. > +// // Each value is a required string containing a frame description, "required" seems a little unnecessary, as you can't omit just a value. @@ +111,5 @@ > +// // Each value is a required string containing a frame description, > +// // starting with #99 -- stack frame numbers are not meaningful when > +// // stack frames are written in isolation like this, so we use 99 which > +// // is an obviously bogus value. The post-processing scripts must > +// // replace #99 with a correct frame number when they reconstruct each Wait, every one of these starts with "#99: "? Is there some reason you can't just omit it? ::: memory/replace/dmd/check_test_output.py @@ +31,3 @@ > # Filenames > + i = str(i) > + tmpdir = tempfile.gettempdir() tmpdir -> tmp_dir maybe? @@ +31,5 @@ > # Filenames > + i = str(i) > + tmpdir = tempfile.gettempdir() > + ijson = i + '.json' > + itxt = i + '.txt' IMHO, ijson and itxt are more confusing than clarifying and don't save all that many characters. I think constructing these strings like this would be easier to understand: 'test-%s-fixed-%d.json' % (kind, i) @@ +45,1 @@ > sysname = platform.system() sysname -> sys_name maybe? @@ +45,3 @@ > sysname = platform.system() > + if sysname == 'Linux': > + fix = os.path.join(srcdir, 'tools', 'rb', 'fix_linux_stack.py') Since these are in the same directory, maybe just select the file name, then share the rest of this for both cases? @@ +117,5 @@ > + if (len(sys.argv) != 2): > + print('usage:', sys.argv[0], '<topsrcdir>') > + sys.exit(1) > + > + srcdir = sys.argv[1] srcdir -> src_dir? ::: memory/replace/dmd/dmd.py @@ +1,1 @@ > +#! /usr/bin/python This file should be under tools, probably in a new directory dmd. That seems like a more natural place for something that a random Gecko developer is going to use. I don't know if you should get some kind of rubberstamp from dbaron for doing that or not. ::: python/mozbuild/mozbuild/mach_commands.py @@ -927,5 @@ > @CommandArgument('--sample-below', default=None, type=str, > help='The sample size to use, [1..n]. Default is 4093.') > @CommandArgument('--max-frames', default=None, type=str, > help='The max number of stack frames to capture in allocation traces, [1..24] Default is 24.') > - @CommandArgument('--max-records', default=None, type=str, The mach changes look fine to me, but they should get review from a mach person, maybe mshal. ::: xpcom/base/nsMemoryInfoDumper.cpp @@ +773,3 @@ > // if DMD is enabled. > nsCString dmdFilename; > + MakeFilename("dmd", aIdentifier, aPid, "json.gz", dmdFilename); AFAICT, get_about_memory.py has the .txt hardcoded in it, so you need to do a pull request thing to update that too: https://github.com/mozilla-b2g/B2G/blob/master/tools/get_about_memory.py#L128 Relatedly, somebody should make sure that DMD still works end-to-end on B2G before this lands, as it is DMD's wonkiest and most important platform.
Assignee | ||
Comment 17•10 years ago
|
||
> Do you think testing the actual intermediate JSON might be useful? I > suppose it would be hard to get the final output right and have the > intermediate stuff be right. You mean test the contents of the JSON files? Hmm... doesn't seem like it would add much. > Is there some reason you aren't closing this file any more? I'll add a destructor to FpWriteFunc that closes the file. I actually need that for the xpcshell test I'm working on anyway. > Should RunStressMode be converted to the UniquePointer stuff like RunTestMode? Good idea, I'll do it. > > +// We use base-32 values for indexing into the traceTable and the frameTable, > > +// for the following reasons. > > : not . maybe? I always use "for the following <X>." when I deliberately want to use a '.' in this kind of situation, which I do here because the bullet points are themselves each whole sentences > @@ +1606,2 @@ > > { > > + static const char digits[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef"; > > I don't know if I agree with the particular aesthetics of this set of > digits, but basically nobody is going to look at these so an arbitrary set > of them is okay by me. (People have a lot of opinions on base 32: > http://en.wikipedia.org/wiki/Base32 ) Interesting. Given that I have a comment explaining my choice, I'm happy with this and will leave it as is. > Do you want to add reports here for the various new data structures you've > added, like idMap, usedStackTraces, usedPcs, etc.? It seems to me like they > could add up. Done. > nit: "required" here instead of "mandatory" to be consistent? Unless that > implies something else. I meant to use "mandatory" throughout. I've changed all the cases I missed. > It is a little odd that this one has a trailing comma, but some places above > (like "sampleBelowSize") don't. Oh, that's a bug; JSON doesn't allow trailing commas. I've fixed this and a few other nits, and I checked that it parses correctly in Python. > It might be kind of slick if stack traces and frame names were somehow > distinguishable, like one is upper case and one is lower case. I guess they > are obvious from context, so maybe not really. Yeah... I can't see an obvious way to do it, and doesn't seem worth the effort. > Wait, every one of these starts with "#99: "? Is there some reason you > can't just omit it? Yeah, after bug 1062709 a frame number (preceded by '#') is required for the fix*.py to parse it correctly. It might be possible to avoid it in a follow-up, but that would make this patch even more complicated, so this is the least-worse solution for now. > ::: memory/replace/dmd/check_test_output.py I'll address all your nits, but FWIW, this file is going to be rewritten as JS for use in an xpcshell test in bug 1073312. > ::: memory/replace/dmd/dmd.py > @@ +1,1 @@ > > +#! /usr/bin/python > > This file should be under tools, probably in a new directory dmd. That > seems like a more natural place for something that a random Gecko developer > is going to use. I don't know if you should get some kind of rubberstamp > from dbaron for doing that or not. I won't do this, for the following reason. Bug 1074008 will add a --fix-stacks option to dmd.py that will do stack fixing for you. For this to work, dmd.py will need to be installed (so it can find the relevant fix*.py script), and so the official way to run it will be from $OBJDIR/dist/bin/dmd.py. With this in mind, keeping the script next DMD's C++ code and test files seems like a good idea. > ::: python/mozbuild/mozbuild/mach_commands.py > > The mach changes look fine to me, but they should get review from a mach > person, maybe mshal. The changes are simple enough that I don't think this is necessary. > AFAICT, get_about_memory.py has the .txt hardcoded in it, so you need to do > a pull request thing to update that too: Hmm, good catch. This'll be the second bug in a row (after bug 1062709) where I'll have to coordinate landings to b2g-inbound and B2G. Separate repos suck :( I've addressed all your other comments.
Assignee | ||
Comment 18•10 years ago
|
||
Changes in this version. - It addresses mccr8's review comments. - I reverted the conversion of all the " chars to ' in check_memory_output.py. Unnecessary churn, esp. for a file that will soon be gone. - I created memory/replace/dmd/test/ and moved all the test files inside it. This will save me doing it for bug 1073312. - I added a -o option to dmd.py. This will also be used for bug 1073312.
Attachment #8496641 -
Flags: review?(continuation)
Assignee | ||
Updated•10 years ago
|
Attachment #8495672 -
Attachment is obsolete: true
Attachment #8495672 -
Flags: review?(continuation)
Assignee | ||
Comment 19•10 years ago
|
||
One more tweak.
Attachment #8496643 -
Flags: review?(continuation)
Assignee | ||
Updated•10 years ago
|
Attachment #8496641 -
Attachment is obsolete: true
Attachment #8496641 -
Flags: review?(continuation)
Comment 20•10 years ago
|
||
> Yeah, after bug 1062709 a frame number (preceded by '#') is required for the fix*.py to parse it correctly
Ah, that makes sense. You could change it to #0 to save some bytes. ;)
Assignee | ||
Comment 21•10 years ago
|
||
> You could change it to #0 to save some bytes. ;)
Nope! The scripts expect two digits after the '#' :)
Comment 22•10 years ago
|
||
Comment on attachment 8496643 [details] [diff] [review] DMD: emit JSON output and use Python for post-processing Review of attachment 8496643 [details] [diff] [review]: ----------------------------------------------------------------- Haven't looked deeper than this, but: ::: memory/replace/dmd/dmd.py @@ +14,5 @@ > +import json > +import sys > + > +# Command line args. > +args = None Please don't use a global variable for that. @@ +17,5 @@ > +# Command line args. > +args = None > + > +# The DMD output version this script handles. > +outputVersion = 1 global "consts" should have CAPITALIZED_NAMES.
Assignee | ||
Comment 23•10 years ago
|
||
> > +# Command line args.
> > +args = None
>
> Please don't use a global variable for that.
Why not? I originally had it as a local and ended up passing it everywhere, so it was as good as a global anyway...
Comment 24•10 years ago
|
||
Comment on attachment 8496643 [details] [diff] [review] DMD: emit JSON output and use Python for post-processing Review of attachment 8496643 [details] [diff] [review]: ----------------------------------------------------------------- r=me with the review comments addressed. Sorry it took me so long to review this, but I actually did just spend that much time on it... > Given that I have a comment explaining my choice, > I'm happy with this and will leave it as is. That sounds reasonable. The set of chars you use avoids the pitfalls they talk about, in that there's no confusing O vs 0 or I vs 1. > I won't do this [move dmd.py], for the following reason. Sounds alright, then. Please be sure to update the DMD page after this lands to mention how to use dmd.py. You don't need to duplicate the --help information, but just something like "to get something human readable output from the DMD log, run blah/blah/dmd/dmd.py". The current content there probably needs to be left there, with some comment about "before version X" added. We really need a comment in DMD.h explaining what sampling is and how it works, at a high level. There are a few remarks about it in AllocCallback() but that comment is low level. That isn't the fault of this patch, though, so it can just go in a followup. ::: memory/replace/dmd/DMD.cpp @@ -335,5 @@ > const char* DMDEnvVar() const { return mDMDEnvVar; } > > size_t SampleBelowSize() const { return mSampleBelowSize.mActual; } > size_t MaxFrames() const { return mMaxFrames.mActual; } > - size_t MaxRecords() const { return mMaxRecords.mActual; } What is the effect on log size of dumping all records by default? Is the new format just so much more space efficient it doesn't matter? @@ +655,5 @@ > public: > StackTrace() : mLength(0) {} > > uint32_t Length() const { return mLength; } > + const void* Pc(uint32_t i) const { MOZ_ASSERT(i < mLength); return mPcs[i]; } This is pre-existing, but would you mind splitting this into multiple lines? @@ +1706,5 @@ > + { > + if (b.ReportStackTrace1()) { > + aWriter.StringElement(ID(b.ReportStackTrace1())); > + } > + if (b.ReportStackTrace2()) { This is just an aside, but does this mean that if a block is reported 3 times, we only show the first two stacks that report it? @@ +1794,5 @@ > > + StatusMsg(" Location service: %10s bytes\n", > + Show(locService->SizeOfIncludingThis(MallocSizeOf), buf1, kBufLen)); > + StatusMsg(" Used stack traces set: %10s bytes\n", > + Show(usedStackTraces.sizeOfIncludingThis(MallocSizeOf), buf1, kBufLen)); This is a pedantic observation, but for this and the next two variables you use sizeOfIncludingThis(), and they are allocated on the stack, so you are reporting a mixture of stack and heap memory, whereas we usually only report heap memory. I don't know if that is the right or wrong thing to do, but it did stick out to me. ::: memory/replace/dmd/DMD.h @@ +105,5 @@ > +// // easier to see which frames correspond to which references in the > +// // "traceTable" object. > +// "frameTable": { > +// // Each property key is a frame key mentioned in the "traceTable" object. > +// // Each property value is a string containing a frame description, Thanks for the explanation in bugzilla comments about why the #99 is here. I think the fact that the frames are in this particular format for the benefit of the fix_stack scripts should be explicitly documented here. Maybe something like "Each property value is a string containing a frame description. Each frame description must be in a format recognized by the stack fixing scripts (eg fix_linux_stack.py). This is why each frame description begins with "#99:". The stack frame number is bogus because a frame in this table may be shared by different stack traces. The proper frame number can be reconstructed by later tools that output stack traces in a conventional non-shared format." ::: memory/replace/dmd/dmd.py @@ +4,5 @@ > +# License, v. 2.0. If a copy of the MPL was not distributed with this > +# file, You can obtain one at http://mozilla.org/MPL/2.0/. > + > +'''This script takes a JSON file emitted by DMD, and analyzes the heap > +contents in various ways.''' nit: the "in various ways" feels unnecessary. @@ +14,5 @@ > +import json > +import sys > + > +# Command line args. > +args = None glandium's suggestion to just pass around args sounds reasonable to me. It is a little ugly up front, but it is the sort of thing that will pay off in the future. @@ +19,5 @@ > + > +# The DMD output version this script handles. > +outputVersion = 1 > + > +# If --ignore-alloc-fns is specified, stack frames containing functions that Could the name of |--ignore-alloc-fns| and |allocatorFns| be made more general? Later, we may want to remove frames that don't explicitly relate to allocation. (This isn't a hypothetical, I think removing things like XRE_main and the various "oops fix_stack.py couldn't figure out anything useful!" can be nice.) Oh, I guess you only remove allocator frames from the start. I guess we could add some option later that is more comprehensive or something and implies ignore-alloc-fns. @@ +34,5 @@ > + 'operator new(', > + 'operator new[](', > + 'g_malloc', > + 'g_slice_alloc', > + 'g_slice_alloc0', nit: g_slice_alloc0 is redundant with g_slice_alloc @@ +53,5 @@ > +separator = '#' + '-' * 65 + '\n' > + > +# Format a number, with comma as a separator and a '~' prefix if it's sampled. > +def number(n, isSampled): > + return ('{:}{:,d}').format('~' if isSampled else '', n) nit: the () around the '{:}{:,d}' looks weird to me, and seems inconsistent with uses elsewhere in the file. @@ +87,5 @@ > + self.reqSizes = collections.defaultdict(int) > + > + @staticmethod > + def cmpByIsSampled(r1, r2): > + if not r1.isSampled and r2.isSampled: return -1 Can't you just call some kind of cmp directly on isSampled? Or is the ordering not what you want? @@ +94,5 @@ > + > + @staticmethod > + def cmpByUsableSize(r1, r2): > + # Sort by usable size, then req size, then by isSampled. > + if r1.usableSize > r2.usableSize: return -1 It might be a little clearer here and below if you just called some kind of cmp on usableSize, and then check if the result is 0 to break ties with isSampled. @@ +114,5 @@ > + return Record.cmpByIsSampled(r1, r2) > + > + > +sortByChoices = { > + 'usable': Record.cmpByUsableSize, # first one is the default You could just say "# default" here as the position already indicates that you are referring to the first one. @@ +121,5 @@ > +} > + > + > +def parseCommandLine(): > + def range_1_24(string): Why is this limited to 24? It seems rather arbitrary. I'm trying to think of a situation where 0 would be useful, but I guess I can't. ;) @@ +141,5 @@ > + > + p.add_argument('-m', '--max-frames', type=range_1_24, > + help='maximum number of frames to consider in each trace') > + > + p.add_argument('-r', '--ignore-reports', action='store_true', The phrasing of this help is confusing to me. Maybe it could be rephrased in terms of the effect on the output file? "Don't categorize blocks based on how they have been reported to the memory reporters" or something? @@ +147,5 @@ > + 'reported by memory reporters') > + > + p.add_argument('-s', '--sort-by', choices=sortByChoices.keys(), > + default=sortByChoices.keys()[0], > + help='which metric should records be sorted by?') The rest of these helps are not questions, so this shouldn't be either. Maybe "sort the records by the specified metric" instead. @@ +167,5 @@ > +# --max-frames. > +def maybeTrimStackTraces(traceTable, frameTable): > + if args.ignore_alloc_fns: > + for traceKey, frameKeys in traceTable.items(): > + n = 0 nit: please use a better variable name than n. numSkippedFrames maybe. @@ +170,5 @@ > + for traceKey, frameKeys in traceTable.items(): > + n = 0 > + for frameKey in frameKeys: > + matched = False > + for fn in allocatorFns: This is probably premature optimization, but in my DMD log parser, to filter boring frames I generate a regexp: https://github.com/amccreight/heapgraph/blob/master/dmd/parse_report.py#L39 Though I did see an increase in runtime from about 560ms to 670ms on the log you sent me when I passed in -a. ;) @@ +172,5 @@ > + for frameKey in frameKeys: > + matched = False > + for fn in allocatorFns: > + frameDesc = frameTable[frameKey] > + if fn in frameDesc: Matching anywhere in the frame seems a little footgun-y, but I guess the alternative is to have to deal with matching against various return types. @@ +210,5 @@ > + > + if args.ignore_reports: > + liveRecords = collections.defaultdict(Record); > + else: > + unreportedRecords = collections.defaultdict(Record) micronit: I don't care too much how you do it, but you aren't very consistent about aligning = or not. (Personally I prefer not doing it, because it makes later patches have to adjust a bunch of lines...) @@ +220,5 @@ > + > + for block in blockList: > + # For each block we compute a |recordKey|, and all blocks with the same > + # |recordKey| are aggregated into a single record. The |recordKey| is > + # derived from the one or more stack traces ('allocatedAt' and Maybe this could made clearer by saying "The |recordKey| is derived from the block's allocation and (possibly) reporting stack traces." Forward referring some variables makes it harder to understand. @@ +224,5 @@ > + # derived from the one or more stack traces ('allocatedAt' and > + # 'reportedAts') of each block. > + # > + # We use each stack trace's *values* (i.e. the PCs themselves) rather > + # than the stack trace's *identity* (i.e. its key) because the frame I found the first part of this sentence (everything before "because") to be very confusing. I think the problem is that when I think of a stack trace, I think of the actual list of stack frames, whereas here you are really referring to the stack trace record in the log file, including its name. Also you start out in this sentence using the word "identity" to refer to the name in the file, but then you use it to refer it more to the recordKey sense of identity (which is how I was thinking of identity here). @@ +236,5 @@ > + records = liveRecords > + else: > + recordKey = str(traceTable[allocatedAt]) > + if 'reps' in block: > + reportedAts = block['reps'] reportedAts needs to be sorted to ensure canonicity. @@ +239,5 @@ > + if 'reps' in block: > + reportedAts = block['reps'] > + for reportedAt in reportedAts: > + recordKey += str(traceTable[reportedAt]) > + records = onceReportedRecords if len(reportedAts) == 1 else \ nit: maybe just use a regular if-then-else if you have to break this into multiple lines anyways? @@ +249,5 @@ > + > + if 'req' in block: > + # not sampled > + reqSize = block['req'] > + slopSize = block['slop'] if 'slop' in block else 0 nit: I think you could also just do block.get('slop', 0) for the RHS. @@ +254,5 @@ > + isSampled = False > + else: > + # sampled > + reqSize = sampleBelowSize > + slopSize = 0 It might be nice to add an assert here that validate your assumption about there being no slop for sampled blocks. @@ +269,5 @@ > + record.isSampled = record.isSampled or isSampled > + record.allocatedAt = block['alloc'] > + if args.ignore_reports: > + pass > + else: nit: please turn this into an elif 'reps' in block: @@ +285,5 @@ > + cmpRecords = sortByChoices[args.sort_by] > + sortedRecords = sorted(records.values(), cmp=cmpRecords) > + kindBlocks = 0 > + kindUsableSize = 0 > + maxRecord = 1000 aside: Eventually it would be nice to have an option for this (conservative heap scanning will need it), but for now it is totally no big deal for me to just edit this if I need to increase it. Oh, it would also be cool to have a size-based threshold, because if you print out every record you end up with this long tail of tiny allocations. @@ +289,5 @@ > + maxRecord = 1000 > + > + # First iteration: get totals, etc. > + for record in sortedRecords: > + kindBlocks += record.numBlocks nit: again, you are kind of inconsistent about aligning = or not @@ +299,5 @@ > + > + kindCumulativeUsableSize = 0 > + for i, record in enumerate(sortedRecords, start=1): > + # Stop printing at the |maxRecord|th record. > + if (i == maxRecord): nit: no () around the test in Python. @@ +302,5 @@ > + # Stop printing at the |maxRecord|th record. > + if (i == maxRecord): > + out('# {:}: stopping after {:,d} heap block records\n'. > + format(RecordKind, i)) > + break; nit: I'm not sure what this trailing semicolon does, but I'm pretty sure it does not need to be here. :) @@ +308,5 @@ > + kindCumulativeUsableSize += record.usableSize > + > + isSampled = record.isSampled > + > + out(RecordKind + ' {') aside: It seems weird to use the brackets for this output that isn't intended for machines. But prettier output can be left for a followup, as I have no desire to bikeshed that out here. @@ +312,5 @@ > + out(RecordKind + ' {') > + out(' {:} block{:} in heap block record {:,d} of {:,d}'. > + format(number(record.numBlocks, isSampled), > + plural(record.numBlocks), i, numRecords)) > + out(' {:} bytes ({:} requested / {:} slop)'. Could you please make this "total bytes" instead of "bytes"? As you recall, the phrasing confused me before. ;) @@ +330,5 @@ > + > + if args.show_all_block_sizes: > + reqSizes = sorted(record.reqSizes.items(), reverse=True) > + > + out(' Individual block sizes: ', end='') As discussed in IRC, please change to something like "Individual requested block sizes". @@ +347,5 @@ > + out(' }') > + if args.ignore_reports: > + pass > + else: > + if (hasattr(record, 'reportedAts')): nit: no () in the test in Python
Attachment #8496643 -
Flags: review?(continuation) → review+
Assignee | ||
Comment 25•10 years ago
|
||
> > - size_t MaxRecords() const { return mMaxRecords.mActual; } > > What is the effect on log size of dumping all records by default? Is the > new format just so much more space efficient it doesn't matter? The old output was entirely record-oriented, where each record is an aggregation of one or more blocks with the same stack traces. The new JSON output is block-oriented, and DMD.cpp has no notion of records -- that's now entirely done within dmd.py. I guess an alternative question is "can we limit the JSON file size by ignoring small things?" Which is exactly what sampling does. > This is just an aside, but does this mean that if a block is reported 3 > times, we only show the first two stacks that report it? Correct. This restriction will be lifted in bug 1046431, though, because we'll need to be able to record N (or more) partial reports each covering 1/Nth of a block. > @@ +1794,5 @@ > > > > + StatusMsg(" Location service: %10s bytes\n", > > + Show(locService->SizeOfIncludingThis(MallocSizeOf), buf1, kBufLen)); > > + StatusMsg(" Used stack traces set: %10s bytes\n", > > + Show(usedStackTraces.sizeOfIncludingThis(MallocSizeOf), buf1, kBufLen)); > > This is a pedantic observation, but for this and the next two variables you > use sizeOfIncludingThis(), and they are allocated on the stack, so you are > reporting a mixture of stack and heap memory locService is heap-allocated (the |->| gives it away) so it's fine. usedStackTraces, usedPcs and idMap should all definitely use sizeOfExcludingThis(), though. Surprising I haven't seen crashes because of this. > glandium's suggestion to just pass around args sounds reasonable to me. It > is a little ugly up front, but it is the sort of thing that will pay off in > the future. As I said earlier, it was a local and I switched to a global because it was simpler than passing it everywhere and returning it from parseCommandLine(). But fine, I'll change it back. > Could the name of |--ignore-alloc-fns| and |allocatorFns| be made more > general? [...] I guess we could add some option later that is more > comprehensive or something and implies ignore-alloc-fns. If we were to add a way for the user to specify additional functions to skip, then we could give it a more general name. But I'll leave it unchanged for now. > > +def parseCommandLine(): > > + def range_1_24(string): > > Why is this limited to 24? It seems rather arbitrary. 24 is the maximum stack depth that DMD produces. The implicit coupling with DMD.cpp isn't great but I don't know a good way to share constants between C++ and Python. I'll add a comment. > This is probably premature optimization, but in my DMD log parser, to filter > boring frames I generate a regexp: > https://github.com/amccreight/heapgraph/blob/master/dmd/parse_report.py#L39 Good idea! I'll steal it. > Matching anywhere in the frame seems a little footgun-y, but I guess the > alternative is to have to deal with matching against various return types. Yeah, you have to deal with lots of stuff like |unsigned int* js::LifoAlloc::pod_malloc|. It's conceivable that you might have a frame with a name like |this_is_not_pod_malloc_ha_ha| just after one or more real allocator function frames, in which case you'll have a slightly truncated stack trace. But it's unlikely, and stack traces omit frames all the time due to inlining, so... meh. > micronit: I don't care too much how you do it, but you aren't very > consistent about aligning = or not. (Personally I prefer not doing it, > because it makes later patches have to adjust a bunch of lines...) I do it when I feel it helps readability, which is usually when there's significant overlap between the lines, and thus the alignment makes it clearer which parts differ. > reportedAts needs to be sorted to ensure canonicity. DMD.cpp will order them temporally, i.e. later (in time) reports come later (in the array). Sorting them here would destroy that ordering. > > + if args.ignore_reports: > > + pass > > + else: > > nit: please turn this into an elif 'reps' in block: This was deliberate. Throughout the file this structure appears repeatedly: > if args.ignore_reports: > do_stuff_1 > else: > do_stuff_2 That's why the |pass| is there instead of it just being: > if not args.ignore_reports and 'reps' in block: > ... I can use the latter if you feel strongly, though. > aside: It seems weird to use the brackets for this output that isn't > intended for machines. I just replicated the existing output exactly to avoid making any decisions about this! > Could you please make this "total bytes" instead of "bytes"? As you recall, > the phrasing confused me before. ;) I generally use "usable" for (requested + slop), but where it's not specified it's always "usable" (e.g. in the Summary as well as here)... except for the individual block sizes. So I will switch the individual sizes to usable instead of requested. That has the advantage of also making those lists shorter, because the rounding up will create more duplicates. The important thing with that list is to be able to see which allocations are big and which are small, not the exact request size. (It'll also ensure that we never individual block sizes below the sampling threshold.)
Assignee | ||
Comment 26•10 years ago
|
||
mccr8: not sure if you missed the r? request on the first patch, or it you're just taking a rest before tackling it. Don't worry, it's a lot smaller :)
Updated•10 years ago
|
Attachment #8495673 -
Flags: review?(continuation) → review+
Comment 27•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #25) > I guess an alternative question is "can we limit the JSON file size by > ignoring small things?" Which is exactly what sampling does. Good point! I only sort of obliquely asked this, but what is the general effect of this patch on the size of the log output? Is it vaguely the same size or smaller or what? > locService is heap-allocated (the |->| gives it away) so it's fine. Yes, I know, I was referring to the three after it. ;) > usedStackTraces, usedPcs and idMap should all definitely use > sizeOfExcludingThis(), though. Surprising I haven't seen crashes because of > this. Ah, good to know that is actually a problem. I guess calling malloc_usable_size_of on a non-malloc thing would be bad! > As I said earlier, it was a local and I switched to a global because it was > simpler than passing it everywhere and returning it from parseCommandLine(). > But fine, I'll change it back. Thanks! > 24 is the maximum stack depth that DMD produces. The implicit coupling with > DMD.cpp isn't great but I don't know a good way to share constants between > C++ and Python. I'll add a comment. Thanks, a comment sounds good. It is good to know where 24 comes from, but it still seems like in the dmd.py script people should be able to specify, say, 100 max frames, and you'll just get less if there are less. > Yeah, you have to deal with lots of stuff like |unsigned int* > js::LifoAlloc::pod_malloc|. I wonder if there's some kind of general Python library we could use for parsing C++ function names. We need it here, we need it in the test harness for both Valgrind and LSan, and right now we have a bunch of hacky implementations (no slight meant to your Valgrind log parser, I'm just assuming it is hacky because that's all that is needed ;). > > reportedAts needs to be sorted to ensure canonicity. > > DMD.cpp will order them temporally, i.e. later (in time) reports come later > (in the array). Sorting them here would destroy that ordering. Why do you need to preserve that ordering? Though I suppose in practice that means that canonizing the ordering doesn't matter, because odds are that if you have two reporting stacks A and B, the code to generate A will either always run before B or always after B. > Throughout the file this structure appears repeatedly: > > > if args.ignore_reports: > > do_stuff_1 > > else: > > do_stuff_2 > > That's why the |pass| is there instead of it just being: > > > if not args.ignore_reports and 'reps' in block: > > ... > > I can use the latter if you feel strongly, though. Nah, that's fine. The sprinkling of args.ignore_reports is kind of ugly, though. If we add a third kind of report then this will get even worse. At some point would be better to change this to some kind of object that gets passed around that we call various callbacks on for the various points here. > > aside: It seems weird to use the brackets for this output that isn't > > intended for machines. > > I just replicated the existing output exactly to avoid making any decisions > about this! Yeah, that seems reasonable given how big this is already. But we should spend some time at some point to make it nicer! > > Could you please make this "total bytes" instead of "bytes"? As you recall, > > the phrasing confused me before. ;) > > I generally use "usable" for (requested + slop), but where it's not > specified it's always "usable" (e.g. in the Summary as well as here)... > except for the individual block sizes. So I will switch the individual sizes > to usable instead of requested. That has the advantage of also making those > lists shorter, because the rounding up will create more duplicates. The > important thing with that list is to be able to see which allocations are > big and which are small, not the exact request size. (It'll also ensure that > we never individual block sizes below the sampling threshold.) Well, the "total" I'm referring to is "total across all blocks". So maybe "total usable bytes"? I agree that usable is more useful. That's certainly what I want for conservative heap scanning, though either should work. The rest sounds fine for me. Thanks for working on this patch!
Assignee | ||
Comment 28•10 years ago
|
||
> I only sort of obliquely asked this, but what is the general effect of this > patch on the size of the log output? Is it vaguely the same size or smaller > or what? Here are measurements after starting the browser and opening Gmail (all files have had their stacks fixed): - Old text output, with sampling: 11.0 MB - Old text output, no sampling: 10.9 MB - New JSON output, with sampling: 3.5 MB - New JSON output, no sampling: 39.7 MB - New text output, with sampling: 10.9 MB - New text output, no sampling: 11.9 MB So the text output is basically unchanged, which makes sense because it's the same information. Though with dmd.py you have many additional options, the use of which will affect the text output size. (In particular, if we added an option that let you to choose how many records to show that could have a huge effect; the non-sampled file has roughly 10x as many records.) The JSON output is new and its size depends greatly on whether sampling is used or not. > > 24 is the maximum stack depth that DMD produces. The implicit coupling with > > DMD.cpp isn't great but I don't know a good way to share constants between > > C++ and Python. I'll add a comment. > > Thanks, a comment sounds good. It is good to know where 24 comes from, but > it still seems like in the dmd.py script people should be able to specify, > say, 100 max frames, and you'll just get less if there are less. My counter-argument is that if you ask for 100 and only get 24 you might be left scratching your head. No perfect solution exists, seemingly. > > DMD.cpp will order them temporally, i.e. later (in time) reports come later > > (in the array). Sorting them here would destroy that ordering. > > Why do you need to preserve that ordering? Though I suppose in practice > that means that canonizing the ordering doesn't matter, because odds are > that if you have two reporting stacks A and B, the code to generate A will > either always run before B or always after B. Ultimately, the ordering probably doesn't matter for normal use. It would only be an issue if it made the tests non-deterministic, but the current behaviour avoids that because the report ordering is deterministic in the tests. > The sprinkling of args.ignore_reports is kind of ugly, > though. If we add a third kind of report then this will get even worse. At > some point would be better to change this to some kind of object that gets > passed around that we call various callbacks on for the various points here. Yep. I considered that and concluded that for two alternatives the current if/else structure was clearer.
Comment 29•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #28) > - Old text output, with sampling: 11.0 MB > - Old text output, no sampling: 10.9 MB > > - New JSON output, with sampling: 3.5 MB > - New JSON output, no sampling: 39.7 MB Thanks for checking. Good to see that in the common case (sampling) the log is so much smaller. I suppose the no sampling new output is larger because it has a lot more data, as it contains everything! > My counter-argument is that if you ask for 100 and only get 24 you might be > left scratching your head. No perfect solution exists, seemingly. It does say "max frames" not "number of frames". :) But I see your point of view, too. > Yep. I considered that and concluded that for two alternatives the current > if/else structure was clearer. Cool. I can believe that.
Assignee | ||
Comment 30•10 years ago
|
||
I am reminded that the thing that tipped me over to making |args| a global was that out(), which is called a *lot*, needs access to it. Sigh.
Assignee | ||
Comment 31•10 years ago
|
||
I'm testing on B2G and something's going wrong. I run get_about_memory.py and see this:
> Warning: Child 9666 exited during memory reporting
> Warning: Child 9828 exited during memory reporting
> Warning: Child 9738 exited during memory reporting
and the file collection fails. Maybe the processes are OOMing when running memory reports or DMD?
Comment 32•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #31) > and the file collection fails. Maybe the processes are OOMing when running > memory reports or DMD? Yeah, I think that's what it means. One process started running DMD, its memory usage goes up, then the OS kills another process to get memory back. I'm not too surprised. You are adding a number of additional tables. You could try scoping each of the data structures more tightly to where they are used. For instance, don't create locService until you are dumping the frameTable. usedStackTraces can be destroyed after you don't the traceTable, it looks like. usedPcs is only needed for the traceTable and frameTable. (You'll have to cache stats for data structures that are destroyed early.) That isn't a huge win, but maybe it will be enough. Particularly, the location service is gigantic.
Comment 33•10 years ago
|
||
> after you don't the traceTable
"don't" should be "dumped"
Assignee | ||
Comment 34•10 years ago
|
||
> I'm not too surprised. You are adding a number of additional tables.
Interestingly, it's not the tables that's the problem -- it's something about the JSON writing. If I remove all the writing but keep the table creation, it's fine. One thing that's different about the JSON code is that it causes gzFileWriter::Write to be called many times with small strings, whereas the old code had fewer calls with longer strings. That *seems* to be the problem, though I'm still confirming it, and I'm not sure why that would make a difference.
But scoping the table creations is a good idea, I'll try that too.
Assignee | ||
Comment 35•10 years ago
|
||
It turned out to be a bad bug in JSONWriter. See https://bugzilla.mozilla.org/show_bug.cgi?id=1074591#c5.
Assignee | ||
Comment 36•10 years ago
|
||
Attachment #8499276 -
Flags: review?(erahm)
Assignee | ||
Comment 37•10 years ago
|
||
https://github.com/mozilla-b2g/B2G/commit/f7802c72350a13d18578fd11380f0a1d65d8fb57
Updated•10 years ago
|
Attachment #8499276 -
Flags: review?(erahm) → review+
Assignee | ||
Comment 38•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/155f170fa5ea https://hg.mozilla.org/integration/mozilla-inbound/rev/d73853c26a1a
Comment 39•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/155f170fa5ea https://hg.mozilla.org/mozilla-central/rev/d73853c26a1a
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in
before you can comment on or make changes to this bug.
Description
•