Closed
Bug 1363883
Opened 7 years ago
Closed 7 years ago
Share memory maps between BHR native stacks
Categories
(Toolkit :: Telemetry, enhancement, P2)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: nika, Assigned: nika)
Details
Attachments
(1 file, 1 obsolete file)
16.79 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
BHR native stacks can take up a lot of space in memory, and approximately 30% of this space is used to hold the memory map. This memory map should be shared between all of the native stacks which are captured by a single ping. This patch changes the way that threadHangStats' native stacks are stored, in order to reduce the size of the threadHangStats object.
Assignee | ||
Comment 1•7 years ago
|
||
MozReview-Commit-ID: IQ1fMXUD0ch
Assignee | ||
Comment 2•7 years ago
|
||
Doug, this patch would change the layout of the BHR ping data, but could save a lot of space (From some back-of-the-napkin calculations with ehsan, this would reduce the amount of space which native stacks take up in the ping by 30% - meaning that we could collect 30+% more frames from each native hang, which could really help with the perf-html fork). How do you feel about this new layout (see the changes to main-ping.rst to see what it would look like specifically). I can look into a different layout if you think it would be better as well.
Flags: needinfo?(dothayer)
Comment 3•7 years ago
|
||
Looks good to me, and it should be pretty trivial to adapt the the processing! :) However, I think we need about 100-200% more frames in order to be in a good place such that most of our stacks still have their roots. I don't have the numbers on hand, but somewhere around 50% of stacks seem to be longer than 50 frames.
Flags: needinfo?(dothayer)
Assignee | ||
Comment 4•7 years ago
|
||
(In reply to Doug Thayer [:dthayer] from comment #3) > Looks good to me, and it should be pretty trivial to adapt the the > processing! :) > > However, I think we need about 100-200% more frames in order to be in a good > place such that most of our stacks still have their roots. I don't have the > numbers on hand, but somewhere around 50% of stacks seem to be longer than > 50 frames. I still want to do that, but I'm viewing this as a first step, so we can start collecting more data before we move threadHangStats outside of main-ping.
Assignee | ||
Updated•7 years ago
|
Attachment #8866542 -
Flags: review?(benjamin)
Comment 5•7 years ago
|
||
Comment on attachment 8866542 [details] [diff] [review] Share memory maps between BHR native stacks I'm going to mark data-review+ on this. I cannot spend the time to do a proper and thorough code review, and so I'm asking froydnj to help do a good code review.
Attachment #8866542 -
Flags: review?(nfroyd)
Attachment #8866542 -
Flags: review?(benjamin)
Attachment #8866542 -
Flags: review+
Updated•7 years ago
|
Priority: -- → P2
Comment 6•7 years ago
|
||
Comment on attachment 8866542 [details] [diff] [review] Share memory maps between BHR native stacks Review of attachment 8866542 [details] [diff] [review]: ----------------------------------------------------------------- Particularly curious about the bit in CreateJSThreadHangStats. ::: toolkit/components/telemetry/Telemetry.cpp @@ +2084,5 @@ > for (size_t i = 0; i < thread.mHangs.length(); i++) { > JS::RootedObject obj(cx, CreateJSHangHistogram(cx, thread.mHangs[i])); > + > + // Check if we have a native stack, and if we do add it to CombinedStacks, > + // and store its index in the 'nativeStack' member of the hang object. This comment might read a little more clearly if it said "...and if we do, add it to combinedStacks and store its index..." so that you're not inclined to read "if we do add it to CombinedStacks" as some sort of success condition. @@ +2087,5 @@ > + // Check if we have a native stack, and if we do add it to CombinedStacks, > + // and store its index in the 'nativeStack' member of the hang object. > + const Telemetry::NativeHangStack& stack = thread.mHangs[i].GetNativeStack(); > + if (!stack.empty() && > + combinedStacks.GetStackCount() < Telemetry::kMaximumNativeHangStacks) { Do you just want to write this as: // Don't add empty stacks. if (stack.empty()) { continue; } // Don't wrap around in combinedStacks, as that would botch previously-computed indices. if (...) { continue; } so that you're not adding an empty `obj` to `hangs` after this block? @@ +2090,5 @@ > + if (!stack.empty() && > + combinedStacks.GetStackCount() < Telemetry::kMaximumNativeHangStacks) { > + Telemetry::ProcessedStack processed = Telemetry::GetStackAndModules(stack); > + uint32_t index = combinedStacks.AddStack(processed); > + JS_DefineProperty(cx, obj, "nativeStack", index, JSPROP_ENUMERATE); You need to check for success here. ::: toolkit/components/telemetry/docs/data/main-ping.rst @@ +275,5 @@ > + "stacks": [ > + [ > + [ > + 0, // the module index or -1 for invalid module indices > + 190649 // the offset of this program counter in its module or an absolute pc For my own edification, but also for documentation purposes, the entries in a stack are basically: enum StackEntry { // module_index is always valid. Offset(module_index, pc_offset), UnknownModule(absolute_pc), } and those are the only two options? Maybe we should slightly edit the comments here to indicate that--in a followup bug? @@ +292,5 @@ > "(content script)", > "IPDL::PPluginScriptableObject::SendGetChildProperty", > ... up to 11 frames ... > ], > + "nativeStack": 0, // index into nativeStacks.stacks array Has the necessary handling been added on the telemetry server side to deal with this format change? Do we need to bump a version on the client side?
Attachment #8866542 -
Flags: review?(nfroyd)
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #6) > Do you just want to write this as: > > // Don't add empty stacks. > if (stack.empty()) { > continue; > } > // Don't wrap around in combinedStacks, as that would botch > previously-computed indices. > if (...) { > continue; > } > > so that you're not adding an empty `obj` to `hangs` after this block? The object isn't empty. We still attach the non-native stack created from CreateJSHangHistogram. We can collect pseudostacks which don't have any native stack associated with them still. Your code would discard entries from the ping completely which don't have a native stack, while currently we don't, we just include the native stack and no pseudostack. > ::: toolkit/components/telemetry/docs/data/main-ping.rst > @@ +275,5 @@ > > + "stacks": [ > > + [ > > + [ > > + 0, // the module index or -1 for invalid module indices > > + 190649 // the offset of this program counter in its module or an absolute pc > > For my own edification, but also for documentation purposes, the entries in > a stack are basically: > > enum StackEntry { > // module_index is always valid. > Offset(module_index, pc_offset), > UnknownModule(absolute_pc), > } Yes, these are formatted as Offset(module_index, pc_offset) => [module_index, pc_offset] UnknownModule(absolute_pc) => [-1, absolute_pc] > and those are the only two options? Maybe we should slightly edit the > comments here to indicate that--in a followup bug? Yeah, I think we should do that in a followup. > Has the necessary handling been added on the telemetry server side to deal > with this format change? Do we need to bump a version on the client side? Our only consumers of this data right now are my python script which I run on aws (https://people-mozilla.org/~mlayzell/bhr), and the new BHR dashboard in bug 1344003. Both of these should be easy to fix to use the new format, and I will needinfo dthayer to make sure that he can support it in his code before I land this.
Assignee | ||
Comment 8•7 years ago
|
||
MozReview-Commit-ID: IQ1fMXUD0ch
Attachment #8868692 -
Flags: review?(nfroyd)
Assignee | ||
Updated•7 years ago
|
Attachment #8866542 -
Attachment is obsolete: true
Comment 9•7 years ago
|
||
Comment on attachment 8868692 [details] [diff] [review] Share memory maps between BHR native stacks, dr=bsmedberg Review of attachment 8868692 [details] [diff] [review]: ----------------------------------------------------------------- OK. Sorry for dropping this one on the floor for so long, it should have been a faster review. ::: toolkit/components/telemetry/Telemetry.cpp @@ +2081,5 @@ > if (!hangs) { > return nullptr; > } > for (size_t i = 0; i < thread.mHangs.length(); i++) { > JS::RootedObject obj(cx, CreateJSHangHistogram(cx, thread.mHangs[i])); CreateJSHangHistogram can fail, so we need to null-check `obj`, here, correct? It's not obvious to me what the correct behavior is: 1) Skip native stack processing and stick a null entry into `hangs` (this is what we do currently); 2) Skip adding `obj` in any way to `hangs`, which would require a little more logic in what element we're adding to `hangs`.
Attachment #8868692 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 10•7 years ago
|
||
dthayer, are you OK with me landing this today? I don't want to break your analysis.
Flags: needinfo?(dothayer)
Comment 11•7 years ago
|
||
Yeah, that should be fine - I should be able to get the switch in tomorrow hopefully.
Flags: needinfo?(dothayer)
Comment 12•7 years ago
|
||
Pushed by michael@thelayzells.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/875b676f6c83 Share memory maps between BHR native stacks, dr=bsmedberg, r=froydnj
Comment 13•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/875b676f6c83
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox55:
--- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•