Closed Bug 1024774 Opened 10 years ago Closed 9 years ago

[jsdbg2] Implement heap snapshots

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox40 --- fixed
firefox41 --- fixed

People

(Reporter: fitzgen, Assigned: fitzgen)

References

(Blocks 2 open bugs)

Details

Attachments

(16 files, 102 obsolete files)

3.21 MB, patch
fitzgen
: review+
Details | Diff | Splinter Review
3.93 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
6.04 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
88.29 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
5.96 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
19.52 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
1.98 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
3.43 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
2.37 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
22.47 KB, patch
jimb
: review+
Details | Diff | Splinter Review
1.84 KB, patch
jimb
: review+
Details | Diff | Splinter Review
8.03 KB, patch
jimb
: review+
Details | Diff | Splinter Review
6.07 KB, patch
jimb
: review+
Details | Diff | Splinter Review
46.23 KB, patch
fitzgen
: review+
Details | Diff | Splinter Review
3.49 KB, patch
sfink
: review+
Details | Diff | Splinter Review
4.25 MB, patch
Details | Diff | Splinter Review
1) We need to specify a format and create a library for making compact core dumps. 2) Next, we need to be able to deserialize core dumps into ubi::Node instances and provide a set of Debugger.Memory APIs on top of these deserialized ubi::Nodes. 3) Finally, we need to glue everything together: traverse the heap with ubi::Node and save the core dump to temporary storage on the debuggee device, then stream the core dump across the RDP to the debugger machine where we can deserialize and call those APIs we made above. ------------------------------------------------------ Skeleton of a core dump format: # Attributes * Attributes are names of a field in a structure. * Examples: node type, edge, compartment # Form * A specific representation of a value * Examples: null terminated string, 32 bit word, 64 bit word, variable length integer, block of bytes preceded by count # Entry * An actual thing in the heap. * Consists of: a) Shape number b) n times: * If it's the first time this shape number has occured in the core dump: - attr code - form code - value form * Else: -value form c) END sentinel value The shape number could have a "has children" bit and then any entries following an entry that has this bit set are considered children of that entry until we reach the END_CHILDREN sentinel value. This could be useful for describing a node's edges, for example.
(In reply to Nick Fitzgerald [:fitzgen] from comment #0) > c) END sentinel value Note that this only need appear the first time the shape number is used. After that, we know when the values end. (All forms must be self-terminating: a fixed width, an explicit count, or a null terminating byte.)
Would also be nice if we could implement a nice little serialization DSL like boost's: http://www.boost.org/doc/libs/1_55_0/libs/serialization/doc/index.html
Working on docs, at least for now, so I will take this for the mean time.
Assignee: nobody → nfitzgerald
Attached patch heap-snapshot-docs.patch (obsolete) — Splinter Review
Here is a work in progress, but I think enough is there to get the idea. I'm starting to lean towards just using protobufs (which are already in the tree), and I'm going to investigate that now.
Attachment #8451955 - Flags: feedback?(jimb)
> TODO how many bits for these things? The "right" length for integers is always a tough question. Any fixed length is always wrong: the size has to accommodate the largest value you think might occur, which guarantees that the more significant bytes will usually be zero, and hence wasted. Both DWARF and protobufs use a variable-length integer encoding: https://developers.google.com/protocol-buffers/docs/encoding#varints (Amazingly, I believe they use exactly the same encoding, although I haven't read closely enough to be sure.)
So, the big difference between protobufs and our encoding is that protobufs repeat the shape metadata in every message, instead of just citing a previously established shape. The first version of DWARF did the same thing, and they found that the metadata took up more space than was acceptable. (I heard this from Michael Eager, but DWARF's Wikipedia page says the same thing.) Now, that's clearly a function of the kind of data DWARF was storing (the number of shapes was very small relative to the number of nodes), and perhaps even on the specific predilections of the producers. Now, tradeoffs have changed; storage is a lot cheaper than it was, and bandwidth is a lot better. I think the cost of implementing custom serialization (it's actually getting it tested properly that gives me pause) is high enough that it would make sense to give protobufs a chance, and then revise if the size (or, specifically, the metadata size) turns out to be relevant to developer experience. But I'm definitely still interested in your impressions after having read over the protobufs spec.
I wonder whether it's necessary to include the NodeId at the head of each node. Can't we just number them implicitly?
One simplification we could make would be to drop the node / edge distinction, calling everything an 'entry', and let entries have children. Then, edges would simply be the children of a node, and the referent would simply be the child of an edge. I think the type vocabulary could be simplified a lot. It's not important these days to have values the processor can just pluck out of the stream directly. We need signed and unsigned varints; bools; UTF-8 strings; node references, and that's it. Yes, we should specify the byte order. The nice thing about varints is that they are explicitly little-endian.
Attachment #8451955 - Flags: feedback?(jimb) → feedback+
Attached patch heap-snapshots.patch (obsolete) — Splinter Review
Ok this adds the IDL interface and skeleton of the class, renames the jsinspector xpcom module to devtools_server, and does parameter validation. Just want to get some early feedback, mostly on the module changes and interface glue stuff.
Attachment #8451955 - Attachment is obsolete: true
Attachment #8496266 - Flags: feedback?(jimb)
Depends on: 1073124
Comment on attachment 8496266 [details] [diff] [review] heap-snapshots.patch I've switchded from XPIDL to WebIDL, and a bunch of this patch is obsolete now.
Attachment #8496266 - Attachment is obsolete: true
Attachment #8496266 - Flags: feedback?(jimb)
Depends on: 1083456
Attached patch WIP (obsolete) — Splinter Review
Ok this does the heap graph traversal and converts everything into protobuf messages (although incomplete ones), but doesn't write the core dump to disk yet.
Depends on: 1099440
Attached patch WIP (obsolete) — Splinter Review
Attachment #8523129 - Attachment is obsolete: true
Attached patch WIP (obsolete) — Splinter Review
Ok this can write a heap snapshot to a core dump, but I don't have a concrete specialization that is backed by the core dump yet.
Attachment #8523342 - Attachment is obsolete: true
This is just the interface.
Attachment #8524044 - Attachment is obsolete: true
Attachment #8526236 - Flags: review?(bobbyholley)
:mmc, can you give feedback on the use of protobufs in CoreDump.* and HeapSnapshotManager.*? Thanks!
Attachment #8526240 - Flags: review?(jimb)
Attachment #8526240 - Flags: feedback?(mmc)
Attached patch Part 3: xpcshell tests (obsolete) — Splinter Review
Attachment #8526242 - Flags: review?(jimb)
Attached patch Part 4: GTests (obsolete) — Splinter Review
Attachment #8526244 - Flags: review?(jimb)
Attachment #8526240 - Attachment description: heap-snapshots-2.patch → Part 2: Serialize heap snapshots to core dumps
Comment on attachment 8526236 [details] [diff] [review] Part 1: HeapSnapshotManager.webidl Review of attachment 8526236 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/bindings/Bindings.conf @@ +461,5 @@ > 'headerFile': 'nsGeolocation.h' > }, > > +'HeapSnapshotManager': { > + 'concrete': False, I don't think this is right. ::: dom/webidl/HeapSnapshotManager.webidl @@ +6,5 @@ > + > +/** > + * A class for saving heap snapshots. > + */ > +[ChromeOnly, Exposed=(Window,System,Worker)] We need test coverage for each type of global this is exposed on, including window and workers. @@ +35,5 @@ > + * > + * * debuggees: Debugger object > + * Like "globals", but use the Debugger's debuggees as the globals. > + * > + * * runtime: true What do you mean by "true" here? It will default to false I think. @@ +48,5 @@ > + * incoming JS reference to the selected Zones. > + */ > +dictionary HeapSnapshotBoundaries { > + any globals; > + object debuggees; Why are these not sequences?
Attachment #8526236 - Flags: review?(bobbyholley) → review-
Comment on attachment 8526240 [details] [diff] [review] Part 2: Serialize heap snapshots to core dumps Review of attachment 8526240 [details] [diff] [review]: ----------------------------------------------------------------- I looked at HeapSnapshotManager and the protobuf stuff. The protobuf stuff looks fine, so does the approach of flushing each node during traversal. ::: toolkit/devtools/server/CoreDump.proto @@ +1,1 @@ > +option optimize_for = LITE_RUNTIME; I could do with more comments in this file. It should also probably have the usual preamble and license header. @@ +5,5 @@ > +message Metadata { > + optional uint64 timeStamp = 1; > +} > + > +message Node { At some point you may regret having multiple classes named "Node" @@ +8,5 @@ > + > +message Node { > + optional uint64 id = 1; > + // char16_t * > + optional bytes typeName = 2; why Node.typeName() and Edge.name()? I see, it's consistent with ubinode stuff. @@ +14,5 @@ > + repeated Edge edges = 4; > +} > + > +message Edge { > + optional uint64 referent = 1; Is this an id? If so, please include a comment or rename referentId ::: toolkit/devtools/server/HeapSnapshotManager.cpp @@ +132,5 @@ > + * globals, find the set of zones the globals allocate within. Returns false on > + * failure. > + */ > +static bool > +PopulateZonesWithGlobals(JSContext *cx, cx is unused @@ +151,5 @@ > +{ > + const char16_t *ss = s; > + while (*ss) > + ss++; > + return ss - s; This gives you 0 if s is null. Is that a bug? @@ +277,5 @@ > + if (!boundaries.mRuntime.WasPassed()) { > + if (!PopulateGlobalsWithBoundaries(cx, boundaries, globals) || > + !zones.init() || > + !PopulateZonesWithGlobals(cx, globals, zones)) > + { I don't understand the cuddle bracket heuristic here... @@ +303,5 @@ > + } > + > + protobuf::Metadata metadata; > + // TODO FITZGEN: figure out how the hell to get the time stamp. > + // metadata.set_timestamp(); gettimeofday? Maybe not Windows friendly. https://mxr.mozilla.org/mozilla-central/source/security/pkix/lib/pkixtime.cpp#36 ::: toolkit/devtools/server/HeapSnapshotManager.h @@ +23,5 @@ > + * Populate a `protobuf::Node` protobuf message with the given > + * `JS::ubi::Node`. Returns false on failure. > + */ > +extern bool PopulateProtobufNode(JSContext *cx, > + JS::ubi::Node ubiNode, pass by reference, unless it's going to change out from under you? @@ +36,5 @@ > +class MOZ_STACK_CLASS HeapSnapshotHandler { > + JSContext *cx; > + bool wantNames; > + CoreDump &coreDump; > + JS::ZoneSet *zones; I wonder if you want a maxNodes or maxSize here in case things go horribly wrong. Or maybe the traversal code handles that? @@ +97,5 @@ > + * A `CoreDump` class serializes and accumulates protobuf messages to disk, or > + * memory, or network, etc. It class must implement the following public > + * method: > + * > + * bool serialize(::google::protobuf::MessageLite &message); It may be more useful to return the size of the serialized string.
Attachment #8526240 - Flags: feedback?(mmc) → feedback+
Comment on attachment 8526240 [details] [diff] [review] Part 2: Serialize heap snapshots to core dumps Review of attachment 8526240 [details] [diff] [review]: ----------------------------------------------------------------- Review in progress... ::: js/public/UbiNode.h @@ +477,5 @@ > typedef mozilla::Vector<SimpleEdge, 8, js::TempAllocPolicy> SimpleEdgeVector; > > +// An EdgeRange concrete class that holds a pre-existing vector of SimpleEdges. > +class PreComputedEdgeRange : public EdgeRange { > + SimpleEdgeVector &edges; Who owns this vector? This class won't free it: it only owns a reference to the vector, not the vector itself. ::: js/public/UbiNodeTraverse.h @@ +120,5 @@ > MOZ_ASSERT(!traversalBegun); > traversalBegun = true; > > + // While there are pending nodes, visit them, until we've found a path > + // to the target. Thanks. ::: toolkit/devtools/server/CoreDump.proto @@ +2,5 @@ > + > +package mozilla.devtools.protobuf; > + > +message Metadata { > + optional uint64 timeStamp = 1; You've got to specify units and epoch here. Seconds since 1970? Milliseconds? Pity the poor hacker trying to read this file to understand the format.
Comment on attachment 8526240 [details] [diff] [review] Part 2: Serialize heap snapshots to core dumps Review of attachment 8526240 [details] [diff] [review]: ----------------------------------------------------------------- More in-progress comments: ::: js/public/Debug.h @@ +303,5 @@ > > > // Return true if the given value is a Debugger object, false otherwise. > JS_PUBLIC_API(bool) > +IsDebugger(const JSObject *obj); nit: For pointers that aren't allowed to be null, SpiderMonkey encourages the use of references, as a convention. (Exceptions: JSRuntime *, the ubiquitous JSContext *, and pointers that represent the start of an array.) @@ +308,5 @@ > + > +// Append each of the debuggee global objects observed by the Debugger object > +// |dbgObj| to |vector|. Returns true on success, false on failure. > +JS_PUBLIC_API(bool) > +GetDebuggeeGlobals(JSContext *cx, const JSObject *dbgObj, AutoObjectVector &vector); Same here. ::: js/src/vm/Debugger.cpp @@ +7261,2 @@ > { > + return obj && js::Debugger::fromJSObject(obj) != nullptr; Doesn't this assert if obj isn't a Debugger object, instead of returning false? @@ +7269,5 @@ > + js::Debugger *dbg = js::Debugger::fromJSObject(dbgObj); > + > + if (!vector.reserve(vector.length() + dbg->debuggees.count())) > + // TODO FITZGEN: do I need to report the OOM on cx here? If so, how do I > + // avoid overwriting that in HeapSnapshotManager? In reporting failures, SpiderMonkey makes a distinction between simple containers (Vector; HashMap; etc.) that return false on OOM but don't report anything on the context; versus, um, Proper SpiderMonkey Fallible functions, which take a JSContext * and from which a false return means that an error *has been reported on the cx* if appropriate. In the present case: 1) The vector.reserve call is in the first category, so it's not going to report the OOM. 2) You need to decide which category GetDebuggeeGlobals falls in; I recommend the second. If that's the case, then it must report the OOM, and its callers should assume that a false return means that *cx has been set properly. In that case, it would almost always be a bug for a caller to overwrite cx's exception when GetDebuggeeGlobals returns false.
Comment on attachment 8526240 [details] [diff] [review] Part 2: Serialize heap snapshots to core dumps Review of attachment 8526240 [details] [diff] [review]: ----------------------------------------------------------------- This a good start, but I think we want to do things a bit differently. The code to handle the boundaries could be simplified a lot. We walk the array of globals once to check it, then walk it again to get globals, then walk those globals to get zones, and then possibly pass that to the RootList init; all this could be a single loop, eliminating the need for any temporary list of globals. mRuntime.WasPassed() is checked in four different places. Instead, let's have a single function: /** * Choose roots and limits for a traversal, given |boundaries|. * Set |roots| to the set of nodes within the boundaries that are referred to by * nodes outside. If |boundaries| does not include all JS zones, initialize * |zones| to the set of included zones; otherwise, leave |zones| uninitialized. * (You can use zones.initialized() to check.) * * If |boundaries| is incoherent, or we encounter an error while trying to * handle it, or we run out of memory, set |rv| appropriately. */ bool EstablishBoundaries(JSContext *cx, ErrorResult &rv, const HeapSnapshotBoundaries &boundaries, RootList &roots, ZoneSet &zones); I think this is going to end up being not a short function, but in the end much simpler than the series we go through now. I'm concerned about the out-of-memory handling, too. If the call to edges.append in RootList::init(JSContext *, ZoneSet &) fails, for example, it looks to me like that shows up as an NS_ERROR_UNEXPECTED result from SaveSnapshot. Shouldn't we return NS_ERROR_OUT_OF_MEMORY? My other concern is that I don't think we should use template parameters the way HeapSnapshotHandler and SerializeToCoreDump do now. I've put more specific comments by the code. ::: toolkit/devtools/server/CoreDump.proto @@ +6,5 @@ > + optional uint64 timeStamp = 1; > +} > + > +message Node { > + optional uint64 id = 1; The choice of id is interesting. If we assigned nodes sequentially increasing ids as we visited them (we'd just make HeapSnapshotHandler<T>::NodeData hold a size_t), then the variable-length integer encoding would save us a lot of space, compared to addresses, which will often use 48 bits. But if we could recognize object relocation and address re-use (a big 'if'), then using ubi::Node::identifier here would let us compare heaps. I'm inclined to say, let's assign sequential ids for now, and worry about providing ids that persist across snapshots once we have consumers for such, and solutions to the problems I mentioned above. @@ +7,5 @@ > +} > + > +message Node { > + optional uint64 id = 1; > + // char16_t * I would write |char16_t[]| here and in Edge, as there's no indirection involved. ::: toolkit/devtools/server/HeapSnapshotManager.cpp @@ +27,5 @@ > + > +/** > + * Validate the boundaries object. Returns false on unexpected > + * failure. Otherwise, `*validp` is true if the boundaries object is valid, and > + * false if it is invalid. Bobby suggested making the globals a .webidl sequence. How does that show up in the C++? If it's no longer a JavaScript array object, then perhaps this function wouldn't have any more "unexpected failure" conditions, and it could just return true iff the boundary is valid. @@ +80,5 @@ > + *validp = numPropsFound == 1; > + return true; > +} > + > +typedef mozilla::Maybe<AutoCheckCannotGC> MaybeAutoCheckCannotGC; If every user of RootList is going to wish they had this typedef, it seems like this probably belongs with the RootList declaration in the .h file. @@ +146,5 @@ > + return true; > +} > + > +size_t > +strlen16(const char16_t *s) We should use NS_strlen here. @@ +241,5 @@ > + bool result = message.SerializeToString(&str); > + NS_ENSURE_TRUE(result, false); > + > + int32_t nbytes = str.length(); > + int32_t amount = PR_Write(fd, str.data(), nbytes); PR_Write is an unbuffered 'write' system call. This is going to be very system-call-heavy. We should be writing to an nsIOutputStream, I would think. @@ +303,5 @@ > + } > + > + protobuf::Metadata metadata; > + // TODO FITZGEN: figure out how the hell to get the time stamp. > + // metadata.set_timestamp(); I think PR_Now is probably good. ::: toolkit/devtools/server/HeapSnapshotManager.h @@ +23,5 @@ > + * Populate a `protobuf::Node` protobuf message with the given > + * `JS::ubi::Node`. Returns false on failure. > + */ > +extern bool PopulateProtobufNode(JSContext *cx, > + JS::ubi::Node ubiNode, ubi::Node is meant to be usable as a value type, but it is two words long, so passing by const reference here is probably not a bad idea. @@ +32,5 @@ > + * A JS::ubi::BreadthFirst handler that serializes a snapshot of the heap into a > + * core dump. > + */ > +template<typename CoreDump> > +class MOZ_STACK_CLASS HeapSnapshotHandler { Rather than making this a template parameter, we should have: struct NodeWriter { virtual bool write(const JS::ubi::Node &node) = 0; }; and then have everyone take a pointer to that. We get better type checking and error messages; we can keep internal details in the .cpp where they belong; the mocking in the unit tests will be easier; and compared to the serialization, the virtual call overhead is insignificant. @@ +58,5 @@ > + > + // JS::ubi::BreadthFirst handler interface. > + > + class NodeData { }; > + bool operator() (JS::ubi::BreadthFirst<HeapSnapshotHandler> &traversal, Let's do: typedef JS::ubi::BreadthFirst<HeapSnapshotHandler> Traversal up towards the top. Then use that here, and say CoreDumpHandler::Traversal instead of the typedef down in SerializeToCoreDump. @@ +75,5 @@ > + if (zones) { > + // Don't serialize nodes that belong to a zone that is not an element of our > + // zone set. Do serialize nodes that are not associated with a zone, or are > + // associated with the atoms zone, on the assumption that they are shared > + // resources that our target zones are using. This is an interesting choice. It means that we'll get edges to nodes we didn't serialize, which is going to be a pain. And once we have better ubi::Node support for non-SpiderMonkey types, it means that we'll wander off into areas of the graph that aren't associated with any zone --- because we do serialize any node reachable through an arbitrary path of non-zone nodes. I think it would be better to serialize every node referenced by any edge we serialize; but omit outgoing edges of nodes that are not in a target zone. That way we have some data about everything our nodes of interest refer to; but when a referent is not of interest, we just don't have as much detail about it. @@ +115,5 @@ > + > + protobuf::Node protobufNode; > + NS_ENSURE_TRUE(PopulateProtobufNode(cx, node, wantNames, protobufNode), > + false); > + NS_ENSURE_TRUE(coreDump.serialize(protobufNode), false); We're not supposed to add new instances of NS_ENSURE_TRUE; see https://bugzilla.mozilla.org/show_bug.cgi?id=971673#c88 That also suggests an alternative. With NodeWriter, of course, this will turn into a single call to writer.write(...). @@ +132,5 @@ > +} > + > +// TODO fitzgen: this has to exist somewhere re-usable, already... can't use > +// js_strlen because it isn't exported... > +extern size_t strlen16(const char16_t *s); How about NS_strlen? ::: toolkit/devtools/server/generate-core-dump-sources.sh @@ +14,5 @@ > +fi > + > +if [ ! -e $PROTOC_PATH ]; then > + echo You must install the protocol compiler from > + echo https://code.google.com/p/protobuf/downloads/list If we require a very old version of protoc (as you've said on IRC), then this should document the version necessary, or point to where one can find the version of the protobuf-lite library that protoc must match.
Attachment #8526240 - Flags: review?(jimb)
Nick, since you're revising the main patch, I'm going to treat the other reviews as still needed, but lower priority for now. If that's not okay, let me know.
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
Attachment #8526236 - Attachment is obsolete: true
Attachment #8526240 - Attachment is obsolete: true
Attachment #8526242 - Attachment is obsolete: true
Attachment #8526244 - Attachment is obsolete: true
Attachment #8526242 - Flags: review?(jimb)
Attachment #8526244 - Flags: review?(jimb)
Attachment #8549932 - Attachment is obsolete: true
Attachment #8549933 - Attachment is obsolete: true
Attachment #8549934 - Attachment is obsolete: true
Attachment #8549935 - Attachment is obsolete: true
Attachment #8549936 - Attachment is obsolete: true
Attachment #8549937 - Attachment is obsolete: true
Attachment #8549938 - Attachment is obsolete: true
Attachment #8549939 - Attachment is obsolete: true
Attachment #8549940 - Attachment is obsolete: true
Comment on attachment 8555561 [details] [diff] [review] WIP: GZip serialized core dumps I'm getting linker errors with this patch: claims it can't find the GzipOutputStream's ctor's symbols. I'm looking into how the protobuf library is built in patch 0, but I haven't seen anything out of the ordinary yet.
Attachment #8555549 - Attachment is obsolete: true
Attachment #8555550 - Attachment is obsolete: true
Attachment #8555551 - Attachment is obsolete: true
Attachment #8555552 - Attachment is obsolete: true
Attachment #8555554 - Attachment is obsolete: true
Attachment #8555555 - Attachment is obsolete: true
Attachment #8555556 - Attachment is obsolete: true
Attachment #8555557 - Attachment is obsolete: true
Attachment #8555558 - Attachment is obsolete: true
Attachment #8555560 - Attachment is obsolete: true
Attachment #8555561 - Attachment is obsolete: true
Attachment #8556598 - Attachment is obsolete: true
Comment on attachment 8556601 [details] [diff] [review] Part 1: Add HeapSnapshotManager WebIDL interface. Review of attachment 8556601 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/bindings/Bindings.conf @@ +446,5 @@ > }, > > +'HeapSnapshotManager': { > + # The codegen is dumb, and doesn't understand that this interface is only a > + # collection of static methods, so we have this `concrete: False` hack. Is there a bug on file for this?
Attachment #8556601 - Flags: review?(bobbyholley) → review+
Though actually, do we really want to be introducing entire WebIDL interfaces objects for trivial stuff like this? Wouldn't it make more sense to just have a single utils class or something? Smaug, what do you think?
Flags: needinfo?(bugs)
Attachment #8556607 - Flags: review?(bobbyholley) → review+
Yeah, Part 1 doesn't look performance critical or anything. And the dictionary is rather simple, so saveSnapshot could just take all those dictionary members as parameters. sequence<object> globals; could be an nsIVariant... but sequence<> as nsIVariant is annoying to handle. So just because of that using .webidl would be ok to me. There is also the option to use .idl for saveSnapshot and have the disctionary, and then on C++ initialize the dictionary using a jsval. Do we really not have any helper interface for debugger stuff where saveSnapshot could be added?
Flags: needinfo?(bugs)
The existing Debugger API is implemented in and exposed by SpiderMonkey, so there isn't a good existing WebIDL interface to expose this stuff on. I was following the pattern of the `PromiseDebugging` WebIDL interface. I suppose I could rename this interface to `DevTools` or something? Note that I am adding another method in later patches (patch 7), and potentially even more in the future.
(In reply to Nick Fitzgerald [:fitzgen] from comment #59) > The existing Debugger API is implemented in and exposed by SpiderMonkey, so > there isn't a good existing WebIDL interface to expose this stuff on. I was > following the pattern of the `PromiseDebugging` WebIDL interface. It seems like we should avoid proliferating these, and create something like ChromeUtils as a dumping ground for this stuff. We pay a per-interface in various places, so we should avoid adding new interfaces willy-nilly.
* pay a per-interface cost.
Attachment #8556608 - Attachment is obsolete: true
Attachment #8556609 - Attachment is obsolete: true
Attachment #8556610 - Attachment is obsolete: true
Depends on: 1129559
Attachment #8558252 - Attachment is obsolete: true
Attachment #8558253 - Attachment is obsolete: true
Attachment #8558254 - Attachment is obsolete: true
Attachment #8560159 - Flags: feedback?(bobbyholley)
Attachment #8560161 - Flags: feedback?(jimb)
Attachment #8560162 - Flags: feedback?(jimb)
Comment on attachment 8560159 [details] [diff] [review] Part 7: Add HeapSnapshot WebIDL interface Review of attachment 8560159 [details] [diff] [review]: ----------------------------------------------------------------- This doesn't seem to address comment 60.
Attachment #8560159 - Flags: feedback?(bobbyholley) → feedback-
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #71) > Comment on attachment 8560159 [details] [diff] [review] > Part 7: Add HeapSnapshot WebIDL interface > > Review of attachment 8560159 [details] [diff] [review]: > ----------------------------------------------------------------- > > This doesn't seem to address comment 60. I was going to rename "HeapSnapshotManager" to "DevToolsPlatformUtils" that can be shared for all devtools static functions (since we already have a DevToolsUtils js module), but I haven't done that yet. I don't see a way to avoid having a HeapSnapshot interface though, as this is going to be where we have methods for computing dominator trees, finding shortest paths from a root to an object, etc. What do you suggest in this case?
Flags: needinfo?(bobbyholley)
(In reply to Nick Fitzgerald [:fitzgen] from comment #72) > (In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect > things) from comment #71) > > Comment on attachment 8560159 [details] [diff] [review] > > Part 7: Add HeapSnapshot WebIDL interface > > > > Review of attachment 8560159 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > This doesn't seem to address comment 60. > > I was going to rename "HeapSnapshotManager" to "DevToolsPlatformUtils" that > can be shared for all devtools static functions (since we already have a > DevToolsUtils js module), but I haven't done that yet. I think it should be even more general - is there any reason we can't just make something called ChromeUtils? It's not like these statics methods share state, so I don't see any reason why we need to separate them into separate interfaces. If we do DevToolsPlatformUtils, what happens when the Loop team needs to add a one-off method? > I don't see a way to avoid having a HeapSnapshot interface though, as this > is going to be where we have methods for computing dominator trees, finding > shortest paths from a root to an object, etc. Right - if this is for an actual concrete object, having an interface is fine.
Flags: needinfo?(bobbyholley)
Attachment #8556712 - Attachment is obsolete: true
Attachment #8564356 - Flags: review?(mmc)
Attachment #8564356 - Flags: review?(mh+mozilla)
Comment on attachment 8564356 [details] [diff] [review] Part 0: Upgrade the protobuf library. r=mmc Review of attachment 8564356 [details] [diff] [review]: ----------------------------------------------------------------- I'm not sure how or if configure is called from moz.build, but it doesn't seem to be. In that case we should not include any of its friends, either. ::: configure.in @@ +9136,5 @@ > AC_OUTPUT_SUBDIRS(modules/freetype2) > ) || exit 1 > fi > > + if test -z "$direct_nspr_config"; then This formatting change looks bogus. ::: toolkit/components/downloads/generate_csd.sh @@ +26,2 @@ > # Get the protocol buffer and compile it > +CMD='wget https://src.chromium.org/chrome/trunk/src/chrome/common/safe_browsing/csd.proto -O csd.proto' I think this link is now wrong since Chrome switched to gitiles. See https://bugzilla.mozilla.org/show_bug.cgi?id=1083085. What are the consequences of omitting the csd changes from this patch? If they don't include breaking backwards compatibility, the easiest thing to do is omit it for now and file a bug to take care of it later. ::: toolkit/components/protobuf/README.txt @@ +23,5 @@ > + `$MOZILLA_CENTRAL/toolkit/components/protobuf/src/google/**` except for test > + files and `$PROTOBUF/src/google/protobuf/compiler/**` files. > + > +3. Update the moz.build to export the new set of headers and add any new .cc > + files to the unified sources. If there aren't any manual steps required in 1. and 2., can we change the update script to do this automatically? ::: toolkit/components/protobuf/autom4te.cache/output.0 @@ +1,1 @@ > +@%:@! /bin/sh Is this file used in our build system? ::: toolkit/components/protobuf/autom4te.cache/requests @@ +1,3 @@ > +# This file was generated. > +# It contains the lists of macros which have been traced. > +# It can be safely removed. We should remove it if it's not being used. ::: toolkit/components/protobuf/autom4te.cache/traces.0 @@ +1,1 @@ > +m4trace:configure.ac:15: -1- AC_INIT([Protocol Buffers], [2.6.1], [protobuf@googlegroups.com], [protobuf]) Are we using this? ::: toolkit/components/protobuf/configure @@ +1,1 @@ > +#! /bin/sh Same question ::: toolkit/components/protobuf/configure.ac @@ +1,1 @@ > +## Process this file with autoconf to produce configure. Same question ::: toolkit/components/protobuf/src/google/protobuf/SEBS @@ +5,5 @@ > +# This is an experimental build definition file using the SEBS build system. > +# I (Kenton Varda, maintainer of Protocol Buffers) happen to be the author of > +# SEBS, though SEBS is not a Google project. I'm sticking this file in > +# protobuf's SVN because that's the easiest place for me to put it, and it > +# shouldn't harm anyone. This file is not included in the distribution. We shouldn't include this. ::: toolkit/components/protobuf/vs2013.patch @@ -1,1 @@ > -diff --git a/toolkit/components/protobuf/google/protobuf/wire_format_lite_inl.h b/toolkit/components/protobuf/google/protobuf/wire_format_lite_inl.h Did you check that this works with VS2013?
Attachment #8564356 - Flags: review?(mmc)
Comment on attachment 8564356 [details] [diff] [review] Part 0: Upgrade the protobuf library. r=mmc Review of attachment 8564356 [details] [diff] [review]: ----------------------------------------------------------------- I have nothing to add to Monica's comments. I'll just note that autom4te.cache files shouldn't even be in the upstream releases. If they are, they should fix their release scripts.
Attachment #8564356 - Flags: review?(mh+mozilla)
Attachment #8556601 - Attachment is obsolete: true
Attachment #8566736 - Flags: review+
Attachment #8556603 - Attachment is obsolete: true
Attachment #8556603 - Flags: review?(jimb)
Attachment #8566741 - Flags: review?(jimb)
Attachment #8556605 - Attachment is obsolete: true
Attachment #8556605 - Flags: review?(jimb)
Attachment #8566743 - Flags: review?(jimb)
Attachment #8556606 - Attachment is obsolete: true
Attachment #8556606 - Flags: review?(jimb)
Attachment #8566744 - Flags: review?(jimb)
Attachment #8556607 - Attachment is obsolete: true
Attachment #8566745 - Flags: review+
Attachment #8560159 - Attachment is obsolete: true
Attachment #8566746 - Flags: review?(bobbyholley)
Attachment #8560161 - Attachment is obsolete: true
Attachment #8560161 - Flags: feedback?(jimb)
Attachment #8566747 - Flags: review?(jimb)
Attachment #8560162 - Attachment is obsolete: true
Attachment #8560162 - Flags: feedback?(jimb)
Attachment #8566748 - Flags: review?(jimb)
Attachment #8560163 - Attachment is obsolete: true
Attachment #8560163 - Flags: feedback?(jimb)
Attachment #8566749 - Flags: review?(jimb)
Attachment #8566749 - Attachment is obsolete: true
Attachment #8566749 - Flags: review?(jimb)
Attachment #8566751 - Flags: review?(jimb)
Attachment #8560164 - Attachment is obsolete: true
Attachment #8566752 - Flags: review?(jimb)
Attachment #8560165 - Attachment is obsolete: true
Attachment #8560165 - Flags: feedback?(jimb)
Attachment #8566754 - Flags: review?(jimb)
Attachment #8566746 - Flags: review?(bobbyholley) → review+
Comment on attachment 8566738 [details] [diff] [review] Part 2: Implement a google::protobuf::ZeroCopyOutputStream wrapper around nsIOutputStream Review of attachment 8566738 [details] [diff] [review]: ----------------------------------------------------------------- I get that protobuf just returns false when something goes wrong, but I think we can do a better job here propagating error codes out from the underlying nsIOutputStream to our callers. Let's change 'failed' from a bool to an nsresult, and add a public accessor function for it. writeBuffer would save the result from nsIOutputStream::write there. When a protobuf operation fails, we can check the underlying ZeroCopyNSIOutputStream's accessor to see if it can provide any more detail on the cause. 'flush' and 'writeBuffer' would return nsresult, and it would be in the 'Next' implementation that we throw away detail, as required by the interface it's implementing. ::: toolkit/devtools/server/ZeroCopyNSIOutputStream.cpp @@ +10,5 @@ > +namespace mozilla { > +namespace devtools { > + > +ZeroCopyNSIOutputStream::ZeroCopyNSIOutputStream(nsIOutputStream &out) > + : ZeroCopyOutputStream() I think you can leave this out when it has no arguments. @@ +23,5 @@ > +} > + > +ZeroCopyNSIOutputStream::~ZeroCopyNSIOutputStream() > +{ > + (void) writeBuffer(); I have suggestions elsewhere for how to avoid throwing away this error result; but if we don't go with that, this should at least NS_WARN_IF. @@ +44,5 @@ > + return true; > + > + uint32_t amountWritten = 0; > + while (amountWritten < amountUsed) { > + uint32_t justWritten = 0; size_t is generally the right type for any count of bytes in memory, but I see that nsIOutputStream::Write requires uint32_t. Oh well. @@ +46,5 @@ > + uint32_t amountWritten = 0; > + while (amountWritten < amountUsed) { > + uint32_t justWritten = 0; > + > + if (NS_WARN_IF(NS_FAILED(out.Write(reinterpret_cast<char *>(buffer) + amountWritten, If you make buffer a char[], as recommended below, then this can just be |buffer + amountWritten|, because arrays are converted to pointers to their first element. @@ +52,5 @@ > + &justWritten)))) > + return fail(); > + > + if (NS_WARN_IF(justWritten == 0)) > + return fail(); I can't find anything that says that a zero written count is an error. I think it is just an extreme case of "partial write". Is there a reason we shouldn't simply retry? @@ +58,5 @@ > + amountWritten += justWritten; > + } > + > + if (NS_WARN_IF(NS_FAILED(out.Flush()))) > + return fail(); We should only flush the underlying stream if we ourselves are flushed, it seems to me. @@ +83,5 @@ > + } > + > + *data = &buffer; > + *size = BUFFER_SIZE; > + amountUsed = BUFFER_SIZE; I think this isn't correct. If anyone calls BackUp, and then calls Next, then amountUsed will be less than BUFFER_SIZE on entry to this function; we won't call writeBuffer, and we'll hand out the same section of the buffer we did before. I think *data and *size need to reflect the portion of buffer after the first amountUsed bytes. ::: toolkit/devtools/server/ZeroCopyNSIOutputStream.h @@ +18,5 @@ > +// `nsIOutputStream` under the covers. > +// > +// This class will automatically write and flush its data to the > +// `nsIOutputStream` in its destructor, but if you care whether that call > +// succeeds or fails, then you should call the `flush` method yourself. I really dislike eating errors like this. It always wastes some poor schmuck's time. I considered recommending that the destructor assert that there is no unflushed data; but then abandoning a ZeroCopyNSIOutputStream because of some other error would cause assertions. Would it be feasible to have that assert, and add an "abandon" method that is called when one is deliberately dropping a ZeroCopyNSIOutputStream because of some other error? Are the streams' lifetimes controlled enough to make that practical? @@ +25,5 @@ > +{ > + static const int BUFFER_SIZE = 8192; > + > + // The nsIOutputStream we are streaming to. > + nsIOutputStream &out; This should probably be nsCOMPtr, right? @@ +28,5 @@ > + // The nsIOutputStream we are streaming to. > + nsIOutputStream &out; > + > + // The buffer we write data to before passing it to the output stream. > + uint8_t buffer[BUFFER_SIZE]; If you make this just |char buffer[BUFFER_SIZE]|, then you can avoid a bunch of casts everywhere. @@ +57,5 @@ > + // ZeroCopyOutputStream Interface > + virtual ~ZeroCopyNSIOutputStream() MOZ_OVERRIDE; > + virtual bool Next(void **data, int *size) MOZ_OVERRIDE; > + virtual void BackUp(int count) MOZ_OVERRIDE; > + virtual ::google::protobuf::int64 ByteCount() const MOZ_OVERRIDE; Spacing is weird here.
Attachment #8566738 - Flags: review?(jimb) → feedback+
RootList's dependence on Maybe<AutoCheckCannotGC> (which is not understood by the static checker) made me sad, so I filed bug 1137389.
Depends on: 1137478
Comment on attachment 8566741 [details] [diff] [review] Part 3: Serialize heap snapshots. Review of attachment 8566741 [details] [diff] [review]: ----------------------------------------------------------------- Partial review; will actually look at the meat of the patch next. ::: js/public/UbiNode.h @@ +492,5 @@ > + { > + settle(); > + } > + > + void popFront() MOZ_OVERRIDE { i++; settle(); } Might want to assert !empty here. @@ +496,5 @@ > + void popFront() MOZ_OVERRIDE { i++; settle(); } > +}; > + > + > +typedef mozilla::Maybe<AutoCheckCannotGC> MaybeAutoCheckCannotGC; You probably wanted to use this typedef in the RootList definition, right? (I don't object! but I find these typedefs that save two characters and introduce a shadow of doubt as to their definition kind of annoying.) ::: js/public/UbiNodeTraverse.h @@ +120,5 @@ > MOZ_ASSERT(!traversalBegun); > traversalBegun = true; > > + // While there are pending nodes, visit them, until we've found a path > + // to the target. Since we're here, could you just drop the "until we've" clause altogether? There's no 'target' in BreadthFirst at all, so I think that comment is a vestige of its origins in the implementation of findPath. ::: js/src/vm/Debugger.cpp @@ +7686,5 @@ > return false; > + } > + > + for (GlobalObjectSet::Range r = dbg->debuggees.all(); !r.empty(); r.popFront()) > + vector.append(static_cast<JSObject *>(r.front())); This needs a read barrier. Filed as bug 1137478; full explanation there. Also, you should use Vector::infallibleAppend here. ::: toolkit/devtools/server/CoreDump.proto @@ +41,5 @@ > +// In practice, certain message fields have a lot of duplication (such as type > +// or edge name strings). Rather than try and de-duplicate this information at > +// the protobuf message and field level, core dumps should be written with > +// `google::protobuf::io::GzipOutputStream` and read from > +// `google::protobuf::io::GZipInputStream`. "GzipInputStream" (Oh, Jim, thank you so much for catching all these subtle bugs!) @@ +43,5 @@ > +// the protobuf message and field level, core dumps should be written with > +// `google::protobuf::io::GzipOutputStream` and read from > +// `google::protobuf::io::GZipInputStream`. > + > +// We only have the lite protobuf runtime in mozilla-central. Is this still true? I thought we had to drop the lite runtime to get the GzipOutputStream. ::: toolkit/devtools/server/generate-core-dump-sources.sh @@ +2,5 @@ > + > +# A script to generate toolkit/devtools/server/CoreDump.pb.{h,cc} from > +# toolkit/devtools/server/CoreDump.proto. This script assumes you have > +# downloaded and installed the protocol buffer compiler, and that it is either > +# on your $PATH or located at $PROTOC_PATH. When I ran toolkit/components/downloads/generate_csd.sh, I got confused about which directory I was supposed to run it in. It might be friendly to just "cd $(dirname $0)" at the top.
Comment on attachment 8566741 [details] [diff] [review] Part 3: Serialize heap snapshots. Review of attachment 8566741 [details] [diff] [review]: ----------------------------------------------------------------- This looks totally great. Some necessary changes noted, but r=me with those taken care of. ::: toolkit/devtools/server/ChromeUtils.cpp @@ +25,5 @@ > + > +namespace mozilla { > +namespace devtools { > + > +#define HEAP_SNAPSHOT_ERROR_MESSAGE_LENGTH 1024 This isn't used. @@ +32,5 @@ > +using namespace dom; > + > + > +// If we are only taking a snapshot of the heap affected by the given set of > +// globals, find the set of zones the globals allocate within. Returns false on "are allocated within" @@ +199,5 @@ > + , wantNames(wantNames) > + , stream(stream) > + { } > + > + virtual ~StreamWriter() MOZ_OVERRIDE { } We seem to be following SpiderMonkey style in this file; there, one includes either 'virtual' or 'MOZ_OVERRIDE' but not both. @@ +221,5 @@ > + mozilla::MallocSizeOf mallocSizeOf = dbg::GetDebuggerMallocSizeOf(rt); > + MOZ_ASSERT(mallocSizeOf); > + protobufNode.set_size(ubiNode.size(mallocSizeOf)); > + > + UniquePtr<ubi::EdgeRange, JS::FreePolicy> edges(ubiNode.edges(cx, wantNames)); Shouldn't this be inside the 'if (includeEdges)' too? @@ +222,5 @@ > + MOZ_ASSERT(mallocSizeOf); > + protobufNode.set_size(ubiNode.size(mallocSizeOf)); > + > + UniquePtr<ubi::EdgeRange, JS::FreePolicy> edges(ubiNode.edges(cx, wantNames)); > + if (!edges) NS_WARN_IF, like below? @@ +279,5 @@ > + { > + // We're only interested in the first time we reach edge.referent, not > + // in every edge arriving at that node. > + if (!first) > + return true; This isn't right, is it? We need all the edges, or else we can't do findPath or dominator tree analyses on the graph. @@ +300,5 @@ > + JS::Zone *zone = referent.zone(); > + > + if (zones->has(zone)) { > + return writer.writeNode(referent, CoreDumpWriter::INCLUDE_EDGES); > + } SpiderMonkey style says no braces when both 'if' and consequent are single lines. @@ +360,5 @@ > + > + nsCOMPtr<nsIOutputStream> outputStream; > + rv = NS_NewLocalFileOutputStream(getter_AddRefs(outputStream), file, > + PR_WRONLY | PR_CREATE_FILE | PR_TRUNCATE, > + -1, 0); Those last two are the default values for those arguments, aren't they? @@ +381,5 @@ > + zones.initialized() ? &zones : nullptr, > + maybeNoGC.ref())) > + { > + rv.Throw(NS_ERROR_UNEXPECTED); > + return; Here's where we would check zeroCopyStream for any saved error code, if we make the changes I suggest in comment 91. ::: toolkit/devtools/server/ChromeUtils.h @@ +20,5 @@ > +namespace devtools { > + > +// A `CoreDumpWriter` is given the data we wish to save in a core dump and > +// serializes it to disk, or memory, or a socket, etc. > +class CoreDumpWriter This is really nice.
Attachment #8566741 - Flags: review?(jimb)
Attachment #8566741 - Flags: review+
Attachment #8566741 - Flags: feedback+
Attachment #8566747 - Flags: review?(jimb) → review+
Comment on attachment 8566743 [details] [diff] [review] Part 4: Add an xpcshell test for saving heap snapshots. Review of attachment 8566743 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/devtools/server/tests/unit/test_SaveHeapSnapshot.js @@ +16,5 @@ > + testGoodParameters(filePath); > + do_test_finished(); > +} > + > +function testBadParameters(filePath) { How about: - boundaries.globals is not an object - boundaries.globals is not an array - boundaries.globals is an array containing a non-object - boundaries.debugger is Debugger.prototype - some boundaries property is a getter (what does WebIDL do with that?) @@ +54,5 @@ > +function testGoodParameters(filePath) { > + let sandbox = Cu.Sandbox(CC('@mozilla.org/systemprincipal;1', 'nsIPrincipal')()); > + const dbg = new Debugger(sandbox); > + ChromeUtils.saveHeapSnapshot(filePath, { debugger: dbg }); > + ok(true, "Should be able to save a snapshot for debuggee globals."); Debugger with multiple debuggees? Debugger with no debuggees?
Attachment #8566743 - Flags: review?(jimb) → review+
Comment on attachment 8566748 [details] [diff] [review] Part 9: Add JS::ubi::TypeNameRegistry Review of attachment 8566748 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/UbiNode.cpp @@ +238,5 @@ > + TypeNameRegistry::Register(MOZ_UTF16("js::Shape")); > +template<> const char16_t *TracerConcrete<js::BaseShape>::concreteTypeName = > + TypeNameRegistry::Register(MOZ_UTF16("js::BaseShape")); > +template<> const char16_t *TracerConcrete<js::ObjectGroup>::concreteTypeName = > + TypeNameRegistry::Register(MOZ_UTF16("js::ObjectGroup")); We are not allowed to use static constructors in Firefox. Let's talk on IRC about how to accomplish what you're trying to do here.
Attachment #8566748 - Flags: review?(jimb) → review-
(In reply to Jim Blandy :jimb from comment #96) > Comment on attachment 8566748 [details] [diff] [review] > Part 9: Add JS::ubi::TypeNameRegistry > > Review of attachment 8566748 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/vm/UbiNode.cpp > @@ +238,5 @@ > > + TypeNameRegistry::Register(MOZ_UTF16("js::Shape")); > > +template<> const char16_t *TracerConcrete<js::BaseShape>::concreteTypeName = > > + TypeNameRegistry::Register(MOZ_UTF16("js::BaseShape")); > > +template<> const char16_t *TracerConcrete<js::ObjectGroup>::concreteTypeName = > > + TypeNameRegistry::Register(MOZ_UTF16("js::ObjectGroup")); > > We are not allowed to use static constructors in Firefox. Let's talk on IRC > about how to accomplish what you're trying to do here. Ok, so Jim and I talked on IRC and came to the conclusion that we don't need to make DeserializedNodes is<T>() for the T that they represent a deserialized version of yet. Every use of is<T>() is a guard before an as<T>() which you can't do with a non-live ubi::Node. Instead, we should add ubi::Node methods/accessors for the various bits of info we as<T>() for right now, and then serialize and deserialize and re-expose that data with heap snapshots.
Attachment #8564356 - Attachment is obsolete: true
Attachment #8570809 - Flags: review?(mmc)
Attachment #8570809 - Flags: review?(mh+mozilla)
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #75) > ::: toolkit/components/downloads/generate_csd.sh > @@ +26,2 @@ > > # Get the protocol buffer and compile it > > +CMD='wget https://src.chromium.org/chrome/trunk/src/chrome/common/safe_browsing/csd.proto -O csd.proto' > > I think this link is now wrong since Chrome switched to gitiles. See > https://bugzilla.mozilla.org/show_bug.cgi?id=1083085. Updated. (Wow, gitiles is lame.) > What are the consequences of omitting the csd changes from this patch? If > they don't include breaking backwards compatibility, the easiest thing to do > is omit it for now and file a bug to take care of it later. Unfortunately, the old generated files won't work with this newer runtime, so we have to regenerate them. > ::: toolkit/components/protobuf/README.txt > @@ +23,5 @@ > > + `$MOZILLA_CENTRAL/toolkit/components/protobuf/src/google/**` except for test > > + files and `$PROTOBUF/src/google/protobuf/compiler/**` files. > > + > > +3. Update the moz.build to export the new set of headers and add any new .cc > > + files to the unified sources. > > If there aren't any manual steps required in 1. and 2., can we change the > update script to do this automatically? Added a fairly comprehensive upgrade script. > ::: toolkit/components/protobuf/vs2013.patch > @@ -1,1 @@ > > -diff --git a/toolkit/components/protobuf/google/protobuf/wire_format_lite_inl.h b/toolkit/components/protobuf/google/protobuf/wire_format_lite_inl.h > > Did you check that this works with VS2013? For some reason I thought that this got fixed upstream, but it appears it didn't. Folded this patch into m-c-changes.patch.
Comment on attachment 8570809 [details] [diff] [review] Part 0: Upgrade the protobuf library. r=mmc Review of attachment 8570809 [details] [diff] [review]: ----------------------------------------------------------------- rs=me ::: toolkit/components/protobuf/README.txt @@ +21,5 @@ > + > +2. Export the `$PROTOBUF_LIB_PATH` variable as the absolute path to that release > + checkout. > + > +3. Run `./toolkit/components/protobuf/upgrade_protobuf.sh`. It would probably be simpler to pass the path of the release checkout as an argument to the script, instead of having to set an environment variable.
Attachment #8570809 - Flags: review?(mh+mozilla) → review+
Attachment #8570809 - Attachment is obsolete: true
Attachment #8570809 - Flags: review?(mmc)
Attachment #8571605 - Flags: review?(mmc)
(In reply to Nick Fitzgerald [:fitzgen] from comment #101) > Created attachment 8571605 [details] [diff] [review] > Part 0: Upgrade the protobuf library. Carrying over glandium r+. This version passes the protobuf path as an argument, and additionally has better usage error reporting.
Comment on attachment 8571605 [details] [diff] [review] Part 0: Upgrade the protobuf library. Review of attachment 8571605 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! LG pending green try.
Attachment #8571605 - Flags: review?(mmc) → review+
Attachment #8571605 - Attachment is obsolete: true
Attachment #8572045 - Flags: review+
Trying to fix the compiler warnings that are causing build failures. https://treeherder.mozilla.org/#/jobs?repo=try&revision=1fda3a090506
Attachment #8566741 - Attachment is obsolete: true
Attachment #8572081 - Flags: review+
Depends on: 1139217
Comment on attachment 8572080 [details] [diff] [review] Part 2: Implement a google::protobuf::ZeroCopyOutputStream wrapper around nsIOutputStream Review of attachment 8572080 [details] [diff] [review]: ----------------------------------------------------------------- Looks great. r=me with minor comments addressed. ::: toolkit/devtools/server/ZeroCopyNSIOutputStream.cpp @@ +22,5 @@ > +} > + > +ZeroCopyNSIOutputStream::~ZeroCopyNSIOutputStream() > +{ > + NS_WARN_IF(NS_FAILED(writeBuffer())); Let's only try to writeBuffer if !failed(). Otherwise we're going to warn twice if a write to the underlying nsIOutputStream fails: once when the operation actually fails, and then again when we run the destructor and writeBuffer returns early. @@ +81,5 @@ > + MOZ_ASSERT(count > 0, > + "Must back up a positive number of bytes."); > + MOZ_ASSERT(amountUsed == BUFFER_SIZE, > + "Can only call BackUp directly after calling Next."); > + MOZ_ASSERT(uint32_t(count) < amountUsed, Not '<='?
Attachment #8572080 - Flags: review?(jimb) → review+
Attachment #8572045 - Attachment is obsolete: true
Attachment #8572693 - Flags: review+
Attachment #8572693 - Attachment is obsolete: true
Attachment #8574905 - Flags: review+
Alright this latest version should hopefully fix the static analysis complaints about isNaN and the LOGLEVEL_0 compiler errors on windows. New try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c564e626976f
Attachment #8566736 - Attachment is obsolete: true
Attachment #8574914 - Flags: review+
Attachment #8572081 - Attachment is obsolete: true
Attachment #8574916 - Flags: review+
Attachment #8566744 - Attachment is obsolete: true
Attachment #8566744 - Flags: review?(jimb)
Attachment #8574918 - Flags: review?(jimb)
Attachment #8566745 - Attachment is obsolete: true
Attachment #8574919 - Flags: review+
Attachment #8566746 - Attachment is obsolete: true
Attachment #8574920 - Flags: review+
Attachment #8566747 - Attachment is obsolete: true
Attachment #8574921 - Flags: review+
Attachment #8566748 - Attachment is obsolete: true
Attachment #8566751 - Attachment is obsolete: true
Attachment #8566751 - Flags: review?(jimb)
Attachment #8574922 - Flags: review?(jimb)
Attachment #8566752 - Attachment is obsolete: true
Attachment #8566752 - Flags: review?(jimb)
Attachment #8574923 - Flags: review?(jimb)
Attachment #8566754 - Attachment is obsolete: true
Attachment #8566754 - Flags: review?(jimb)
Attachment #8574924 - Flags: review?(jimb)
Attachment #8566756 - Attachment is obsolete: true
Attachment #8566756 - Flags: review?(jimb)
Attachment #8574925 - Flags: review?(jimb)
(In reply to Nick Fitzgerald [:fitzgen] from comment #115) > Alright this latest version should hopefully fix the static analysis > complaints about isNaN and the LOGLEVEL_0 compiler errors on windows. > > New try push: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=c564e626976f Alright, down to just windows builds failing! Inching closer...
Attachment #8574905 - Attachment is obsolete: true
Attachment #8576238 - Flags: review+
(In reply to Nick Fitzgerald [:fitzgen] from comment #129) > Created attachment 8576238 [details] [diff] [review] > Part 0: Upgrade the protobuf library. And I'm hoping this fixes those windows builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d71b71549d2
Pushing to try from m-c rather than inbound for hopefully less noise: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fa0eaae2835f
Comment on attachment 8574918 [details] [diff] [review] Part 5: Add GTests for core dump serialization. Review of attachment 8574918 [details] [diff] [review]: ----------------------------------------------------------------- I've asked for a few additions, but with those this is probably a good enough start for now. We should test behavior crossing zone boundaries, where we serialize the referent node, but not its outgoing edges. We should test nodes with two incoming edges. ::: toolkit/devtools/server/tests/gtest/DevTools.h @@ +35,5 @@ > + JSCompartment *compartment; > + JS::Zone *zone; > + JS::PersistentRootedObject global; > + > + static const JSClass *getGlobalClass() { Can't this just be a static member of the class? @@ +52,5 @@ > + rt(nullptr), > + cx(nullptr) > + { } > + > + virtual void SetUp() { Perhaps assert(!initialized_) here? @@ +208,5 @@ > + return false; > +} > + > +// Ensures that two char16_t* strings are equal. > +MATCHER_P(UTF16StrEq, str, "") { Couldn't you use NS_strcmp here? @@ +224,5 @@ > +// A mock `Writer` class to be used with testing `WriteHeapGraph`. > +class MockWriter : public CoreDumpWriter > +{ > +public: > + virtual ~MockWriter() MOZ_OVERRIDE { } We're using plain 'override' now; MOZ_OVERRIDE is no more. ::: toolkit/devtools/server/tests/gtest/DoesntCrossZoneBoundaries.cpp @@ +6,5 @@ > +// Test that heap snapshots walk the zone boundaries correctly. > + > +#include "DevTools.h" > + > +DEF_TEST(DoesntCrossZoneBoundaries, { Let's have a pair to this, DoesCrossZoneBoundaries. Can be mostly copy-paste of this tests. Let's also test with an empty zone set, and with a null zone set. ::: toolkit/devtools/server/tests/gtest/SerializesEdgeNames.cpp @@ +10,5 @@ > +using testing::Field; > +using testing::Property; > +using testing::Return; > + > +DEF_TEST(SerializesEdgeNames, { It would be easy to extend this test: how about a nameless edge? How about an edge whose name is the empty string? @@ +16,5 @@ > + FakeNode node(cx); > + FakeNode referent(cx); > + const char16_t edgeName[] = MOZ_UTF16("edge name"); > + > + size_t size = (strlen("edge name") + 1) * sizeof(char16_t); This is just sizeof(edgeName). sizeof doesn't convert array arguments to a pointer to the first element, so you get the size of the whole array. @@ +30,5 @@ > + EXPECT_CALL(writer, writeNode(AllOf( > + EdgesLength(cx, 1), > + Edge(cx, 0, Field(&JS::ubi::Edge::name, > + UTF16StrEq(edgeName))) > + ), _)) Is this indentation weird in the original? @@ +35,5 @@ > + .Times(1) > + .WillOnce(Return(true)); > + > + // Should get the referent node that doesn't have any edges once. > + EXPECT_CALL(writer, writeNode(EdgesLength(cx, 0), _)) This would be better as a call to ExpectWriteNode, it seems to me.
Attachment #8574918 - Flags: review?(jimb) → review+
Comment on attachment 8574922 [details] [diff] [review] Part 9: Deserialize heap snapshots Review of attachment 8574922 [details] [diff] [review]: ----------------------------------------------------------------- This is great code. It has that "couldn't be otherwise" quality to it. Just some concerns: ::: toolkit/devtools/server/ChromeUtils.cpp @@ +404,5 @@ > + return nullptr; > + } > + > + PRFileInfo fileInfo; > + if (PR_GetFileInfo(path.get(), &fileInfo) != PR_SUCCESS) { For the future: consider using PR_GetFileInfo64 (and dealing with issues of conversion to size_t on 32-bit machines). @@ +425,5 @@ > + > + uint32_t bytesRead = 0; > + while (bytesRead < size) { > + uint32_t bytesLeft = size - bytesRead; > + int32_t bytesReadThisTime = PR_Read(fd, buffer.get() + bytesRead, bytesLeft); For the future: there is a PR_MemMap. ::: toolkit/devtools/server/DeserializedNode.cpp @@ +71,5 @@ > + for (decltype(edgesLength) i = 0; i < edgesLength; i++) { > + DeserializedEdge edge; > + if (!edge.init(node.edges(i), *owner)) > + return nullptr; > + edges.append(Move(edge)); You want to use infallibleAppend here, since you don't expect to outgrow what you've reserved above. ::: toolkit/devtools/server/DeserializedNode.h @@ +27,5 @@ > +namespace devtools { > + > +class HeapSnapshot; > + > +using NodeId = uint64_t; Get off my lawn. @@ +34,5 @@ > +// node with id equal to `DeserializedEdge::referent` that we deserialized from > +// a core dump. > +struct DeserializedEdge { > + NodeId referent; > + const char16_t *name; This is always a string borrowed from the HeapSnapshot's table, right? If so, this must be documented, just as for DeserializedNode::typeName. @@ +51,5 @@ > + > +// A `DeserializedNode` is a node in the heap graph that we deserialized from a > +// core dump. > +struct DeserializedNode { > + using EdgeVector = Vector<DeserializedEdge>; Here is a case where choosing a good MinInlineCapacity (second template arg) could make a big difference. But let's get data first. @@ +65,5 @@ > + HeapSnapshot *owner; > + > + // Create a new `DeserializedNode` from the given `protobuf::Node` message. > + static UniquePtr<DeserializedNode> Create(const protobuf::Node &node, > + HeapSnapshot *owner); We're following SpiderMonkey style in this directory, apparently, so let's make this |HeapSnapshot &owner|. @@ +77,5 @@ > + > + // Get a borrowed reference to the given edge's referent. This method is > + // virtual to provide a hook for gmock and gtest. > + virtual const UniquePtr<DeserializedNode> & > + getEdgeReferent(const DeserializedEdge &edge); Wouldn't simply DeserializedNode & be the natural type for a borrowed reference? The operator * of a UniquePtr will produce such a reference. @@ +80,5 @@ > + virtual const UniquePtr<DeserializedNode> & > + getEdgeReferent(const DeserializedEdge &edge); > + > +private: > + DeserializedNode() = delete; Do you have to delete this? I think simply declaring any constructor yourself causes the default constructor not to be supplied. ::: toolkit/devtools/server/HeapSnapshot.cpp @@ +137,5 @@ > + return false; > + > + // Finally, the rest of the nodes in the core dump. > + > + while (stream.ByteCount() < size) { Unless there's something about the relationships of the streams here that I'm missing (quite possible), this doesn't seem right: it assumes that |gzipStream| won't consume up to the end of |stream| yet have more data readable in its own buffer. I don't see why a decompressor mustn't read ahead. I think you have to use gzipStream.Next(...) here, check its return value to detect end-of-stream, and then BackUp completely before handing the stream to parseMessage. @@ +152,5 @@ > +const char16_t * > +HeapSnapshot::borrowUniqueString(const char16_t *duplicateString) > +{ > + MOZ_ASSERT(duplicateString); > + auto ptr = strings.lookupForAdd(duplicateString); So inference. (Wow.) ::: toolkit/devtools/server/HeapSnapshot.h @@ +62,5 @@ > + }; > + > + // Initialize this HeapSnapshot from the given buffer that contains a > + // serialized core dump. Do NOT take ownership of the buffer, only borrow > + // it. Return false on failure. "only borrow it for the duration of the call", right? @@ +75,5 @@ > + > + // The id of the root node for this deserialized heap graph. > + NodeId rootId; > + > + // The set of nodes in this deserialized heap graph, keyed by id. Suggestion for the future: I think this could save a lot of memory by holding the DeserializedNodes directly: js::HashMap<NodeId, DeserializedNode> because the hash table is frozen once 'init' is over, so ubi::Nodes pointing directly into the table won't be invalidated by table resizing. @@ +95,5 @@ > + virtual ~HeapSnapshot() { } > + > +public: > + // Create a `HeapSnapshot` from the given buffer that contains a serialized > + // core dump. Do NOT take ownership of the buffer, only borrow it. So, again, this doesn't hold a reference to buffer after it returns, right? "... only borrow it for the duration of the call."
Attachment #8574922 - Flags: review?(jimb) → review+
Comment on attachment 8574922 [details] [diff] [review] Part 9: Deserialize heap snapshots Sorry: meant f+, not r+; concerned about check for EOF on gzipStream, as noted in review.
Attachment #8574922 - Flags: review+ → feedback+
Attachment #8576238 - Attachment is obsolete: true
Attachment #8583476 - Flags: review+
Attachment #8574914 - Attachment is obsolete: true
Attachment #8583477 - Flags: review+
Attachment #8574916 - Attachment is obsolete: true
Attachment #8583479 - Flags: review+
Attachment #8574918 - Attachment is obsolete: true
Attachment #8583481 - Flags: review+
Attachment #8574919 - Attachment is obsolete: true
Attachment #8583482 - Flags: review+
Attachment #8574920 - Attachment is obsolete: true
Attachment #8583483 - Flags: review+
Attachment #8574921 - Attachment is obsolete: true
Attachment #8583484 - Flags: review+
Attachment #8574922 - Attachment is obsolete: true
Attachment #8583485 - Flags: review?(jimb)
Attachment #8574923 - Attachment is obsolete: true
Attachment #8574923 - Flags: review?(jimb)
Attachment #8583486 - Flags: review?(jimb)
Attachment #8574924 - Attachment is obsolete: true
Attachment #8574924 - Flags: review?(jimb)
Attachment #8583487 - Flags: review?(jimb)
Attachment #8574925 - Attachment is obsolete: true
Attachment #8574925 - Flags: review?(jimb)
Attachment #8583488 - Flags: review?(jimb)
Attachment #8583485 - Attachment is obsolete: true
Attachment #8583485 - Flags: review?(jimb)
Attachment #8583491 - Flags: review?(jimb)
Attachment #8583476 - Attachment is obsolete: true
Attachment #8584023 - Flags: review+
Attachment #8583479 - Attachment is obsolete: true
Attachment #8584027 - Flags: review+
Attachment #8583484 - Attachment is obsolete: true
Attachment #8584032 - Flags: review+
Attachment #8583491 - Attachment is obsolete: true
Attachment #8583491 - Flags: review?(jimb)
Attachment #8584033 - Flags: review?(jimb)
Attachment #8583486 - Attachment is obsolete: true
Attachment #8583486 - Flags: review?(jimb)
Attachment #8584034 - Flags: review?(jimb)
Alright now everything is rebased on the latest m-c and is using {override,final} instead of MOZ_{OVERRIDE,FINAL}.
Comment on attachment 8584033 [details] [diff] [review] Part 9: Deserialize heap snapshots Review of attachment 8584033 [details] [diff] [review]: ----------------------------------------------------------------- Looks great, thanks for the revisions.
Attachment #8584033 - Flags: review?(jimb) → review+
Comment on attachment 8584033 [details] [diff] [review] Part 9: Deserialize heap snapshots Review of attachment 8584033 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/devtools/server/HeapSnapshot.h @@ +77,5 @@ > + NodeId rootId; > + > + // The set of nodes in this deserialized heap graph, keyed by id. > + using NodeMap = js::HashMap<NodeId, UniquePtr<DeserializedNode>>; > + NodeMap nodes; Oh, one thought here, if you have time: Since DeserializedNode itself has an 'id' field, this HashMap actually ends up storing the id twice in each hash entry: once as the key, and then again in the DeserializedNode pointed to by the value. The hack to avoid this is to make this a HashSet, use Id for the Lookup type in the hash policy, and make the policy's match method look inside its first argument to get the Id.
Comment on attachment 8584035 [details] [diff] [review] Part 11: Implement a JS::ubi::Node specialization for DeserializedNode Review of attachment 8584035 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. ::: js/public/UbiNode.h @@ +369,5 @@ > return base()->edges(cx, wantNames); > } > > + typedef Base::Id Id; > + Id identifier() const { return base()->identifier(); } Hmm. This is necessary because a ubi::Node backed by a DeserializedNode wants to report the saved node's id, and not synthesize a fresh one based on the DeserializedNode's address. That's too bad; it would have been nice to keep 'identifier' fast. If this becomes a problem, we can name the non-virtual method 'fastIdentifier', and use that when anything that preserves equality will do. At least the HashPolicy is still fast. ::: toolkit/devtools/server/DeserializedNode.cpp @@ +148,5 @@ > + settle(); > + } > + > + bool > + init(DeserializedNode &node) For in-class inline definitions, SpiderMonkey style puts the return type on the same line as the function name. Please fix here and elsewhere. @@ +165,5 @@ > + return false; > + } > + > + DeserializedNode &referent = node.getEdgeReferent(*edgep); > + edges.append(mozilla::Move(SimpleEdge(name, Node(&referent)))); As before, use infallibleAppend here, since you reserved above. It's a pity we have to copy the edge vector like this. Could you file a follow-up bug to make this iterate over the DeserializedNode's underlying vector, and have 'settle' build a proper JS::ubi::Edge from the current element? @@ +184,5 @@ > +EdgeRange * > +Concrete<DeserializedNode>::edges(JSContext *cx, bool) const > +{ > + mozilla::UniquePtr<DeserializedEdgeRange> range( > + js_new<DeserializedEdgeRange>(cx)); Could we use MakeUnique here, to get our alloc/free matching automatically? (I hate that the types don't take care of this for us. I tried to make myself go back over the patches and check that all UniquePtrs were being used with the right deleters, but I just couldn't.)
Attachment #8584035 - Flags: review?(jimb) → review+
Comment on attachment 8584034 [details] [diff] [review] Part 10: Add an xpcshell test for reading heap snapshots. Review of attachment 8584034 [details] [diff] [review]: ----------------------------------------------------------------- We're going to be very unhappy with such meager test coverage, but there's just not much time. Let's file a followup for full branch coverage by the test suite of the .cpp and .h files in toolkit/devtools/server.
Attachment #8584034 - Flags: review?(jimb) → review+
Comment on attachment 8584036 [details] [diff] [review] Part 12: Add a GTest for the JS::ubi::Node specialization for DeserializedNode Review of attachment 8584036 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/devtools/server/tests/gtest/DeserializedNodeUbiNodes.cpp @@ +40,5 @@ > +DEF_TEST(DeserializedNodeUbiNodes, { > + const char16_t *typeName = MOZ_UTF16("TestTypeName"); > + > + NodeId id = 1337; > + uint64_t size = 1024; Perhaps it's dumb, but I always used full-width random hex constants for these things, to make sure we weren't getting silently truncated down to 32 bits somewhere along the path. @@ +63,5 @@ > + edge1.referent = referent1->id; > + mocked.addEdge(Move(edge1)); > + EXPECT_CALL(mocked, > + getEdgeReferent(Field(&DeserializedEdge::referent, > + referent1->id))) For the future, it'd be nice to add a DeserializedEdge matcher to DevTools.h. Then we could just say exactly what we mean.
Attachment #8584036 - Flags: review?(jimb) → review+
Depends on: 1149399
Comment on attachment 8584033 [details] [diff] [review] Part 9: Deserialize heap snapshots Review of attachment 8584033 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/devtools/server/ChromeUtils.cpp @@ +397,5 @@ > +{ > + // TODO FITZGEN: it sure would be nice to actually return a > + // Promise<HeapSnapshot> and do all this IO off main thread... > + > + UniquePtr<char[]> path(ToNewCString(filePath)); Does this pair the alloc and free functions correctly? I think this should be: UniquePtr<char[], nsMemory::free> per: https://dxr.mozilla.org/mozilla-central/source/xpcom/string/nsReadableUtils.cpp#278
Comment on attachment 8584033 [details] [diff] [review] Part 9: Deserialize heap snapshots Review of attachment 8584033 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/devtools/server/DeserializedNode.cpp @@ +95,5 @@ > + , edges() > + , owner(&owner) > +{ > + this->edges = Move(edges); > +} Is there a reason we can't just say: , edges(Move(edges)) in the initializer list, and leave the constructor body empty? (When it appears within the parens, the argument name shadows the member name, so it works.)
Depends on: 1149397
Here is my latest try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=499bdf6b592b Looks like GTests all pass, but then we get some warnings and then segfault (assert?) causing the builds to come up red. Kind of confused about this. 15:12:14 INFO - TEST-PASS | GTest unit test: passed 15:12:14 INFO - [16229] WARNING: An event was posted to a thread that will never run it (rejected): file /builds/slave/try-l64-asan-d-000000000000000/build/src/xpcom/threads/nsThread.cpp, line 513 15:12:14 INFO - [16229] WARNING: 'NS_FAILED(rv)', file /builds/slave/try-l64-asan-d-000000000000000/build/src/security/manager/boot/src/DataStorage.cpp, line 700 15:12:14 INFO - [16229] WARNING: 'NS_FAILED(rv)', file /builds/slave/try-l64-asan-d-000000000000000/build/src/security/manager/boot/src/DataStorage.cpp, line 742 15:12:14 INFO - Program /builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/firefox (pid = 16229) received signal 11. 15:12:14 INFO - Stack: 15:12:14 INFO - #01: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x885d2db] 15:12:14 INFO - #02: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x9145ec9] 15:12:14 INFO - #03: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x91d619b] 15:12:14 INFO - #04: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x91bd074] 15:12:14 INFO - #05: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x22164a9] 15:12:14 INFO - #06: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x2215ee8] 15:12:14 INFO - #07: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x2215cc5] 15:12:14 INFO - #08: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x2219395] 15:12:14 INFO - #09: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x237c8df] 15:12:14 INFO - #10: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x1ff260f] 15:12:14 INFO - #11: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x1ff20e5] 15:12:14 INFO - #12: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x76ea7d1] 15:12:14 INFO - #13: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x76f2a8a] 15:12:14 INFO - #14: XRE_main[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/gtest/libxul.so +0x76f35b3] 15:12:14 INFO - #15: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/firefox +0x8c711] 15:12:14 INFO - #16: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/firefox +0x8bc92] 15:12:14 INFO - #17: __libc_start_main[/lib64/libc.so.6 +0x1ecdd] 15:12:14 INFO - #18: ???[/builds/slave/try-l64-asan-d-000000000000000/build/src/obj-firefox/dist/bin/firefox +0x8b97d] Additionally, the hazard analysis thinks that WriteHeapGraph can GC, which shouldn't be the case, and we end up with 6 more hazards than expected: https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nfitzgerald@mozilla.com-499bdf6b592b/try-linux64-br-haz/hazards.txt.gz Looking good otherwise! Getting closer.
No longer blocks: 1138721
Ok, figured out what's going on with those GTest failures: * Our gtest runner infrastructure uses ScopedXPCOM * XPCOM uses its own JSRuntime * JSRuntime uses thread local storage * Therefore there can only be one JSRuntime per thread * My new gtests manage their own JSRuntime * Which overwrite the ScopedXPCOM's JSRuntime's thread local storage * And causes failures when the ScopedXPCOM's JSRuntime shuts down after all the tests run So. I need to figure out how to get access to the ScopedXPCOM's JSRuntime in these tests rather than managing my own.
CycleCollectedJSRuntime::Get()?
Or rather, CycleCollectedJSRuntime::Get()->Runtime(). This is used in a few places to pull the JSRuntime out of TLS. http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#12277
(In reply to Bobby Holley (:bholley) from comment #175) nsIJSRuntimeService::GetRuntime did the trick locally -- any reason I should prefer this or CycleCollectedJSRuntime::Get()->Runtime() ?
(In reply to Nick Fitzgerald [:fitzgen] from comment #176) > (In reply to Bobby Holley (:bholley) from comment #175) > > nsIJSRuntimeService::GetRuntime did the trick locally -- any reason I should > prefer this or CycleCollectedJSRuntime::Get()->Runtime() ? It depends whether you want the main-thread JSRuntime (your suggestion) or the JSRuntime for the thread you're on (my suggestion).
For the hazards, if the protobuf library is function pointer-happy, we can always do a blanket whitelist of mozilla::devtools::protobuf::*. I don't know how protobufs work, though -- if you are expected to pass in functions to do stuff, then that would seem very unsafe. Generally speaking, you don't have to cut these off at the bottom frame. For example, with uint8 mozilla::devtools::StreamWriter::writeNode(JS::ubi::Node*, uint8) void mozilla::devtools::protobuf::Node::Node() void mozilla::devtools::protobuf::Node::Node() void mozilla::devtools::protobuf::Node::SharedCtor() const class std::basic_string<char>* google::protobuf::internal::GetEmptyString() void google::protobuf::GoogleOnceInit(int64*, (void)()*)()) void google::protobuf::GoogleOnceInitImpl(int64*, google::protobuf::Closure*) void google::protobuf::internal::FunctionClosure0::Run() FieldCall: google::protobuf::internal::FunctionClosure0.function_ would it help to annotate GetEmptyString(), or do other things do similar stuff? If necessary, we can do a special push that captures the whole callgraph, but there aren't really good tools to, say, "find me all of the function pointers called anywhere under here". As for the nsIRunnable.Run, I have to agree with the analysis -- that can definitely GC in general. Perhaps the best thing to annotate there would be mozilla::devtools::DeserializedNode* MockDeserializedNode::getEdgeReferent(mozilla::devtools::DeserializedEdge*)>::Argument1) or perhaps the whole MockDeserializedNode class. (Though if that *can* GC, even just in tests, then you're risking intermittent oranges with the annotation.)
(In reply to Nick Fitzgerald [:fitzgen] from comment #179) > Created attachment 8591746 [details] [diff] [review] > Part 14: Ignore protobuf indirect calls in the GC hazard analysis So the problem here is that our infra uses a custom gtest logger which runs a sync runnable which we have to assume might GC, and ASSERT/EXPECT gtest assertions call into that for each assertion. This means that gtest assertions are hazards. The alright news is that we only run the sync runnable if an assertion already failed. In that light, it doesn't seem super horrible to me to suppress the hazard warning there, since the test already failed even if we accidentally trigger a GC. However, my #ifdef doesn't seem to actually suppress the hazard analysis, and it is still warning about the same method. Should I just add that method to annotations.js
Flags: needinfo?(sphink)
AutoSuppressGCAnalysis is intended for places where you know it won't GC in the (dynamic) path being considered. That is not true in this case, which means that (1) if it *does* GC, you'll have a MOZ_CRASH; and (2) even if it doesn't, we sometimes add checks in places that *might* GC, so you could get a MOZ_CRASH anyway. We don't currently have an analysis-only RAII guard, and I'm hesitant to add one. Your situation is crazy enough that it might be warranted, but even if so, I'd probably rather do it via an annotation to prevent it from being used for other purposes. Not only that, but you already have the MOZ_CRASH problem -- you're within the scope of an AutoCheckCannotGC, which will also crash if there is a GC or even a potential GC. Still thinking about the principled way to fix all this.
Flags: needinfo?(sphink)
Depends on: 1154341
We had talked on IRC and agreed to whitelist all MockWhatever gmock'd classes. Even so, it seems that we are still going through gtest assertions to code that calls function pointers passed with potentially user-provided interfaces, which could theoretically GC (although we are 99% sure they don't in our specific case). AFAICT, this is the same situation as with the mocked classes, but now we don't have anything as easy to point at and white list as "when given some thought, this is OK". Maybe just whitelist AssertHelper? Continue whitelisting one thing at a time until the hazard analysis is happy? Rewrite the tests with some other test framework? https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nfitzgerald@mozilla.com-32ebdf26750a/try-linux64-br-haz/hazards.txt.gz Function '_ZN37DevTools_DoesCrossZoneBoundaries_Test8TestBodyEv|void DevTools_DoesCrossZoneBoundaries_Test::TestBody()' has unrooted 'noGC' of type 'JS::AutoCheckCannotGC' live across GC call '_ZNK7testing8internal12AssertHelperaSERKNS_7MessageE|void testing::internal::AssertHelper::operator=(testing::Message*) const' at toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10 toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(173,174, __temp_60*.AssertionResult(0,__temp_61*)) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Assign(174,175, gtest_ar_:7 := __temp_60*) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(175,176, __temp_60.~AssertionResult()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(176,177, __temp_63 := gtest_ar_:7.operator 1()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Assume(177,179, __temp_63*, false) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(179,181, __temp_66 := GetBoolAssertionFailureMessage(gtest_ar_:7,"WriteHeapGraph(cx, JS::ubi::Node(&nodeA), writer, false, &targetZones, noGC)","false","true")) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(181,183, __temp_65 := __temp_66.c_str()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(183,185, __temp_64*.AssertHelper(0,2,"/builds/slave/l64-br-haz_try_dep-00000000000/build/source/toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp",70,__temp_65*)) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(185,187, __temp_67*.Message(0)) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(187,189, __temp_64.operator=(__temp_67)) [[GC call]] toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(189,191, __temp_66.~String()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(191,193, __temp_64.~AssertHelper()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(193,195, __temp_67.~Message()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(195,196, gtest_ar_:7.~AssertionResult()) toolkit/devtools/server/tests/gtest/DoesCrossZoneBoundaries.cpp:10: Call(196,197, noGC.~AutoCheckCannotGC()) GC Function: _ZNK7testing8internal12AssertHelperaSERKNS_7MessageE|void testing::internal::AssertHelper::operator=(testing::Message*) const testing::UnitTest* testing::UnitTest::GetInstance() void testing::UnitTest::UnitTest() void testing::UnitTest::UnitTest() void testing::internal::UnitTestImpl::UnitTestImpl(testing::UnitTest*) void testing::internal::UnitTestImpl::UnitTestImpl(testing::UnitTest*) void testing::TestEventListeners::SetDefaultResultPrinter(testing::TestEventListener*) void testing::internal::TestEventRepeater::~TestEventRepeater() void testing::internal::TestEventRepeater::~TestEventRepeater(int32) void testing::internal::TestEventRepeater::~TestEventRepeater(int32) void testing::internal::ForEach(class std::vector<testing::TestEventListener*>*, (void)(testing::TestEventListener*)*) [with Container = std::vector<testing::TestEventListener*>; Functor = void (*)(testing::TestEventListener*)] _Funct std::for_each(_IIter, _IIter, _Funct) [with _IIter = __gnu_cxx::__normal_iterator<testing::TestEventListener* const*, std::vector<testing::TestEventListener*> >; _Funct = void (*)(testing::TestEventListener*)] IndirectCall: __f
Flags: needinfo?(sphink)
Attachment #8591746 - Attachment is obsolete: true
Attachment #8591746 - Flags: feedback?(sfink)
Attachment #8595016 - Flags: review?(sphink)
(In reply to Nick Fitzgerald [:fitzgen] from comment #185) > Latest try push: > https://treeherder.mozilla.org/#/jobs?repo=try&revision=c3b66566179e Green!
Attached patch combined.patch (obsolete) — Splinter Review
Combined patch of my whole queue for sfink.
Attached patch combined.patchSplinter Review
Foreally this time
Attachment #8595475 - Attachment is obsolete: true
Comment on attachment 8595016 [details] [diff] [review] Part 14: Ignore protobuf indirect calls in the GC hazard analysis Review of attachment 8595016 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/devtools/rootAnalysis/annotations.js @@ +174,5 @@ > // And these are workarounds to avoid even more analysis work, > // which would sadly still be needed even with bug 898815. > "void js::AutoCompartment::AutoCompartment(js::ExclusiveContext*, JSCompartment*)": true, > + > + "void test::RingbufferDumper::OnTestPartResult(testing::TestPartResult*)" : true, Can you comment this one? The rest seem adequately commented.
Attachment #8595016 - Flags: review?(sphink) → review+
Follow up to fix xpcshell tests that got broken by some unrelated change jorendorff snuck in before me on inbound ;) remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/53058615e7a3
Flags: needinfo?(sphink)
sorry had to backout this changes in https://hg.mozilla.org/mozilla-central/rev/0b202671c9e2 since one of this changes seems to have caused a Linux PGO Bustage like: https://treeherder.mozilla.org/logviewer.html#?job_id=1380632&repo=mozilla-central
Status: RESOLVED → REOPENED
Flags: needinfo?(nfitzgerald)
Resolution: FIXED → ---
needinfo :bsmedberg for ideas on why toolkit/crashreporter/test/unit/test_crashreporter_crash.js might be breaking on linux PGO (see comment 193). I saw a .proto protobuf message description file in toolkit/crashreporter/google-breakpad and suspected that this was related to upgrading the protobuf lib, but it turns out that is for the google-breakpad server only, and we shouldn't be compiling it to cc/h files for client fx builds.
Flags: needinfo?(benjamin)
That test does what it says on the tin--spawns an xpcshell subprocess, enables crash reporting in it, and then intentionally crashes it to ensure that we got a minidump out. The failure mode there indicates that we're probably crashing (again) during minidump writing. I'm not sure why your patch set would cause that. If you can spin another Linux build with debug symbols enabled ( https://wiki.mozilla.org/ReleaseEngineering/TryServer#Getting_debug_symbols ) you might be able to run the test locally (by downloading the build + test packages etc) and get a coredump.
Flags: needinfo?(benjamin)
Status update: I got a VM set up and made a PGO build, but haven't been able to reproduce the crash. I'm using this as my mozconfig: > mk_add_options MOZ_PGO=1 > mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $(MOZ_OBJDIR)/_profile/pgo/profileserver.py'
Like I said in comment 197, if you use that patch to enable uploading debug symbols from your try build it'd be easier to debug the binaries on another machine after-the-fact.
Thanks, Ted. Somehow I missed that the first time around. Here is a try push with the rebase issue that broke the last try push's build fixed, and enabling debug symbols: https://treeherder.mozilla.org/#/jobs?repo=try&revision=65da11dacd10
Alright, I was able to run the test locally with the build from try after following the directions here[0], and I can't reproduce the test failure from try. > 15:32:36 INFO - TEST-PASS | toolkit/crashreporter/test/unit/test_crashreporter_crash.js | took 1016ms I guess it is time to start print debugging via try push :-/ [0] https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer
Try push that puts the minidumps in the upload directory instead of a temp directory: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f3ae6c78bd3
Got caught by bug 1165184. Here is a new try push w/ the patch from that bug: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a1332222a000
Ok, so I overlooked that more than one place try to use the temp dir to find the minidump. This new push should handle that, and also not explicitly delete the dump files after the tests. I also disabled the other crashreporter tests that weren't failing. https://treeherder.mozilla.org/#/jobs?repo=try&revision=9b9351dfe35d
The minidump file generated *does* have the "MDMP" header signature that the lib was complaining was missing when I inspect it manually, so things are going wrong somewhere after generating the dump file. Somewhere around when we read it back in and start parsing it. Strange that it would only be a problem on linux PGO and not normal linux or linux64, etc. This is making me think something funkier than a minidump parser bug is going on; perhaps a (PGO related?) compiler bug? This is also code I didn't touch in these patches, nor is the crashing stack for these tests going through code I touched in these patches, which I think lends more weight to this hypothesis. Adding more logging to determine that we are trying to read the correct minidump file, logging the header of the minidump file from JS before passing it into the minidump C++ lib, and a few other random things. https://treeherder.mozilla.org/#/jobs?repo=try&revision=7bcacd1bc30b
Alright, so I got my logging added and now the test passes(!): https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7269851275e Race condition in crash reporter code? Compiler bug w/ PGO?
Gross. I'd try adding `NO_PGO = True` to https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/test/moz.build, to see if building the utility library that reads the minidump without PGO makes a difference. If so let's just do that.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #208) > Gross. I'd try adding `NO_PGO = True` to > https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/test/ > moz.build, to see if building the utility library that reads the minidump > without PGO makes a difference. If so let's just do that. Splendid idea! Thank you!
Ok, so that try push that passed had 3 changes: 1. New logging in the Minidump lib (C++) 2. New logging in the failing test (JS) 3. Disable sibling tests in the failing test's directory So I'm going to try three new try pushes, each with one of those things undone. * Running all tests (but keep all logging): https://treeherder.mozilla.org/#/jobs?repo=try&revision=ee7aaa3cb69e * Disable C++ logging: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b21b72517311 * Disable JS logging: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ad0e11986c7 And for good measure, here is what Ted suggested in comment 208 (disabling PGO for the Minidump lib, none of the logging or test changes from above): https://treeherder.mozilla.org/#/jobs?repo=try&revision=9af9b9106fad
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #210) Alright, so all 4 try pushes are green. Maybe whatever was triggering this behavior changed on latest m-c? Here is a try push with only my original commits: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8784fb6f1f9e If that isn't green, then I will go forward with disabling PGO for minidump.
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #211) > (In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #210) > > Alright, so all 4 try pushes are green. Maybe whatever was triggering this > behavior changed on latest m-c? Here is a try push with only my original > commits: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8784fb6f1f9e > > If that isn't green, then I will go forward with disabling PGO for minidump. Ok, so that push is green. I guess the conditions that triggered the PGO failure in crashreporter no longer exist on newer m-c? Yay? Here is a sanity try push for everything this series touches, it looks optimistic, but unfortunately there are infra issues with windows builds at the moment: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b03de5952e12 Assuming those infra issues get fixed, I can retrigger, and the windows builds are green, I'm going to (re)land this!
Depends on: 1169974
Blocks: heapprof
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: