Closed Bug 558385 Opened 14 years ago Closed 6 years ago

Memory profiler should do a better job of accounting for internal non-ActionScript allocations

Categories

(Tamarin Graveyard :: Profiler, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX
Q2 12 - Cyril

People

(Reporter: mike, Assigned: jbahuley)

References

Details

(Whiteboard: MustFixFlex)

The problem:

Currently, the memory profiler primarily exposes only those memory allocations that correlate directly to user-visible ActionScript objects.  It does not account for internal allocations.  (There is the new flash.sampler.sampleInternalAllocs(), which is a good step in the right direction, but doesn't do everything we need -- more on this below.)

For example, if the user writes this:

    var xml:XML = <root><child id="myid"/></root>

we will expose the memory taken by the XMLObject to which the 'xml' variable points, but we do not expose the memory taken by numerous underlying C++ objects:

- The E4XNode representing the <root> element
- The E4XNode representing the <child> element
- The E4XNode representing the "id" attribute

Furthermore, although all the Strings associated with this XML object -- "root", "child", "id", "myid", etc. -- do show up in the profiler, the profiler does *not* show the path to gc root of those strings -- the profiler only shows relationships between objects when one ActionScript-visible object points to another ActionScript-visible object.  In this case, since there is no AS property such as "xml.name" point to the "root" (there is xml.name(), but that's a function call), the tree of objects is not exposed to the user.

Also, there is no way for the profiler to represent the fact that some ActionScript objects might share pointers to underlying C++ objects.  This happens in the XML example: If I write "var xml2 = xml.child", xml2 is pointing to the same <child> E4XNode that xml is pointing to, but the profiler doesn't have a way of expressing that.

And finally, we currently have a pretty fragile mechanism for accounting for the memory taken by each object.  AvmPlusScriptableObject has a bytesUsed() function that each subclass must implement.  String, for example, takes the memory of the StringObject plus the memory of the buffer that holds the actual characters of the string.  But the problem with bytesUsed() is that there is no programmatic way of enforcing that it is returning the correct value.  For example, if a Tamarin developer modifies the implementation of a built-in class, he will probably neglect to modify bytesUsed() to reflect his changes.  This has happened numerous times in the past; see all the bugs I have logged recently against the bytesUsed() functions for String, Object, Array, ByteArray, Vector, XML/XMLList, and so on.

As mentioned above, flash.sampler.sampleInternalAllocs() is a step in the right direction -- it tells the user about all GC allocs that are not correlated with an ActionScript object, and how much memory each of them takes.  However, the user isn't given any clues what to do with that information -- all he knows is that some mysterious object took 10,000 bytes, but he doesn't know what it is or even what other objects point to it.

--

Proposed solution:

The profiler is already capable of exposing internal allocations; it would be quite easy to modify the profiler to also expose the graph of links (pointers) into and out of those internal allocations.  This alone would go a long ways toward addressing this problem.

For example, there could be a new flash.sampler.getInternalMembers() or something.  You pass it one of the "id" values that came back from a NewObjectSample, and it returns a list of id's of other allocations to which that one points.

The implementation of this would be based on refactoring the code in GC::MarkItem() -- in other words, the code used by the GC would be the exact same code used by the profiler to find pointers.  This would help guarantee that if the GC algorithm changed, the profiler's behavior would change as well.

For example, if you turned this on, then the size the profiler sees for an instance of XML would always be a small number (the number of bytes used by the XMLObject); but getInternalMembers() would return an id which corresponds to the E4XNode that the XMLObjects points to.

An optional enhancement would be that there could be artificially manufactured "member names" for some of those pointers internal allocations, and/or artificially manufactured "type names" for some of the internally allocated blocks of memory.  For example, instead of just telling the user, "Your XML variable points to an internal allocation that is 24 bytes in size," we could tell him, "Your XML variable has a pseudo-member called '[node]' that points to an internal allocation of pseudo-type '[E4XNode]' that is 24 bytes in size."

The main disadvantage I can see of doing this is that LOTS of internal allocations will be exposed, and they will routinely be encountered by the user, and they could potentially make the profiling results more confusing.  E.g. take something as simple as an Object -- this could have several underlying allocations in addition to the main ScriptObject:

- a pointer to the delegate
- a pointer to the hashtable of dynamic members
- a pointer to the vtable?  (not sure if that is on the heap)

Almost every object is going to have those.  We'll have to be careful about how we expose all this additional information.
Flags: flashplayer-qrb+
Priority: -- → P2
Target Milestone: --- → flash10.2
Blocks: 564161
Component: Virtual Machine → Profiler
QA Contact: vm → profiler
Adding "MustFixFlex" to whiteboard on behalf of Mayank Kumar <mayankk@adobe.com>
Whiteboard: MustFixFlex
Flags: flashplayer-bug-
Whiteboard: MustFixFlex → MustFixFlex, must-fix-candidate
This should be split into a couple bugs:

1) Wire up internal allocs to their parent AvmPlusScriptableObject with getInteralAllocations or some such

2) Provide meaningful names derived from the c++ names for these edges

3) Can bytesUsed by reengineered to be more fool proof?   If all internal allocs are exposed can we get rid of bytesUsed and rely on GC::Size?

An assumption here is that we can provide all this information and that the consumer of these API's will be able to filter the information but if generated the samples is too expensive we can entertain adding an filtering mechanism to the API.
This bug should become #1 above
Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan
Assignee: nobody → jbahuley
Flags: flashplayer-injection-
Target Milestone: Q1 12 - Brannan → Q2 12 - Cyril
Whiteboard: MustFixFlex, must-fix-candidate → MustFixFlex
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.