Closed Bug 552307 Opened 14 years ago Closed 14 years ago

Profiler incorrectly reports memory taken by XML and XMLList

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P2)

defect

Tracking

(Not tracked)

VERIFIED FIXED
flash10.1

People

(Reporter: mike, Assigned: mike)

References

Details

(Whiteboard: Has patch)

Attachments

(1 file)

Steps to reproduce:
1. Compile and run this progam, which creates large XML and XMLList objects and then asks the profiler how much memory they take:

    import flash.sampler.*;

    var s = "";
    for (var i=1; i<=10000; ++i)
        s += "<child id='" + i + "'>child " + i + "</child>";

    var x = XML("<root>" + s + "</root>");
    trace("XML: " + getSize(x) + " bytes");

    x = XMLList(s);
    trace("XMLList: " + getSize(x) + " bytes");

Actual result:
    XML: 24 bytes
    XMLList: 48 bytes

Expected result:
Both should be much larger than that.
This one is going to take more work to fix than some of the other profiler bugs (but it is still important that we fix it).  What makes it hard is that internally, an XMLObject -- which is a ScriptObject, and therefore something the end user knows about -- keeps its contents as a bunch of E4XNodes, and it will share E4XNodes between different XMLObjects.  For example, if you have this (I think I have this right, but I'm writing it on the fly so I may have a mistake here):

    var x = <root><child/></root>;
    var firstChild = x.*[0];

x is an XMLObject, and firstChild is an XMLObject; but internally, it looks to me like they point to the same E4XNode.

The problem is, E4XNode is a C++-only class that has no ActionScript representation that we can show the end user.  The profiler currently exposes only ActionScript objects.

If E4XNodes were not shared, this wouldn't be a problem; but since they are shared, there is no reasonable way to decide which XMLObject should "own" the bytes of memory taken by that E4XNode.

We will be running into this sort of problem in other places too.  We'll have to come up with a way of dealing with this sort of thing.  Possibly the right approach would be to somehow expose E4XNode to the profiler, but with a special name such as "[E4XNode]" or something; but even doing that will be hard.
I think this one should wait until flash10.2.  As described above, the fix is quite complicated.
Flags: flashplayer-qrb+
Priority: -- → P2
Target Milestone: --- → flash10.2
This is a weird suggestion, but ... I was wondering if, for flash10.1, we should change the behavior so that even though it will still be wrong, it will be wrong in a different and possibly more useful way.

The current behavior is, all XML objects are reported as being 24 bytes in size.  This is greatly underestimating the amount of memory used.

Fixing this is not straightforward, because, as explained above, different XML objects can point into the same underlying C++ structures.

But for the short term (flash10.1), would it be better to ignore that problem, and thus potentially overestimate the amount of memory taken, rather than underestimating it?  For each XML variable, calculate the total memory consumption including all underlying C++ structures.

So if the user has this code:

    var x:XML = /* a large xml object */ ;

... we would correctly report the memory consumption.  But if the user also has this:

    var sublist:XMLList = x.firstlevel.secondlevel;

... then we would report the full recursive size for x, and also the full recursive size for sublist.  The result is that we would be reporting *more* memory consumption than is actually being taken.

The reason this might be the right move is that even though the report is essentially wrong, it might be "less wrong," in that if these XML structures are taking a lot of memory, we will at least be pointing that out to the user.
And for those of you who are now cringing: I want to be clear that this is only a short-term plan.  There is definitely a plan for flash10.2 to fix this, which is to expose all memory allocations to the user, not just those allocations that directly correspond to an ActionScript object.  This will build on the sampleInternalAllocs() call that is already there.
Changing from flash10.2 to flash10.1 after discussion with Dan.
Assignee: nobody → mmoreart
Target Milestone: flash10.2 → flash10.1
Status: NEW → ASSIGNED
Attached patch Proposed patchSplinter Review
Proposed patch.  As described above, this takes a best shot at giving the user meaningful data to work with, even though, strictly speaking, the results are somewhat misleading, because it is possible for XML and XMLList objects to point into the same underlying C++ data structures.

XMLObject::bytesUsed() starts with the E4XNode it is looking at, walks up to the topmost parent of that node's tree, and then adds up the memory taken by the whole tree.

XMLListObject::bytesUsed() does the same thing, for all the nodes in the list; but in the case where some nodes in the list are actually pointing into the same tree of nodes, that tree's memory will only be counted once.
Attachment #436341 - Flags: superreview?(edwsmith)
Attachment #436341 - Flags: review?(treilly)
Alternate thought: What if we continued reporting XML/XMLList as 24/48, then lumping *all* the E4XNode space into a single bin we report to the user separately? A bit ugly, but reinforces the idea that XML storage is potentially a shared resource.
I like that, it exposes the complexity that exists and is unavoidable, and it doesn't lie.
(In reply to comment #7)
> Alternate thought: What if we continued reporting XML/XMLList as 24/48, then
> lumping *all* the E4XNode space into a single bin we report to the user
> separately?

Dan and I talked this over, and I don't think this is going to work, for a couple of reasons:

- The main reason is that although this does alert the user if E4XNodes are eating up a lot of his memory, it doesn't tell him which XML objects those E4XNodes are associated with.  For a small simple app with only a few well-understood XML variables, it will probably be easy for the user to figure out which XML variables are causing the memory consumption; but for a large app, the user won't know what to do to his program to reduce memory consumption.

- There are implementation issues.  I can't see how we would implement this in the 10.1 API.  The 10.1 API is that the profiler calls getSamples(), which returns a stream of all samples that have taken place since beginSampling() was called.  When and where in that list would the block of E4XNodes show up?  The first time the user takes a memory snapshot, getSamples() might have gotten called, and we might be able to construct a fake event representing the E4XNodes; but then later, when a second memory snapshot is taken, how do we do it?  Maybe construct an DeleteObjectSample to pretend the original block of E4XNodes got deleted, and then a NewObjectSample to represent all memory now taken by E4XNodes -- not sure, that might work.  But another issue is, since there would not be a backing ActionScript object for this, we can't construct a fake NewObjectSample unless the profiler has called sampleInternalAllocs(), allowing us to return NewObjectSamples for which 'object' is null -- but then again, if the profiler has called that, then all the E4XNodes are accounted for anyway.
(In reply to comment #9)
> Dan and I talked this over, and I don't think this is going to work, for a
> couple of reasons:

Fair enough. Sounds like there isn't an ideal solution, so we have to choose the approach that is least bad.
The ideal solution should be on the short list for post-argo:  let the memory profiler report about non-as3 objects.  then you [can] get proper reporting for internal and as3 objects, all of which form a full directed/cyclic graph in the general case.  this includes [shared] string buffers, internal VM objects, xml nodes, and so on.  From Mike's description, its beyond Argo, but not a huge effort.
Agreed.  I have not yet logged that as a post-argo enhancement, but I will be doing so soon, and when I do, I'll put a comment here.
Comment on attachment 436341 [details] [diff] [review]
Proposed patch

I don't see anything obviously wrong but you should probably get Werner to look this over too, he's the original e4x author
Attachment #436341 - Flags: review?(treilly) → review+
Attachment #436341 - Flags: review?(wsharp)
Attachment #436341 - Flags: review?(wsharp) → review+
Comment on attachment 436341 [details] [diff] [review]
Proposed patch

nits:
* habitual use of:  for (int i=0; i < someLimitFunction(); i++) should be replaced by for (int i=0, n=someLimitFunction(); i < n; i++), unless you really want to call that function in every iteration.
* needs an overview comment explaining how size is calcuated, and a FIXME pointing to the bug that will make all this better, later, by exposing internal objects to the memory profiler.
Attachment #436341 - Flags: superreview?(edwsmith) → superreview+
Whiteboard: Has patch
(In reply to comment #14)
> (From update of attachment 436341 [details] [diff] [review])
> nits:
> * habitual use of:  for (int i=0; i < someLimitFunction(); i++) should be
> replaced by for (int i=0, n=someLimitFunction(); i < n; i++), unless you really
> want to call that function in every iteration.

You're right of course.  I almost always do that, but in this case I had been attempting to follow the prevailing coding style in those files (plus, all the limit functions I was calling are inline functions); but that's lame.  I'll change my code.

> * needs an overview comment explaining how size is calcuated, and a FIXME
> pointing to the bug that will make all this better, later, by exposing internal
> objects to the memory profiler.

Will add this.  I have logged bug 558385 for the enhancement to expose internal objects to the memory profiler.
Pushed to tamarin-redux-argo 3960:79a8d9d575d0 and tamarin-redux 4370:79a8d9d575d0
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
QA Contact: vm → cpeyer
Verified fixed tr 4709:

XML: 2639408 bytes
XMLList: 2688464 bytes
Status: RESOLVED → VERIFIED
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: