Closed Bug 717861 Opened 8 years ago Closed 8 years ago

GetVsize and GetResident underestimate memory size on Solaris

Categories

(Toolkit :: about:memory, defect)

All
Solaris
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla12

People

(Reporter: ginnchen+exoracle, Assigned: ginnchen+exoracle)

References

Details

Attachments

(1 file, 2 obsolete files)

2.55 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
It seems fstat() returns a smaller value than actual size for /proc/self/xmap.
We can use a loop and double nmap for each time to get the real size.
Attached patch patch (obsolete) — Splinter Review
Attachment #588318 - Flags: review?(khuey)
Comment on attachment 588318 [details] [diff] [review]
patch

I'm going to have Justin take this one.
Attachment #588318 - Flags: review?(khuey) → review?(justin.lebar+bug)
Assignee: ginn.chen → nobody
Component: XPCOM → about:memory
Product: Core → Toolkit
QA Contact: xpcom → about.memory
Assignee: nobody → ginn.chen
FYI, with the changes in bug 715453, the 'explicit' tree in about:memory will become useless on platforms without malloc_usable_size.  I believe this includes Solaris.
This code loops to find the length of xmap by reading in increasingly large chunks of the file.  Why not instead just read the file in chunks, summing vsize and RSS as you go?  Then you wouldn't have to malloc anything. 

(The only tricky part is if read() doesn't return a full prxmap_t, so you end up reading half a map.  But the kernel probably guarantees this won't happen.)
r- because I'd prefer the approach in comment 4, and anyway if we stayed with this approach, it would need a comment explaining why fstat and summing as you go doesn't work.
Attachment #588318 - Flags: review?(justin.lebar+bug) → review-
(In reply to Justin Lebar [:jlebar] from comment #3)
> FYI, with the changes in bug 715453, the 'explicit' tree in about:memory
> will become useless on platforms without malloc_usable_size.  I believe this
> includes Solaris.

With --enable-jemalloc, it should be fine on Solaris.

(In reply to Justin Lebar [:jlebar] from comment #4)
> This code loops to find the length of xmap by reading in increasingly large
> chunks of the file.  Why not instead just read the file in chunks, summing
> vsize and RSS as you go?  Then you wouldn't have to malloc anything. 
> 
> (The only tricky part is if read() doesn't return a full prxmap_t, so you
> end up reading half a map.  But the kernel probably guarantees this won't
> happen.)

The problem is we don't know if /proc/self/xmap changed during several reading of prxmap_t structures chunks.
So we should try to read the whole file in one time.

The time cost of the XMappingIter() is determined by the number of calls to read().
read() to xmap costs about 30ms on my machine, no matter the length you read.
So from the aspect of time consuming, the two implementations have no difference.
> With --enable-jemalloc, it should be fine on Solaris.

Cool; I didn't know that worked there.

> The time cost of the XMappingIter() is determined by the number of calls to read().
> So from the aspect of time consuming, the two implementations have no difference.

How do you figure?  The doubling implementation takes log2(n) reads, while the read-in-chunks implementation takes n/cunksize reads.  Depending on the chunksize and the size of xmaps, log2(n) might be larger or smaller than n/chunksize.

> The problem is we don't know if /proc/self/xmap changed during several reading of prxmap_t 
> structures chunks.

Well, calling malloc can certainly change the structure!  So we're already introducing an inaccuracy there.  It seems possible if not likely that this inaccuracy will be larger than the inaccuracy from a race reading xmap.

But more generally, I'm concerned about possible OOMs here.  This code can malloc an unbounded amount of memory, and we've seen code which does this cause crashes before (e.g.  bug 702217).
I'd r+ this if you caught and handled the malloc OOM case.
Attached patch patch v2 (obsolete) — Splinter Review
handle OOM
Attachment #588318 - Attachment is obsolete: true
Attachment #592647 - Flags: review?(justin.lebar+bug)
> +            if (prmapp) {
> +                free(prmapp);
> +            }

free can safely take a null pointer.

Can you please initialize prmapp to NULL, so one doesn't have to trace through the code to see that we'll never free uninitialized memory?

We canonically return -1 on an error from a memory reporter; can you please change the patch so it does this, rather than returning whatever values it happened to have?

> +                // stat(2) on /proc/<pid>/xmap returns an incorrect value,
> +                // prior to the release of Solaris 11.
> +                // Here is a workaround for it.
> +                nmap *= 2;

If the value returned by stat is correct on Solaris 11, don't you want to nmap *= 2 at the end of the loop?
Attachment #592647 - Flags: review?(justin.lebar+bug) → review-
Attached patch patch v3Splinter Review
> We canonically return -1 on an error from a memory reporter; can you please
> change the patch so it does this, rather than returning whatever values it
> happened to have?

Actually XMappingIter() won't touch Vsize, Resident if something is wrong, and they're set to -1 in GetVsize() and GetResident().
I agree it would be better if I did this at head of XMappingIter().
So I moved it there.

> If the value returned by stat is correct on Solaris 11, don't you want to
> nmap *= 2 at the end of the loop?

Although fstat() returns the correct value on Solaris 11, nmap may not be enough in some cases because we have malloc between fstat() and pread() and other threads may also do malloc/mmap.
So nmap * 2 would save us a pread() in these cases.
Attachment #592647 - Attachment is obsolete: true
Attachment #592968 - Flags: review?(justin.lebar+bug)
Comment on attachment 592968 [details] [diff] [review]
patch v3

I'd not optimize for the case in Solaris 11 where we have a race condition and the value returned by stat is too small.  But this is fine.

Thanks for your patience!
Attachment #592968 - Flags: review?(justin.lebar+bug) → review+
https://hg.mozilla.org/mozilla-central/rev/3e14b079b230
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
You need to log in before you can comment on or make changes to this bug.