Closed Bug 1477512 Opened Last year Closed Last year

Add memory reporter for non-stack thread overhead

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: kmag, Assigned: kmag)

References

(Blocks 1 open bug)

Details

Attachments

(6 files)

Bug 1475899 added a memory reporter for the committed size of thread stacks. That should be the bulk of memory overhead for threads, but we also have heap overhead from nsThread/PRThread wrappers and thread event queues, and kernel overhead for the actual OS threads.

The former is pretty easy to report in a cross-platform way. The latter is variable, but at least on Windows is a known, fixed, per-thread value.
Comment on attachment 8993973 [details]
Bug 1477512: Part 0 - Remove unused nsThread::mEventObservers array.

https://reviewboard.mozilla.org/r/258568/#review265808

::: commit-message-03ed4:2
(Diff revision 1)
> +Bug 1477512: Part 0 - Remove unused nsThread::mEventObservers array. r?erahm
> +

Can you add a note that this was moved in bug 1350432.
Attachment #8993973 - Flags: review?(erahm) → review+
Comment on attachment 8993974 [details]
Bug 1477512: Part 1 - Add memory reporter functions to thread event queues.

https://reviewboard.mozilla.org/r/258570/#review265816

A few minor comments.

::: xpcom/threads/EventQueue.h:45
(Diff revision 1)
>    void SuspendInputEventPrioritization(const MutexAutoLock& aProofOfLock) final {}
>    void ResumeInputEventPrioritization(const MutexAutoLock& aProofOfLock) final {}
>  
> +  size_t SizeOfExcludingThis(mozilla::MallocSizeOf aMallocSizeOf) const override
> +  {
> +    return mQueue.ShallowSizeOfExcludingThis(aMallocSizeOf);

Should we `aMallocSizeof` the entries themselves?

::: xpcom/threads/LabeledEventQueue.h:54
(Diff revision 1)
>    void SuspendInputEventPrioritization(const MutexAutoLock& aProofOfLock) final {}
>    void ResumeInputEventPrioritization(const MutexAutoLock& aProofOfLock) final {}
>  
> +  size_t SizeOfExcludingThis(mozilla::MallocSizeOf aMallocSizeOf) const override
> +  {
> +    return (mEpochs.ShallowSizeOfExcludingThis(aMallocSizeOf) +

Nit: unnecessary parentheses.

::: xpcom/threads/PrioritizedEventQueue.h:88
(Diff revision 1)
> +    n += mHighQueue->SizeOfIncludingThis(aMallocSizeOf);
> +    n += mInputQueue->SizeOfIncludingThis(aMallocSizeOf);
> +    n += mNormalQueue->SizeOfIncludingThis(aMallocSizeOf);
> +    n += mIdleQueue->SizeOfIncludingThis(aMallocSizeOf);
> +
> +    if (mMutex) {

`mMutex` is not owned

::: xpcom/threads/Scheduler.cpp:75
(Diff revision 1)
>    Mutex& MutexRef() { return mLock; }
>  
> +  size_t SizeOfExcludingThis(mozilla::MallocSizeOf aMallocSizeOf) const override
> +  {
> +    return (SynchronizedEventQueue::SizeOfExcludingThis(aMallocSizeOf) +
> +            mQueue->SizeOfIncludingThis(aMallocSizeOf));

nit: parentheses
Attachment #8993974 - Flags: review?(erahm) → review+
Comment on attachment 8993975 [details]
Bug 1477512: Part 2 - Add memory reporting functions to nsThread.

https://reviewboard.mozilla.org/r/258572/#review265838

::: xpcom/threads/nsThread.h:153
(Diff revision 1)
> +
> +  // Returns the size of this object, its PRThread, and its shutdown contexts,
> +  // but excluding its event queues.
> +  size_t ShallowSizeOfIncludingThis(mozilla::MallocSizeOf aMallocSizeOf) const;
> +
> +  size_t SizeOfEventQueues(mozilla::MallocSizeOf aMallocSizeOf) const;

I'm not following the need to split this out, but I'm guessing the following patches will explain it.

::: xpcom/threads/nsThread.cpp:1069
(Diff revision 1)
> +
> +size_t
> +nsThread::SizeOfEventQueues(mozilla::MallocSizeOf aMallocSizeOf) const
> +{
> +  size_t n = 0;
> +  if (mCurrentPerformanceCounter) {

Should this be in `ShallowSizeOfIncludingThis`?
Attachment #8993975 - Flags: review?(erahm) → review+
Comment on attachment 8993976 [details]
Bug 1477512: Part 3 - Add memory reporter for thread heap overhead.

https://reviewboard.mozilla.org/r/258574/#review265848

::: xpcom/base/nsMemoryReporterManager.cpp:1439
(Diff revision 1)
> +    size_t eventQueueSizes = 0;
> +    size_t wrapperSizes = 0;
> +
>      for (auto* thread : nsThread::Enumerate()) {
> +      eventQueueSizes += thread->SizeOfEventQueues(MallocSizeOf);
> +      wrapperSizes += thread->ShallowSizeOfIncludingThis(MallocSizeOf);

Ah it makes sense now, so just the stuff that the nsThread wrappers use.
Attachment #8993976 - Flags: review?(erahm) → review+
Comment on attachment 8993977 [details]
Bug 1477512: Part 4 - Add memory reporter for thread kernel stack sizes.

https://reviewboard.mozilla.org/r/258576/#review265850

::: xpcom/base/nsMemoryReporterManager.cpp:1536
(Diff revision 1)
>        wrapperSizes,
>        "The sizes of nsThread/PRThread wrappers.");
>  
> +#if defined(XP_WIN)
> +    // Each thread on Windows has a fixed kernel overhead. For 32 bit Windows,
> +    // that's 12K. For 64 bit, it's 24K.

We should probably link to our source

::: xpcom/base/nsMemoryReporterManager.cpp:1548
(Diff revision 1)
> +    constexpr size_t kKernelSize = 4 * 1024;
> +#else
> +    constexpr size_t kKernelSize = 8 * 1024;
> +#endif
> +#else
> +    // Elsewhere, just assume that kernel stacks require at least 8K.

I believe OSX is going to be 16K, my source is a little dated though (and it's talking about tasks):

https://books.google.com/books?id=K8vUkpOXhN4C&pg=PA513&lpg=PA513&dq=mach+kernel+thread+stack+size&source=bl&ots=OMkhR-_r-x&sig=SPAq-yiqsVNzNI5rPJUVo26p4zA&hl=en&sa=X&ved=0ahUKEwiM4I7P47PcAhUuHzQIHWP_A0EQ6AEIKTAA#v=onepage&q=mach%20kernel%20thread%20stack%20size&f=false
Attachment #8993977 - Flags: review?(erahm) → review+
Comment on attachment 8993976 [details]
Bug 1477512: Part 3 - Add memory reporter for thread heap overhead.

https://reviewboard.mozilla.org/r/258574/#review265848

> Ah it makes sense now, so just the stuff that the nsThread wrappers use.

Yeah. So, basically, there were two questions I was trying to answer:

1) Should we bother trying to avoid allocating event queues for nsThread wrappers that don't have an XPCOM event loop?
2) If we decide to start creating virtual threads that can dispatch to shared thread pools, are we going to have to worry much about the alloction size of event queues?

So far, I think the answers look like "maybe" and "probably not".
Comment on attachment 8993978 [details]
Bug 1477512: Part 5 - Rearrange the fields of nsThread for better packing.

https://reviewboard.mozilla.org/r/258578/#review265858

::: xpcom/threads/nsThread.h:204
(Diff revision 1)
>    mozilla::Atomic<bool> mShutdownRequired;
> -  MainThreadFlag mIsMainThread;
>  
> +  int8_t   mPriority;
> +
> +  uint8_t  mIsMainThread;

We should really update this to be an `enum class` but that's a follow up at most.
Attachment #8993978 - Flags: review?(erahm) → review+
Comment on attachment 8993974 [details]
Bug 1477512: Part 1 - Add memory reporter functions to thread event queues.

https://reviewboard.mozilla.org/r/258570/#review265816

> Should we `aMallocSizeof` the entries themselves?

There isn't actually any way to iterate over them externally, and they should ideally be pretty transient anyway, so I think it's probably not worth the trouble. Might be worth a follow-up at some point, though.
Comment on attachment 8993975 [details]
Bug 1477512: Part 2 - Add memory reporting functions to nsThread.

https://reviewboard.mozilla.org/r/258572/#review265838

> Should this be in `ShallowSizeOfIncludingThis`?

Maybe. I put it here because it's used by the event processing code, and is more closely related to the event queue than the wrapper. I don't have that strong an opinion on it, though.
Note to performance sheriffs: Like bug 1475899, these patches will also increase the reported Base Content Explicit memory values because of better reporting. This isn't a real regression, just better reporting of the kernel memory associated with threads.
You need to log in before you can comment on or make changes to this bug.