Add the ability to grow (and shrink) the profile buffer's capacity dynamically

RESOLVED WONTFIX

Status

()

enhancement
P1
normal
RESOLVED WONTFIX
Last year
8 months ago

People

(Reporter: mstange, Assigned: canaltinova)

Tracking

(Blocks 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox63 wontfix, firefox64 wontfix, firefox65 affected)

Details

Attachments

(3 attachments, 2 obsolete attachments)

Sometimes you want to be able to capture the entire duration of some workload / benchmark / performance test in the profiler buffer, but you don't know the necessary buffer size upfront. And you don't want to pick a gigantic buffer size because doing so will consume a lot more memory than necessary.

For these use cases, it would be nice if the profiler treated the buffer size as a "maximum size", and allocated just as much as it needed for the current buffer contents.

This means that the profiler will need to re-allocate and memcpy the buffer between samples, whenever necessary. The time needed to do this allocation + memcpy may cause slight gaps between samples, but I'm hoping that this isn't going to be much of a problem: The buffer is usually smaller than 50MB, so the time to do this allocation should be very short.

Another advantage of being able to dynamically change the profile buffer size is the fact that it'll allow us to add a fixed-timeduration-window sampling mode, rather than a fixed-buffersize-window sampling mode.
Blocks: 1476775
Assignee: mstange → nobody
Status: ASSIGNED → NEW
Assignee: nobody → canaltinova
Status: NEW → ASSIGNED
Comment on attachment 8993123 [details]
Bug 1476757 - Add methods to change the capacity of the ProfileBuffer.

https://reviewboard.mozilla.org/r/257946/#review265216

::: tools/profiler/core/ProfileBuffer.cpp:57
(Diff revision 1)
> +
> +bool
> +ProfileBuffer::SetCapacityPow2(uint32_t aCapacity)
> +{
> +  MOZ_RELEASE_ASSERT(aCapacity != 0, "can't set ProfileBuffer capacity to zero");
> +  MOZ_RELEASE_ASSERT((aCapacity & (aCapacity - 1)) == 0, "aCapacity needs to be a power of two");

This assert could use IsPowerOfTwo from MathAlgorithms.h.
Comment on attachment 8993124 [details]
Bug 1476757 - WIP: Grow and shrink the profiler buffer dynamically so as to not waste memory when a large buffer size limit is picked.

https://reviewboard.mozilla.org/r/257948/#review265220

::: tools/profiler/core/platform.cpp:320
(Diff revision 1)
>  #endif
>  };
>  
>  CorePS* CorePS::sInstance = nullptr;
>  
> +static const uint32_t kInitialProfileBufferCapacity = 1024;

1024 is probably too low - If the first sample uses more than 1024 entries, we'll miss part of it, because we only start increasing the buffer size once we the buffer is 90% full after a sample.
We should take some measurements of how many entries are taken up by a single sample if you profile many threads, and then set the minimum size to something that definitely provides enough space for the first sample.

::: tools/profiler/core/platform.cpp:2228
(Diff revision 1)
>          // SuspendAndSampleAndResumeThread, which is why it is done here.
>          CorePS::Lul(lock)->MaybeShowStats();
>  #endif
>        }
> +
> +      ActivePS::EnsureAdequateBufferCapacity(lock);

There should be a comment here that says that this needs to be done outside the profiler's "critical section". The buffer entries are added just after stackwalking *while the target thread is suspended*, and during that time we can't allocate, so we can't grow the buffer as we add the entries. Instead, we need to grow the buffer ahead of time, by using some heuristic to predict whether growing the buffer is necessary.
Blocks: 1378796
Priority: -- → P1
Attachment #8993123 - Attachment is obsolete: true
Attachment #8993124 - Attachment is obsolete: true
I spent quite some time looking at these patches, and I've left some comments. A lot of them are nitpicking variable names, because I find this kind of buffer-resize code is much clearer if the names are good.

But more generally, I'm left scratching my head about the notion of a resizeable circular buffer. If you are willing to reallocate the buffer any time it fills up, does it need to be circular? I guess the idea is that there is a maximum, but you can set it large and hopefully you won't hit it? But I wonder if it would be better to just not impose an upper limit, drop the circularity, and be prepared to handle OOM if necessary.
To put it another way: a resizeable circular buffer seems like the worst of all worlds. You have the complexity of the circularity, you have the complexity of resizing (including OOM handling), and you still have to choose some kind of upper memory limit.
That's a good point. I think this is mostly a problem with the justification that I wrote in comment 0. My *real* motivation for this change is bug 1476775. In that bug, the patches implement the "fixed duration" by evicting too-old buffer entries, simply by moving mRangeStart past them. With the circular buffer, both eviction and addition of entries is very efficient because most of the time, the buffer stays at the same size, and the buffer contents aren't moved in memory.
What I didn't realize was that e.g. std::dequeue<ProfileEntry> can do this just as well... with the exception that its indexed access would be a little slower because it can't take advantage of the "& mEntryIndexMask" trick and because "typical implementations use a sequence of individually allocated fixed-size arrays, with additional bookkeeping, which means indexed access to deque must perform two pointer dereferences, compared to vector's indexed access which performs only one." [1]

So I think there's still value in going the route that I chose in these patches.
Put another way: The patches in this bug make it look like the circular buffer is used much like an append-only std::vector, but the eviction piece in bug 1476775 makes it so that we use it in a circular fashion even when we're not at the maximum capacity.

[1] https://en.cppreference.com/w/cpp/container/deque
I fixed the nits and other comments and added another patch that changes {a,m}EntrySize to {a,m}Capacity in ProfileBuffer.

(In reply to Nicholas Nethercote [:njn] from comment #8)
> To put it another way: a resizeable circular buffer seems like the worst of
> all worlds. You have the complexity of the circularity, you have the
> complexity of resizing (including OOM handling), and you still have to
> choose some kind of upper memory limit.

I didn't change the `resizeable circular buffer` bits because of the Markus' comment above. We need it to behave like that because of the bug 1476775.
Pushed by canaltinova@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/1924a400dfa4
Change {a,m}EntrySize to {a,m}Capacity in ProfileBuffer r=njn
https://hg.mozilla.org/integration/autoland/rev/ef82ba4b7f22
Add methods to change the capacity of the ProfileBuffer. r=njn
https://hg.mozilla.org/integration/autoland/rev/3428510869a9
Grow and shrink the profiler buffer dynamically so as to not waste memory when a large buffer size limit is picked. r=njn
https://hg.mozilla.org/mozilla-central/rev/1924a400dfa4
https://hg.mozilla.org/mozilla-central/rev/ef82ba4b7f22
https://hg.mozilla.org/mozilla-central/rev/3428510869a9
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Backout by canaltinova@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ee93e21a764
Backed out changeset 3428510869a9 r=mstange
https://hg.mozilla.org/integration/mozilla-inbound/rev/d31c758d9cb4
Backed out changeset ef82ba4b7f22 r=mstange
Depends on: 1503411
(Looks like this should be reopened, given the backout in comment 14.)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla65 → ---
We have no intentions to land that again since resizable circular buffer wasn't a good idea. Closing that now.
Status: REOPENED → RESOLVED
Closed: 10 months ago8 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.