Open Bug 1806054 Opened 2 years ago Updated 4 months ago

Memory profiling uses the expensive malloc_usable_size function.

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

ASSIGNED

People

(Reporter: pbone, Assigned: pbone)

References

Details

(Whiteboard: [sp3])

Attachments

(1 file)

Memory profiling uses malloc_usable_size to determine a cell's actual size. This can be slow and distorts profiles.

https://searchfox.org/mozilla-central/source/tools/profiler/core/memory_hooks.cpp#385

Hey Paul, do you have a suggestion to avoid doing this?

Yeah, I do, I'm working on a patch for it now. The first call in the allocator can be replaced with the malloc_good_size call which is faster. And I'll truy to write a faster version of the 2nd call in the free hook. It won't be perfect, to do the testing I need (comapring memory allocators) I'll have to turn off allocation profiling. But it'll be better for other users of the profiler.

Hi Julien,

The call to MallocUsableSize here can't be made faster right now: https://searchfox.org/mozilla-central/source/tools/profiler/core/memory_hooks.cpp#429 Because the pointer might belong to PHC, so our code to ask jemalloc, but let jemalloc fail if it doesn't own the memory works (and returns a size of 0 if PHC owns the memory).

We could perhaps speed it up if we could free the memory before running FreeCallback(). I guess FreeCallback() never de-references the pointer, why would it? But it could run in parallel with another thread that gets given that pointer for a new allocation?

I'm not sure how much effort is worth-while. Especially since to move on with my profiling I can disable memory profiling and get the results I need.

Flags: needinfo?(felash)

Currently we take for granted that enabling memory profiling skews all the other results, so we advise our users to choose one or the other. It would be great if that wouldn't be the case, but to tell you the truth we don't have anybody in our team to take on this work currently.

This means that I'd be glad if you want to do more work, but it's also completely OK that you want to move on and work on your stuff instead :-)

Flags: needinfo?(felash)

Florian told me that this code runs by default, I was missing that this is used for the counters in addition of the memory profiling using stack captures.

I think that the overhead isn't usually a problem because it's more or less constant. But feel free to disagree with me.

This makes me a bit concerned about the overhead from some of the profiling I did for allocation heavy code where the allocations were small but frequent. Most of the CPU time was spent in allocations and de-allocations, and this may have been an artificial reading due to memory counters. If I were to repeat the results I would want the ability to turn memory hooks off (which we can't do). The profiling I was doing was for ICU, which is used to back all of the Intl calls in Firefox.

These results justified investment in code that would be zero-alloc in order to mitigate these issues.

See Also: → 1806752

(In reply to Julien Wajsberg [:julienw] from comment #5)

This means that I'd be glad if you want to do more work, but it's also completely OK that you want to move on and work on your stuff instead :-)

No worries, I'll keep it on my list for now.

The following patch is waiting for review from a reviewer who resigned from the review:

ID Title Author Reviewer Status
D165111 Bug 1806054 - Use the faster malloc_good_size call r=julienw pbone julienw: Resigned from review

:pbone, could you please find another reviewer?

For more information, please visit auto_nag documentation.

Flags: needinfo?(pbone)
Whiteboard: [sp3]
Severity: -- → S3
Priority: -- → P2
Flags: needinfo?(pbone)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: