run a CI memory test
Categories
(Core :: Machine Learning, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox137 | --- | fixed |
People
(Reporter: tarek, Assigned: tarek)
References
Details
(Whiteboard: [genai])
Attachments
(1 file)
The memory snapshot we're currently collecting will not provide the peak RSS for the inference process.
In order to do this, we could run the performance test with --gecko-profile
and collect the generated profile to extract the value.
in practice we need to find a way to hook something that will get the JSON, extract the data and publish it as a performance metrics
Assignee | ||
Comment 1•28 days ago
•
|
||
This is the script to extract peak usage
import sys
import json
data = json.loads(open(sys.argv[-1]).read())
def getPeakMem(process):
max = 0
current = 0
pid = 0
processName = ""
for thread in process["threads"]:
if thread["name"] == "GeckoMain":
pid = thread["pid"]
processName = thread.get("processName", "unknown")
break
for counter in process["counters"]:
if counter["name"] == "malloc":
for sample in counter["samples"]["data"]:
current += sample[1]
if current > max:
max = current
print(
f"[{pid}][{processName}] Peak memory allocation {max / (1024 * 1024)}"
)
getPeakMem(data)
for process in data["processes"]:
getPeakMem(process)
example on a perf test :
➜ python3 peak.py profile.json
[51064][Parent Process] Peak memory allocation 298.32491302490234
[51065][unknown] Peak memory allocation 35.444618225097656
[51070][Utility Process] Peak memory allocation 35.41194152832031
[51067][unknown] Peak memory allocation 35.460235595703125
[51084][Web Content] Peak memory allocation 37.041725158691406
[51073][Inference] Peak memory allocation 479.8895950317383
[51083][Web Content] Peak memory allocation 37.04314422607422
[51074][Web Content] Peak memory allocation 37.16900634765625
[51072][Web Content] Peak memory allocation 39.958473205566406
[51066][WebExtensions] Peak memory allocation 52.40960693359375
[51068][Web Content] Peak memory allocation 40.31774139404297
[51071][Privileged Content] Peak memory allocation 56.09490203857422
Comment 2•24 days ago
|
||
So this is not a great source of truth for anything running in Wasm. I ran into this issue earlier on doing some memory analysis on the translations engine. This source of data is in the hooked malloc from the profiler. Unfortunately most of the Wasm memory is not going through this malloc site, and is using another syscall that is not instrumented. Ontop of that, Wasm will reserve the memory, but it doesn't become actualized into real memory until it is committed (at least this is my understanding)
Because of this, it's important to query the actual memory usage using some other OS-level system utility. In my manual tests on macOS I use the activity monitor and record the memory usage there for my manual testing. Erik implemented this check using: https://searchfox.org/mozilla-central/rev/c5432a86ece2ce8671e7aefbe43fed9a10151227/browser/components/translations/tests/browser/head.js#704-737
I'm wrapping up my day, but we should probably audit that it's reporting the correct thing, but the numbers in our perf alerts seem similar to what I've been seeing.
Also see Bug 1811927 for better hooks into Wasm memory.
Assignee | ||
Comment 3•23 days ago
|
||
I've implemented something very similar here: https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/browser/head.js#410-445
Another option that would work on all platform would be to use psutil in the background inside a perftest hook to watch the inference process and grab values every second or so
Assignee | ||
Comment 4•23 days ago
|
||
I will try a different approach based on hooks, by running a pstutil loop in https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/tools/hooks_local_hub.py
Comment 5•23 days ago
|
||
(In reply to Tarek Ziadé (:tarek) from comment #3)
I've implemented something very similar here: https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/browser/head.js#410-445
I based mine off of yours, so that makes sense.
Please keep me up to date on your findings here and if we should measure memory in more than one way in Translations as well.
I'll follow along on this bug.
Assignee | ||
Updated•21 days ago
|
Updated•21 days ago
|
Assignee | ||
Updated•21 days ago
|
Assignee | ||
Comment 6•20 days ago
|
||
Updated•20 days ago
|
Updated•17 days ago
|
Comment 8•16 days ago
|
||
bugherder |
Description
•