Closed Bug 1944790 Opened 28 days ago Closed 16 days ago

run a CI memory test

Categories

(Core :: Machine Learning, enhancement)

enhancement

Tracking

()

RESOLVED FIXED
137 Branch
Tracking Status
firefox137 --- fixed

People

(Reporter: tarek, Assigned: tarek)

References

Details

(Whiteboard: [genai])

Attachments

(1 file)

The memory snapshot we're currently collecting will not provide the peak RSS for the inference process.

In order to do this, we could run the performance test with --gecko-profile and collect the generated profile to extract the value.

in practice we need to find a way to hook something that will get the JSON, extract the data and publish it as a performance metrics

This is the script to extract peak usage

    import sys
    import json
    ​
    data = json.loads(open(sys.argv[-1]).read())
    ​
    ​
    def getPeakMem(process):
        max = 0
        current = 0
        pid = 0
        processName = ""
    ​
        for thread in process["threads"]:
            if thread["name"] == "GeckoMain":
                pid = thread["pid"]
                processName = thread.get("processName", "unknown")
                break
    ​
        for counter in process["counters"]:
            if counter["name"] == "malloc":
                for sample in counter["samples"]["data"]:
                    current += sample[1]
                    if current > max:
                        max = current
    ​
                print(
                    f"[{pid}][{processName}] Peak memory allocation {max / (1024 * 1024)}"
                )
    ​
    ​
    getPeakMem(data)
    ​
    for process in data["processes"]:
        getPeakMem(process)

example on a perf test :

➜  python3 peak.py profile.json
[51064][Parent Process] Peak memory allocation 298.32491302490234
[51065][unknown] Peak memory allocation 35.444618225097656
[51070][Utility Process] Peak memory allocation 35.41194152832031
[51067][unknown] Peak memory allocation 35.460235595703125
[51084][Web Content] Peak memory allocation 37.041725158691406
[51073][Inference] Peak memory allocation 479.8895950317383
[51083][Web Content] Peak memory allocation 37.04314422607422
[51074][Web Content] Peak memory allocation 37.16900634765625
[51072][Web Content] Peak memory allocation 39.958473205566406
[51066][WebExtensions] Peak memory allocation 52.40960693359375
[51068][Web Content] Peak memory allocation 40.31774139404297
[51071][Privileged Content] Peak memory allocation 56.09490203857422
Depends on: 1944913

So this is not a great source of truth for anything running in Wasm. I ran into this issue earlier on doing some memory analysis on the translations engine. This source of data is in the hooked malloc from the profiler. Unfortunately most of the Wasm memory is not going through this malloc site, and is using another syscall that is not instrumented. Ontop of that, Wasm will reserve the memory, but it doesn't become actualized into real memory until it is committed (at least this is my understanding)

Because of this, it's important to query the actual memory usage using some other OS-level system utility. In my manual tests on macOS I use the activity monitor and record the memory usage there for my manual testing. Erik implemented this check using: https://searchfox.org/mozilla-central/rev/c5432a86ece2ce8671e7aefbe43fed9a10151227/browser/components/translations/tests/browser/head.js#704-737

I'm wrapping up my day, but we should probably audit that it's reporting the correct thing, but the numbers in our perf alerts seem similar to what I've been seeing.

Also see Bug 1811927 for better hooks into Wasm memory.

I've implemented something very similar here: https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/browser/head.js#410-445

Another option that would work on all platform would be to use psutil in the background inside a perftest hook to watch the inference process and grab values every second or so

I will try a different approach based on hooks, by running a pstutil loop in https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/tools/hooks_local_hub.py

(In reply to Tarek Ziadé (:tarek) from comment #3)

I've implemented something very similar here: https://searchfox.org/mozilla-central/source/toolkit/components/ml/tests/browser/head.js#410-445

I based mine off of yours, so that makes sense.

Please keep me up to date on your findings here and if we should measure memory in more than one way in Translations as well.

I'll follow along on this bug.

Whiteboard: [genai]
Assignee: nobody → tziade
Attachment #9464648 - Attachment description: WIP: Bug 1944790 - run a CI memory test → Bug 1944790 - run a CI memory test
Attachment #9464648 - Attachment description: Bug 1944790 - run a CI memory test → Bug 1944790 - run a CI memory test r?nordzilla,atossou
Pushed by tziade@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e9c59acf56eb run a CI memory test r=afinder,atossou,nordzilla,perftest-reviewers
Status: NEW → RESOLVED
Closed: 16 days ago
Resolution: --- → FIXED
Target Milestone: --- → 137 Branch
Blocks: 1947840
Regressions: 1949171
See Also: → 1950558
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: