Closed Bug 1758643 Opened 3 years ago Closed 3 years ago

MOZ_RELEASE_ASSERT(aLength <= kMax) (string is too large) when capturing a profile

Tracking

()

Status:

RESOLVED FIXED

Milestone:

104 Branch

Tracking Flags:

Tracking

Status

firefox104

---

fixed

People

(Reporter: florian, Assigned: mozbugz)

References

(Depends on 1 open bug)

Details

Attachments

(4 files)

Bug 1758643 - ResetGathering now take the nsresult to use in the promise rejection - r?florian 3 years ago Gerald Squelart (he/him) (not at Mozilla since 2022-09-15) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1758643 - Check for maximum string length before outputting profile JSON - r?florian 3 years ago Gerald Squelart (he/him) (not at Mozilla since 2022-09-15) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1758643 - Avoid double allocation when outputting profile JSON - r?florian 3 years ago Gerald Squelart (he/him) (not at Mozilla since 2022-09-15) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1758643 - Preemptively discard child profiles that would add too much data - r?florian 3 years ago Gerald Squelart (he/him) (not at Mozilla since 2022-09-15) 48 bytes, text/x-phabricator-request		Details \| Review

Florian Quèze [:florian]

Reporter

Description

•

3 years ago

I crashed when trying to capture a long profile yesterday (https://crash-stats.mozilla.org/report/index/ea53d488-6637-4447-86e5-aa33e0220308) and today (https://crash-stats.mozilla.org/report/index/86471d1b-6c2d-415d-84c8-7d4df0220309).

Is it possible that something changed recently that makes us attempt to generate JSON profiles that are larger than 2GB? https://searchfox.org/mozilla-central/rev/131f3af9a49d2203adb7b7ef30dcc37c9f1aa10b/xpcom/string/nsTStringRepr.h#65-69,84,87

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 1

•

3 years ago

I can't think of obvious recent changes that could explain this. We keep adding new markers, and recording longer profiles, maybe it's getting out of control? 😛

I see that this happens when we have received all expected child profiles, and we're stitching everything in one big buffer, which is given for adoption to an nsCString, at which point the >2GB length is a problem.
Assuming we can't avoid using nsCString, nsProfiler::FinishGathering() could detect that the sum of all per-process profiles is too big, and we could drop one or more to fit into the 2GB limit.
(I don't think we could remove the oldest data from each profile, though that would be the ideal solution.)

Also other things like bug 1756535 (combining unique strings&stacks from all threads) could help, by reducing the size of each per-process profile.

Locally It could be useful to dump the giant string to disk before crashing (just after this line), to analyze the json profile and see if there are real issues.

Severity: -- → S4

Priority: -- → P3

Julien Wajsberg [:julienw]

Comment 2

•

3 years ago

Hey Gerald,
this seems to happen a lot more now that we enabled IPC by default.
Therefore should we bump the severity and the priority?

Assuming we can't avoid using nsCString

Why can't we use a structure that can hold more than 2GB?

Paul Adenot (:padenot)

Comment 3

•

3 years ago

This is trivial to reproduce when doing a media profile of more than a few seconds on a reasonnably active Web Audio API program such as https://jackschaedler.github.io/karplus-stress-tester/ (click Start audio, then the button 600, then profile, wait 10s, capture). This crashes Firefox after a few seconds on capture. I've also seen 140GB (!) memory in use in the child process, when it doesn't crash.

This is likely caused by IPC markers being collected automatically these days, quite a steep increase in marker collected.

Julien Wajsberg [:julienw]

Comment 4

•

3 years ago

(In reply to Julien Wajsberg [:julienw] from comment #2)

Assuming we can't avoid using nsCString

Why can't we use a structure that can hold more than 2GB?

Although in that case we might also reach the limit for WebChannel like https://github.com/firefox-devtools/profiler/issues/4021

I believe we need to discuss another model to transfer the profile data to the webpage.

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 5

•

3 years ago

Thank you both for the extra info. This certainly needs to be investigated sooner.

Severity: S4 → S3

Priority: P3 → P2

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Updated

•

3 years ago

Depends on: 1339982

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Updated

•

3 years ago

Depends on: 1777890

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 6

•

3 years ago

I believe bug 1777890 (just landed) fixes the main cause of this issue: The buffer size-limiting algorithm was not running at all since April, which would explain the big buffers in processes, and then the too-big strings when collecting profiles...

Please let me know if you still experience this issue or not.
(But I should work on it in any case.)

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Updated

•

3 years ago

Updated

•

3 years ago

Depends on: 1779685

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 7

•

3 years ago

Attached file Bug 1758643 - ResetGathering now take the nsresult to use in the promise rejection - r?florian — Details

This will be used in the following patches to more easily reject the promise
with a specific code AND reset gathering, in one call.

Depends on D151902

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → gsquelart

Status: NEW → ASSIGNED

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 8

•

3 years ago

Attached file Bug 1758643 - Check for maximum string length before outputting profile JSON - r?florian — Details

The promise to be resolved may end up in JavaScript, so we check for the
maximum JS string length -- which is under the nsCString max length, so we
won't fail the too-big-string assertion there.

Depends on D152023

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 9

•

3 years ago

Attached file Bug 1758643 - Avoid double allocation when outputting profile JSON - r?florian — Details

Instead of allocating a buffer with the profile and then copying it into an
nsCString (at which point there are two full copies in memory), we resize the
nsCString as needed and directly output the profile data into it.

Depends on D152024

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 10

•

3 years ago

Attached file Bug 1758643 - Preemptively discard child profiles that would add too much data - r?florian — Details

Instead of throwing away the whole profile at the end if it's too big, we try
to keep as much data as possible, by only throwing away incoming child profiles
that would make the resulting combined full profile too big.

Depends on D152025

Pulsebot

Comment 11

•

3 years ago

Pushed by gsquelart@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/207232098956 ResetGathering now take the nsresult to use in the promise rejection - r=florian https://hg.mozilla.org/integration/autoland/rev/0fef9e23cd85 Check for maximum string length before outputting profile JSON - r=florian https://hg.mozilla.org/integration/autoland/rev/643d16db6266 Avoid double allocation when outputting profile JSON - r=florian https://hg.mozilla.org/integration/autoland/rev/2ae1019d8ff7 Preemptively discard child profiles that would add too much data - r=florian

Sandor Molnar[:smolnar]

Comment 12

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/207232098956
https://hg.mozilla.org/mozilla-central/rev/0fef9e23cd85
https://hg.mozilla.org/mozilla-central/rev/643d16db6266
https://hg.mozilla.org/mozilla-central/rev/2ae1019d8ff7

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox104: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 104 Branch

Florian Quèze [:florian]

Reporter

Comment 13

•

3 years ago

I think I just got a profile that would previously have crashed: https://share.firefox.dev/3J90uyQ

I started the profiler with the 'power' preset, and started watching a video. After about half an hour, I tried to capture the profile. It took a while, and when I finally got the profile, it missed the content process that was playing the video, and the GPU process.

The profile gathering log contains 2 error messages:
Could not create shmem for profile from pid 28660 (535024313 B)
Could not create shmem for profile from pid 26856 (504297331 B)

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Assignee

Comment 14

•

3 years ago

Thanks. The system works! 😁

Wow, 500MB...
I guess possible next steps could be:

Reduce JSON output for the same data (like in bug 1774329, and more to come).
Split profiles in segments to fit in smaller shmems.
Remove old data to shrink profiles. This would be difficult and slow!
Until then, lower the maximum buffer size?

You need to log in before you can comment on or make changes to this bug.