Closed Bug 1796094 Opened 3 years ago Closed 2 years ago

Flag JS out-of-memory crashes as OOM

Categories

(Socorro :: Signature, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gsvelto, Assigned: willkg)

References

Details

Attachments

(2 files)

I noticed that some crashes are caused by the JS engine running out of memory, these often aren't reported as OOM but maybe they should. In particular crashes that have the JSLargeAllocationFailure annotation set, and the ones with the JSOutOfMemory annotation set to "Reporting" or "Reported".

Nicholas, am I interpreting those crash annotations correctly?

Flags: needinfo?(nicolas.b.pierron)

This crash is an example.

Looking at the code, we seems to have multiple OOMState:

    // We are currently reporting the given condition.
    //
    // Suppose a crash report contains "JSLargeAllocationFailure:
    // Reporting". This means we crashed while executing memory-pressure
    // observers, trying to shake loose some memory. The large allocation in
    // question did not return null: it is still on the stack. Had we not
    // crashed, it would have been retried.
    Reporting,

    // The condition has been reported since the last GC.
    //
    // If a crash report contains "JSOutOfMemory: Reported", that means a small
    // allocation failed, and then we crashed, probably due to buggy
    // error-handling code that ran after allocation returned null.
    //
    // This contrasts with "Reporting" which means that no error-handling code
    // had executed yet.
    Reported,

The Reported case seems to be a case where failure from the JS engine are not processed properly.

(In reply to Gabriele Svelto [:gsvelto] from comment #1)

This crash is an example.

In this particular case, the problems comes from RemoteObjectProxyBase::GetOrCreateProxyObject which does not check the error reporting from the JSContext, and simply return without setting aNewObjectCreated, which does not carry the failure condition.

I do not know much about the DOM to answer more on the error handling.

One thing we could consider in the JS engine is to make more API based on mozilla::Result, to carry the OOM errors this way.

Flags: needinfo?(nicolas.b.pierron)

Thanks for the detailed explanation, so the JSLargeAllocationFailure: Reporting case sounds like a genuine OOM, while JSLargeAllocationFailure: Reported is a different kind of bug that should be fixed first, right?

This is what I understand from the comment. Yes.

I can't tell if this is something we want to fix in Socorro or somewhere else. If it's in Socorro, what do we want to do about it?

The idea is to add the OOM | large prefix to crash reports that have the JSLargeAllocationFailure annotation set to Reporting. Those should be out-of-memory crashes that haven't been flagged as such correctly (i.e. they don't also have the OOMAllocationSize annotation populated). That being said since that annotation is not indexed I'm not sure if this is a common occurrence or not.

Grabbing this to do this week if I can.

Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P2
Depends on: 1804095

This was deployed to prod just now in bug #1804294, but I'm not sure what's going on since it didn't seem to change the signature.

Oh, I did "Reported" and not "Reporting". That's a problem. There are probably others. I'll look int oit.

I looked through a few weeks of reports and can't find any examples where JSLargeAllocationFailure=Reporting. I added it to the search index, so I can fix the code now and verify it when we get a crash report that meets the criteria.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: