Open Bug 1757802 Opened 11 months ago Updated 9 days ago

Consider no longer directly associating Shmem instances with actors

Categories

(Core :: IPC, enhancement, P2)

enhancement

Tracking

()

REOPENED

People

(Reporter: nika, Assigned: nika)

References

(Regressed 1 open bug)

Details

Attachments

(1 file)

With this new approach, Shmem instances will now have their handles transferred inline within messages as attachments, rather than being associated with their actors and sent in separate messages.

This has a few advantages:

  • The implementation is much simpler
  • Releasing all references to a Shmem will automatically destroy it by RAII, rather than leaking the shared memory region until the toplevel actor is destroyed, removing the need for types like RaiiShmem.
  • This allows re-transmitting Shmem instances to another process, as we don't close the shared memory region handle upon receiving it.

But also has a disadvantage that because we keep alive the shared memory region's handle until the shmem is destroyed, so that it can be re-transmitted, we may end up using more FDs or HANDLEs while running.

This bug intentionally doesn't change or simplify callsites, removing APIs like RaiiShmem, in order to make it easier to revert if this causes issues on platforms like Linux due to FD exhaustion. If we don't run into increased resource exhaustion problems, we can make these changes in a follow-up.

With this new approach, Shmem instances will now have their handles
transferred inline within messages as attachments, rather than being
associated with their actors and sent in separate messages.

This has a few advantages:

  • The implementation is much simpler
  • Releasing all references to a Shmem will automatically destroy it by
    RAII, rather than leaking the shared memory region until the toplevel
    actor is destroyed, removing the need for types like RaiiShmem.
  • This allows re-transmitting Shmem instances to another process, as we
    don't close the shared memory region handle upon receiving it.

But also has a disadvantage that because we keep alive the shared memory
region's handle until the shmem is destroyed, so that it can be
re-transmitted, we may end up using more FDs or HANDLEs while running.

This patch intentionally doesn't change or simplify callsites, removing
APIs like RaiiShmem, in order to make it easier to revert if this causes
issues on platforms like Linux due to FD exhaustion. If we don't run
into increased resource exhaustion problems, we can make these changes
in a follow-up.

Depends on D140009

Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0d23532d6a49
Don't keep alive Shmem shared memory regions on IProtocol, r=ipc-reviewers,jld
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 101 Branch
Regressions: 1765299
Regressions: 1765776
Regressions: 1766003
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 101 Branch → ---
Flags: needinfo?(nika)
No longer regressions: 1765393

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:nika, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit auto_nag documentation.

Flags: needinfo?(nika)
Flags: needinfo?(jld)

There was a regression from the change and we still don't have any idea as to what caused it, so there's not a ton of immediate action we can take.

Flags: needinfo?(nika)

Err actually nevermind, wrong patch. This is blocked on bug 1767514 (and potentially other changes) for now.

Depends on: 1767514
No longer regressions: 1765480
See Also: → 1765480

We still need the ffmpeg update to land something like this, and we may want to also be a bit more clever around the media code etc.

Flags: needinfo?(jld)

(also around graphics potentially)

Depends on: 1765480
See Also: 1765480

Nika, I've updated our vendored ffmepg copy to a recent version, containing the bug fix that was needed for this to not crash, maybe we can reland this?

Flags: needinfo?(nika)
Severity: -- → S2

We're slightly concerned about potential performance issues which might occur due to GMP code expecting the existing behaviour's performance characteristics, so we've been holding off on landing something like this directly. Instead, we've been adding some new Shmem types like BigBuffer or the flexible API for unsafe shred memory from bug 1795311 which are hopefully better suited for the existing uses of Shmem. We might want to then refactor the code out of the core IPC layer, and make it something which GMP deals with directly instead.

Flags: needinfo?(nika)
You need to log in before you can comment on or make changes to this bug.