Closed Bug 1452657 Opened 6 years ago Closed 6 years ago

replace ...IAmADoodyhead marker with something like PrivateIPDLInterface

Categories

(Core :: IPC, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: froydnj, Assigned: froydnj)

Details

Attachments

(1 file)

The current version is incredibly long (yes, yes, editor auto-complete, blah blah) and the fun-ness of it is IMHO not funny anymore.
That sounds like a good idea.
This finally annoyed me enough today while working on other things.
Attachment #8968222 - Flags: review?(continuation)
Assignee: nobody → nfroyd
Comment on attachment 8968222 [details] [diff] [review]
rename IPDL-private Shmem token struct to something more professional

Review of attachment 8968222 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/Shmem.h
@@ +72,5 @@
>    typedef int32_t id_t;
>    // Low-level wrapper around platform shmem primitives.
>    typedef mozilla::ipc::SharedMemory SharedMemory;
>    typedef SharedMemory::SharedMemoryType SharedMemoryType;
> +  struct PrivateIPDLCaller {};

Maybe add a comment here about when it is okay to use this? Maybe just "Don't use this outside of ipc/glue" or ProtocolUtils.cpp or something?
Attachment #8968222 - Flags: review?(continuation) → review+
Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d65829a60e7
rename IPDL-private Shmem token struct to something more professional; r=mccr8
(In reply to Cosmin Sabou [:CosminS] from comment #5)
> I think this push
> https://hg.mozilla.org/integration/mozilla-inbound/rev/
> 0d65829a60e72992181f7536d970f05b6245687b has now generated this new fail
> https://treeherder.mozilla.org/logviewer.html#?job_id=174116886&repo=mozilla-
> inbound&lineNumber=2562
> 
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> inbound&fromchange=0d65829a60e72992181f7536d970f05b6245687b&tochange=2eb8596e
> 5602eeed82d6e7033eed210c45ce6136&filter-
> searchStr=b06cd3208b805f1f199f8ce2ada9d44efbb75b73&selectedJob=174116886
> 
> Nathan, do you mind taking a look? Thank you.

I have a very difficult time seeing how a rename of a structure could cause new failures to appear, but I suppose it is possible?  One thought is that this somehow made us start getting symbols for bug 1440545, perhaps because the symbol is now shorter and doesn't cause issues when symbolicating.  My (admittedly selfish) request is that we don't backout, but see if we start getting symbols more consistently for such crashes--and see whether the test still only fails intermittently.

If the test starts falling over permanently or if the intermittent rate goes up, I guess we might have to back this out?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #6)
> I have a very difficult time seeing how a rename of a structure could cause
> new failures to appear, but I suppose it is possible?  One thought is that
> this somehow made us start getting symbols for bug 1440545, perhaps because
> the symbol is now shorter and doesn't cause issues when symbolicating.

Yeah, that really looks like it is the case. The crash is in the same test and everything.
(In reply to Andrew McCreight [:mccr8] from comment #7)
> (In reply to Nathan Froyd [:froydnj] from comment #6)
> > I have a very difficult time seeing how a rename of a structure could cause
> > new failures to appear, but I suppose it is possible?  One thought is that
> > this somehow made us start getting symbols for bug 1440545, perhaps because
> > the symbol is now shorter and doesn't cause issues when symbolicating.
> 
> Yeah, that really looks like it is the case. The crash is in the same test
> and everything.

My only hesitation (ok, beyond the "this is a longshot" hesitation) is that the crash reason is not identical (SIGBUS vs. SIGSEGV in the recorded intermittents), and the stack in my inbound push is much deeper than, say, the stacks from:

https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2018-04-09&endday=2018-04-15&tree=trunk&bug=1440545
https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2018-04-02&endday=2018-04-08&tree=trunk&bug=1440545

But maybe that is just the nature of the intermittent.
So, this test is actually permanently failing, AFAICT. (Unless there is a filter for failures I'm not seeing?) But if you expand out the range, it looks like this test was failing very frequently before, too. I think that is tracked in bug 1454112. Maybe this patch just changed one of the signatures for that failure?
SIGBUS on Linux generally means an I/O error or out-of-range access on a memory-mapped file.  So, a SIGBUS somewhere random in graphics is probably from being out of space in /dev/shm.

A SIGBUS when Shmem::Alloc writes the length to the end of the mapping is more suspicious, because in theory it could be ENOSPC, but it seems more likely the length got mixed up somewhere and it mapped (and poked the end of) a larger area than the length of the actual file.
…however, a SIGBUS in Shmem::Alloc immediately after a SIGBUS by another process somewhere else probably means that the first process exhausted /dev/shm and crashed, and then the second “OOM”ed on the first page of a new shmem because /dev/shm was completely full at that point.  We've seen this before: bug 1446618 comment #2
https://hg.mozilla.org/mozilla-central/rev/0d65829a60e7
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.