Closed Bug 1245679 Opened 8 years ago Closed 8 years ago

TreeHerder tab is crashing (aborting) in mozalloc_abort | NS_DebugBreak | mozilla::layers::PLayerTransactionChild::SendPCompositableConstructor

Categories

(Core :: Graphics: Layers, defect)

Unspecified
Linux
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1245241
Tracking Status
e10s ? ---
firefox47 --- affected

People

(Reporter: dholbert, Unassigned)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-14cf9342-b794-441e-a9f2-c68502160203.
=============================================================
Right now, TreeHerder is pretty much insta-crashing for me (within seconds).

Crash report linked above.

Note that stack-level 2 in my crash is in obj-firefox/ipc/ipdl/PLayerTransactionChild.cpp (generated code), line 177. Crash-stats' link-to-source-in-HG functionality clearly doesn't work, for generated code.  But from looking at my objdir locally, I can see that line 177 in this file is the NS_RUNTIMEABORT below:

> auto PLayerTransactionChild::SendPTextureConstructor(
>         PTextureChild* actor,
>         const SurfaceDescriptor& aSharedData,
>         const LayersBackend& aBackend,
>         const TextureFlags& aTextureFlags) -> PTextureChild*
> {
[...]
>     bool sendok__ = (mChannel)->Send(msg__);
>     if ((!(sendok__))) {
>         NS_RUNTIMEABORT("constructor for actor failed");
>         return nullptr;
>     }

(This explains why we're invoking NS_DebugBreak, I think.)
I'm using Nightly with e10s and with dom.ipc.processCount = 5 (and a bunch of add-ons).
I forgot to mention -- most of my treeherder content-process crashes here did *not* trigger crashreporter forms on the "tab crashed" page, for some reason.

The crash report linked in comment 0 is from the one time that I did get a crashreporter form.
We're running out of fd's.

with 1024, I insta-crash loading beta.  Looking at /proc/xxxxx/fd, I see >975 fd's in use when we MOZ_CRASH.  Almost all are /dev/shm/org.chromium.xxxxxx (often marked (DELETED) in lsof)

With 32768, it's happy - and at idle, 30 fd's are in use.

Perhaps the child is getting ahead of the master, or vice versa to where they build up, due to some buffering up of drawing commands?

In any case, attached is a sampling of shmem create stacks from an rr trace in the content process
Flags: needinfo?(wmccloskey)
Flags: needinfo?(nical.bugzilla)
These stacks are interesting.

We're using ContentClientRemoteBuffer which synchronously swaps buffers with the compositor, so it's hard to see how we'd fall behind.

Messaging for freeing shmem might be running on a different channel though?

It's possible that we churn shmems a lot, and whatever is supposed to be freeing them falls behind.
Going to dupe this over since the other bug has a bit more traffic.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(wmccloskey)
Resolution: --- → DUPLICATE
Flags: needinfo?(nical.bugzilla)
(In reply to Daniel Holbert [:dholbert] (offline 2/18-2/21) from comment #3)
> I forgot to mention -- most of my treeherder content-process crashes here
> did *not* trigger crashreporter forms on the "tab crashed" page, for some
> reason.

Bug 1249995 is filed on this lack-of-crashreporter-UI issue, btw.
See Also: 1245241
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: