Closed Bug 946407 Opened 6 years ago Closed 6 years ago

Electrolyze memory report file writing (but not GC logs, yet) and work around DMD/sandbox conflict

Categories

(Toolkit :: about:memory, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: jld, Assigned: jld)

References

Details

Attachments

(4 files, 14 obsolete files)

33.79 KB, patch
jld
: review-
Details | Diff | Splinter Review
43 bytes, text/x-github-pull-request
jld
: review+
jld
: checkin+
Details | Review
44.05 KB, patch
njn
: review+
Details | Diff | Splinter Review
1.01 KB, patch
jld
: review+
Details | Diff | Splinter Review
Currently, the memory reporter writes information to files under TmpD from the content process.  Specifically, it creates a subdirectory, changes its permissions (for Fennec on non-rooted phones, where the browser and the adb shell run under different UIDs), creates and writes a file, and uses rename(2) to indicate completion.

Eventually this will all be blocked by the system call sandbox; as it stands, everything except opening the file is currently blocked, so B2G tools/get_about_memory.py will cause a crash if sandboxing is enabled.

So, we'll need to remote this over IPC and do it in the parent process.  In particular, we probably want writes to be remoted instead of passing the child a writable file descriptor to a regular file, so that it's possible to prevent a compromised child from consuming all disk space.

Also relevant: we'll probably want to make the profiler (SPS) use the same backend with different directory/file names.
So, we already have IPC for collecting the basic memory reports (not including the GC/CC logs or DMD info) and assembling them in the parent process.  So if we use that instead of calling DumpMemoryInfoToTempDir in child processes, and modify tools/get_about_memory.py to deal with getting the data either per-process or unified, and then figure out something for DMD and the GC/CC logs, then that should do it.
Assignee: nobody → jld
DMD is going to be annoying — we're holding locks on malloc stuff when we log, so we can't take locks that might be held across a malloc in another thread (which sending messages does), or otherwise establish a blocking relationship in that direction, and I assume it's similarly bad if the writer itself allocates (which the IPC code probably would have done at some point if it hadn't hit the other deadlock first).

If we have a sufficiently platform-independent abstraction for pipe() then one of those could be passed to the child to transport the text back instead of an IPC actor (and would also let me remove some of the callback hell I've had to add to nsMemoryInfoDumper.cpp).
DMD isn't quite as bad as that, for two reasons:

1. We actually can allocate — DMD has a system for disabling its malloc interception per-thread, for use within the dynamic extent of its own reporting.  The problem is that sending the IPC message means posting a task to the I/O thread, which means contending for some kind of task queue lock with that thread, and that lock can be held over allocations that aren't exempt from DMD because they're on another thread.  And we could fix that, at the risk of having allocation discrepancies for any other messages send/received concurrently.

2. It doesn't matter yet, because DMD just opens its output files; it doesn't use any system calls we're currently forbidding.

3. Bonus reason: DMD builds are uncommon enough, and (as I understand it) typically used only for debugging a specific memory issue where the regular reporters fail, so we could probably get away with disabling sandboxing (or adding whitelist entries) conditionally.
Attached patch bug946407-p0-gc-cc-e10s-hg0.diff (obsolete) — Splinter Review
e10s for the GC/CC logs, by transporting the collector listener interface over IPC.
Reworks nsMemoryReporterManager::GetReports and DumpMemoryInfoToTempDir so that the latter can use the former's IPC integration.  Changes the report from one file per process to one unified file.  DMD logs are still written by the child, however (see above about deadlocks).
B2G patch; adapts the scripts behind get_about_memory to deal with the unified-memory-report file from the corresponding Gecko patch.
Disables the seccomp sandbox if DMD is enabled.  See also bug 956961.
Attachment #8356388 - Flags: review?(n.nethercote)
Attachment #8356389 - Flags: review?(n.nethercote)
Attachment #8356390 - Flags: review?(n.nethercote)
Attachment #8356391 - Flags: review?(n.nethercote)
Comment on attachment 8356388 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg0.diff

Review of attachment 8356388 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm, I think khuey (or mccr8?) is better to review this one.

::: dom/ipc/PContent.ipdl
@@ +270,5 @@
>      async DumpGCAndCCLogsToFile(nsString identifier,
>                                  bool dumpAllTraces,
>                                  bool dumpChildProcesses);
> +    // Similar, but with file I/O in the parent process.
> +    async PCycleCollectWithLogs(nsString identifier, bool dumpAllTraces);

Is DumpGCAndCCLogsToFile still needed?
Attachment #8356388 - Flags: review?(n.nethercote)
Attachment #8356388 - Flags: review?(khuey)
Attachment #8356388 - Flags: feedback?(continuation)
Comment on attachment 8356389 [details] [diff] [review]
bug946407-p1-memreport-e10s-hg0.diff

Review of attachment 8356389 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty good! I'd like another look before landing, though. I'm pretty sure the various aDumpChildren parameters are no longer needed, which should simplify a few things.

::: dom/ipc/ContentChild.cpp
@@ +176,5 @@
>  
> +NS_IMPL_ISUPPORTS1(MemoryReportRequestChild, nsIRunnable)
> +
> +MemoryReportRequestChild::MemoryReportRequestChild(uint32_t aGeneration, const nsAString& aDMDDumpIdent)
> +: mGeneration(aGeneration), mDMDDumpIdent(aDMDDumpIdent)

Feel free to move this ctor and the corresponding dtor into the class; they're short enough that outlining isn't worthwhile.

::: dom/ipc/ContentParent.cpp
@@ +1989,5 @@
> +        unsigned generation;
> +        int minimize;
> +        nsDependentString msg(aData);
> +
> +        if (sscanf(NS_ConvertUTF16toUTF8(Substring(msg, 0, 11)).get(),

Please put 11 in a constant instead of hard-coding it.

@@ +1994,5 @@
> +                   "%x %d", &generation, &minimize) != 2) {
> +            return NS_ERROR_INVALID_ARG;
> +        }
> +        unused << SendPMemoryReportRequestConstructor(
> +          generation, minimize, nsString(Substring(msg, 11)));

Ditto.

::: xpcom/base/nsIMemoryReporter.idl
@@ +242,5 @@
>                    in nsISupports handleReportData,
>                    in nsIFinishReportingCallback finishReporting,
>                    in nsISupports finishReportingData);
> +  [noscript] void
> +    getReportsExtended(in nsIMemoryReporterCallback handleReport,

Can the Extended version just replace the old version? If not, the Extended version should get its own comment.

@@ +246,5 @@
> +    getReportsExtended(in nsIMemoryReporterCallback handleReport,
> +                       in nsISupports handleReportData,
> +                       in nsIFinishReportingCallback finishReporting,
> +                       in nsISupports finishReportingData,
> +                       in boolean dumpChildProcesses,

I don't think you need |dumpChildProcesses| -- it's always true.

@@ +257,5 @@
>     */
>    void getReportsForThisProcess(in nsIMemoryReporterCallback handleReport,
>                                  in nsISupports handleReportData);
> +  [noscript] void
> +    getReportsForThisProcessExtended(in nsIMemoryReporterCallback handleReport,

Again, can this replace the non-Extended version? And if not, a comment is needed.

::: xpcom/base/nsMemoryInfoDumper.cpp
@@ +852,5 @@
> +
> +NS_IMETHODIMP
> +nsMemoryInfoDumper::DumpMemoryInfoToTempDir(const nsAString& aIdentifier,
> +                                            bool aMinimizeMemoryUsage,
> +                                            bool aDumpChildProcesses)

Is aDumpChildProcesses ever false? Can we get rid of it?

::: xpcom/base/nsMemoryReporterManager.cpp
@@ +1032,5 @@
> +    nsIHandleReportCallback* aHandleReport,
> +    nsISupports* aHandleReportData,
> +    nsIFinishReportingCallback* aFinishReporting,
> +    nsISupports* aFinishReportingData,
> +    bool aDumpChildren,

Is aDumpChildren ever false? Can we get rid of it?

@@ +1065,5 @@
>          NS_ENSURE_STATE(obs);
>  
> +        nsPrintfCString genStr("%08x %d ", generation, aMinimize ? 1 : 0);
> +        nsAutoString msg = NS_ConvertUTF8toUTF16(genStr);
> +        msg += aDMDDumpIdent;

Please add a short comment describing the comment, and maybe giving an example. And a mention to the name of constant for the length would be helpful.

Actually, it would be even better if the string was self-documenting. E.g.:

 "generation=00000001 minimize=1 DMDident="

BTW, how will the DMD ident be used in the future? I see it's always empty in this patch.

@@ +1088,5 @@
> +                                               aDMDDumpIdent);
> +    } else {
> +        mGetReportsState = new GetReportsState(generation,
> +                                               nullptr,
> +                                               0,

This would be nicer:

  /* numChildProcesses = */ 0,

@@ +1099,5 @@
>  
> +    if (aMinimize) {
> +        nsCOMPtr<nsICancelableRunnable> r;
> +        return MinimizeMemoryUsage(NS_NewRunnableMethod(this, &nsMemoryReporterManager::StartGettingReports), getter_AddRefs(r));
> +    } else {

No |else| after |return|.
Attachment #8356389 - Flags: review?(n.nethercote) → feedback+
Comment on attachment 8356390 [details] [diff] [review]
bug946407-p1-pull-unified-hg0.diff

Review of attachment 8356390 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know this script at all... rs=me, assuming you've tested it still works on Geckos that don't have the other patches applied.

::: tools/include/device_utils.py
@@ +276,3 @@
>      '''Wait for files to appear on the remote device.
>  
> +    The conditions argument is a list of (prefixes, num_expected) pairs.

Judging from the call above, this list always has two elements. In which case it should just be two arguments.
Attachment #8356390 - Flags: review?(n.nethercote) → review+
Attachment #8356391 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #10)
> ::: dom/ipc/ContentParent.cpp
> @@ +1989,5 @@
> > +        unsigned generation;
> > +        int minimize;
> > +        nsDependentString msg(aData);
> > +
> > +        if (sscanf(NS_ConvertUTF16toUTF8(Substring(msg, 0, 11)).get(),
> 
> Please put 11 in a constant instead of hard-coding it.

Alternately, I could just convert the entire string to UTF-8 and obtain the offset dynamically with %n.  This isn't exactly a fast path.

> ::: xpcom/base/nsIMemoryReporter.idl
> @@ +242,5 @@
> >                    in nsISupports handleReportData,
> >                    in nsIFinishReportingCallback finishReporting,
> >                    in nsISupports finishReportingData);
> > +  [noscript] void
> > +    getReportsExtended(in nsIMemoryReporterCallback handleReport,
> 
> Can the Extended version just replace the old version? If not, the Extended
> version should get its own comment.

I wasn't sure if changing the existing interface like that would break compatibility in a way we care about.  But, yes, I was going to do something about the comments and forgot.

> @@ +246,5 @@
> > +    getReportsExtended(in nsIMemoryReporterCallback handleReport,
> > +                       in nsISupports handleReportData,
> > +                       in nsIFinishReportingCallback finishReporting,
> > +                       in nsISupports finishReportingData,
> > +                       in boolean dumpChildProcesses,
> 
> I don't think you need |dumpChildProcesses| -- it's always true.

Good point.  I think at some point I was going to have this interface cover the single-process case as well, but that wound up not making sense.  There are a lot of "moving parts" here....

> @@ +257,5 @@
> >     */
> >    void getReportsForThisProcess(in nsIMemoryReporterCallback handleReport,
> >                                  in nsISupports handleReportData);
> > +  [noscript] void
> > +    getReportsForThisProcessExtended(in nsIMemoryReporterCallback handleReport,
> 
> Again, can this replace the non-Extended version? And if not, a comment is
> needed.

As above.

> ::: xpcom/base/nsMemoryInfoDumper.cpp
> @@ +852,5 @@
> > +
> > +NS_IMETHODIMP
> > +nsMemoryInfoDumper::DumpMemoryInfoToTempDir(const nsAString& aIdentifier,
> > +                                            bool aMinimizeMemoryUsage,
> > +                                            bool aDumpChildProcesses)
> 
> Is aDumpChildProcesses ever false? Can we get rid of it?

Not anymore, and apparently so.  Also, I think I got rid of the only call to SendDumpMemoryInfoToTempDir, so that could be removed from the IPDL.

> ::: xpcom/base/nsMemoryReporterManager.cpp
> @@ +1032,5 @@
> > +    nsIHandleReportCallback* aHandleReport,
> > +    nsISupports* aHandleReportData,
> > +    nsIFinishReportingCallback* aFinishReporting,
> > +    nsISupports* aFinishReportingData,
> > +    bool aDumpChildren,
> 
> Is aDumpChildren ever false? Can we get rid of it?

As above, for the IDL, respectively no and yes.

> @@ +1065,5 @@
> >          NS_ENSURE_STATE(obs);
> >  
> > +        nsPrintfCString genStr("%08x %d ", generation, aMinimize ? 1 : 0);
> > +        nsAutoString msg = NS_ConvertUTF8toUTF16(genStr);
> > +        msg += aDMDDumpIdent;
> 
> Please add a short comment describing the comment, and maybe giving an
> example. And a mention to the name of constant for the length would be
> helpful.
> 
> Actually, it would be even better if the string was self-documenting. E.g.:
> 
>  "generation=00000001 minimize=1 DMDident="

Good point.

> BTW, how will the DMD ident be used in the future? I see it's always empty
> in this patch.

It's empty if we're not writing DMD files... which I think I forgot to document; sorry.  It's nonempty when called from nsMemoryInfoDumper::DumpMemoryInfoToTempDir.

> @@ +1088,5 @@
> > +                                               aDMDDumpIdent);
> > +    } else {
> > +        mGetReportsState = new GetReportsState(generation,
> > +                                               nullptr,
> > +                                               0,
> 
> This would be nicer:
> 
>   /* numChildProcesses = */ 0,

Noted.

> @@ +1099,5 @@
> >  
> > +    if (aMinimize) {
> > +        nsCOMPtr<nsICancelableRunnable> r;
> > +        return MinimizeMemoryUsage(NS_NewRunnableMethod(this, &nsMemoryReporterManager::StartGettingReports), getter_AddRefs(r));
> > +    } else {
> 
> No |else| after |return|.

Will fix.

Thanks for the feedback.
(In reply to Nicholas Nethercote [:njn] from comment #11)
> I don't know this script at all... rs=me, assuming you've tested it still
> works on Geckos that don't have the other patches applied.

It does — in fact, I just used it for bug 925416 (applied to m-c without these patches).

> ::: tools/include/device_utils.py
> @@ +276,3 @@
> >      '''Wait for files to appear on the remote device.
> >  
> > +    The conditions argument is a list of (prefixes, num_expected) pairs.
> 
> Judging from the call above, this list always has two elements. In which
> case it should just be two arguments.

That could be done?  I made it a list because that seemed easier, and more future-proof if/when we have yet another possible format.
Attachment #8356391 - Flags: review?(gdestuynder)
Comment on attachment 8356391 [details] [diff] [review]
bug946407-p2-dmd-unsandbox-hg0.diff

Review of attachment 8356391 [details] [diff] [review]:
-----------------------------------------------------------------

As DMD is only used when requested while debugging I think that's fine (and thus r+)
Note: I wonder if it's nicer or not to have DMD fail to build with ENABLE_CONTENT_SANDBOX and just manually disable the later when using the former. I'm not sure with either, so it's just a thought.
Attachment #8356391 - Flags: review?(gdestuynder) → review+
Comment on attachment 8356388 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg0.diff

Review of attachment 8356388 [details] [diff] [review]:
-----------------------------------------------------------------

The nsCycleCollector.cpp and nsICycleCollectorListener.idl changes look fine to me.

What happens if something new is added to the .idl without updating the .ipdl?  Does it fail to build, or will it just silently stop logging some stuff?  If it is the latter, it would be nice to have a comment in nsICCListener.idl about the ipdl stuff that needs to be updated.

::: xpcom/base/nsICycleCollectorListener.idl
@@ +48,5 @@
>      attribute boolean wantAfterProcessing;
>  
>      // This string will appear somewhere in the log's filename.
>      attribute AString filenameIdentifier;
> +    // This number (default: current process ID) will also appear

nit: blank line before the comment, please.
Attachment #8356388 - Flags: feedback?(continuation) → feedback+
(In reply to Andrew McCreight [:mccr8] from comment #15)
> The nsCycleCollector.cpp and nsICycleCollectorListener.idl changes look fine
> to me.
> 
> What happens if something new is added to the .idl without updating the
> .ipdl?  Does it fail to build, or will it just silently stop logging some
> stuff?  If it is the latter, it would be nice to have a comment in
> nsICCListener.idl about the ipdl stuff that needs to be updated.

Wouldn't that cause CycleCollectWithLogsChild to inherit pure virtual methods from nsICycleCollectorListener and break the build until they're implemented?
So there's a bunch of stuff that can be deleted as a result of these patches, some of which was pointed out earlier during review/feedback.  This patch does that; I think it might be clearer to apply it as a separate patch.

(Updated version of the original memory reporting patch to follow, once I'm done with the comment fixes for it.)
Attachment #8362043 - Flags: review?(n.nethercote)
Comment on attachment 8362043 [details] [diff] [review]
bug946407-p3-manual-sccp-hg0.diff

Review of attachment 8362043 [details] [diff] [review]:
-----------------------------------------------------------------

Lovely!

::: mobile/android/chrome/content/MemoryObserver.js
@@ +59,5 @@
>    },
>  
>    dumpMemoryStats: function(aLabel) {
>      let memDumper = Cc["@mozilla.org/memory-info-dumper;1"].getService(Ci.nsIMemoryInfoDumper);
> +    memDumper.dumpMemoryInfoToTempDir(aLabel, false);

Can you add a comment like |/* minimize = */ false|?
Attachment #8362043 - Flags: review?(n.nethercote) → review+
(In reply to Guillaume Destuynder [:kang] (use NEEDINFO!) from comment #14)
> Note: I wonder if it's nicer or not to have DMD fail to build with
> ENABLE_CONTENT_SANDBOX and just manually disable the later when using the
> former. I'm not sure with either, so it's just a thought.

My thought here was that, someday, there might be a Gecko change that allows using DMD while sandboxed, so it makes sense for the part of this that disables sandboxing to live in Gecko rather than in people's .userconfigs or gonk-misc or somewhere else, so it can be changed at the same time.
I think I got all of the suggested changes here that aren't in the other patch.

Note in particular the change to nsMemoryReporterManager::kTimeoutLengthMS, which is something I tripped over while testing.  I observed 20s by starting a bunch of apps and having one consuming CPU during the report and using DMD (but the build was non-debug and optimized), so 50s looks like a lot but it should give some safety margin.
Attachment #8356389 - Attachment is obsolete: true
Attachment #8364124 - Flags: review?(n.nethercote)
In addition to suggested minor change, I also bumped the UUIDs of the XPIDL interfaces being changed, as I forgot to do the first time.  Carrying over r=njn.
Attachment #8362043 - Attachment is obsolete: true
Attachment #8364127 - Flags: review+
Comment on attachment 8364124 [details] [diff] [review]
bug946407-p1-memreport-e10s-hg1.diff

Review of attachment 8364124 [details] [diff] [review]:
-----------------------------------------------------------------

Lovely. Thanks for doing this.

::: dom/ipc/ContentParent.cpp
@@ +2020,5 @@
> +                || identOffset < 0) {
> +                return NS_ERROR_INVALID_ARG;
> +            }
> +            // The pre-%n part of the string should be all ASCII, so the byte
> +            // offset in identOffset should be correct as a char offset.

Can we assert this somehow? Maybe |cmsg.get()[identOffset-1] == '='|?

::: xpcom/base/nsMemoryReporterManager.cpp
@@ +1023,5 @@
>      nsISupports* aFinishReportingData)
>  {
> +    return GetReportsExtended(aHandleReport, aHandleReportData,
> +                              aFinishReporting, aFinishReportingData,
> +                              true, false, nsString());

Comments on the |true| and |false| args, please.

@@ +1057,5 @@
>  
>      MEMORY_REPORTING_LOG("GetReports (gen=%u, %d child(ren) present)\n",
>                           generation, mNumChildProcesses);
>  
> +    if (aDumpChildren && mNumChildProcesses > 0) {

This patch adds aDumpChildren to this condition, and then the follow-up patch removes it again. It might be better to just combine this patch and that patch before landing.
Attachment #8364124 - Flags: review?(n.nethercote) → review+
Applied suggested fixes and squashed in the relevant parts of patch 3.  Carrying over r=njn.
Attachment #8364124 - Attachment is obsolete: true
Attachment #8364127 - Attachment is obsolete: true
Attachment #8364767 - Flags: review+
Squashed the rest of the cleanup patch into this, and rebased.  (Not carrying over f=mccr8.)
Attachment #8356388 - Attachment is obsolete: true
Attachment #8356388 - Flags: review?(khuey)
Attachment #8364768 - Flags: review?(khuey)
I'd like this to land soon. There might be some other nsIMemoryInfoDumper changes coming soon and I don't want to cause big conflicts. Are we just waiting on khuey's review for one patch? Could mmcr8 do that review instead?
It's on my todo list ... just very busy.  I would like to test it on a device.
> It's on my todo list ... just very busy.

Yep... I'm trying to lighten your load :) Will you just generally test that it works, or are you looking for something specific that would be hard for someone else to do?
Kyle, what do you think about Billm and I splitting up the review here instead of you?  He knows stuff about IPDL and IPC, and the interface itself looks fairly simple.
Flags: needinfo?(khuey)
If billm can do the review this week go for it.
Flags: needinfo?(khuey)
Bill, would you mind reviewing the IPC and IPDL stuff in the r? patch here?  Just assume we're passing the right things around, I'll check the cycle-collector-ness part of it.  Thanks!
Flags: needinfo?(wmccloskey)
Sure, that's fine. I won't be able to test anything though.
Flags: needinfo?(wmccloskey)
Comment on attachment 8364768 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg2.diff

Thanks, Bill!

Will you be able to test this on a device, Jed?
Attachment #8364768 - Flags: review?(wmccloskey)
Attachment #8364768 - Flags: review?(khuey)
Attachment #8364768 - Flags: review?(continuation)
The other thing we need to review is whether we're triggering lots of allocations/copying, which will make getting about:memory more likely to OOM-kill things on the phone.
(In reply to Andrew McCreight [:mccr8] from comment #32)
> Will you be able to test this on a device, Jed?

I tested this on a device back when I was developing it; at this point I should probably rebase and re-test.  It might also be worth testing with a desktop build and a sufficiently large website, since basically the entire object graph is streamed over IPC.

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #33)
> The other thing we need to review is whether we're triggering lots of
> allocations/copying, which will make getting about:memory more likely to
> OOM-kill things on the phone.

Since I'm sending each observation as a single message, it has the potential to be less bad than things that pass a single huge array (like the main memory reporting protocol — which the *other* half of this bug enables on b2g), but I don't know offhand how much the IPC layer actually tries to keep its queues bounded.
Comment on attachment 8364768 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg2.diff

Review of attachment 8364768 [details] [diff] [review]:
-----------------------------------------------------------------

I did my review, but then I realized that this code doesn't handle the JS heap correctly. See here:
  http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsCycleCollector.cpp#1440
We're going to get a GC dump of the parent rather than the child. That will be a little hard to fix. We could add some code to the JS engine to dump to a memory buffer and then we could send that to the parent I guess.

Here are my other review comments.

::: dom/ipc/ContentChild.cpp
@@ +177,5 @@
>      MOZ_COUNT_DTOR(MemoryReportRequestChild);
>  }
>  
> +// IPC sender for remote GC/CC logging.
> +class CycleCollectWithLogsChild : public PCycleCollectWithLogsChild,

I think you need to implement ActorDestroy for this class. It should set a flag (mDestroyed or something). Then each time one of the callbacks is called, we need to check mDestroyed and avoid calling the Send function in that case.

::: dom/ipc/ContentParent.cpp
@@ +198,5 @@
>      MOZ_COUNT_DTOR(MemoryReportRequestParent);
>  }
>  
> +// IPC receiver for remote GC/CC logging.
> +class CycleCollectWithLogsParent : public PCycleCollectWithLogsParent

You should probably implement an empty ActorDestroy method here just so that people know that it doesn't need to do anything (since there are no Send calls).

@@ +209,5 @@
> +
> +    virtual bool RecvBegin()
> +    {
> +        nsresult rv = mLogger->Begin();
> +        return !NS_WARN_IF(NS_FAILED(rv));

Keep in mind that returning false from any of these Recv methods will cause the child to be terminated. I think it would be better to have a success/failure value here as an outparam and always return true. Only Begin() can fail; the other methods could just print a warning on failure, perhaps.

@@ +276,5 @@
> +    }
> +
> +    virtual bool Recv__delete__()
> +    {
> +        nsresult rv = mLogger->End();

This can fail, but I think we should just issue a warning in that case and return true.

::: xpcom/base/nsICycleCollectorListener.idl
@@ +51,5 @@
>      attribute AString filenameIdentifier;
>  
> +    // This number (default: current process ID) will also appear
> +    // in the filename.
> +    [noscript] attribute unsigned long processIdentifier;

Don't we have to change the IID for this?
Attachment #8364768 - Flags: review?(wmccloskey)
> Don't we have to change the IID for this?
Oops, yeah, I always forget to mention that in reviews...
Comment on attachment 8364768 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg2.diff

Oh, right, good catch Bill.  This is going to be kind of a pain, because of course we want this to work on phones, so we can't really just dump the entire heap into memory, then push it all over to the main process, and expect it to not OOM.
Attachment #8364768 - Flags: review?(continuation)
Summary: Electrolyze about:memory file writing → Electrolyze memory report file writing (but not GC logs, yet) and work around DMD/sandbox conflict
Comment on attachment 8364768 [details] [diff] [review]
bug946407-p0-gc-cc-e10s-hg2.diff

As previously discussed this approach doesn't work, but I'll leave the patch here in case it's useful.
Attachment #8364768 - Flags: review-
Rebased onto m-c, without the GC/CC log patch.  Carrying over r=njn (but maybe this calls for f? from someone relevant).
Attachment #8364767 - Attachment is obsolete: true
Attachment #8376555 - Flags: review+
Rebase.  Carrying over r=njn r=kang.
Attachment #8356391 - Attachment is obsolete: true
Attachment #8376556 - Flags: review+
This bit-rotted enough that I had to more or less rewrite it.  Re-tested both with and without the Gecko patches.
Attachment #8356390 - Attachment is obsolete: true
Attachment #8376627 - Flags: review?(n.nethercote)
Comment on attachment 8376627 [details] [diff] [review]
bug946407-p1-pull-unified-hg1.diff

Review of attachment 8376627 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/include/device_utils.py
@@ +231,5 @@
> +        new_unified_files = _list_remote_temp_files(unified_outfiles_prefixes) - old_files
> +        if new_unified_files:
> +            files_gotten = (len(new_unified_files), num_unified_expected)
> +        else:
> +            files_gotten = (len(new_files), num_expected_files)

I'm not a fan of (ab)using arrays as structs. I'd just have two variables, |files_expected| and |files_gotten|, or something like that.
Attachment #8376627 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #42)
> ::: tools/include/device_utils.py
> @@ +231,5 @@
> > +        new_unified_files = _list_remote_temp_files(unified_outfiles_prefixes) - old_files
> > +        if new_unified_files:
> > +            files_gotten = (len(new_unified_files), num_unified_expected)
> > +        else:
> > +            files_gotten = (len(new_files), num_expected_files)
> 
> I'm not a fan of (ab)using arrays as structs. I'd just have two variables,
> |files_expected| and |files_gotten|, or something like that.

Yeah, that started out as an (ab)use of Python's conditional operator for the argument tuple of a format string, and then I realized I needed the same values in other places.  It's definitely nicer as separate variables; thanks.
Carrying over r=njn.
Attachment #8376627 - Attachment is obsolete: true
Attachment #8378429 - Flags: review+
…and convert to a pull request.
Attachment #8378429 - Attachment is obsolete: true
Attachment #8378432 - Flags: review+
Comment on attachment 8378432 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/B2G/pull/326

…and land this one first, because the Gecko patch depends on it.  (There will still be people out of sync, of course.  I'll warn dev-b2g.)

https://github.com/mozilla-b2g/B2G/commit/55db3594c28927187851495eb13ee7098dfbd585
Attachment #8378432 - Flags: checkin+
Rebased.  Carrying over r+, but a quick look to make sure the change to the JS OOM hook (which effectively turns on dumpChildProcesses, since it's now always on, as per previous review comments) is okay would be nice.
Attachment #8376555 - Attachment is obsolete: true
Attachment #8385019 - Flags: review+
Attachment #8385019 - Flags: feedback?(n.nethercote)
Attachment #8385019 - Flags: feedback?(n.nethercote) → feedback+
Small but important update to this piece of the patch.  Re-?-ing the previous r+(kang); assuming that njn's opinion of the patch won't change based on internal details of the sandbox implementation.
Attachment #8376556 - Attachment is obsolete: true
Attachment #8385055 - Flags: review?(gdestuynder)
Comment on attachment 8385055 [details] [diff] [review]
bug946407-p2-dmd-unsandbox-hg2.diff

Review of attachment 8385055 [details] [diff] [review]:
-----------------------------------------------------------------

seems ok -i added some comments that i didn't previously, but they're nitpicks.
also, previously we planned to die (exit_group)  if this function would return failure. while the comment is gone, this *could* come back in the future, i suppose.
we can always deal with it when/if we're in a position where we can do that in the future, tho.

::: security/sandbox/linux/Sandbox.cpp
@@ +221,5 @@
>  static int
>  InstallSyscallFilter(void)
>  {
> +#ifdef MOZ_DMD
> +  char* e = getenv("DMD");

not PR_GetEnv() ?

@@ +222,5 @@
>  InstallSyscallFilter(void)
>  {
> +#ifdef MOZ_DMD
> +  char* e = getenv("DMD");
> +  if (e && strcmp(e, "") != 0 && strcmp(e, "0") != 0) {

seems to me we could just check if its non-null
Attachment #8385055 - Flags: review?(gdestuynder) → review+
(In reply to Guillaume Destuynder [:kang] (use NEEDINFO!) from comment #49)
> also, previously we planned to die (exit_group)  if this function would
> return failure. while the comment is gone, this *could* come back in the
> future, i suppose.

The comment about bug 880797 is still there; it's in SetThreadSandbox now.

> ::: security/sandbox/linux/Sandbox.cpp
> @@ +221,5 @@
> >  static int
> >  InstallSyscallFilter(void)
> >  {
> > +#ifdef MOZ_DMD
> > +  char* e = getenv("DMD");
> 
> not PR_GetEnv() ?

I was copying the existing DMD code, but that's a good point, since we're using PR_GetEnv elsewhere in the file.

> @@ +222,5 @@
> >  InstallSyscallFilter(void)
> >  {
> > +#ifdef MOZ_DMD
> > +  char* e = getenv("DMD");
> > +  if (e && strcmp(e, "") != 0 && strcmp(e, "0") != 0) {
> 
> seems to me we could just check if its non-null

Also copied from the code that decides whether to enable DMD, since the idea is to disable the sandbox only if DMD is enabled.
Comment on attachment 8385019 [details] [diff] [review]
bug946407-p1-memreport-e10s-hg4.diff

Review of attachment 8385019 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/integration/b2g-inbound/rev/9f9ee7372552
Once more, with feeling.  We need to actually call FinishReporting() in the zero-children case (which includes in-process desktop about:memory, which is the mochitest I broke) to reset the state so that future reports will work, not just call the mFinishReporting callback directly.

Hopefully the Bugzilla interdiff will do something useful.
Attachment #8385019 - Attachment is obsolete: true
Attachment #8385724 - Flags: review?(n.nethercote)
Comment on attachment 8385724 [details] [diff] [review]
bug946407-p1-memreport-e10s-hg5.diff

Review of attachment 8385724 [details] [diff] [review]:
-----------------------------------------------------------------

The interdiff was good. Yay!

I guess the |delete mGetReportsState| is ok because it'll be null in the no-children case?
Attachment #8385724 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #54)
> I guess the |delete mGetReportsState| is ok because it'll be null in the
> no-children case?

It's no longer null in that case, with this patch.  (And if it somehow is, we'll already have crashed by that point.)  If memory serves, this is because the no-children case can include a MinimizeMemoryUsage, which is async, so I needed the mGetReportsState for its continuation.

Also, while I'm commenting, a try run: https://tbpl.mozilla.org/?tree=Try&rev=3e1c314bfb9b
Adjusted as commented earlier.  Carrying over r+.
Attachment #8385055 - Attachment is obsolete: true
Attachment #8385788 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/8e3a1fa0569c
https://hg.mozilla.org/mozilla-central/rev/e304d624f1f7
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.