Closed Bug 992454 Opened 10 years ago Closed 10 years ago

[B2G] Integrate Task tracer with GeckoProfiler

Categories

(Core :: Gecko Profiler, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: BenWa, Assigned: shelly)

References

()

Details

Attachments

(2 files, 17 obsolete files)

15.12 KB, patch
shelly
: review+
Details | Diff | Splinter Review
7.60 KB, patch
shelly
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #908995 +++

We talked during the graphics work week. The outcome is:

I will provide a header for TaskTracer to implement with a simple Start/Stop/Save function. The Save function will build a string representation of the data. This data will be included in the profile. If the string is present when the profile is opened in cleopatra then a tab with an iframe to Isis will be provided.

When profiling TaskTracer can be used via a new checkbox (desktop) or using:
profile.sh start -p PID -f tasktracer using the profiler script.

I'll provide the header+GeckoProfiler patch when I'm done traveling.
Flags: needinfo?(bgirard)
Flags: needinfo?(bgirard)
Already here's the binding. TaskTracer just needs to implement start/stop/save.
Flags: needinfo?(slin)
Thank you very much :). Will start to implement the integration ASAP.
Flags: needinfo?(slin)
No longer depends on: 908995
Blocks: 995058
Attachment #8403468 - Attachment is obsolete: true
Cleopatra now has 'tabs' part of another feature. Once the gecko changes here are ready we can just put Isis into its own tab in cleopatra.
The way how TaskTracer work is slightly different then our GeckoProfiler.
If gecko is built with TaskTracer, all tasks/runnables should be wrapped with traced-tasks/traced-runnables from system start up. It is the matter of recording control, that is when to start logging and stop logging.

Here is my proposal:

To start the profiler with TaskTracer, use
./profiler.sh start -f tasktracer
This starts the profiler with feature of TaskTracer, this will kill the previous b2g and restart, monitoring all processes. 

To start logging TaskTracer data, use
./profiler.sh signal

And to stop logging, just use the same command above
./profiler.sh signal
This should save out the logged data to files.

To stop, use
./profiler.sh stop
Attached patch WIP - The revised profile.sh (obsolete) — Splinter Review
Sorry for the late response. We should be using ./profiler.sh capture to get the data, not profiler.sh signal.
Oh, sure, I was just testing with the signalling on tasktracer. 
I think it would be something like:
Start tasktracer with "./profile.sh capture", save the flag TASKTRACER_ON somewhere, only signal tasktracer to start logging; Stop logging tasktracer with the next "./profile.sh capture" and save out the data to files.

Thanks for your advise!
However, the approach of using a static nsTArray<nsCString> to store log data is not thread-safe.
Hey Kyle, this part does not involve with GeckoProfiler too much, mostly bugs fixes and implementation of logging TaskTracer data. 

Changes are:
1. (Re)Initialize TaskTracer for content processes after they are forked by Nuwa, and shutdown TaskTracer at content processes quit themselves.

This is because if a content process is forked by Nuwa, it carries over the TraceInfo data stored in the TLS of Nuwa, which we actually need a clean new TraceInfos.

2. Use LinkList to store TraceInfo of threads instead of nsHashTable. LinkList is easier on enumeration, and we only look up this TraceInfos at deletion.

3. Other follow-up issues are left in comments, thank you :)
Attachment #8415588 - Attachment is obsolete: true
Attachment #8454383 - Attachment is obsolete: true
Attachment #8472049 - Attachment is obsolete: true
Attachment #8486290 - Flags: review?(khuey)
Attachment #8455856 - Attachment is obsolete: true
Hi BenWa, I was trying to eliminating the output of profiler data since the file size grows up fast as time goes by, also it buy us a lot of time on saving out log files (profile.sh needs some small alternation for sure).

The reason of placing |StreamTaskTracer(b)| inside the original scope of |b.BeginObject(); ... b.EndObject();| is that it makes it easy on turning over the above idea.
Attachment #8486352 - Attachment is obsolete: true
Attachment #8486945 - Flags: review?(bgirard)
Comment on attachment 8486945 [details] [diff] [review]
Part2: Integrate with GeckoProfiler

Review of attachment 8486945 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/TableTicker.cpp
@@ +307,5 @@
> +    if (TaskTracer()) {
> +      b.Name("tasktracer");
> +      StreamTaskTracer(b);
> +
> +      // For now, outputting data of GeckoProfiler is skipped if feature

I don't think we should do that.

Can you elaborate more on the problem you're trying to solve? Are we running out of memory? Is the profile save taking too much time? The overhead from the normal profile saving is very small. Saving the profiler data shouldn't make it worse. Perhaps we can find a better solution here.

If you want to use less data then I suggest using the following profiler settings:
./profile.sh -i 50 -e 10000 <other options>. This will store less samples and make them 50ms apart. You'll get less data.

::: tools/profiler/platform.cpp
@@ +669,5 @@
>      // Add main thread I/O to the profile
>      "mainthreadio",
>      // Add RSS collection
>      "memory",
> +    // Toggle the state of the task tracer external data provider

This should be MOZ_WIDGET_GONK

@@ +884,5 @@
>      return false;
>    }
>  
>  #if defined(MOZ_WIDGET_GONK) && !defined(MOZ_PROFILING)
> +#ifndef MOZ_TASK_TRACER

Why is this needed?
Comment on attachment 8486945 [details] [diff] [review]
Part2: Integrate with GeckoProfiler

Review of attachment 8486945 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/TableTicker.cpp
@@ +307,5 @@
> +    if (TaskTracer()) {
> +      b.Name("tasktracer");
> +      StreamTaskTracer(b);
> +
> +      // For now, outputting data of GeckoProfiler is skipped if feature

I didn't see any oom problem, just that for example, the size of result file of b2g process, with and without profiler data are 3MB and 163kB.

But, I think that's because I was trying to register sampler on every threads (please see my other comments), otherwise the size of result files and time of saving are quite small.

::: tools/profiler/platform.cpp
@@ +884,5 @@
>      return false;
>    }
>  
>  #if defined(MOZ_WIDGET_GONK) && !defined(MOZ_PROFILING)
> +#ifndef MOZ_TASK_TRACER

(I finally recall what this is for)
Profiler has a better solution on monitoring threads, and all we want is a map of thread ids and thread names. Without bypassing the check below, I can only fetch 3 results(GeckoMain, Storage I/O, BufMgrParent#4797) from |sRegisteredThreads|. But I think this is why the files are big. Any suggestions on doing it lightly?
(In reply to Shelly Lin [:shelly] from comment #16)
> Comment on attachment 8486945 [details] [diff] [review]
> Part2: Integrate with GeckoProfiler
> 
> Review of attachment 8486945 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: tools/profiler/TableTicker.cpp
> @@ +307,5 @@
> > +    if (TaskTracer()) {
> > +      b.Name("tasktracer");
> > +      StreamTaskTracer(b);
> > +
> > +      // For now, outputting data of GeckoProfiler is skipped if feature
> 
> I didn't see any oom problem, just that for example, the size of result file
> of b2g process, with and without profiler data are 3MB and 163kB.
> 
> But, I think that's because I was trying to register sampler on every
> threads (please see my other comments), otherwise the size of result files
> and time of saving are quite small.

is 3MB really too big? It's never been too big for us. There's useful data in there. We can show combined profiler and Isis data.

> 
> ::: tools/profiler/platform.cpp
> @@ +884,5 @@
> >      return false;
> >    }
> >  
> >  #if defined(MOZ_WIDGET_GONK) && !defined(MOZ_PROFILING)
> > +#ifndef MOZ_TASK_TRACER
> 
> (I finally recall what this is for)
> Profiler has a better solution on monitoring threads, and all we want is a
> map of thread ids and thread names. Without bypassing the check below, I can
> only fetch 3 results(GeckoMain, Storage I/O, BufMgrParent#4797) from
> |sRegisteredThreads|. But I think this is why the files are big. Any
> suggestions on doing it lightly?

Can you just use Isis with a profiling build? This is the default for engineering builds now.

Otherwise you can restart b2g with the profiling option so that profiler_active() is true. Use MOZ_PROFILER_STARTUP=1
Attachment #8486945 - Attachment is obsolete: true
Attachment #8486945 - Flags: review?(bgirard)
Attachment #8488426 - Flags: review?(bgirard)
Attachment #8488429 - Flags: review?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #17)
> (In reply to Shelly Lin [:shelly] from comment #16)

> 
> Can you just use Isis with a profiling build? This is the default for
> engineering builds now.
> 
Sure, this is even better.
Comment on attachment 8488426 [details] [diff] [review]
Part2: Integrate with GeckoProfiler (v2)

Review of attachment 8488426 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Thanks for doing this :D

::: tools/profiler/TableTicker.h
@@ +156,5 @@
>      mSaveRequested = true;
> +#ifdef MOZ_TASK_TRACER
> +    if (mTaskTracer) {
> +      // Save when we stop logging TaskTracer data.
> +      mSaveRequested = !(mozilla::tasktracer::ToggleLogging());

This isn't great. Can't we just save the data we've collected so far without stopping TaskTracer? If not that's fine but it's an unfortunate limitation but we can live without it.
Attachment #8488426 - Flags: review?(bgirard) → review+
Assignee: nobody → slin
Comment on attachment 8488429 [details] [diff] [review]
Part3: Enable TaskTracer with MOZ_PROFILING

Review of attachment 8488429 [details] [diff] [review]:
-----------------------------------------------------------------

Maybe instead you should modify gonk-misc/default-gecko-config and where --enable-profiling is turned on turn on tasktracer there.
Comment on attachment 8488429 [details] [diff] [review]
Part3: Enable TaskTracer with MOZ_PROFILING

re ping me for review if you don't agree with modifying gonk-misc
Attachment #8488429 - Flags: review?(bgirard)
Blocks: 1074122
Blocks: 1064672
Comment on attachment 8486290 [details] [diff] [review]
Part1: Bugs fix and write TaskTracer logs into string buffers

Review of attachment 8486290 [details] [diff] [review]:
-----------------------------------------------------------------

I'm concerned about the ordering after forking from Nuwa.  Does something guarantee that InitOnContentProcessCreated runs before other threads start running again?  If not, what stops us from throwing away data from this process when we re-initialize things?  What happens if another thread tries to get something out TLS when we are in the middle of destroying and recreating the TLS variable?

Changing in the hashtable for a linked list means we're moving from O(1) to O(n) performance.  Are we ok with that?

::: dom/ipc/ContentChild.cpp
@@ +519,5 @@
> +    using namespace mozilla::tasktracer;
> +    if (aAfterNuwaFork) {
> +      InitTaskTracer(FORKED_AFTER_NUWA);
> +    }
> +#endif

This needs to be rebased over bug 1064890.

@@ +1556,5 @@
>  ContentChild::QuickExit()
>  {
> +#ifdef MOZ_TASK_TRACER
> +    mozilla::tasktracer::ShutdownTaskTracer();
> +#endif

We don't always QuickExit().  In debug builds we go through the normal shutdown process.

::: tools/profiler/GeckoTaskTracer.cpp
@@ +17,5 @@
> +#ifdef MOZILLA_INTERNAL_API
> +#include "nsString.h"
> +#else
> +#include "nsStringAPI.h"
> +#endif

Uh, you shouldn't need both of these.  Either this is in libxul or it's not, unless I'm missing something.

@@ +36,5 @@
>  namespace tasktracer {
>  
> +static mozilla::ThreadLocal<TraceInfo*>* sTraceInfoTLS = nullptr;
> +static mozilla::StaticMutex sMutex;
> +static mozilla::LinkedList<TraceInfo> sTraceInfos;

Have I ever told you that linked lists are my least favorite data structure? :P

@@ +156,4 @@
>  
> +  if (sTraceInfoTLS) {
> +    return;
> +  }

Is this ever possible?  Can we assert this instead of early returning?
Attachment #8486290 - Flags: review?(khuey) → review-
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #24)
> Comment on attachment 8486290 [details] [diff] [review]
> Part1: Bugs fix and write TaskTracer logs into string buffers
> 
> Review of attachment 8486290 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm concerned about the ordering after forking from Nuwa.  Does something
> guarantee that InitOnContentProcessCreated runs before other threads start
> running again?  If not, what stops us from throwing away data from this
> process when we re-initialize things?  What happens if another thread tries
> to get something out TLS when we are in the middle of destroying and
> recreating the TLS variable?
> 
1. The answer is no, but I don't think we care about those throw-away data.
2. As for the racing problem with deleting TLS...hmmm....we don't have a solution yet, and are trying to work that out.

> Changing in the hashtable for a linked list means we're moving from O(1) to
> O(n) performance.  Are we ok with that?
> 
I think it's okay to go O(n), we're not doing any search operation, but iteration is needed (to process signal interrupt), and yes hashtable also does iteration, it's just easier to implement with a list structure, I'm also fine with using Array type.
 
> @@ +1556,5 @@
> >  ContentChild::QuickExit()
> >  {
> > +#ifdef MOZ_TASK_TRACER
> > +    mozilla::tasktracer::ShutdownTaskTracer();
> > +#endif
> 
> We don't always QuickExit().  In debug builds we go through the normal
> shutdown process.
> 
So maybe the place where we do normal shutdown as well?

> @@ +36,5 @@
> >  namespace tasktracer {
> >  
> > +static mozilla::ThreadLocal<TraceInfo*>* sTraceInfoTLS = nullptr;
> > +static mozilla::StaticMutex sMutex;
> > +static mozilla::LinkedList<TraceInfo> sTraceInfos;
> 
> Have I ever told you that linked lists are my least favorite data structure?
> :P
> 
Why lol?

> @@ +156,4 @@
> >  
> > +  if (sTraceInfoTLS) {
> > +    return;
> > +  }
> 
> Is this ever possible?  Can we assert this instead of early returning?
Sure thing.
(In reply to Benoit Girard (:BenWa) (PTO Until Oct. 6) from comment #23)
> Comment on attachment 8488429 [details] [diff] [review]
> Part3: Enable TaskTracer with MOZ_PROFILING
> 
> re ping me for review if you don't agree with modifying gonk-misc

No problem.
Although tasktracer relies on the feature of profiling, but I think I can live with that.
Attachment #8486290 - Attachment is obsolete: true
Attachment #8488426 - Attachment is obsolete: true
Attachment #8488429 - Attachment is obsolete: true
Attachment #8498693 - Attachment is obsolete: true
Comment on attachment 8498736 [details] [diff] [review]
Part1: Bugs fix and write TaskTracer logs into string buffers (v2)

Review of attachment 8498736 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/GeckoTaskTracer.cpp
@@ +248,5 @@
>              SourceEventType aSourceEventType)
>  {
> +  TraceInfo* info = GetOrCreateTraceInfo();
> +  if (!(info && info->mStartLogging)) {
> +    return;

No need to log the warning message here.
Comment on attachment 8498737 [details] [diff] [review]
Part2: Integrate with GeckoProfiler (v3)

Review of attachment 8498737 [details] [diff] [review]:
-----------------------------------------------------------------

Instead of "toggling" tasktracer with |./profile.sh capture|, we've changed the behaviour to "Save out log files at every capture, the recording period of data is from the time of last capture till the time of current capture", for the first capture, it would be "from the time of start till the time of first capture".

For example:
(t1) ./profile.sh start -f tasktracer
(t2) ./profile.sh capture
-> will generate files with period from t1 to t2
(t3) ./profile.sh capture
-> will generate files with period from t2 to t3
(t4) ./profile.sh stop
-> will discard the rest data

Other changes are pretty much the same.
Attachment #8498737 - Flags: review?(bgirard)
(In reply to Shelly Lin [:shelly] from comment #25)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #24)
> > Comment on attachment 8486290 [details] [diff] [review]
> > Part1: Bugs fix and write TaskTracer logs into string buffers
> > 
> > Review of attachment 8486290 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > I'm concerned about the ordering after forking from Nuwa.  Does something
> > guarantee that InitOnContentProcessCreated runs before other threads start
> > running again?  If not, what stops us from throwing away data from this
> > process when we re-initialize things?  What happens if another thread tries
> > to get something out TLS when we are in the middle of destroying and
> > recreating the TLS variable?
> > 
> 1. The answer is no, but I don't think we care about those throw-away data.
> 2. As for the racing problem with deleting TLS...hmmm....we don't have a
> solution yet, and are trying to work that out.
> 

We don't need to reinit TaskTracer in ContentChild.cpp::InitOnContentProcessCreated(). We can do it using NuwaAddConstructor(), which runs right after fork and before recreating threads.
Comment on attachment 8498737 [details] [diff] [review]
Part2: Integrate with GeckoProfiler (v3)

Review of attachment 8498737 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/TableTicker.cpp
@@ +123,5 @@
> +    b.EndArray();
> +
> +    b.Name("threads");
> +    b.BeginArray();
> +      SetPaused(true);

No need to pause here since we're not reading the circular buffer.
Attachment #8498737 - Flags: review?(bgirard) → review+
Attached patch Addition to part 1 (obsolete) — Splinter Review
Addition to part 1: reinitialization of TaskTracer is called after fork and before recreating threads. This addresses the race and possibly crashes when some thread is using the buffer and we are reinitializing it.
Hey Kyle, please find Cervantes's comments about the problem of re-init tasktracer in comment 32 and comment 34, and this patch has also addressed issues per last review comment.
Attachment #8498736 - Attachment is obsolete: true
Attachment #8498737 - Attachment is obsolete: true
Attachment #8499469 - Attachment is obsolete: true
Attachment #8500295 - Flags: review?(khuey)
Remove |SetPaused(true)| and |SetPaused(false)| per comment 33.

Carry r+ from BenWa.
Attachment #8500297 - Flags: review+
Fix a release-on-null pointer issue in |FreeTraceInfo()|.
Attachment #8500295 - Attachment is obsolete: true
Attachment #8500295 - Flags: review?(khuey)
Attachment #8503979 - Flags: review?(khuey)
Comment on attachment 8503979 [details] [diff] [review]
Part1: Bugs fix and write TaskTracer logs into string buffers (v3)

Review of attachment 8503979 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/GeckoTaskTracer.cpp
@@ +50,5 @@
>  
>  static bool
>  IsInitialized()
>  {
> +  return (sTraceInfoTLS ? sTraceInfoTLS->initialized() : false);

You don't need parentheses here.

@@ +147,5 @@
> +static bool
> +IsStartLogging(TraceInfo* aInfo)
> +{
> +  StaticMutexAutoLock lock(sMutex);
> +  return (aInfo ? aInfo->mStartLogging : false);

or here
Attachment #8503979 - Flags: review?(khuey) → review+
With nit fixed and carry r+ from Kyle.
Attachment #8500297 - Attachment is obsolete: true
Attachment #8503979 - Attachment is obsolete: true
Attachment #8504463 - Flags: review+
Rebase and carry r+ from BenWa.
Attachment #8504464 - Flags: review+
Keywords: checkin-needed
Are there docs on how to use Task Tracer with Cleopatra?  Or is more work still needed?

The Task Tracer wiki seems to only describe Isis for now.
Flags: needinfo?(slin)
Our front-end tool Isis is still independent to Cleopatra, but the back-end TaskTracer is very close to integrate into GeckoProfiler, once Bug 1084226 is checked-in, we are able to use ./profile.sh to control TaskTracer, and hopefully Bug 956620 will integrate needed tools and steps into ./profile.sh as well.
Flags: needinfo?(slin)
Summary: [B2G] Integrate Task tracer&Isis with GeckoProfiler&Cleopatra → [B2G] Integrate Task tracer&Nephthys with GeckoProfiler&Cleopatra
Remove the part of integrating Nephthys with Cleopatra from bug title to avoid confusion.
Summary: [B2G] Integrate Task tracer&Nephthys with GeckoProfiler&Cleopatra → [B2G] Integrate Task tracer with GeckoProfiler
You need to log in before you can comment on or make changes to this bug.