Label users of pipes (nsPipeInputStream)

NEW
Assigned to

Status

()

Core
XPCOM
P3
normal
6 months ago
2 months ago

People

(Reporter: billm, Assigned: bevis)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 15 obsolete attachments)

34.59 KB, application/zip
Details
(Reporter)

Description

6 months ago
nsPipeInputStream::AsyncWait is in the top 5 unlabeled runnables. We need to figure out where it's mainly used from an label those places.

Comment 1

6 months ago
Are you suggesting changing nsIAsyncInputStream API to support labeling?  I'm concerned with getting this to scale since we do lots of streams stuff all over, including using helpers like NS_AsyncCopy(), etc.  I guess I'd like to see the patch.
(Reporter)

Comment 2

6 months ago
First I'd like to understand what the most common uses are (by frequency with which they occur at runtime).

Comment 3

6 months ago
Our IPCStream stuff probably uses it a lot now that we have it integrated in necko:

http://searchfox.org/mozilla-central/source/ipc/glue/IPCStreamSource.cpp#279

Comment 4

6 months ago
We're also about to start using it with ReadableStream.
(Assignee)

Updated

6 months ago
See Also: → bug 1359706
(Assignee)

Comment 5

6 months ago
API change makes not much sense just for the naming things.
I'll print some logs on all the callsite of nsIAsyncInputStream::Wait() locally first instead to figure out who is the high frequent one on the main thread of the content process.
Assignee: nobody → btseng
(Assignee)

Comment 6

5 months ago
Created attachment 8882324 [details] [diff] [review]
Patch for Logging
(Assignee)

Comment 7

5 months ago
Created attachment 8882328 [details]
Logs for analysis
(Assignee)

Comment 8

5 months ago
After further breaking down to the log in comment 7(FB, GDoc, YouTube, NYTimes, Wiki, Yahoo are visited), 
1. All the unlabeled nsIAsyncInputStream::Wait() requests are found in nsPipeInputStream. 
   - Search the keywords of "could be unlabeled", you'll see that nsInputStreamPump::EnsureWaiting() is the entry point to call
     nsIAsyncInputStream::Wait() without nsIEventTarget or with nsIEventTarget which is not a SchedulerEventTarget instance. 
     This is fine if the eventTarget is actually aother background thread.
2. nsBaseChannel::BeginPumpingData() and mozilla::dom::workers::serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult() are the 2 entry points to call nsInputStreamPump::AsyncRead without valid SchedulerEventTarget which could be when calling NS_NewInputStreamPump:
   - Search the keywords of "has no mLabeledMainThreadTarget" you can see only these 2 entry points printed in the attached log.
   - The frequency is pretty high in nsBaseChannel::BeginPumpingData() when loading web page in a new tab even though the 
     labeling was done in nsContentUtils::GetEventTargetByLoadInfo() which tries to return a SchedulerEventTarget when possible.
     However, we failed to get the nsGlobalWindow with the outerWindowId retrieved from nsILoadInfo.
   - The serviceWorkerScriptCache is obvious not labeled because the nsIEventTarget is not specified when calling NS_NewInputStreamPump().

:billm, is it reasonable that we failed to get the nsGlobalWindow from nsContentUtils::GetEventTargetByLoadInfo() in this frequency especially when loading from a new tab?

:bkelly, nsInputStreamPump::AsyncRead() in CompareCache::ManageValueResult() is called on the main thread. Do you have any suggestion to associate this nsInputStreamPump to a SchedulerEventTarget of the window?

BTW, you could check the change in nsDebugImpl.cpp and nsInputStreamPump::AsyncRead() and the log to know how I narrow down the unlabeled caller to nsBaseChannel::BeginPumpingData() and CompareCache::ManageValueResult().
Flags: needinfo?(wmccloskey)
Flags: needinfo?(bkelly)
(Assignee)

Comment 9

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #8)
> BTW, you could check the change in nsDebugImpl.cpp and
> nsInputStreamPump::AsyncRead() and the log to know how I narrow down the
> unlabeled caller to nsBaseChannel::BeginPumpingData() and
> CompareCache::ManageValueResult().
1. Add logs to all callers and callees of nsIAsyncInputStream::Wait() and find out that nsInputStreamPump::EnsureWaiting() is the only one that could be unlabeled.
2. Add logs to all the callers and calles of nsInputStreamPump::AsyncRead() to figure out which caller is not labeled.

Comment 10

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #8)
> :bkelly, nsInputStreamPump::AsyncRead() in CompareCache::ManageValueResult()
> is called on the main thread. Do you have any suggestion to associate this
> nsInputStreamPump to a SchedulerEventTarget of the window?

This code is explicitly not associated with a window.  It should be in system group.
Flags: needinfo?(bkelly)
(Reporter)

Comment 11

5 months ago
> :billm, is it reasonable that we failed to get the nsGlobalWindow from nsContentUtils::GetEventTargetByLoadInfo() in this
> frequency especially when loading from a new tab?

I don't know. It looks like the printing you added for GetEventTargetByLoadInfo doesn't check whether we're in the content process, so some of the logging is probably coming from chrome stuff that we don't care about.

Anyway, it seems like passing SystemGroup to NS_NewInputStreamPump in CompareCache::ManageValueResult would be a good start here. Once that's done, we can see if telemetry shows any improvement.
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 12

5 months ago
(In reply to Bill McCloskey (:billm) from comment #11)
> > :billm, is it reasonable that we failed to get the nsGlobalWindow from nsContentUtils::GetEventTargetByLoadInfo() in this
> > frequency especially when loading from a new tab?
> 
> I don't know. It looks like the printing you added for
> GetEventTargetByLoadInfo doesn't check whether we're in the content process,
> so some of the logging is probably coming from chrome stuff that we don't
> care about.
Oh, but the unlabeled used of nsBaseChannel::BeginPumpingData() accompanied with invalid eventTarget from nsContentUtils::GetEventTargetByLoadInfo() is relatively higher than the unlabeled NS_NewInputStreamPump() in CompareCache::ManageValueResult()
(Reporter)

Comment 13

5 months ago
I wasn't able to reproduce the calls to BeginPumpingData. Are there particular pages that trigger it? I also wonder if it has to do with extensions you have installed. It's possible that content scripts could be triggering that code. Did you have any extensions installed when you tested?
(Assignee)

Comment 14

5 months ago
(In reply to Bill McCloskey (:billm) from comment #13)
> I wasn't able to reproduce the calls to BeginPumpingData. Are there
> particular pages that trigger it? I also wonder if it has to do with
> extensions you have installed. It's possible that content scripts could be
> triggering that code. Did you have any extensions installed when you tested?

No, there is no extension installed.
Besides, I saw nsBaseChannel::BeginPumpingData() unlabeled in 2 scencarios:
1. Type any url and load manually in a new tab. (I've tried FB, GDoc, YouTube, NYTimes, Wiki and Yahoo.)
2. Right click a link -> Open in new Tab.

The unlabeled CompareCache::ManageValueResult() can be found when watching video in Youtube.

1. Load Youtube in 1st Tab:
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from void mozilla::dom::workers::serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult(JSContext *, JS::Handle<JS::Value>): 1152 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from void mozilla::dom::workers::serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult(JSContext *, JS::Handle<JS::Value>): 1152 has no mLabeledMainThreadTarget for AsyncRead.

2. Open a new Tab to load Wikipedia:
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.

3. Right click a video in YouTube and select Open in a new Tab:
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from nsresult nsBaseChannel::BeginPumpingData(): 283 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from void mozilla::dom::workers::serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult(JSContext *, JS::Handle<JS::Value>): 1152 has no mLabeledMainThreadTarget for AsyncRead.
AsyncRead from void mozilla::dom::workers::serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult(JSContext *, JS::Handle<JS::Value>): 1152 has no mLabeledMainThreadTarget for AsyncRead.
(Assignee)

Comment 15

5 months ago
Created attachment 8882435 [details] [diff] [review]
Patch for Logging

Print error in nsContentUtils::GetEventTargetByLoadInfo only if we are on the main thread of a content process.
(Assignee)

Comment 16

5 months ago
BTW, my patch is based on the revision of https://hg.mozilla.org/mozilla-central/rev/9af23c413a1f
(Reporter)

Comment 17

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #14)
> Besides, I saw nsBaseChannel::BeginPumpingData() unlabeled in 2 scencarios:
> 1. Type any url and load manually in a new tab. (I've tried FB, GDoc,
> YouTube, NYTimes, Wiki and Yahoo.)
> 2. Right click a link -> Open in new Tab.

I added a printf here:
http://searchfox.org/mozilla-central/rev/17ebac68112bd635b458e831670c1e506ebc17ad/netwerk/base/nsInputStreamPump.cpp#386
like you have in your patch. It fired twice on content process startup, when it loaded these two images:
http://searchfox.org/mozilla-central/rev/17ebac68112bd635b458e831670c1e506ebc17ad/layout/generic/nsImageFrame.cpp#2295-2296

I couldn't get it to fire anymore after that. Can you try setting your content process count to 1 and then see if you still see this a lot? It's possible you're seeing a lot of content processes starting, which hopefully is not very common in the real world.


> The unlabeled CompareCache::ManageValueResult() can be found when watching
> video in Youtube.

Yeah, I can reproduce that one. We should label fix it.
(Assignee)

Comment 18

5 months ago
(In reply to Bill McCloskey (:billm) from comment #17)
> I added a printf here:
> http://searchfox.org/mozilla-central/rev/
> 17ebac68112bd635b458e831670c1e506ebc17ad/netwerk/base/nsInputStreamPump.
> cpp#386
> like you have in your patch. It fired twice on content process startup, when
> it loaded these two images:
> http://searchfox.org/mozilla-central/rev/
> 17ebac68112bd635b458e831670c1e506ebc17ad/layout/generic/nsImageFrame.
> cpp#2295-2296
> 
> I couldn't get it to fire anymore after that. Can you try setting your
> content process count to 1 and then see if you still see this a lot? It's
> possible you're seeing a lot of content processes starting, which hopefully
> is not very common in the real world.
Thanks for pointing out this. The unlabeled BeginPumpingData() is related to the number of content processes and these 2 images to be loaded!
> 
> > The unlabeled CompareCache::ManageValueResult() can be found when watching
> > video in Youtube.
> 
> Yeah, I can reproduce that one. We should label fix it.

Sure, bug 1377275 has been filed to address this.

I'll keep using this build in these days to see if there is any other unlabeled nsPipeInputStream::AsyncWait. :-)
(Assignee)

Updated

5 months ago
Attachment #8882324 - Attachment is obsolete: true
(Assignee)

Comment 19

5 months ago
There are several call sites of NS_NewInputStreamPump():
http://searchfox.org/mozilla-central/search?q=NS_NewInputStreamPump&case=false&regexp=false&path=
1. Unlabeled:
   - (anonymous namespace)::CacheScriptLoader::ResolvedCallback in dom/workers/ScriptLoader.cpp
   - nsJARChannel::OpenLocalFile(), nsJARChannel::OnDownloadComplete
   - nsZipWriter::BeginProcessingAddition()
   - nsZipWriter::BeginProcessingRemoval()
   - mozilla::net::ExtensionStreamGetter::OnStream() in netwerk/protocol/res/ExtensionProtocolHandler.cpp
   - nsWyciwygChannel::ReadFromCache()
2. Labeled:
   - serviceWorkerScriptCache::(anonymous namespace)::CompareCache::ManageValueResult() in bug 1377275.
   - FetchBodyConsumer<Derived>::BeginConsumeBodyMainThread() in bug 1363318.

I haven't hit the remained unlabeled ones so far.
Not sure in which scenario will these call sites be triggered.
(Assignee)

Updated

5 months ago
Depends on: 1377275
(Assignee)

Updated

5 months ago
Depends on: 1377484
(Assignee)

Comment 20

5 months ago
I've seen 
"AsyncWait called from nsresult mozilla::dom::FileReader::DoAsyncWait(): 272 could be unlabeled"
several times in today's trial.

It's a little bit surprising me because the FileReader::mTarget for the ::AsyncWait() at[1] is supposed to be a SchedulerEventTarget retrieved at [2] when FileReader is created on the main thread.

I'll do more investigation to see if this is a false alarm.

[1] http://searchfox.org/mozilla-central/rev/dde5c480358718607cc40d752656c968a0e6eabd/dom/file/FileReader.cpp#274
[2] http://searchfox.org/mozilla-central/rev/dde5c480358718607cc40d752656c968a0e6eabd/dom/file/FileReader.cpp#120
(Assignee)

Comment 21

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #19)
>    - mozilla::net::ExtensionStreamGetter::OnStream() in netwerk/protocol/res/ExtensionProtocolHandler.cpp
This has been addressed in bug 1377544.
>    - nsWyciwygChannel::ReadFromCache()
WyciwygChannel is run in parent process. So we can skip this.
(Assignee)

Updated

5 months ago
Depends on: 1377544
(Assignee)

Comment 22

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #19)
> There are several call sites of NS_NewInputStreamPump():
> http://searchfox.org/mozilla-central/
> search?q=NS_NewInputStreamPump&case=false&regexp=false&path=
> 1. Unlabeled:
>    - nsJARChannel::OpenLocalFile(), nsJARChannel::OnDownloadComplete
>    - nsZipWriter::BeginProcessingAddition()
>    - nsZipWriter::BeginProcessingRemoval()

Hi, :aklotz

We are trying to specify proper nsIEventTarget for the use of NS_NewInputStreamPump() on the main thread of content processes to get more benefit from Quantum-DOM scheduling. [1]
There are several unlabeled use cases in modules/libjar as mentioned above.

Can you help to check if this are the candidates that need to be labeled? [2]
(If yes, it would be great if someone familiar with modules/libjar could help us or give us suggestion to label it.)

Thanks!

[1] https://hacks.mozilla.org/2017/06/an-inside-look-at-quantum-dom-scheduling/
[2] https://wiki.mozilla.org/Quantum/DOM#Q3:_What_runnables_shall_be_labeled.3F
Flags: needinfo?(aklotz)
(Assignee)

Comment 23

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #20)
> I've seen 
> "AsyncWait called from nsresult mozilla::dom::FileReader::DoAsyncWait(): 272
> could be unlabeled"
> several times in today's trial.
> 
> It's a little bit surprising me because the FileReader::mTarget for the
> ::AsyncWait() at[1] is supposed to be a SchedulerEventTarget retrieved at
> [2] when FileReader is created on the main thread.
> 
> I'll do more investigation to see if this is a false alarm.
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> dde5c480358718607cc40d752656c968a0e6eabd/dom/file/FileReader.cpp#274
> [2]
> http://searchfox.org/mozilla-central/rev/
> dde5c480358718607cc40d752656c968a0e6eabd/dom/file/FileReader.cpp#120

After further breaking down, I found that the unlabeled use case of the ::AsyncWait() in FileReader::DoAsyncWait() was triggered from backgroundPageThumbsContent.js instead of a web content(The web content one is fine after testing, btw):
http://searchfox.org/mozilla-central/rev/a3a739de04ee6134c11546568a33dbb6a6a29907/toolkit/components/thumbnails/content/backgroundPageThumbsContent.js#165,177

As explained in comment 20, the FileReader::mTarget is supposed to be a SchedulerEventTarget.
However, when creating FileReader from backgroundPageThumbsContent.js, because the pass-in nsIGlobalObject is not a nsGlobalWindow, it falls back to the one in DispatcherTrait::EventTargetFor() when calling nsIGlobalObject::EventTargetFor().

I haven't figured out the STR of this use case yet but it's the only unlabeled ::AsyncWait() call I have seen so far after bug 1377275 and bug 1377484 are fixed and the nsBaseChannel::BeginPumpingData() is clarified.

:billm, do you have any suggestion to label this use case?
Flags: needinfo?(wmccloskey)
(In reply to Bevis Tseng [:bevis][:btseng] from comment #22)
> Can you help to check if this are the candidates that need to be labeled? [2]
> (If yes, it would be great if someone familiar with modules/libjar could
> help us or give us suggestion to label it.)

Unfortunately the original authors of libjar have left Mozilla, so we have lost a lot of expertise. As a peer I do libjar reviews but I do not admit to being very knowledgeable about all aspects of libjar's code.

Sorry but I don't think that I know enough about the code in question to have anything to add. It might be a good idea to check with the Necko people to see if any of them have any advice.

(I am happy to review any libjar patches, however!)
Flags: needinfo?(aklotz)
(Assignee)

Comment 25

5 months ago
(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #24)
> (In reply to Bevis Tseng [:bevis][:btseng] from comment #22)
> > Can you help to check if this are the candidates that need to be labeled? [2]
> > (If yes, it would be great if someone familiar with modules/libjar could
> > help us or give us suggestion to label it.)
> 
> Unfortunately the original authors of libjar have left Mozilla, so we have
> lost a lot of expertise. As a peer I do libjar reviews but I do not admit to
> being very knowledgeable about all aspects of libjar's code.
> 
> Sorry but I don't think that I know enough about the code in question to
> have anything to add. It might be a good idea to check with the Necko people
> to see if any of them have any advice.
> 
> (I am happy to review any libjar patches, however!)

No problem! Thanks for your feedback! :)

BTW, after further study, I found that we don't have to care about the use of nsJarChannel in content process because it has been disabled after FF55:
https://developer.mozilla.org/en-US/docs/Mozilla/Security/Security_and_the_jar_protocol

I'll keep reviewing the use of nsZipWriter to see if labeling on it is necessary.
(Assignee)

Comment 26

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #19)
> 1. Unlabeled:
>    - nsZipWriter::BeginProcessingAddition()
>    - nsZipWriter::BeginProcessingRemoval()
The use case of ZipWriter can only be found in the WebIDE of devtools which is only used for the App developmement in Firefox OS. Hence, we could ingore this one for now. :)
http://searchfox.org/mozilla-central/source/devtools/shared/apps/app-actor-front.js#82-93
(Assignee)

Comment 27

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #23)
> I haven't figured out the STR of this use case yet but it's the only
> unlabeled ::AsyncWait() call I have seen so far after bug 1377275 and bug
> 1377484 are fixed and the nsBaseChannel::BeginPumpingData() is clarified.
> 
> :billm, do you have any suggestion to label this use case?

After further investigation, I found that the nsIGlboalObject is a TabChildGlobal when creating FileReader from backgroundPageThumbsContent.js.
Hence, we could override the methods defined in DispatcherTrait class using TabChildGlobal::mTabChild::TabGroup().
(Maybe we could move the declaration of ::TabGroup() from TabChild to TabChildBase for the access from TabChildGlobal.)
(Assignee)

Updated

5 months ago
Depends on: 1378298
(Assignee)

Comment 28

5 months ago
The problem has been addressed in bug 1378298. Cancel the NI flag.
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 29

5 months ago
The labeling of nsPipeInputStream looks fine in today's trial after bug 1378298 is fixed.
There is only one false alarm found in nsInputStreamPump::EnsureWaiting() because the nsInputStreamPump::mTargetThread could be retargeted by nsInputStreamPump::RetargetDeliveryTo() with nsStreamTransportService specified which is fine.

Hope the labeling of this runnable could be improved significantly in next telemetry result after bug 1378298 is landed.
(Assignee)

Comment 30

5 months ago
Unfortunately, we didn't label too much use case of nsPipeInputStream::AsyncWait according to the telemetry analysis where(appBuildId=lambda x: x.startswith('20170709') or x.startswith('20170710').where(appUpdateChannel="nightly"):
1. (u'nsPipeInputStream::AsyncWait', 1486),
2. (u'nsPipeInputStream::AsyncWait(labeled)', 19),

This indicates that my daily trial didn't cover most use cases of nsPipeInputStream::AsyncWait in nightly build. :\

BTW, the unlabeled runnables higher than nsPipeInputStream::Async are:
 (u'PVsync::Msg_Notify', 7226),
 (u'IdleRunnable', 5457),
 (u'CollectorRunner', 5060),
 (u'PCompositorBridge::Msg_DidComposite', 4184)
(Assignee)

Comment 31

5 months ago
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #1)
> Are you suggesting changing nsIAsyncInputStream API to support labeling? 
> I'm concerned with getting this to scale since we do lots of streams stuff
> all over, including using helpers like NS_AsyncCopy(), etc.  I guess I'd
> like to see the patch.

It seems unavoidable to have some API change to identify the users of nsPipeInputStream after several trials to identify the callers from local build.

However, instead of changing nsIAsyncInputStream API, I think we could at least name the caller of NS_NewPipe2() from native and nsPipeConstructor() from JS and append the caller's name at [1] to identify the callers from telemetry as what we have done to name the callers of "AbstractThead::Runner" and "ProxyReleaseEvent".

[1] http://searchfox.org/mozilla-central/rev/5dadcbe55b4ddd1e448c06c77390ff6483aa009b/xpcom/io/nsPipe3.cpp#1496
(Assignee)

Comment 32

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #31)
> However, instead of changing nsIAsyncInputStream API, I think we could at
> least name the caller of NS_NewPipe2() from native and nsPipeConstructor()
> from JS and append the caller's name at [1] to identify the callers from
> telemetry as what we have done to name the callers of
> "AbstractThead::Runner" and "ProxyReleaseEvent".
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 5dadcbe55b4ddd1e448c06c77390ff6483aa009b/xpcom/io/nsPipe3.cpp#1496

The only concern is that if the relationship between the caller of NS_NewPipe2 and the caller of nsIAsyncInputStream::Await is too weak, then we'll still have to do some API change on nsIAsyncInputStream eventually to identify the use of "nsPipeInputStream::AsyncWait".

Comment 33

5 months ago
So after looking at this stuff a bit more I have a new question.

AsyncWait() already supports passing an event target.  Is this bug just about making sure the right TabGroup event target is passed there instead of NS_GetCurrentThread()/NS_GetMainThread()?

And yea, I don't think you can fix this by changing the pipe constructor methods.  There can be any number of listeners attached to the same pipe and its really those call sites that determine how these runnables are created.
(Assignee)

Comment 34

5 months ago
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #33)
> So after looking at this stuff a bit more I have a new question.
> 
> AsyncWait() already supports passing an event target.  Is this bug just
> about making sure the right TabGroup event target is passed there instead of
> NS_GetCurrentThread()/NS_GetMainThread()?
Correct, but it's time-consuming to label all the call sites of nsIAsyncInputStream::AsyncWait:
http://searchfox.org/mozilla-central/search?q=symbol:_ZN19nsIAsyncInputStream9AsyncWaitEP22nsIInputStreamCallbackjjP14nsIEventTarget%2C%23asyncWait&redirect=false
> And yea, I don't think you can fix this by changing the pipe constructor
> methods.  There can be any number of listeners attached to the same pipe and
> its really those call sites that determine how these runnables are created.
The runnable of "nsPipeInputStream::AsyncWait" is only created when calling nsIAsyncInputStream::AsyncWait on a nsPipeInputStream instance only if a valid nsIEventTarget is also specified:
http://searchfox.org/mozilla-central/source/xpcom/io/nsPipe3.cpp#1496
and the nsPipeInputStream is only created in nsPipe's ctor.
Changing the signature of nsIAsyncInputStream::AsyncWait with call site's name could help us to identify the users directly but the places to be changed are a lot in both native code and JS:
http://searchfox.org/mozilla-central/search?q=symbol:_ZN19nsIAsyncInputStream9AsyncWaitEP22nsIInputStreamCallbackjjP14nsIEventTarget%2C%23asyncWait&redirect=false
Similar issues if we change the signature of nsPipe creators:
http://searchfox.org/mozilla-central/search?q=NewPipe&case=false&regexp=false&path=
http://searchfox.org/mozilla-central/search?q=%40mozilla.org%2Fpipe%3B1&case=false&regexp=false&path=

If changing the API signaure from 
    void asyncWait(in nsIInputStreamCallback aCallback,
                   in unsigned long aFlags,
                   in unsigned long aRequestedCount,
                   in nsIEventTarget aEventTarget);
to
    void asyncWait(in ACString aCallSite,
                   in nsIInputStreamCallback aCallback,
                   in unsigned long aFlags,
                   in unsigned long aRequestedCount,
                   in nsIEventTarget aEventTarget);
is acceptable, then I'd like to add this change instead to identify the users from the telemetry easily.

If the |aCallSite| is annoying, we could remove it later once we identified all the users of "nsPipeInputStream::AsyncWait".

How do you think?
Flags: needinfo?(bkelly)
(Assignee)

Comment 35

5 months ago
(In reply to Bevis Tseng [:bevis][:btseng] from comment #34)
> If changing the API signaure from 
>     void asyncWait(in nsIInputStreamCallback aCallback,
>                    in unsigned long aFlags,
>                    in unsigned long aRequestedCount,
>                    in nsIEventTarget aEventTarget);
> to
>     void asyncWait(in ACString aCallSite,
>                    in nsIInputStreamCallback aCallback,
>                    in unsigned long aFlags,
>                    in unsigned long aRequestedCount,
>                    in nsIEventTarget aEventTarget);
> is acceptable, then I'd like to add this change instead to identify the
> users from the telemetry easily.
> 
> If the |aCallSite| is annoying, we could remove it later once we identified
> all the users of "nsPipeInputStream::AsyncWait".
> 
> How do you think?

After further study, I think we could just make nsIInputStreamCallback inherit from nsINamed instead and have GetName method implemented in each nsIInputStreamCallback.
(Assignee)

Comment 36

5 months ago
The only concern is that there is no syntax sugar to implement nsIInputStreamCallback as a JS function.
(Assignee)

Comment 37

5 months ago
Created attachment 8885668 [details] [diff] [review]
(WIP) Have the nsINamed interface implemented by all the instances of nsIInputStreamCallback.

It shall be easier to identify the caller of nsPipeInputStream::AsyncWait() by querying the name of the nsIInputStreamCallback instance by overriding GetName() in nsInputStreamReadyEvent class without any API signature change.

TODO:
1. Finish the nsINamed implementation in all nsIInputStreamCallback instance.
2. Override GetName() in nsInputStreamReadyEvent to append the name of nsIInputStreamCallback.
3. Try to concatenate the nsIInputStreamCallback names because some AsyncInputStream wrappers override the nsIInputStreamCallback to itself in ::AsyncWait() implementation. This prevents us identifying the real caller.
4. Name the caller of nsInputStreamPump::AsyncRead with similar solution.
Attachment #8882328 - Attachment is obsolete: true
Attachment #8882435 - Attachment is obsolete: true
Flags: needinfo?(bkelly)

Comment 38

5 months ago
AIUI naming stuff is just one step towards the ultimate goal of getting the right event target.  Right?  Changing a lot of callsites to pass a name seams a similar amount of work to changing a lot of callsites to pass a different event target.

For example, all the js code doing some form of:

  Services.tm.currentThread

Don't all of these call sites have an nsIGlobalObject in place?  (I guess I assume they have a global since they are running js, but maybe there is some weird legacy mode...) Can we bulk convert them to calling nsIGlobalObject::EventTargetFor() in some way?

The c++ call sites I would change in this way:

Convert to SystemGroup::EventTargetFor():
* http://searchfox.org/mozilla-central/source/dom/file/ipc/IPCBlobInputStream.cpp#308
* http://searchfox.org/mozilla-central/source/dom/flyweb/HttpServer.cpp
* http://searchfox.org/mozilla-central/source/dom/presentation/PresentationTCPSessionTransport.cpp
* http://searchfox.org/mozilla-central/source/gfx/layers/LayerScope.cpp

Convert to nsIGlobalObject::EventTargetFor():
* http://searchfox.org/mozilla-central/source/dom/network/TCPSocket.cpp#204

These are already labeled on main thread:
* http://searchfox.org/mozilla-central/source/dom/file/FileReader.cpp#271

Not called on main thread:
* http://searchfox.org/mozilla-central/source/dom/indexedDB/ActorsChild.cpp#3526 (STS)
* http://searchfox.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#29610 (STS)
* http://searchfox.org/mozilla-central/source/dom/workers/FileReaderSync.cpp#455 (STS)
* http://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpTransaction.cpp#764
* http://searchfox.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketChannel.cpp

I skipped call sites that were just pass-through due to wrapping nsIAsyncInputStream or were passing nullptr for that target.

Give the c++ call sites I would guess the large volume of runnables is coming from one of the js call sites.  So getting a browser chrome exposed self.eventTargetFor(TaskCategory.other) would be really helpful there.
(Reporter)

Updated

5 months ago
Flags: needinfo?(wmccloskey)
(Reporter)

Comment 39

5 months ago
Hi Bevis. Today I started an official Nightly in a fresh profile with processCount=1, browsed around a bit, and then looked at about:telemetry. I saw over a thousand unlabeled nsPipeInputStream::AsyncWait calls. I wonder if local testing might find more stuff. One thing I wonder is whether things work differently when you do |make package|. Perhaps we do a lot of work reading from the OmniJAR using pipes?
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 40

5 months ago
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #38)
Thanks for breaking this down.
This analysis helps a lot if we have to label anyone of them.
To prevent any regression and longer debugging session after labeling these at once, I'd like to try to name them first and then label the necessary one according to the telemetry result.

The worse case is the JS one that need to be labeled which I don't have too much knowhow about exposing API from a internal nsIGlobalObject to the internal js.
(Assignee)

Comment 41

5 months ago
(In reply to Bill McCloskey (:billm) from comment #39)
> Hi Bevis. Today I started an official Nightly in a fresh profile with
> processCount=1, browsed around a bit, and then looked at about:telemetry. I
> saw over a thousand unlabeled nsPipeInputStream::AsyncWait calls. I wonder
> if local testing might find more stuff. One thing I wonder is whether things
> work differently when you do |make package|. Perhaps we do a lot of work
> reading from the OmniJAR using pipes?

Wow! I was not aware of the difference of the official build and my local build.
I just checked the telemetry of my nightly and saw over 7k unlabeled nsPipeInputStream::AsyncWait!
I'll try |make package| later to see if I could reproduce this locally.
Thanks for these information!
(Reporter)

Comment 42

5 months ago
Regarding the JS stuff, it's not as easy as labeling based on the global. All the code will be using a system global, which isn't much help. We don't know if the chrome code is going to touch content at some point.

Most of the JS stuff doesn't look that important to me except for the devtools stuff. I don't understand what that's for. It might be used a lot when devtools are open. I think we might be able to use the SystemGroup there. It probably makes sense to have something like Services.tm.systemGroupEventTarget (which would use TaskCategory::Other).
(Assignee)

Comment 43

5 months ago
Created attachment 8886108 [details] [diff] [review]
(v1) Part 1: Append the CallSite name to nsInputStreamReadyEvent::GetName().

Make this change to allow the nsINamed-implemented nsIInputStreamCallback in part 2 and part 3 to be displayed in the telemetry.
Note: We reuse nsIRequest::GetName() is available.
Attachment #8885668 - Attachment is obsolete: true
Attachment #8886108 - Flags: review?(wmccloskey)
(Assignee)

Comment 44

5 months ago
Created attachment 8886109 [details] [diff] [review]
(v1) Part 2: Have nsINamed interface implemented by all C++ instances of nsIInputStreamCallback.
Attachment #8886109 - Flags: review?(wmccloskey)
(Assignee)

Comment 45

5 months ago
Created attachment 8886110 [details] [diff] [review]
(v1) Part 3: Have nsINamed interface implemented by all JS instances of nsIInputStreamCallback.
Attachment #8886110 - Flags: review?(wmccloskey)
(Assignee)

Comment 46

5 months ago
Comment on attachment 8886108 [details] [diff] [review]
(v1) Part 1: Append the CallSite name to nsInputStreamReadyEvent::GetName().

treeherder looks NG.
Attachment #8886108 - Flags: review?(wmccloskey)
(Assignee)

Comment 47

5 months ago
Comment on attachment 8886109 [details] [diff] [review]
(v1) Part 2: Have nsINamed interface implemented by all C++ instances of nsIInputStreamCallback.

treeherder looks NG.
Attachment #8886109 - Flags: review?(wmccloskey)
(Assignee)

Comment 48

5 months ago
Comment on attachment 8886110 [details] [diff] [review]
(v1) Part 3: Have nsINamed interface implemented by all JS instances of nsIInputStreamCallback.

treeherder looks NG.
Attachment #8886110 - Flags: review?(wmccloskey)
(Assignee)

Comment 49

5 months ago
Created attachment 8886203 [details] [diff] [review]
(v1) Part 1: Append the CallSite name to nsInputStreamReadyEvent::GetName()
Attachment #8886108 - Attachment is obsolete: true
(Assignee)

Comment 50

5 months ago
Created attachment 8886204 [details] [diff] [review]
(v2) Part 2: Have nsINamed interface implemented by all C++ instances of nsIInputStreamCallback
Attachment #8886109 - Attachment is obsolete: true
(Assignee)

Comment 51

5 months ago
Created attachment 8886205 [details] [diff] [review]
(v3) Part 3: Have nsINamed interface implemented by all JS instances of nsIInputStreamCallback
Attachment #8886110 - Attachment is obsolete: true
(Assignee)

Comment 52

5 months ago
Created attachment 8886206 [details] [diff] [review]
(v1) Part 4: Append the CallSite name to nsInputStreamPump::GetName()
(Assignee)

Comment 53

5 months ago
Created attachment 8886207 [details] [diff] [review]
(v1) Part 5: Make nsStreamLoader namable
(Assignee)

Comment 54

5 months ago
Created attachment 8886208 [details] [diff] [review]
(v1) Part 6: Name call sites of nsIInputStreamPump::AsyncWait
(Assignee)

Comment 55

5 months ago
Created attachment 8886209 [details] [diff] [review]
(v1) Part 7: Define nsIAlteredNamed.idl to append the call sites of AsyncRead if method signature is conflicted
(Assignee)

Comment 56

5 months ago
Created attachment 8886210 [details] [diff] [review]
(v1) Part 8: Name nsIStreamListener using nsIAlteredNamed if conflicted
(Assignee)

Comment 57

5 months ago
Hi Bill, I've tried to make the call sites of nsIAsyncInputStream::AsyncWait() visible from the telemetry with these changes.
Although there are still some bugs to be fixed, I'd like to have your earlier feedback about the patches noted below, thanks!

Part 1:
 - To append the call site name in nsInputStreamReadyEvent::GetName() by QI(nsINamed)
Part 4:
 - Because there are too many call sites of nsIputStreamPump::AsyncRead(), this patch to to append the additional call site in nsIputStreamPump::GetName()
Part 5:
 - Some call site of nsIputStreamPump::AsyncRead() use NS_NewStreamLoader to create a listener.
 - This patch is to allow the call site to call ::SetName of this created listener.
Part 7:
 - Some of the listeners override GetName() from nsIRequest, so I add nsIAlteredNamed idl interface to provide the class name properly.
Flags: needinfo?(wmccloskey)
(Reporter)

Comment 58

5 months ago
I spent pretty much all day debugging the problem I described in comment 39. It turns out there is a really major bug in telemetry that I suspect will have a major influence on our telemetry data, including this bug. Let's wait to do anything here until bug 1380880 is fixed and we get new data.
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 59

5 months ago
(In reply to Bill McCloskey (:billm) from comment #58)
> I spent pretty much all day debugging the problem I described in comment 39.
> It turns out there is a really major bug in telemetry that I suspect will
> have a major influence on our telemetry data, including this bug. Let's wait
> to do anything here until bug 1380880 is fixed and we get new data.
OMG, great to know that!
(Assignee)

Comment 60

5 months ago
Created attachment 8887426 [details]
1376840_patches_for_naming_is_ready.zip

The unlabeled count of this runnable is much lower down in the ranking now after bug 1380880 was fixed.
Backup the patches that is ready for the review here for revisiting later if needed.
Attachment #8886203 - Attachment is obsolete: true
Attachment #8886204 - Attachment is obsolete: true
Attachment #8886205 - Attachment is obsolete: true
Attachment #8886206 - Attachment is obsolete: true
Attachment #8886207 - Attachment is obsolete: true
Attachment #8886208 - Attachment is obsolete: true
Attachment #8886209 - Attachment is obsolete: true
Attachment #8886210 - Attachment is obsolete: true
(Assignee)

Updated

2 months ago
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.