Closed Bug 1132861 Opened 9 years ago Closed 9 years ago

StreamCopier does not always run in parallel with the main thread on multi-core device

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: ting, Unassigned)

References

Details

Attachments

(3 files)

When launch an app, content process main thread could be blocked in JARChannel::OpenLocalFile() until StreamCopier finish Process() on StreamTransportService thread pool. It's because NS_AsyncCopy() can block. This makes main thread running interleaved with the thread pool, and inflation is done sequentially.

I expect the main thread is never blocked and StreamCopier runs on thread pool in parallel to have better cpu utilization.

Here's the sequence how NS_AsyncCopy() is blocked, the stacks are from the pound signs (#), where context switch happens:

Main Thread        StreamTransportService
-----------        ----------------------
  |
  #----------------->+  nsJARChannel::OpenLocalFile()
                     |     nsInputStreamPump::AsyncRead()
                     |       nsInputStreamTransport::OpenInputStream()
                     |         NS_AsyncCopy()
                     |           nsAStreamCopier::Start()
                     |             nsAStreamCopier::PostContinuationEvent()  // nsAStreamCopier.mLock.Lock()
                     |               nsAStreamCopier::PostContinuationEvent_Locked()
                     |                 nsStreamTransportService::Dispatch()
                     |                   nsThreadPool::Dispatch()
                     |                     nsThreadPool::PutEvent() [1]      // nsThreadPool.mEvents.GetReentrantMonitor().Exit()
                     |
  +<-----------------#  nsThreadPool::Run()
  |                       nsAStreamCopier::Run()
  |                         nsAStreamCopier::Process() [2]                   // nsAStreamCopier.mLock.Lock()
  |
  #----------------->+  nsJARChannel::OpenLocalFile()
                     |    nsInputStreamPump::AsyncRead()
                     |        nsInputStreamTransport::OpenInputStream()
                     |          NS_AsyncCopy()
                     |            nsAStreamCopier::Start()
                     |              nsAStreamCopier::PostContinuationEvent() [3] // nsAStreamCopier.mLock.Unlock()
                     |
                     #  nsThreadPool::Run()
                     |    nsAStreamCopier::Run()
                     |      nsAStreamCopier::Process()
                     |        ...
                     |          nsJARInputStream::Read()
                     |            nsJARInputStream::ContinueInflate()
                     |
  +<-----------------#  nsThreadPool::Run() [4]                              // nsThreadPool.mEvents.GetReentrantMonitor().Wait()
  |
  #                     nsJARChannel::OpenLocalFile()
  |                       nsInputStreamPump::AsyncRead()
                            nsInputStreamTransport::OpenInputStream()
                              NS_AsyncCopy()

For the record, the profile at bug 1110624 comment 6 shows ~60ms [1720,1780] on Nexus4 to inflate the resource files of SMS app index.html.

[1] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThreadPool.cpp?from=nsThreadPool::PutEvent&case=true#93
[2] https://dxr.mozilla.org/mozilla-central/source/xpcom/io/nsStreamUtils.cpp?from=nsAStreamCopier&case=true#278
[3] https://dxr.mozilla.org/mozilla-central/source/xpcom/io/nsStreamUtils.cpp?from=PostContinuationEvent&case=true#437
[4] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThreadPool.cpp#213
OS: Gonk (Firefox OS) → All
Hardware: ARM → All
Attached patch dbglog.diffSplinter Review
Added some logs to show the main thread and StreamTransportService thread pool run on the same cpu core.
Attached file dbglog.txt
This is the log I captured from a Nexus4 (quad-core) when launch SMS app. I expect main thread and StreamTransoportService thread pool run on different core:

02-15 23:47:30.323 11853 11853 D x       : [0] 0xb33b04e0 nsJARChannel::OpenLocalFile > AsyncRead app://sms.gaiamobile.org/shared/js/settings_listener.js
02-15 23:47:30.323 11853 11853 D x       : [0] 0xb33b04e0 nsJARChannel::OpenLocalFile < AsyncRead
02-15 23:47:30.323 11853 11975 D x       : [2] 0xb2340a00 nsJARInputStream::ContinueInflate > inflate

But you can see:

02-15 23:47:29.432 11853 11853 D x       : [1] 0xb30d94d0 nsJARChannel::OpenLocalFile > AsyncRead app://sms.gaiamobile.org/shared/style/action_menu.css
02-15 23:47:29.432 11853 11960 D x       : [1] 0xb33df680 nsJARInputStream::ContinueInflate > inflate
02-15 23:47:29.432 11853 11853 D x       : [1] 0xb30d94d0 nsJARChannel::OpenLocalFile < AsyncRead
Summary: NS_AsyncCopy() can block and make nsInputStreamPump::AsyncRead() synchronized → StreamCopier does not always run in parallel with the main thread and can make NS_AsyncCopy() synchronized
I wonder is it about the number of online cores (/sys/devices/system/cpu/online)...
I will try sched_setaffinity() with online cores after back from holiday.
(In reply to Ting-Yu Chou [:ting] from comment #3)
> I wonder is it about the number of online cores
> (/sys/devices/system/cpu/online)...

I've tried to stop mpdecision and set 1 to /sys/devices/system/cpu/cpu*/online, but it still happens.
Attached file test: setaffinity.diff
(In reply to Ting-Yu Chou [:ting] from comment #4)
> I will try sched_setaffinity() with online cores after back from holiday.

It works as expected by setting affinity explicitly for B2G process main thread (cpu0) and StreamTransportService thread (cpu1). I used clock_gettime(CLOCK_MONOTONIC) to measure AsyncRead() in nsJARChannel::OpenLocalFile() on a Nexus4, not in parallel usually takes >0.5ms, and in parallel usually takes ~0.1ms.

There're still things to consider:

- How to decide which core to run StreamTransportService threads?
- When to set affinity of B2G main thread back to default state, e.g., from cpu 0,1,2 back to cpu 0,1,2,3?
- Is it the mutex that causes the scheduler works like that?
(In reply to Ting-Yu Chou [:ting] from comment #6)
> - Is it the mutex that causes the scheduler works like that?

I have treid to 1) add one more mutex for accessing mCancel* variables, and 2) get rid of all the MutexAutoLock in nsAStreamCopier but both do not work.
See Also: → 669367
When NS_AsyncCopy() works synchronously, both the main thread and StreamTransportService thread run on the same core, and it is kernel's decision. Besides, there could be only one or two online cores since the loading is not high enough. That's why I tried sched_setaffinity() to force scheduler runs the threads on different cores.

But setting affinity explicitly probably not a good idea for power constrained device, because kernel will have less capability on arranging CPU resource. When launch an application, not only content process but also chrome process is busy working (e.g., system app, compositor), kernel has clearer picture than us on how to utilize CPU and knows when to turn cores off to lower power consumption, which is what we can't determine from StreamCopier.

As such, I'd like to close this as won't fix.
Summary: StreamCopier does not always run in parallel with the main thread and can make NS_AsyncCopy() synchronized → StreamCopier does not always run in parallel with the main thread on multi-core device
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Blocks: 1199539
No longer blocks: AppStartup
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: