data race in StartupCache.cpp

RESOLVED FIXED in mozilla5

Status

()

defect
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: taras.mozilla, Assigned: taras.mozilla)

Tracking

unspecified
mozilla5
x86
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

==22869== Possible data race during write of size 8 at 0x162f6d10 by thread #1
==22869==    at 0x5739380: mozilla::scache::StartupCache::WriteTimeout(nsITimer*, void*) (StartupCache.cpp:415)
==22869==    by 0x63A37C4: nsTimerImpl::Fire() (nsTimerImpl.cpp:425)
==22869==    by 0x63A38A8: nsTimerEvent::Run() (nsTimerImpl.cpp:517)
==22869==    by 0x639FFFC: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:633)
==22869==    by 0x63604F9: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
==22869==    by 0x6259882: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:110)
==22869==    by 0x63E0ECA: MessageLoop::RunInternal() (message_loop.cc:219)
==22869==    by 0x63E0ED6: MessageLoop::RunHandler() (message_loop.cc:202)
==22869==    by 0x63E0F45: MessageLoop::Run() (message_loop.cc:176)
==22869==    by 0x6178787: nsBaseAppShell::Run() (nsBaseAppShell.cpp:192)
==22869==    by 0x5FE21D4: nsAppStartup::Run() (nsAppStartup.cpp:220)
==22869==    by 0x563AC11: XRE_main (nsAppRunner.cpp:3781)
==22869==    by 0x400F7E: main (nsBrowserApp.cpp:158)
 
==22869==  This conflicts with a previous read of size 8 by thread #26
==22869==    at 0x573939B: mozilla::scache::StartupCache::WaitOnWriteThread() (StartupCache.cpp:385)
==22869==    by 0x573960C: mozilla::scache::StartupCache::WriteToDisk() (StartupCache.cpp:326)
==22869==    by 0x57397DF: mozilla::scache::StartupCache::ThreadedWrite(void*) (StartupCache.cpp:398)
==22869==    by 0x7A43A1C: _pt_root (ptthread.c:189)
==22869==    by 0x4C2CBE7: mythread_wrapper (hg_intercepts.c:221)
==22869==    by 0x4E369C9: start_thread (pthread_create.c:300)
==22869==    by 0xB2DA70C: clone (clone.S:112)
This is a race on StartupCache::mWriteThread, which is accessed MT
without locking.  Am tracking it as HGANN 030 in the patch on bug
551155.
Posted patch locking (obsolete) — Splinter Review
Assignee: nobody → tglek
Attachment #516435 - Flags: review?(mwu)
Why not just return immediately from WaitOnWriteThread if NS_IsMainThread is false?

If we could avoid the write thread from calling WaitOnWriteThread completely, we could assert/abort on !NS_IsMainThread and ensure callers only use StartupCache on the main thread.
(In reply to comment #3)
> Why not just return immediately from WaitOnWriteThread if NS_IsMainThread is
> false?

Good idea, but I don't think we can rely on cache being used solely from main thread.

Afaik cache accesses are not happening frequently enough to worry about locking perf.
 
> If we could avoid the write thread from calling WaitOnWriteThread completely,
> we could assert/abort on !NS_IsMainThread and ensure callers only use
> StartupCache on the main thread.

It's convenient to be able to reopen the cache after writing on a separate thread.
(In reply to comment #4)
> (In reply to comment #3)
> > Why not just return immediately from WaitOnWriteThread if NS_IsMainThread is
> > false?
> 
> Good idea, but I don't think we can rely on cache being used solely from main
> thread.
> 
Currently, we can, as its only user will only use it on the main thread. As far as I can tell, StartupCache does not go out of its way to be thread safe. I suspect there are going to be far more severe bugs if you try to use StartupCache from another thread.

An alternate fix would be to not call WaitOnWriteThread on the write thread. There's no reason to call it in WriteToDisk, and LoadArchive doesn't need it because either its caller already called it (InvalidateCache) or its caller knows for sure that the thread isn't running (Init).
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > Why not just return immediately from WaitOnWriteThread if NS_IsMainThread is
> > > false?
> > 
> > Good idea, but I don't think we can rely on cache being used solely from main
> > thread.
> > 
> Currently, we can, as its only user will only use it on the main thread. As far
> as I can tell, StartupCache does not go out of its way to be thread safe. I
> suspect there are going to be far more severe bugs if you try to use
> StartupCache from another thread.
> 
> An alternate fix would be to not call WaitOnWriteThread on the write thread.
> There's no reason to call it in WriteToDisk, and LoadArchive doesn't need it
> because either its caller already called it (InvalidateCache) or its caller
> knows for sure that the thread isn't running (Init).

WriteToDisk is called from the destructor too. In that case it should wait in case there is a write thread going on(but this can be fixed by waiting in destructor).

I'll see about changing code according to your suggestion, i'm just worried about causing fragility down the road when new features are added to the code.
At this point I think we should assert (or runtimeabort) that the startup cache clients only operate on the main thread. Trying to make the implementation completely threadsafe doesn't sound like a fruitful activity.
Posted patch no lockingSplinter Review
This adds assertions for running on non-main thread/process. no more WaitOnWrite thread on write thread
Attachment #516435 - Attachment is obsolete: true
Attachment #516435 - Flags: review?(mwu)
Attachment #516986 - Flags: review?(mwu)
Comment on attachment 516986 [details] [diff] [review]
no locking

Nice.
Attachment #516986 - Flags: review?(mwu) → review+
(In reply to comment #8)
> Created attachment 516986 [details] [diff] [review]

With this in place I can no longer detect the race.  +1 therefore.
http://hg.mozilla.org/mozilla-central/rev/2bee01c51ed6
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-cedar
Target Milestone: --- → mozilla2.2
You need to log in before you can comment on or make changes to this bug.