Closed
Bug 961860
Opened 11 years ago
Closed 11 years ago
Intermittent null-deref in Seer::Observe()
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
| Tracking | Status | |
|---|---|---|
| firefox27 | --- | unaffected |
| firefox28 | --- | unaffected |
| firefox29 | --- | fixed |
| firefox-esr24 | --- | unaffected |
People
(Reporter: n.nethercote, Assigned: u408661)
References
()
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
|
1.13 KB,
patch
|
mcmanus
:
review+
|
Details | Diff | Splinter Review |
Valgrind catches this.
https://tbpl.mozilla.org/php/getParsedLog.php?id=33224002&tree=Mozilla-Inbound
> ==20525== Invalid read of size 8
> ==20525== at 0x80FC00F: mozilla::net::Seer::Observe(nsISupports*, char const*, char16_t const*) (Seer.cpp:355)
> ==20525== by 0x80AC7EF: nsTimerImpl::Fire() (nsTimerImpl.cpp:559)
> ==20525== by 0x80AC8C9: nsTimerEvent::Run() (nsTimerImpl.cpp:635)
> ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637)
> ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263)
> ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488)
> ==20525== by 0x8E79DE3: mozilla::dom::workers::RuntimeService::NoteIdleThread(mozilla::dom::workers::RuntimeService::WorkerThread*) (RuntimeService.cpp:2198)
> ==20525== by 0x8E79F21: (anonymous namespace)::WorkerThreadPrimaryRunnable::FinishedRunnable::Run() (RuntimeService.cpp:2597)
> ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637)
> ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263)
> ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488)
> ==20525== by 0x8CC2E86: nsHtml5ParserThreadTerminator::Observe(nsISupports*, char const*, char16_t const*) (nsHtml5Module.cpp:100)
> ==20525== by 0x808726B: nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverList.cpp:96)
> ==20525== by 0x8087327: nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverService.cpp:302)
> ==20525== by 0x805B69F: mozilla::ShutdownXPCOM(nsIServiceManager*) (nsXPComInit.cpp:708)
> ==20525== by 0x805BB87: NS_ShutdownXPCOM (nsXPComInit.cpp:656)
> ==20525== by 0x97B4512: ScopedXPCOMStartup::~ScopedXPCOMStartup() (nsAppRunner.cpp:1163)
> ==20525== by 0x97BAB55: XREMain::XRE_main(int, char**, nsXREAppData const*) (nsAppRunner.cpp:4116)
> ==20525== by 0x97BAD9E: XRE_main (nsAppRunner.cpp:4331)
> ==20525== by 0x404933: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:280)
> ==20525== Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==20525==
This is a simple NULL-deref:
gSeer->mIOThread->Dispatch(event, NS_DISPATCH_NORMAL);
Either |gSeer| or |gSeer->mIOThread| is NULL. This causes a segfault, and we enter Breakpad, and Breakpad does a clone() call that Valgrind can't handle.
This would also cause a crash if it happened when not running under Valgrind. It's likely that it manifested under Valgrind because thread scheduling is different to native runs.
| Reporter | ||
Comment 1•11 years ago
|
||
Happened again on TBPL: https://bugzilla.mozilla.org/show_bug.cgi?id=961614#c6
I don't disappear to the mountains until Friday :)
It's almost certainly mIOThread that's null - gSeer only gets set to null when the dtor runs, so unless there's something incredibly crazy going on, there's no way gSeer will be null (since the seer is a singleton). That said, there's no need to be referencing gSeer in Seer::Observe, anyway (it's likely an artifact of when I had a separate observer object), so it's probably worth cleaning that up when I add the check to make sure we don't do anything stupid like deref a null pointer :)
Here's a thing that should do what we want (along with an extra sanity check for threading in Seer::Observe)
Running through try for valgrind (will retrigger a few times for reasonable certainty that it's fixed): https://tbpl.mozilla.org/?tree=Try&rev=3484a34b04aa
Attachment #8363839 -
Flags: review?(mcmanus)
| Reporter | ||
Comment 5•11 years ago
|
||
> Running through try for valgrind (will retrigger a few times for reasonable
> certainty that it's fixed):
It's sufficiently rare -- it has happened only two or three times in the past few days -- that you'd need a *lot* of retriggers to be confident. But looking at the patch it seems pretty likely that you've fixed it.
Updated•11 years ago
|
Attachment #8363839 -
Flags: review?(mcmanus) → review+
Comment 7•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 8•11 years ago
|
||
AFAICT, this affects 27/28 as well. Probably too late for an uplift to 27, but can we nominate it for 28 please?
status-firefox27:
--- → wontfix
status-firefox28:
--- → affected
status-firefox29:
--- → fixed
status-firefox-esr24:
--- → unaffected
Actually doesn't affect 27 or 28 - the code that included the null deref is new in 29. And even if it weren't, the seer is disabled on 27/28.
You need to log in
before you can comment on or make changes to this bug.
Description
•