Closed
Bug 961860
Opened 8 years ago
Closed 8 years ago
Intermittent null-deref in Seer::Observe()
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
Tracking | Status | |
---|---|---|
firefox27 | --- | unaffected |
firefox28 | --- | unaffected |
firefox29 | --- | fixed |
firefox-esr24 | --- | unaffected |
People
(Reporter: n.nethercote, Assigned: u408661)
References
()
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
1.13 KB,
patch
|
mcmanus
:
review+
|
Details | Diff | Splinter Review |
Valgrind catches this. https://tbpl.mozilla.org/php/getParsedLog.php?id=33224002&tree=Mozilla-Inbound > ==20525== Invalid read of size 8 > ==20525== at 0x80FC00F: mozilla::net::Seer::Observe(nsISupports*, char const*, char16_t const*) (Seer.cpp:355) > ==20525== by 0x80AC7EF: nsTimerImpl::Fire() (nsTimerImpl.cpp:559) > ==20525== by 0x80AC8C9: nsTimerEvent::Run() (nsTimerImpl.cpp:635) > ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637) > ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263) > ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488) > ==20525== by 0x8E79DE3: mozilla::dom::workers::RuntimeService::NoteIdleThread(mozilla::dom::workers::RuntimeService::WorkerThread*) (RuntimeService.cpp:2198) > ==20525== by 0x8E79F21: (anonymous namespace)::WorkerThreadPrimaryRunnable::FinishedRunnable::Run() (RuntimeService.cpp:2597) > ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637) > ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263) > ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488) > ==20525== by 0x8CC2E86: nsHtml5ParserThreadTerminator::Observe(nsISupports*, char const*, char16_t const*) (nsHtml5Module.cpp:100) > ==20525== by 0x808726B: nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverList.cpp:96) > ==20525== by 0x8087327: nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverService.cpp:302) > ==20525== by 0x805B69F: mozilla::ShutdownXPCOM(nsIServiceManager*) (nsXPComInit.cpp:708) > ==20525== by 0x805BB87: NS_ShutdownXPCOM (nsXPComInit.cpp:656) > ==20525== by 0x97B4512: ScopedXPCOMStartup::~ScopedXPCOMStartup() (nsAppRunner.cpp:1163) > ==20525== by 0x97BAB55: XREMain::XRE_main(int, char**, nsXREAppData const*) (nsAppRunner.cpp:4116) > ==20525== by 0x97BAD9E: XRE_main (nsAppRunner.cpp:4331) > ==20525== by 0x404933: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:280) > ==20525== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==20525== This is a simple NULL-deref: gSeer->mIOThread->Dispatch(event, NS_DISPATCH_NORMAL); Either |gSeer| or |gSeer->mIOThread| is NULL. This causes a segfault, and we enter Breakpad, and Breakpad does a clone() call that Valgrind can't handle. This would also cause a crash if it happened when not running under Valgrind. It's likely that it manifested under Valgrind because thread scheduling is different to native runs.
![]() |
Reporter | |
Comment 1•8 years ago
|
||
Happened again on TBPL: https://bugzilla.mozilla.org/show_bug.cgi?id=961614#c6
I don't disappear to the mountains until Friday :) It's almost certainly mIOThread that's null - gSeer only gets set to null when the dtor runs, so unless there's something incredibly crazy going on, there's no way gSeer will be null (since the seer is a singleton). That said, there's no need to be referencing gSeer in Seer::Observe, anyway (it's likely an artifact of when I had a separate observer object), so it's probably worth cleaning that up when I add the check to make sure we don't do anything stupid like deref a null pointer :)
Here's a thing that should do what we want (along with an extra sanity check for threading in Seer::Observe) Running through try for valgrind (will retrigger a few times for reasonable certainty that it's fixed): https://tbpl.mozilla.org/?tree=Try&rev=3484a34b04aa
Attachment #8363839 -
Flags: review?(mcmanus)
![]() |
Reporter | |
Comment 5•8 years ago
|
||
> Running through try for valgrind (will retrigger a few times for reasonable
> certainty that it's fixed):
It's sufficiently rare -- it has happened only two or three times in the past few days -- that you'd need a *lot* of retriggers to be confident. But looking at the patch it seems pretty likely that you've fixed it.
Updated•8 years ago
|
Attachment #8363839 -
Flags: review?(mcmanus) → review+
Comment 7•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/b89f28d7f0f9
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 8•8 years ago
|
||
AFAICT, this affects 27/28 as well. Probably too late for an uplift to 27, but can we nominate it for 28 please?
status-firefox27:
--- → wontfix
status-firefox28:
--- → affected
status-firefox29:
--- → fixed
status-firefox-esr24:
--- → unaffected
Actually doesn't affect 27 or 28 - the code that included the null deref is new in 29. And even if it weren't, the seer is disabled on 27/28.
You need to log in
before you can comment on or make changes to this bug.
Description
•