Closed Bug 961860 Opened 11 years ago Closed 11 years ago

Intermittent null-deref in Seer::Observe()

Categories

(Core :: Networking, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox27 --- unaffected
firefox28 --- unaffected
firefox29 --- fixed
firefox-esr24 --- unaffected

People

(Reporter: n.nethercote, Assigned: u408661)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Valgrind catches this. https://tbpl.mozilla.org/php/getParsedLog.php?id=33224002&tree=Mozilla-Inbound > ==20525== Invalid read of size 8 > ==20525== at 0x80FC00F: mozilla::net::Seer::Observe(nsISupports*, char const*, char16_t const*) (Seer.cpp:355) > ==20525== by 0x80AC7EF: nsTimerImpl::Fire() (nsTimerImpl.cpp:559) > ==20525== by 0x80AC8C9: nsTimerEvent::Run() (nsTimerImpl.cpp:635) > ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637) > ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263) > ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488) > ==20525== by 0x8E79DE3: mozilla::dom::workers::RuntimeService::NoteIdleThread(mozilla::dom::workers::RuntimeService::WorkerThread*) (RuntimeService.cpp:2198) > ==20525== by 0x8E79F21: (anonymous namespace)::WorkerThreadPrimaryRunnable::FinishedRunnable::Run() (RuntimeService.cpp:2597) > ==20525== by 0x80A880C: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:637) > ==20525== by 0x80580F4: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:263) > ==20525== by 0x80AD128: nsThread::Shutdown() (nsThread.cpp:488) > ==20525== by 0x8CC2E86: nsHtml5ParserThreadTerminator::Observe(nsISupports*, char const*, char16_t const*) (nsHtml5Module.cpp:100) > ==20525== by 0x808726B: nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverList.cpp:96) > ==20525== by 0x8087327: nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) (nsObserverService.cpp:302) > ==20525== by 0x805B69F: mozilla::ShutdownXPCOM(nsIServiceManager*) (nsXPComInit.cpp:708) > ==20525== by 0x805BB87: NS_ShutdownXPCOM (nsXPComInit.cpp:656) > ==20525== by 0x97B4512: ScopedXPCOMStartup::~ScopedXPCOMStartup() (nsAppRunner.cpp:1163) > ==20525== by 0x97BAB55: XREMain::XRE_main(int, char**, nsXREAppData const*) (nsAppRunner.cpp:4116) > ==20525== by 0x97BAD9E: XRE_main (nsAppRunner.cpp:4331) > ==20525== by 0x404933: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:280) > ==20525== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==20525== This is a simple NULL-deref: gSeer->mIOThread->Dispatch(event, NS_DISPATCH_NORMAL); Either |gSeer| or |gSeer->mIOThread| is NULL. This causes a segfault, and we enter Breakpad, and Breakpad does a clone() call that Valgrind can't handle. This would also cause a crash if it happened when not running under Valgrind. It's likely that it manifested under Valgrind because thread scheduling is different to native runs.
Blocks: 961614
i believe nick is away for a few days
Assignee: nobody → hurley
I don't disappear to the mountains until Friday :) It's almost certainly mIOThread that's null - gSeer only gets set to null when the dtor runs, so unless there's something incredibly crazy going on, there's no way gSeer will be null (since the seer is a singleton). That said, there's no need to be referencing gSeer in Seer::Observe, anyway (it's likely an artifact of when I had a separate observer object), so it's probably worth cleaning that up when I add the check to make sure we don't do anything stupid like deref a null pointer :)
Attached patch patchSplinter Review
Here's a thing that should do what we want (along with an extra sanity check for threading in Seer::Observe) Running through try for valgrind (will retrigger a few times for reasonable certainty that it's fixed): https://tbpl.mozilla.org/?tree=Try&rev=3484a34b04aa
Attachment #8363839 - Flags: review?(mcmanus)
> Running through try for valgrind (will retrigger a few times for reasonable > certainty that it's fixed): It's sufficiently rare -- it has happened only two or three times in the past few days -- that you'd need a *lot* of retriggers to be confident. But looking at the patch it seems pretty likely that you've fixed it.
Attachment #8363839 - Flags: review?(mcmanus) → review+
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
AFAICT, this affects 27/28 as well. Probably too late for an uplift to 27, but can we nominate it for 28 please?
Actually doesn't affect 27 or 28 - the code that included the null deref is new in 29. And even if it weren't, the seer is disabled on 27/28.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: