Closed Bug 1699298 Opened 3 months ago Closed 2 months ago

crash near null in [@ mozilla::extensions::MatchPattern::Init]

Categories

(WebExtensions :: General, defect, P5)

defect

Tracking

(firefox-esr78 wontfix, firefox87 wontfix, firefox88 wontfix, firefox89 wontfix, firefox90 fixed)

RESOLVED FIXED
90 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox87 --- wontfix
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 --- fixed

People

(Reporter: tsmith, Assigned: rpl)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This happened once randomly while trying to reproduce another issue. Found with m-c 20210316-0d51fdccaa96.

==22201==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x7f5ebaeb4c25 bp 0x7fff20410e20 sp 0x7fff20410e10 T0)
==22201==The signal is caused by a READ memory access.
==22201==Hint: address points to the zero page.
    #0 0x7f5ebaeb4c25 in nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_RelocateUsingMemutils>::Hdr() const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/nsTArray.h:578:51
    #1 0x7f5ebf9c62e4 in nsTArray_Impl<RefPtr<nsAtom>, nsTArrayInfallibleAllocator>::Elements() const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/nsTArray.h:1191:47
    #2 0x7f5ebf9c5e3d in unsigned long nsTArray_Impl<RefPtr<nsAtom>, nsTArrayInfallibleAllocator>::BinaryIndexOf<nsAtom const*, nsDefaultComparator<RefPtr<nsAtom>, nsAtom const*> >(nsAtom const* const&, nsDefaultComparator<RefPtr<nsAtom>, nsAtom const*> const&) const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/nsTArray.h:1420:9
    #3 0x7f5ebf9c5c52 in unsigned long nsTArray_Impl<RefPtr<nsAtom>, nsTArrayInfallibleAllocator>::BinaryIndexOf<nsAtom const*>(nsAtom const* const&) const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/nsTArray.h:1442:12
    #4 0x7f5ebf9c5b5c in bool nsTArray_Impl<RefPtr<nsAtom>, nsTArrayInfallibleAllocator>::ContainsSorted<nsAtom const*>(nsAtom const* const&) const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/nsTArray.h:1333:12
    #5 0x7f5ebf9c5aa6 in mozilla::extensions::AtomSet::Contains(nsAtom const*) const /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dist/include/mozilla/extensions/MatchPattern.h:60:19
    #6 0x7f5ed32f45b9 in mozilla::extensions::MatchPattern::Init(JSContext*, nsTSubstring<char16_t> const&, bool, bool, mozilla::ErrorResult&) /home/twsmith/code/mozilla-central/toolkit/components/extensions/MatchPattern.cpp:280:53
    #7 0x7f5ed32f3dea in mozilla::extensions::MatchPattern::Constructor(mozilla::dom::GlobalObject&, nsTSubstring<char16_t> const&, mozilla::dom::MatchPatternOptions const&, mozilla::ErrorResult&) /home/twsmith/code/mozilla-central/toolkit/components/extensions/MatchPattern.cpp:242:12
    #8 0x7f5ed32f845a in mozilla::extensions::MatchPatternSet::Constructor(mozilla::dom::GlobalObject&, nsTArray<mozilla::dom::OwningStringOrMatchPattern> const&, mozilla::dom::MatchPatternOptions const&, mozilla::ErrorResult&) /home/twsmith/code/mozilla-central/toolkit/components/extensions/MatchPattern.cpp:497:11
    #9 0x7f5ec2d690c8 in mozilla::dom::MatchPatternSet_Binding::_constructor(JSContext*, unsigned int, JS::Value*) /home/twsmith/code/mozilla-central/objdir-ff-ubsan/dom/bindings/MatchPatternBinding.cpp:1849:68
    #10 0x7f5ed42c8b28 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:435:13
    #11 0x7f5ed42c8b28 in CallJSNativeConstructor(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:451:8
    #12 0x7f5ed42c8b28 in InternalConstruct(JSContext*, js::AnyConstructArgs const&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:643:10
    #13 0x7f5ed42c7299 in js::ConstructFromStack(JSContext*, JS::CallArgs const&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:670:10
    #14 0x7f5ed42997f3 in Interpret(JSContext*, js::RunState&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:3234:16
    #15 0x7f5ed4256f20 in js::RunScript(JSContext*, js::RunState&) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:405:13
    #16 0x7f5ed42c5d62 in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:552:13
    #17 0x7f5ed42c6ca3 in InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:580:10
    #18 0x7f5ed42c6f5a in js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason) /home/twsmith/code/mozilla-central/js/src/vm/Interpreter.cpp:597:8
    #19 0x7f5ed4fba4d6 in js::CallSelfHostedFunction(JSContext*, JS::Handle<js::PropertyName*>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>) /home/twsmith/code/mozilla-central/js/src/vm/SelfHosting.cpp:1642:10
    #20 0x7f5ed678f483 in js::jit::InterpretResume(JSContext*, JS::Handle<JSObject*>, JS::Value*, JS::MutableHandle<JS::Value>) /home/twsmith/code/mozilla-central/js/src/jit/VMFunctions.cpp:1306:10
    #21 0x2ac0e7b5a9ff  (<unknown module>)

A Pernosco session is available here: https://pernos.co/debug/2KcvHyKOR-YkYtLrvo3Rhw/index.html

(In reply to Tyson Smith [:tsmith] from comment #1)

A Pernosco session is available here: https://pernos.co/debug/2KcvHyKOR-YkYtLrvo3Rhw/index.html

Thanks for the link to the Pernosco session, I was going to ask you if you had one that we could use to more easily dig into this :-)

Based on a quick look to the Pernosco session:

(pernosco) call DumpJSStack() 
0 loadManifest() ["resource://gre/modules/Extension.jsm":1258:26]
1 InterpretGeneratorResume(gen = "[object Object]", val = "[object Object],", kind = ""next"") ["self-hosted":1480:33]
2 AsyncFunctionNext(val = "[object Object],") ["self-hosted":690:26]
    this = [object Object]

This seems like an issue with the XPCShell test setup, not happening in the real firefox app.

Severity: -- → S4
Priority: -- → P5

(In reply to Tomislav Jovanovic :zombie from comment #3)

This seems like an issue with the XPCShell test setup, not happening in the real firefox app.

This was found running Firefox.

(In reply to Tyson Smith [:tsmith] from comment #4)

(In reply to Tomislav Jovanovic :zombie from comment #3)

This seems like an issue with the XPCShell test setup, not happening in the real firefox app.

This was found running Firefox.

My apologies, I think that there was a misunderstanding while we discussed about this during the triaging meeting:
I think that I may likely have said that this wasn't a crash report from a user but a crash triggered with fuzzy testing, but I may have said that while Tomislav was asking if this was an xpcshell test and we ended up misunderstanding each other (my fault, I should have double-checked the issue).

As I did mention in the meeting, I'm going to come back to this issue to take another look to the pernosco session, it is odd that we are still loading extension manifest after we have called the xpcom shutdown handlers and I wanted to double-check why was it happening (e.g. I was wondering if this was a very short living session and we started to shutdown while we were still in the process of loading the extensions).

Flags: needinfo?(lgreco)

By looking a bit more into the pernosco session attached to this bug, I did notice that
MatchPattern::Init is being called late in the xpcom shutdown (seems to be triggered by
some pending Promise jobs related to loading the builtin search extension, which in that
run got scheduled during the late shutdown).

Under these conditions, AtomSet::Get does allocate again the static RefPtr (because the
previous one was already destroyed as part of the xpcom shutdown), but it gets destroyed
by ClearOnShutdown before the RefPtr gets to the caller:

This patch does:

  • check explicitly in MatchPattern::Init if we are get past the XPCOMShutdownFinal phase,
    and throws an explicit NS_ERROR_ILLEGAL_DURING_SHUTDOWN error if that is the case

  • change the signature for AtomSet::Get static method, to have an explicit non discardable
    nsresult as its result value, to track down other similar AtomSet::Get calls that may
    have to take this scenario into account (and to make it a bit more clear from the method
    signature that it can actually fail and the returned nsresult should be checked before
    actually using the refptr).

As an additional side note:

  • I've looked into existing crash reports with this signature but I didn't find any
    (but I won't exclude that I may have not looked enough)

  • I've tried to reproduce conditions similar to the ones that I observed in the pernosco session
    (by cheating a bit and trying to force similar conditions with some temporary changes applied locally)
    but I wasn't able to trigger it in the exact same way

Nevertheless by looking into some past bugzilla issues it looks that

  • it did already happen in the past that we would be still loading search extensions late in shutdown,
    (e.g. Bug 1620227) and so it doesn't seem totally surprising that there are other ways that we may
    be still in the progress.

  • we did fix some other crashes due to some C++ code being triggered by the js runtime late in
    shutdown (e.g. Bug 1663315 is one that I did notice in the middle of the bugs I looked), and
    so it doesn't seem that unexpected that we may still have a call to MatchPattern::Init that
    late in the shutdown.

And so this seems to be a legit scenario to account for, even if I wasn't able to trigger it
exactly like Grizzly triggered it during a fuzzy test run.

Assignee: nobody → lgreco
Status: NEW → ASSIGNED

I looked again into this last Friday and I did notice some more details that I did miss the first time I was investigating the scenario of the crash from the pernosco session:

AtomSet::Get is creating a new AtomSet but it gets destroyed before it gets to the caller because MatchPattern::Init is being called late in the shutdown (I did put some details about what I did notice in the comment included in the patch attached in comment 6).

It looked a legit scenario that we may have to account for, even if I have not been able to "force trigger it" locally, and so I opted to prepare a patch to deal with it, and eventually mark the patch as obsolete if (based on a feedback/review from the reviewer) it turns out that I did misread some of the details of the triggering scenario from the pernosco session and we should use a completely different approach to handle it.

Flags: needinfo?(lgreco)
Pushed by luca.greco@alcacoop.it:
https://hg.mozilla.org/integration/autoland/rev/8a0a264d20c5
Ensure MatchPattern::Init throws and AtomSet::Get returns NS_ERROR_ILLEGAL_DURING_SHUTDOWN if called late on shutdown. r=mccr8
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch

Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.

Looking at Socorro, I'm only seeing reports with this signature from ESR78. Some look like writes at random addresses. I'm wondering if this is worth uplifting to Beta/ESR? The patch grafts cleanly.

Crash Signature: [@ mozilla::extensions::MatchPattern::Init]
Flags: needinfo?(lgreco)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #11)

Looking at Socorro, I'm only seeing reports with this signature from ESR78. Some look like writes at random addresses. I'm wondering if this is worth uplifting to Beta/ESR? The patch grafts cleanly.

I discussed about this with Ryan over slack and looked into a bunch of the crash signature from ESR78 butm if I'm not mistaken, it looks those were happening on the startup path instead of the shutdown path (e.g. in the ones I looked to there is nsWindowWatcher::OpenWindow and mozilla::AppWindow::ShowModal which I doubt we hit over the shutdown path, but I don't exclude that I may be wrong and there could be a chance to hit them during the shutdown anyway).

If the crash signature on ESR78 are all happening in the startup path as it seems, then the patch we landed would not have any effect (because what the patch landed on central is doing is: checking explicitly inside AtomSet::Get if we are late in the shutdown and return an explicit failure).

If the crash reports on ESR78 do increase in occurrence, then it may be worth investigating them separately (because they may have a different kind of root cause from the one originally reported in this issue).

(I'm also updating the tracking flags accordingly as Ryan suggested me over slack).

You need to log in before you can comment on or make changes to this bug.