Closed Bug 1669698 Opened 4 years ago Closed 4 years ago

Crash with Marionette and Remote Agent enabled in [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe]

Categories

(Remote Protocol :: Agent, defect, P1)

Unspecified
macOS
defect

Tracking

(firefox83 fixed)

RESOLVED FIXED
83 Branch
Tracking Status
firefox83 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

I can permanently reproduce this crash during shutdown of Firefox with the following steps:

  1. Start Firefox with the --remote-debugger argument
  2. When started quit Firefox
  3. Wait for the crash reporter dialog to pop-up.

Crash report: https://crash-stats.mozilla.org/report/index/ac8a24bf-40e4-4fd7-bcd1-f30bf0201007

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0 XUL remote::startup::handler::RemoteAgentHandler::allocate::Observe remote/startup/handler.rs:39
1 XUL nsObserverList::NotifyObservers xpcom/ds/nsObserverList.cpp:70
2 XUL nsObserverService::NotifyObservers xpcom/ds/nsObserverService.cpp:287
3 XUL NS_InvokeByIndex 
4 XUL XPCWrappedNative::CallMethod js/xpconnect/src/XPCWrappedNative.cpp:1142
5 XUL XPC_WN_CallMethod js/xpconnect/src/XPCWrappedNativeJSOps.cpp:947
6 XUL js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:600
7 XUL Interpret js/src/vm/Interpreter.cpp:3337
8 XUL js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:637
9 XUL JS_CallFunctionValue js/src/jsapi.cpp:2768

Could this just be a moved stack signature now that bug 1660893 is fixed?

Flags: needinfo?(nika)
Blocks: 1669746

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #0)

Could this just be a moved stack signature now that bug 1660893 is fixed?

Clearly not the case. I can even see it for a Firefox Nightly build from April this year:
https://crash-stats.mozilla.org/report/index/210878f4-8b42-4821-903c-a85ad0201007

I wonder if that is related to MacOS 10.15.x, or some other system changes.

Interestingly this is broken with a specific profile only. When I use that Firefox Nightly build from April with a new profile it all works, and the same for the most recent Nightly. I will check which profile setting actually causes that.

Btw when I get this crash there is also this strange second entry printed to stderr:

DevTools listening on ws://localhost:9222/devtools/browser/c77db8cb-635e-dc4a-856e-f2409e4dcc5d
DevTools listening on true

This code can be found at:
https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/remote/startup/handler.rs#145-146

So I assume something causes the remote agent to initialize twice, so that the remote-listening observer notification is fired twice:

https://searchfox.org/mozilla-central/source/remote/RemoteAgent.jsm#85-103

But no idea how wsDebuggerURL could end-up as true:
https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/remote/targets/MainProcessTarget.jsm#40

I'm moving this bug to Remote Agent for now given that it seems to be a bug in our implementation.

Component: XPCOM → Agent
Flags: needinfo?(nika)
Product: Core → Remote Protocol

Actually this crash only happens when both Marionette and the Remote Agent are enabled. Here updated steps:

  1. Start Firefox with the --remote-debugging-port and --marionette argument set
  2. When Firefox has been started quit it right away
  3. Wait for the crash reporter dialog to pop-up.
Summary: Crash in [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe] → Crash with Marionette and Remote Agent enabled in [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe]
Blocks: 1669218

The failing line here is clearly:

https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/remote/startup/handler.rs#145

let url = unsafe { wstring_to_cstring(data) }.map_err(|_| NS_ERROR_FAILURE)?;

In case data is a boolean as in the above case this will fail all the time when starting Firefox directly from the command line, but not when started via mach run:

/Users/henrik/code/gecko/obj/full/dist/Nightly.app/Contents/MacOS/firefox --marionette --remote-debugging-port=0 --profilemanager -no-remote -foreground -profile /Users/henrik/code/gecko/obj/full/tmp/profile-default

vs.

mach run -- --marionette --remote-debugging-port=0 -profile /Users/henrik/code/gecko/obj/full/tmp/profile-default

Maybe mach disables some crash reporting bits by default? That wouldn't be ideal. Nika or Andrew, does one of you have an idea?

Flags: needinfo?(nika)
Flags: needinfo?(ahal)

remote-listening will have the data field being the string "true" when notified from this callsite: https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/testing/marionette/components/marionette.js#521

It seems like this issue is being caused by an observer notification name conflict, where marionette is using the remote-listening notification for one purpose, and the remote debugger service is using it for a different purpose (https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/remote/RemoteAgent.jsm#94-98). These two notifications probably shouldn't use the same name.

Flags: needinfo?(nika) → needinfo?(hskupin)

Yes, I know but this is on purpose. Both use the same observer topic, and I will have a fix for that soon.

The bigger problem I have is failing to understand why starting Firefox via mach run doesn't cause the crash. That was actually the main reason why I set needinfo for you. Not sure which side-effects this could have for CI.

Flags: needinfo?(hskupin) → needinfo?(nika)

Unfortunately I'm not super familiar with the behaviour of "mach run", compared to starting the browser normally, so I have no clue why it might only be happening in that scenario. I could believe that it configures things differently when starting up though.

Flags: needinfo?(nika)

I'm not super familiar with mach run either.. though I do know it generates a new clean profile by default. So I'd guess it works with mach run for the same reason that you found in comment 2. You can pass -P if you want to use an existing profile.

Flags: needinfo?(ahal)
Crash Signature: [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe] → [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe ] [@ remote::startup::handler::RemoteAgentHandler::allocate::Observe]

So as it turns out the safest bet here is to really use different observer topics for Marionette and Remote Protocol. This would be necessary anyway given that when both remote services are enabled, and only one gets disabled, the browser should still be in remote control mode and as such needs flags for each of those.

I will work on that by next week.

Assignee: nobody → hskupin
Severity: -- → S3
Status: NEW → ASSIGNED
Priority: -- → P1
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8375c9d83d20 [marionette] Use dedicated "marionette-listening" notification to inform the browser when marionette is active. r=marionette-reviewers,jdescottes,Gijs
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: