Closed Bug 1544522 Opened 6 years ago Closed 5 years ago

Intermittent runner.py | application crashed [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()] [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*) ]

Categories

(Core :: DOM: Service Workers, defect, P3)

defect

Tracking

()

RESOLVED FIXED
84 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- wontfix
firefox83 --- wontfix
firefox84 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jstutte)

References

(Regression)

Details

(Keywords: crash, intermittent-failure, regression, Whiteboard: [retriggered][stockwell disable-recommended])

Crash Data

Attachments

(1 file)

#[markdown(off)]
Filed by: ccoroiu [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=240389254&repo=mozilla-central

https://queue.taskcluster.net/v1/task/EBRqC5uCRv6MsXTSaiaxYw/runs/0/artifacts/public/logs/live_backing.log

17:18:36 INFO - PROCESS-CRASH | runner.py | application crashed [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]
17:18:36 INFO - Crash dump filename: c:\users\task_1555341419\appdata\local\temp\tmp73zu0s.mozrunner\minidumps\f3e9d38e-90ee-4ee2-8140-d43abfdd9be5.dmp
17:18:36 INFO - Operating system: Windows NT
17:18:36 INFO - 10.0.15063
17:18:36 INFO - CPU: amd64
17:18:36 INFO - family 6 model 142 stepping 9
17:18:36 INFO - 4 CPUs
17:18:36 INFO - GPU: UNKNOWN
17:18:36 INFO - Crash reason: EXCEPTION_BREAKPOINT
17:18:36 INFO - Crash address: 0x7fff24f5c3b4
17:18:36 INFO - Assertion: Unknown assertion type 0x00000000
17:18:36 INFO - Process uptime: 47 seconds
17:18:36 INFO - Thread 0 (crashed)
17:18:36 INFO - 0 xul.dll!void mozilla::dom::ClientSource::MaybeCreateInitialDocument() [ClientSource.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 141 + 0x0]
17:18:36 INFO - rax = 0x00007fff27895b2b rdx = 0x0000000000000000
17:18:36 INFO - rcx = 0x00007fff470a3ba0 rbx = 0x000001a366266200
17:18:36 INFO - rsi = 0x000001a366266200 rdi = 0x000001a366266200
17:18:36 INFO - rbp = 0x00000000000005c8 rsp = 0x0000003fe09fe9d0
17:18:36 INFO - r8 = 0x0000000000000000 r9 = 0x000001a3665288b0
17:18:36 INFO - r10 = 0x000001a3665020cc r11 = 0x000001a366502300
17:18:36 INFO - r12 = 0x000001a366507318 r13 = 0x00000000ffffffff
17:18:36 INFO - r14 = 0x000001a35cf89880 r15 = 0x000001a366507310
17:18:36 INFO - rip = 0x00007fff24f5c3b4
17:18:36 INFO - Found by: given as instruction pointer in context
17:18:36 INFO - 1 xul.dll!mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState *) [ClientSource.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 641 + 0x8]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09fea00 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff24f5d200
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 2 xul.dll!class RefPtr<mozilla::MozPromise<mozilla::dom::ClientOpResult,nsresult,0> > mozilla::dom::ClientSource::GetInfoAndState(const class mozilla::dom::ClientGetInfoAndStateArgs & const) [ClientSource.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 627 + 0x5]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09fea70 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff24f5d947
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 3 xul.dll!static void mozilla::dom::ClientSourceOpChild::DoSourceOp<RefPtr<mozilla::MozPromise<mozilla::dom::ClientOpResult,nsresult,0> > (mozilla::dom::ClientSource::*)(const mozilla::dom::ClientGetInfoAndStateArgs &),mozilla::dom::ClientGetInfoAndStateArgs>( *, const class mozilla::dom::ClientGetInfoAndStateArgs & const) [ClientSourceOpChild.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 42 + 0xb]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09fec10 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff24f5e64b
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 4 xul.dll!class mozilla::ipc::IPCResult mozilla::dom::ClientSourceChild::RecvPClientSourceOpConstructor(class mozilla::dom::PClientSourceOpChild *, const class mozilla::dom::ClientOpConstructorArgs & const) [ClientSourceChild.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 43 + 0x8]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09fed90 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff24f5de33
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 5 xul.dll!mozilla::dom::PClientSourceChild::OnMessageReceived(IPC::Message const &) [PClientSourceChild.cpp: : 324 + 0x16]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09fedc0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2282289f
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 6 xul.dll!void mozilla::ipc::MessageChannel::DispatchMessage(class IPC::Message *) [MessageChannel.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 2078 + 0x4f]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09feff0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff226d626a
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 7 xul.dll!nsresult mozilla::ipc::MessageChannel::MessageTask::Run() [MessageChannel.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 1968 + 0xb1]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff100 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff226d5d94
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 8 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 1180 + 0xb]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff170 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2256dde4
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 9 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 486 + 0xd]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff6a0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2256d879
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 10 xul.dll!void mozilla::ipc::MessagePump::Run(class base::MessagePump::Delegate *) [MessagePump.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 88 + 0xa]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff6f0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2274d63b
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 11 xul.dll!MessageLoop::RunHandler() [message_loop.cc:16d953cca41483b114d70a3132fbcfe60755708f : 308 + 0xf]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff760 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff22545918
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 12 xul.dll!MessageLoop::Run() [message_loop.cc:16d953cca41483b114d70a3132fbcfe60755708f : 290 + 0x5]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff7b0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2256d571
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 13 xul.dll!nsBaseAppShell::Run() [nsBaseAppShell.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 137 + 0xd]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff800 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2274d518
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 14 xul.dll!nsAppShell::Run() [nsAppShell.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 412 + 0x8]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff840 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2274ae23
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 15 xul.dll!XRE_RunAppShell() [nsEmbedFunctions.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 919 + 0x6]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff870 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2664ef05
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 16 xul.dll!MessageLoop::RunHandler() [message_loop.cc:16d953cca41483b114d70a3132fbcfe60755708f : 308 + 0xf]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff8b0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff22545918
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 17 xul.dll!MessageLoop::Run() [message_loop.cc:16d953cca41483b114d70a3132fbcfe60755708f : 290 + 0x5]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff900 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2256d571
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 18 xul.dll!XRE_InitChildProcess(int,char * * const,XREChildData const *) [nsEmbedFunctions.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 757 + 0x5]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ff950 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff2664eb69
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 19 firefox.exe!static int content_process_main(class mozilla::Bootstrap *, int, char * *) [plugin-container.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 56 + 0x13]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffb90 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007ff7b509151a
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 20 firefox.exe!static int NS_internal_main(int, char * *, char * *) [nsBrowserApp.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 263 + 0xa]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffbf0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007ff7b5091452
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 21 firefox.exe!wmain [nsWindowsWMain.cpp:16d953cca41483b114d70a3132fbcfe60755708f : 131 + 0x15]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffc70 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007ff7b5091113
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 22 firefox.exe!static int __scrt_common_main_seh() [exe_common.inl : 288 + 0x22]
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffcd0 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007ff7b50d94e8
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 23 kernel32.dll!BaseThreadInitThunk + 0x14
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffd10 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff56bd2774
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 24 ntdll.dll!SdbpCheckMatchingRegistryEntry + 0x29d
17:18:36 INFO - rbx = 0x000001a366266200 rbp = 0x00000000000005c8
17:18:36 INFO - rsp = 0x0000003fe09ffd40 r12 = 0x000001a366507318
17:18:36 INFO - r13 = 0x00000000ffffffff r14 = 0x000001a35cf89880
17:18:36 INFO - r15 = 0x000001a366507310 rip = 0x00007fff56e40d61
17:18:36 INFO - Found by: call frame info
17:18:36 INFO - 25 KERNELBASE.dll + 0x67c0
17:18:36 INFO - rsp = 0x0000003fe09ffd70 rip = 0x00007fff53c867c0
17:18:36 INFO - Found by: stack scanning

Don't mark intermittent crashes as P5s. We want them to go to triage owners.

Priority: P5 → --
Priority: -- → P5

That's a crash of Firefox and doesn't belong to Raptor.

Component: Raptor → DOM: Core & HTML
Priority: P5 → --
Product: Testing → Core
Version: Version 3 → unspecified
Component: DOM: Core & HTML → DOM: Service Workers
Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)]

We have a new spike here and we have a MOZ_DIAGNOSTIC_ASSERT(GetInnerWindow()); that fails. This means:

nsPIDOMWindowInner* ClientSource::GetInnerWindow() const {
  NS_ASSERT_OWNINGTHREAD(ClientSource);
  if (!mOwner.is<RefPtr<nsPIDOMWindowInner>>()) {
    return nullptr;
  }
  return mOwner.as<RefPtr<nsPIDOMWindowInner>>();
}

returns nullptr. I assume (but am not able to read it easily in the code of Variant.h) that mOwner.is<RefPtr<nsPIDOMWindowInner>>() returns true only, if mOwneris not null and of the desired type.

As this ClientSource message seems to come from a different thread or even process, rather than crashing the main thread we should probable improve failure handling here and leave it to the caller to understand, what it did wrong?

Oh, and it might be related to bug 1583859?

Flags: needinfo?(bugmail)

In the last 7 days there have been 24 occurrences on linux64-shippable-qr, macosx1014-64-shippable, windows10-64-ref-hw-2017 all on build type opt.

Recent failure: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=296266621&repo=mozilla-central&lineNumber=3514

Hi Perry, can you have a look here?

Flags: needinfo?(perry)

Note that this mainly affects the Raptor job: raptor-tp6-9-firefox, and here specifically opt builds across desktop platforms. We haven't seen this on mobile builds yet.

To run these jobs call the following command:

./mach raptor-test --test raptor-tp6-9

Maybe that helps to reproduce the problem locally.

Summary: Intermittent runner.py | application crashed [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()] → Intermittent runner.py | application crashed [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()] [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*) ]
Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()]

Jens, any chance we can get some updates for this bug? Thanks.

Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()]
Flags: needinfo?(jstutte)

Note that all the crashes show the following crash reason:

Mozilla crash reason: MOZ_DIAGNOSTIC_ASSERT(GetInnerWindow())

Oh and these crashes only happen for warm Raptor test jobs, which don't restart Firefox in between each page cycle:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=tp6-9

So I assume this is somewhat related to closing and opening tabs, in between each page cycle and that the inner window is not accessible yet.

This has now reached the disable-recommended queue with 101 total failures in the last 7 days on

  • linux64-shippable opt
  • macosx1014-64-shippable opt
  • windows10-64-shippable opt
  • windows7-32-shippable opt

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=297960006&repo=mozilla-central&lineNumber=3521

[task 2020-04-16T17:07:35.882Z] 17:07:35 INFO - raptor-webext Info: installing webext /Users/cltbld/tasks/task_1587055897/build/tests/raptor/raptor/webextension/../../webext/raptor
[task 2020-04-16T17:07:35.882Z] 17:07:35 INFO - raptor-webext-desktop Info: starting firefox
[task 2020-04-16T17:07:35.882Z] 17:07:35 INFO - Application command: /Users/cltbld/tasks/task_1587055897/build/application/Firefox Nightly.app/Contents/MacOS/firefox -foreground -profile /var/folders/zx/5dr6twgx7sndzbkm0qp77bnh000017/T/tmpQJaAz3
[task 2020-04-16T17:07:38.006Z] 17:07:38 INFO - raptor-control-server Info: received webext_loaded: raptor runner.js is loaded!
[task 2020-04-16T17:07:38.053Z] 17:07:38 INFO - raptor-control-server Info: received webext_status: test name is: raptor-tp6-outlook-firefox
[task 2020-04-16T17:07:38.069Z] 17:07:38 INFO - raptor-control-server Info: received webext_status: test settings url is: http://127.0.0.1:49978/json/raptor-tp6-outlook-firefox.json
[task 2020-04-16T17:07:38.102Z] 17:07:38 INFO - raptor-control-server Info: received webext_status: starting raptorRunner
[task 2020-04-16T17:07:38.111Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] testing on Firefox 77.0a1 20200416150559"
[task 2020-04-16T17:07:38.111Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] getting test settings from control server"
[task 2020-04-16T17:07:38.122Z] 17:07:38 INFO - raptor-control-server Info: reading test settings from json/raptor-tp6-outlook-firefox.json
[task 2020-04-16T17:07:38.124Z] 17:07:38 INFO - raptor-control-server Info: sent test settings to webext runner
[task 2020-04-16T17:07:38.166Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] test settings received: {"raptor-options": {"expected_browser_cycles": 1, "subtest_unit": "ms", "alert_threshold": 2.0, "type": "pageload", "page_cycles": 25, "subtest_lower_is_better": true, "alert_on": ["loadtime"], "test_url": "https://outlook.live.com/mail/inbox\", "page_timeout": 30000, "host": "127.0.0.1", "measure": {"dcf": true, "fnbpaint": true, "loadtime": true}, "cold": false, "lower_is_better": true, "unit": "ms"}}"
[task 2020-04-16T17:07:38.166Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] test URL: https://outlook.live.com/mail/inbox"
[task 2020-04-16T17:07:38.167Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] using page timeout: 30000ms"
[task 2020-04-16T17:07:38.245Z] 17:07:38 INFO - PID 1658 | console.info: "[raptor-runnerjs] wrote settings to ext local storage"

[task 2020-04-16T17:26:25.319Z] 17:26:25 ERROR - PROCESS-CRASH | runner.py | application crashed [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]
[task 2020-04-16T17:26:25.319Z] 17:26:25 INFO - Mozilla crash reason: MOZ_DIAGNOSTIC_ASSERT(GetInnerWindow())
[task 2020-04-16T17:26:25.319Z] 17:26:25 INFO - Crash dump filename: /var/folders/zx/5dr6twgx7sndzbkm0qp77bnh000017/T/tmpQJaAz3/minidumps/CAA4A7A3-60E1-4CC9-96FC-221F639D43A7.dmp
[task 2020-04-16T17:26:25.319Z] 17:26:25 INFO - Operating system: Mac OS X
[task 2020-04-16T17:26:25.319Z] 17:26:25 INFO - 10.14.5 18F132
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - CPU: amd64
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - family 6 model 69 stepping 1
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - 4 CPUs
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - GPU: UNKNOWN
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - Crash address: 0x0
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - Process uptime: 58 seconds
[task 2020-04-16T17:26:25.320Z] 17:26:25 INFO - Thread 0 (crashed)
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - 0 XUL!mozilla::dom::ClientSource::MaybeCreateInitialDocument() [ClientSource.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 147 + 0x11]
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - rax = 0x000000011028adf2 rdx = 0x0000000109f774a8
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - rcx = 0x000000010561a068 rbx = 0x00000001248fe480
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - rsi = 0x0000000000000003 rdi = 0x000000015692e560
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - rbp = 0x00007ffeea957a70 rsp = 0x00007ffeea957a60
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - r8 = 0xffffffff00000000 r9 = 0x0000000000002ce4
[task 2020-04-16T17:26:25.321Z] 17:26:25 INFO - r10 = 0x000000010ac54cf0 r11 = 0x0000000000000246
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - r12 = 0x0000000105804d00 r13 = 0x00007ffeea957e20
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - r14 = 0x00007ffeea958140 r15 = 0x00007ffeea957d90
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - rip = 0x000000010c6d5b26
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - Found by: given as instruction pointer in context
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - 1 XUL!mozilla::dom::ClientSource::SnapshotState() [ClientSource.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 679 + 0x8]
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - rbp = 0x00007ffeea957ad0 rsp = 0x00007ffeea957a80
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - rip = 0x000000010c6d77ba
[task 2020-04-16T17:26:25.322Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - 2 XUL!mozilla::dom::ClientSource::GetInfoAndState(mozilla::dom::ClientGetInfoAndStateArgs const&) [ClientSource.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 664 + 0x5]
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - rbp = 0x00007ffeea957de0 rsp = 0x00007ffeea957ae0
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - rip = 0x000000010c6d7ea1
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - 3 XUL!void mozilla::dom::ClientSourceOpChild::DoSourceOp<RefPtr<mozilla::MozPromise<mozilla::dom::ClientOpResult, mozilla::CopyableErrorResult, false> > (mozilla::dom::ClientSource::)(mozilla::dom::ClientGetInfoAndStateArgs const&), mozilla::dom::ClientGetInfoAndStateArgs>(RefPtr<mozilla::MozPromise<mozilla::dom::ClientOpResult, mozilla::CopyableErrorResult, false> > (mozilla::dom::ClientSource::)(mozilla::dom::ClientGetInfoAndStateArgs const&), mozilla::dom::ClientGetInfoAndStateArgs const&) [ClientSourceOpChild.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 44 + 0xb]
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - rbp = 0x00007ffeea9580d0 rsp = 0x00007ffeea957df0
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - rip = 0x000000010c6d94a4
[task 2020-04-16T17:26:25.323Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - 4 XUL!mozilla::dom::ClientSourceOpChild::Init(mozilla::dom::ClientOpConstructorArgs const&) [ClientSourceOpChild.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 97 + 0xb]
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - rbp = 0x00007ffeea9580f0 rsp = 0x00007ffeea9580e0
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - rip = 0x000000010c6d88a9
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - 5 XUL!mozilla::dom::ClientSourceChild::RecvPClientSourceOpConstructor(mozilla::dom::PClientSourceOpChild*, mozilla::dom::ClientOpConstructorArgs const&) [ClientSourceChild.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 43 + 0x8]
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - rbp = 0x00007ffeea958100 rsp = 0x00007ffeea958100
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - rip = 0x000000010c6d876f
[task 2020-04-16T17:26:25.324Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - 6 XUL!mozilla::dom::PClientSourceChild::OnMessageReceived(IPC::Message const&) [PClientSourceChild.cpp: : 365 + 0x13]
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - rbp = 0x00007ffeea9584c0 rsp = 0x00007ffeea958110
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - rip = 0x000000010ad2a78b
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - 7 XUL!mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) [PBackgroundChild.cpp: : 5970 + 0xd]
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - rbp = 0x00007ffeea958ec0 rsp = 0x00007ffeea9584d0
[task 2020-04-16T17:26:25.325Z] 17:26:25 INFO - rip = 0x000000010ae22774
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - 8 XUL!mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) [MessageChannel.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 2187 + 0xd]
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - rbp = 0x00007ffeea959190 rsp = 0x00007ffeea958ed0
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - rip = 0x000000010ac99dff
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - 9 XUL!mozilla::ipc::MessageChannel::MessageTask::Run() [MessageChannel.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 1990 + 0xd0]
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - rbp = 0x00007ffeea9591e0 rsp = 0x00007ffeea9591a0
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - rip = 0x000000010ac9b3c6
[task 2020-04-16T17:26:25.326Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.327Z] 17:26:25 INFO - 10 XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 1200 + 0x6]
[task 2020-04-16T17:26:25.327Z] 17:26:25 INFO - rbp = 0x00007ffeea959700 rsp = 0x00007ffeea9591f0
[task 2020-04-16T17:26:25.327Z] 17:26:25 INFO - rip = 0x000000010a6d8ec0
[task 2020-04-16T17:26:25.327Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.329Z] 17:26:25 INFO - 11 XUL!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [MessagePump.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 87 + 0x2b]
[task 2020-04-16T17:26:25.329Z] 17:26:25 INFO - rbp = 0x00007ffeea959760 rsp = 0x00007ffeea959710
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - rip = 0x000000010ac9db31
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - 12 XUL!MessageLoop::Run() [message_loop.cc:98d1b98c71776984e1209b452aa986d3e8d49a09 : 290 + 0xc]
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - rbp = 0x00007ffeea959790 rsp = 0x00007ffeea959770
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - rip = 0x000000010ac521c0
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - 13 XUL!nsBaseAppShell::Run() [nsBaseAppShell.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 137 + 0x19]
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - rbp = 0x00007ffeea9597b0 rsp = 0x00007ffeea9597a0
[task 2020-04-16T17:26:25.330Z] 17:26:25 INFO - rip = 0x000000010d0f03d5
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - 14 XUL!nsAppShell::Run() [nsAppShell.mm:98d1b98c71776984e1209b452aa986d3e8d49a09 : 692 + 0x8]
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - rbp = 0x00007ffeea9597e0 rsp = 0x00007ffeea9597c0
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - rip = 0x000000010d1579a8
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - 15 XUL!XRE_RunAppShell() [nsEmbedFunctions.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 909 + 0x6]
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - rbp = 0x00007ffeea959810 rsp = 0x00007ffeea9597f0
[task 2020-04-16T17:26:25.331Z] 17:26:25 INFO - rip = 0x000000010e37a390
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - 16 XUL!MessageLoop::Run() [message_loop.cc:98d1b98c71776984e1209b452aa986d3e8d49a09 : 290 + 0xc]
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - rbp = 0x00007ffeea959840 rsp = 0x00007ffeea959820
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - rip = 0x000000010ac521c0
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - 17 XUL!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:98d1b98c71776984e1209b452aa986d3e8d49a09 : 740 + 0x5]
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - rbp = 0x00007ffeea959b50 rsp = 0x00007ffeea959850
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - rip = 0x000000010e379e89
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.332Z] 17:26:25 INFO - 18 plugin-container + 0xf0b
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - rbp = 0x00007ffeea959b90 rsp = 0x00007ffeea959b60
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - rip = 0x00000001052a5f0b
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - Found by: previous frame's frame pointer
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - 19 libdyld.dylib!start + 0x1
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - rbp = 0x00007ffeea959ba0 rsp = 0x00007ffeea959ba0
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - rip = 0x00007fff6156e3d5
[task 2020-04-16T17:26:25.333Z] 17:26:25 INFO - Found by: previous frame's frame pointer

Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]

I checked a couple of crashes and it's always happening when running the Raptor page load tests for https://outlook.live.com/mail/inbox. Maybe there is something special with this page.

To reproduce it faster run the following command:

mach raptor --test raptor-tp6-outlook-firefox --post-startup-delay 0

Jens, since this is the disable-recommended queue for a while and the ni have no replies yet, can you please assign to someone else?
Henrik also provided some useful info above.

Flags: needinfo?(perry)
Flags: needinfo?(jstutte)
Flags: needinfo?(bugmail)
Flags: needinfo?(jstutte)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]

We will not be able to come to this very soon, I fear. Keeping it on the radar, though.

Flags: needinfo?(jstutte)
Priority: -- → P3
Depends on: 1631795

Given that we are not expecting a fix soon, we are going to re-record the affected page set to maybe get around this crash. Maybe that will help.

Florin, can you please attach the current page set as reference and explain how to use it with Raptor locally? Thanks.

Flags: needinfo?(fstrugariu)

Not sure if Florin already updated the pageset but with my latest try build I do not see a single crash for the tp6-9 job:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f308307ba17bb37ad491104d99ec91322de0c6c8

Same actually for the last merge from autoland to mozilla-central earlier today:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=rap%2Ctp6-9&revision=2fd61eb5c69ce9ac806048a35c7a7a88bf4b9652

Jens, is something in this changeset which might have fixed the crash?

https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=2fd61eb5c69ce9ac806048a35c7a7a88bf4b9652

Florin, for now please do not re-record the pageset. Thanks.

Flags: needinfo?(fstrugariu) → needinfo?(jstutte)

The only change that might be related AFAICS could be bug 1602318 (fiddling with DocChannels, it seems), but it has been backed out anyway?

Flags: needinfo?(jstutte) → needinfo?(matt.woodrow)

All my changes in there are also backed out in the same range.

Bug 1618546 is in that range and touches the clients code.

Flags: needinfo?(matt.woodrow)

(In reply to Matt Woodrow (:mattwoodrow) from comment #43)

All my changes in there are also backed out in the same range.

Bug 1618546 is in that range and touches the clients code.

Interesting, I thought it was just relevant for devtools. Perry, can you confirm that this might have helped also here?

Flags: needinfo?(perry)

I can do some backfills on autoland to figure out caused it to not crash anymore.

Based on the title of the other bug I would assume that there might be a <iframe mozbrowser> element in the webpage under test?

Flags: needinfo?(perry)

Jens, is this bug still something which would need further investigation for a fix, or shall we just close it? As it looks like we only hit this crash with the Raptor test. For us it's not a concern anymore.

Flags: needinfo?(jstutte)

Closed as duplicate of bug 1618546. Let's see, if it resurrects...

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(jstutte)
Resolution: --- → DUPLICATE

(In reply to Jens Stutte [:jstutte] from comment #49)

Closed as duplicate of bug 1618546. Let's see, if it resurrects...

*** This bug has been marked as a duplicate of bug 1618546 ***

Hm, why a dupe of bug 1618546? The fix on bug 1614462 made it actually go away.

Flags: needinfo?(jstutte)

OK, right, we do not really have the possibility to mark a fix as "dupe", not the bug. I'll go for WORKSFORME then, thanks to bug 1618546 it seems.

Flags: needinfo?(jstutte)
Resolution: DUPLICATE → WORKSFORME

No failures since the 22nd.

Whiteboard: [stockwell disable-recommended] → [stockwell fixed:other]
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Hi Andrew, this is one of our most frequent intermittents last week. Can you see something obvious in the pushlog?

Flags: needinfo?(jstutte) → needinfo?(bugmail)
Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument]

Kris, can you take a look? This looks like from bug 1650257.

Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument]
Flags: needinfo?(kmaglione+bmo)
Regressed by: 1650257
Crash Signature: [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()]. [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument] → [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument()] [@ mozilla::dom::ClientSource::SnapshotState(mozilla::dom::ClientState*)] [@ mozilla::dom::ClientSource::SnapshotState()] [@ mozilla::dom::ClientSource::MaybeCreateInitialDocument]
Flags: needinfo?(bugmail)

There are 69 total failures in the last 7 days on linux64-shippable, windows7-32-shippable and windows10-64-shippable opt.
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=316153184&repo=mozilla-central&lineNumber=1156

Kris, did you get the chance to take a look at this? This is already on our disable recommend list. Thank you

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(kmaglione+bmo)

The function void ClientSource::MaybeCreateInitialDocument() is used only in Result<ClientState, ErrorResult> ClientSource::SnapshotState(), which seems to have a "real" error handling already. Couldn't we just promote the MOZ_DIAGNOSTIC_ASSERT(GetInnerWindow()); to a real ifand return an error here?

In the intermittents list at https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2020-09-15&endday=2020-09-22&tree=trunk&bug=1544522, it seems like this reliably reproduces on raptor-tp6-firefox-outlook-e10s and maybe raptor-tp6-firefox-outlook-fis-e10s including on linux so maybe we can can an rr/pernosco reproduction to help shed more light on exactly what's happening.

It looks like this may actually be the code that was causing bug 1650257 to manifest in the wild. Which means that the service worker code is trying to create an inner window here after a BC or its ancestor has been discarded or become inactive, which we can't support, and it needs to be able to deal with.

Flags: needinfo?(kmaglione+bmo)

So I played around a bit with explicit errors here. Basically void ClientSource::MaybeCreateInitialDocument() in the current state does three different things:

  1. Do we have a docshell? If not, do nothing.
  2. Silently call docshell->GetDocument();
  3. Assert we have an inner window (as diagnostic assert)

Now all three steps have the potential to be failures. A strong interpretation would be, to return an error:

  • if we have no docshell
  • if GetDocument() failed
  • if GetInnerWindow() failed

Here is a try run with these conditions. It seems, that assuming we need to have a docshell here breaks many eggs.

The current patch thus reduces strictness to the following:

  1. If we have no docshell, do not fail with an error but return false
  2. If we have a docshell but no document, fail with an error
  3. If we have a document but no inner window, fail with an error
  4. return true

This gives the following try run.

Still I am wondering, which conditions should really lead to an error here.

Looking at test coverage data, we can see that in normal execution we never have a docshell, which explains the many breakages if we promote this to an error condition.

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Flags: needinfo?(bugmail)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]

Hi Kris, Andrew, any ideas how to proceed here? Should we give the explicit error handling another try? Thanks!

Flags: needinfo?(kmaglione+bmo)

(In reply to Jens Stutte [:jstutte] from comment #77)

Hi Kris, Andrew, any ideas how to proceed here? Should we give the explicit error handling another try? Thanks!

We need to be able to handle not being able to create a window for a docshell, yes. There will always be cases where that won't be possible, such as when the parent BrowsingContext has been destroyed, or when the BrowsingContext of the docshell has changed process (though I'm not entirely sure that's possible without having first created an inner window), or when we've reached a recursion limit. It would be nice to try to deal with some of those situations before calling MaybeCreateInitialDocument, but you shouldn't expect to be able to deal with all of them, so we need to be able to deal with a failure.

Flags: needinfo?(kmaglione+bmo)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]

(In reply to Kris Maglione [:kmag] from comment #80)

We need to be able to handle not being able to create a window for a docshell, yes. There will always be cases where that won't be possible, such as when the parent BrowsingContext has been destroyed, or when the BrowsingContext of the docshell has changed process (though I'm not entirely sure that's possible without having first created an inner window), or when we've reached a recursion limit. It would be nice to try to deal with some of those situations before calling MaybeCreateInitialDocument, but you shouldn't expect to be able to deal with all of them, so we need to be able to deal with a failure.

So would

    If we have no docshell, do not fail with an error but do nothing
    If we have a docshell but no document, fail with an error
    If we have a document but no inner window, fail with an error
    Go ahead

look like a reasonable set of rules in general (if we should have all of them inside MaybeCreateInitialDocument is another piece of cake) ?

Flags: needinfo?(kmaglione+bmo)

(In reply to Jens Stutte [:jstutte] from comment #84)

(In reply to Kris Maglione [:kmag] from comment #80)

We need to be able to handle not being able to create a window for a docshell, yes. There will always be cases where that won't be possible, such as when the parent BrowsingContext has been destroyed, or when the BrowsingContext of the docshell has changed process (though I'm not entirely sure that's possible without having first created an inner window), or when we've reached a recursion limit. It would be nice to try to deal with some of those situations before calling MaybeCreateInitialDocument, but you shouldn't expect to be able to deal with all of them, so we need to be able to deal with a failure.

So would

    If we have no docshell, do not fail with an error but do nothing
    If we have a docshell but no document, fail with an error
    If we have a document but no inner window, fail with an error
    Go ahead

look like a reasonable set of rules in general (if we should have all of them inside MaybeCreateInitialDocument is another piece of cake) ?

I think the first two are probably fine, especially since SnapshotWindowState already uses GetExtantDoc() and fails if it returns null.

I don't think you need to check whether we have a document without an inner window. That should basically never happen when we've just gotten the document from the DocShell.

Flags: needinfo?(kmaglione+bmo)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Assignee: nobody → jstutte
Attachment #9177346 - Attachment description: Bug 1544522: Add explicit error handling to SnapshotState → Bug 1544522: Add explicit error handling to SnapshotState r=kmag,#dom-workers-and-storage-reviewers
Pushed by btara@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/da9fae8854d3 Add explicit error handling to SnapshotState r=kmag
Flags: needinfo?(bugmail)
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: