Intermittent PID 4676 | Assertion failure: XRE_IsParentProcess(), at z:/build/build/src/toolkit/components/windowwatcher/nsWindowWatcher.cpp:703
Categories
(Core :: Networking: HTTP, defect, P1)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: kershaw)
References
Details
(Keywords: assertion, intermittent-failure, testcase-wanted, Whiteboard: [necko-triaged])
Crash Data
Attachments
(1 file)
Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=270581674&repo=mozilla-inbound
Full log: https://queue.taskcluster.net/v1/task/Wzyu9dkMTb68LEJkFb5CDg/runs/0/artifacts/public/logs/live_backing.log
[task 2019-10-09T23:41:39.324Z] 23:41:39 INFO - PID 4676 | Assertion failure: XRE_IsParentProcess(), at z:/build/build/src/toolkit/components/windowwatcher/nsWindowWatcher.cpp:703
[task 2019-10-09T23:41:40.231Z] 23:41:40 INFO - Browser not responding, setting status to CRASH
[task 2019-10-09T23:41:40.235Z] 23:41:40 INFO - mozcrash Copy/paste: Z:\task_1570658318\build\win32-minidump_stackwalk.exe c:\users\task_1570658318\appdata\local\temp\tmpm3hjye.mozrunner\minidumps\ccf65631-17bf-4f43-8b16-50e256332f85.dmp Z:\task_1570658318\build\symbols
[task 2019-10-09T23:42:01.779Z] 23:42:01 INFO - mozcrash Saved minidump as Z:\task_1570658318\build\blobber_upload_dir\ccf65631-17bf-4f43-8b16-50e256332f85.dmp
[task 2019-10-09T23:42:01.782Z] 23:42:01 INFO - mozcrash Saved app info as Z:\task_1570658318\build\blobber_upload_dir\ccf65631-17bf-4f43-8b16-50e256332f85.extra
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - PROCESS-CRASH | /service-workers/service-worker/clients-get.https.html | application crashed [@ nsWindowWatcher::OpenWindowInternal(mozIDOMWindowProxy *,char const *,char const *,char const *,bool,bool,bool,nsIArray *,bool,bool,bool,nsDocShellLoadState *,mozilla::dom::BrowsingContext * *)]
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - Crash dump filename: c:\users\task_1570658318\appdata\local\temp\tmpm3hjye.mozrunner\minidumps\ccf65631-17bf-4f43-8b16-50e256332f85.dmp
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - Operating system: Windows NT
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - 6.1.7601 Service Pack 1
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - CPU: x86
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - GenuineIntel family 6 model 63 stepping 2
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO - 8 CPUs
[task 2019-10-09T23:42:01.874Z] 23:42:01 INFO -
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - GPU: UNKNOWN
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO -
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Crash reason: EXCEPTION_BREAKPOINT
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Crash address: 0x5a9c4b9b
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Assertion: Unknown assertion type 0x00000000
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Process uptime: 2 seconds
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO -
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Thread 0 (crashed)
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - 0 xul.dll!nsWindowWatcher::OpenWindowInternal(mozIDOMWindowProxy *,char const *,char const *,char const *,bool,bool,bool,nsIArray *,bool,bool,bool,nsDocShellLoadState *,mozilla::dom::BrowsingContext * *) [nsWindowWatcher.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 703 + 0x0]
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - eip = 0x5a9c4b9b esp = 0x002acd80 ebp = 0x002acf7c ebx = 0x08cd0100
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - esi = 0x002acd80 edi = 0x000006ce eax = 0x6a55a89c ecx = 0x000002bf
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - edx = 0x00000049 efl = 0x00000212
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Found by: given as instruction pointer in context
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - 1 xul.dll!nsWindowWatcher::OpenWindow(mozIDOMWindowProxy *,char const *,char const *,char const *,nsISupports *,mozIDOMWindowProxy * *) [nsWindowWatcher.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 292 + 0x1f]
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - eip = 0x5a9c1a3a esp = 0x002acf84 ebp = 0x002acfdc ebx = 0x0c5f7460
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - esi = 0x0c56ff20 edi = 0x08cd50f8
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.875Z] 23:42:01 INFO - 2 xul.dll!NS_InvokeByIndex + 0x27
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - eip = 0x5c7242b7 esp = 0x002acfe4 ebp = 0x002ad018 ebx = 0x00000005
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - esi = 0x002ad058 edi = 0x00000005
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - 3 xul.dll!CallMethodHelper::Call() [XPCWrappedNative.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 1183 + 0x16]
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - eip = 0x567dd9a6 esp = 0x002ad020 ebp = 0x002ad040
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - Found by: previous frame's frame pointer
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - 4 xul.dll!XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::CallMode) [XPCWrappedNative.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 1149 + 0x5]
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - eip = 0x567dd786 esp = 0x002ad048 ebp = 0x002ad170 ebx = 0x002ad058
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - esi = 0x002ad058 edi = 0x002ad048
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - 5 xul.dll!XPC_WN_CallMethod(JSContext *,unsigned int,JS::Value *) [XPCWrappedNativeJSOps.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 946 + 0x9]
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - eip = 0x567def86 esp = 0x002ad178 ebp = 0x002ad20c ebx = 0x002ad180
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - esi = 0x0c5f2f1c edi = 0x0c5f2f00
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - 6 xul.dll!CallJSNative(JSContext ,bool ()(JSContext *,unsigned int,JS::Value *),js::CallReason,JS::CallArgs const &) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 457 + 0x7]
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - eip = 0x5ab50aa0 esp = 0x002ad214 ebp = 0x002ad268 ebx = 0x0660c800
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - esi = 0x567dec60 edi = 0x002ad4c0
[task 2019-10-09T23:42:01.876Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - 7 xul.dll!js::InternalCallOrConstruct(JSContext *,JS::CallArgs const &,js::MaybeConstruct,js::CallReason) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 549 + 0x9]
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - eip = 0x5ab50422 esp = 0x002ad270 ebp = 0x002ad2d0 ebx = 0x00000000
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - esi = 0x0660c800 edi = 0x002ad4c0
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - 8 xul.dll!static bool InternalCall(struct JSContext *, const class js::AnyInvokeArgs & const, js::CallReason) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 618 + 0xc]
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - eip = 0x5ab51363 esp = 0x002ad2d8 ebp = 0x002ad314 ebx = 0x002ad2f0
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - esi = 0x0660c800 edi = 0x002ad4c0
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - 9 xul.dll!static bool Interpret(struct JSContext *, class js::RunState & const) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 3111 + 0x15]
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - eip = 0x5ab49a97 esp = 0x002ad31c ebp = 0x002ad5a4 ebx = 0x00000000
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - esi = 0xffffff8c edi = 0x00000000
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - 10 xul.dll!js::RunScript(JSContext *,js::RunState &) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 424 + 0x9]
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - eip = 0x5ab40a38 esp = 0x002ad5ac ebp = 0x002ad5d0 ebx = 0x080a0790
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - esi = 0x0660c800 edi = 0x002ad5f0
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - 11 xul.dll!js::InternalCallOrConstruct(JSContext *,JS::CallArgs const &,js::MaybeConstruct,js::CallReason) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 590 + 0xb]
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - eip = 0x5ab50437 esp = 0x002ad5d8 ebp = 0x002ad630 ebx = 0x00000000
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - esi = 0x0660c800 edi = 0x002ad6c0
[task 2019-10-09T23:42:01.877Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - 12 xul.dll!static bool InternalCall(struct JSContext *, const class js::AnyInvokeArgs & const, js::CallReason) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 618 + 0xc]
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - eip = 0x5ab51363 esp = 0x002ad638 ebp = 0x002ad678 ebx = 0x002ad650
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - esi = 0x0660c800 edi = 0x002ad6c0
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - 13 xul.dll!js::Call(JSContext *,JS::Handle<JS::Value>,JS::Handle<JS::Value>,js::AnyInvokeArgs const &,JS::MutableHandle<JS::Value>,js::CallReason) [Interpreter.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 635 + 0x6]
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - eip = 0x5ab51481 esp = 0x002ad680 ebp = 0x002ad690 ebx = 0x002ad858
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - esi = 0x002ad6c0 edi = 0x002ad878
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - 14 xul.dll!JS_CallFunctionValue(JSContext *,JS::Handle<JSObject *>,JS::Handle<JS::Value>,JS::HandleValueArray const &,JS::MutableHandle<JS::Value>) [jsapi.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 2659 + 0x12]
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - eip = 0x5afbec88 esp = 0x002ad698 ebp = 0x002ad7c4 ebx = 0x002ad858
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - esi = 0x002ad740
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - 15 xul.dll!nsXPCWrappedJS::CallMethod(unsigned short,nsXPTMethodInfo const *,nsXPTCMiniVariant *) [XPCWrappedJSClass.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 956 + 0x14]
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - eip = 0x567d6dd9 esp = 0x002ad7cc ebp = 0x002ada78 ebx = 0x00000003
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - esi = 0x80004005 edi = 0x002ad90c
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - 16 xul.dll!static nsresult PrepareAndDispatch(class nsXPTCStubBase *, unsigned int, unsigned int *, unsigned int *) [xptcstubs.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 88 + 0xf]
[task 2019-10-09T23:42:01.878Z] 23:42:01 INFO - eip = 0x55abc41c esp = 0x002ada80 ebp = 0x002adb4c ebx = 0x0c56f320
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - esi = 0x002adab0 edi = 0x0000000c
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - 17 xul.dll!static void SharedStub() [xptcstubs.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 110 + 0x16]
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - eip = 0x55abb166 esp = 0x002adb54 ebp = 0x002adb68 ebx = 0x002adba4
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - esi = 0x0c5d6a40 edi = 0x002adbd0
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - 18 xul.dll!nsDocumentOpenInfo::DispatchContent(nsIRequest *,nsISupports *) [nsURILoader.cpp:cd1f46e85c51f6fac89a5add35537581db9a6e80 : 477 + 0xf]
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - eip = 0x569dcbcc esp = 0x002adb70 ebp = 0x002adb68
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO - Found by: call frame info
[task 2019-10-09T23:42:01.879Z] 23:42:01 INFO -
entering test: /service-workers/service-worker/clients-get.https.html
Comment 1•6 years ago
|
||
Started retriggers and backfill:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&searchStr=Windows%2C7%2Cdebug%2CWeb%2Cplatform%2Ctests%2Ctest-windows7-32%2Fdebug-web-platform-tests-e10s-11%2CW%28wpt11%29&tochange=cd1f46e85c51f6fac89a5add35537581db9a6e80&fromchange=068d8731f1c4003a567c94c8b76e3803e8e50345
and
| Comment hidden (Intermittent Failures Robot) |
Updated•6 years ago
|
Comment 3•6 years ago
|
||
FYI, This is still reproducible on Bughunter. I've resubmitted 75 urls where this has happened in the past on both Linux and Windows and was able to reproduce on 44 of them. http://russnov.ru/vecher-s-vladimirom-solovevym-03-07-2019/ is an example.
Updated•6 years ago
|
Comment 4•6 years ago
|
||
Bob, I tried that link from comment 3 in a debug browser built from revision a7ca5ad33f3d on Linux and I am not getting an assertion...
Any idea what I should do to be able to reproduce the assertion? Is it happening for you reliably on that site?
Comment 5•6 years ago
|
||
I was using a Fedora and a build from yesterday this morning when I picked that url and it was pretty reproducible just by loading the url and perhaps refreshing once or twice but I can't get it or the others top reproducible urls to fire the assert again. I've resubmitted the 75 urls again to see there is another that I can confirm still reproduces.
Comment 6•6 years ago
|
||
I've also gone back to the entire history and tested the full 140+ urls where this has occurred in the past. Bughunter can reliably reproduce it but I am unable to do so locally. The two most frequently reproduced urls are http://aops..uscg.mil/ and http://.google.hu/ which are invalid urls and only load when I use the bughunter proxy.
The next most frequently reproduced urls are
http://russnov.ru/vecher-s-vladimirom-solovevym-03-07-2019/
http://tadslgbtmusiccharts.blogspot.com/2019/09/chart-week-38-week-of-22nd-september.html#comment-form
http://tix-iyo-tiraab.blogspot.com/2018/01/ifb-compilation-197080s_15.html
http://evasbackparty.de/choco-fresh-torte/
https://achigawacell.blogspot.com/2018/03/cara-flash-samsung-galaxy-s4-gt-i9500.html
I wonder if the proxy isn't serving me stale content and the current versions of the sites no longer have whatever advertisement caused the assertion.
cknowles: Can we make the proxy always serve fresh content? What do you think is up with the malformed urls like http://aops..uscg.mil/ and http://.google.hu/ ? When I load them with an opt build, the proxy returns a 400 Bad Request error with X-Squid-Error
ERR_INVALID_URL 0
bz: Perhaps you can proxy through a squid instance somewhere and try one of the invalid urls http://aops..uscg.mil/ and http://.google.hu/ ?
Comment 7•6 years ago
|
||
Sadly, we already do this:
cache deny all
Is on the config for the proxy.
Per https://wiki.squid-cache.org/SquidFaq/ConfiguringSquid#Can_I_make_Squid_proxy_only.2C_without_caching_anything.3F and other squid docs, that's the correct approach for the version we're on.
Looking at the logs, I don't see any cache hits.
I think the issue is that the squid box is trying to resolve the DNS name - and rightly failing due to the malformation. Of course, I can't even open those URLs on a test VM I have.
Let me know if I can look at anything else for you.
Comment 8•6 years ago
|
||
I'm not quite sure where to find a random squid instance...
Comment 9•6 years ago
|
||
I did a quick dnf install and just ran squid from the command line. Weirdly I get a different assertion failure than the one I got with the bughunter proxy when loading the invalid urls.
Assertion failure: cache->GetWrapperMaybeDead() == obj || (js::RuntimeIsBeingDestroyed() && !cache->GetWrapperMaybeDead()), at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/dom/BindingUtils.h:1378
I'll keep looking for reproducible urls and will report back if I find them.
Comment 12•6 years ago
|
||
bz: Just reproduced twice in a row on Ubuntu 19.10 Nightly with https://www.baiscopelk.com/
Get it while it's hot.
Assertion failure: XRE_IsParentProcess(), at /builds/worker/workspace/build/src/toolkit/components/windowwatcher/nsWindowWatcher.cpp:700
#01: nsWindowWatcher::OpenWindow(mozIDOMWindowProxy*, char const*, char const*, char const*, nsISupports*, mozIDOMWindowProxy**) (/mozilla/builds/nightly/mozilla/firefox-debug/dist/bin/libxul.so)
#02: NS_InvokeByIndex (/mozilla/builds/nightly/mozilla/firefox-debug/dist/bin/libxul.so)
Comment 13•6 years ago
|
||
I tried that URL a few times on Linux (under rr) now, starting 5 minutes after comment 12, and haven't been able to reproduce. :(
Comment 14•6 years ago
|
||
I can still reliably reproduce on the ubuntu 19.10 vm but I can run rr on the vm so I can't check it there. I can't reproduce on my local laptop running fedora 30 at all. What distro are you using?
Comment 15•6 years ago
|
||
cknowles got the vm configured temporarily to run rr and I was able to get the assertion. If there is a file or directory I can share with you or if you want to hit me on irc and drive me like the bot I am, let me know.
Comment 16•6 years ago
|
||
Thank you! I'm on Fedora 30 as well.
If you can rr pack the trace directory involved and then zip it up and get it to me, that would be really helpful. Also the version of rr that was used to record.
Comment 17•6 years ago
|
||
Ok, Bob and I stepped through this in rr and we think we know what's going on:
- An iframe is created in a web page (from the parser). It starts a load of its src attribute.
- That load comes back as a redirect and necko starts its various redirect processing. In particular, it calls
HttpChannelChild::Redirect1Beginwhich callsHttpChannelChild::SetupRedirectwhich creates the new channel and callsSetupReplacementChannelto set up its loadgroup, etc. After this the redirection process goes async. - While that async work is going on, the iframe element is removed from the web page. That tears down the docshell, which calls
Cancelon the docshell's loadgroup. This cancels the pre-redirect channel, but not the post-redirect one, since the latter is not in the loadgroup yet! Docshell teardown nulls out themDocShellmember of thensDSURIContentListener. - The async redirect work finishes up and necko calls into
HttpChannelChild::CompleteRedirectSetupwhich adds the post-redirect channel to the loadgroup. There are comments in there about how "No need to check for cancel", but those really don't seem right to me... At this point the pre-redirect channel hasNS_BINDING_ABORTEDas its status, as expected. - The post-redirect channel sends an
OnStartRequestto its listener, and its status is NOT a failure status at this point, contrary to the comments inCompleteRedirectSetup. So the URILoader tries to handle the load, thensDSURIContentListenerdoesn't want to handle it (because it has a nullmDocShell, per item 3), and we end up trying to hand it off to the helper app service, triggering the assert.
I would have thought that HttpChannelChild::Cancel would call Cancel on mRedirectChannelChild if it's non-null... Or that something would prevent the redirect from completing successfully if Cancel is called on the pre-redirect channel before the post-redirect channel is added to the loadgroup, at least.
Honza, could you take a look? I think you know this stuff best... If it would be useful, Bob still has the live rr replay and you could poke around in there.
Comment 18•6 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #17)
Ok, Bob and I stepped through this in rr and we think we know what's going on:
- An iframe is created in a web page (from the parser). It starts a load of its src attribute.
- That load comes back as a redirect and necko starts its various redirect processing. In particular, it calls
HttpChannelChild::Redirect1Beginwhich callsHttpChannelChild::SetupRedirectwhich creates the new channel and callsSetupReplacementChannelto set up its loadgroup, etc. After this the redirection process goes async.- While that async work is going on, the iframe element is removed from the web page. That tears down the docshell, which calls
Cancelon the docshell's loadgroup. This cancels the pre-redirect channel, but not the post-redirect one, since the latter is not in the loadgroup yet! Docshell teardown nulls out themDocShellmember of thensDSURIContentListener.- The async redirect work finishes up and necko calls into
HttpChannelChild::CompleteRedirectSetup
The cancellation of the pre-redirect channel on the child process happens between the async redirect decision for the post-redirect child channel has been made and Redirect3Complete message has been received back on the child (what happens after the post-redirect nsHttpChannel on the parent process has been succesfully asyncopened in reaction to accepting the redirect also on the child).
which adds the post-redirect channel to the loadgroup. There are comments in there about how "No need to check for cancel", but those really don't seem right to me... At this point the pre-redirect channel has
NS_BINDING_ABORTEDas its status, as expected.
You are right! Very good catch. The comment is here: https://searchfox.org/mozilla-central/rev/088e2cf29c59d733d57af43903eb0267dbf72e2a/netwerk/protocol/http/HttpChannelChild.cpp#2249-2254 and should be removed or altered to reflect the changes I suggest below.
- The post-redirect channel sends an
OnStartRequestto its listener, and its status is NOT a failure status at this point, contrary to the comments inCompleteRedirectSetup. So the URILoader tries to handle the load, thensDSURIContentListenerdoesn't want to handle it (because it has a nullmDocShell, per item 3), and we end up trying to hand it off to the helper app service, triggering the assert.I would have thought that
HttpChannelChild::Cancelwould callCancelonmRedirectChannelChildif it's non-null... Or that something would prevent the redirect from completing successfully ifCancelis called on the pre-redirect channel before the post-redirect channel is added to the loadgroup, at least.Honza, could you take a look? I think you know this stuff best... If it would be useful, Bob still has the live rr replay and you could poke around in there.
I believe the right fix here is to simply throw the post-redirect child channel away (it's not in the load group, doesn't refer the listener and has never been asyncopened) when the pre-redirect child channel is found cancelled in HttpChannelChild::Redirect3Complete; more or less copy what happens here for succeeded == false: https://searchfox.org/mozilla-central/rev/088e2cf29c59d733d57af43903eb0267dbf72e2a/netwerk/protocol/http/HttpChannelParent.cpp#2145, never send the event.
I don't think this is a security issue.
Comment 20•6 years ago
•
|
||
(In reply to Bob Clary [:bc:] from comment #19)
Honza: Do you still need the rr recording?
I believe you can delete it. It's possible to write a test for the scenario, so we will be able to verify the cause and the fix.
And thanks!
Updated•6 years ago
|
Updated•6 years ago
|
Comment 22•6 years ago
|
||
I don't think this is a security issue.
Are you sure? This is violating assertions in the WindowWatcher; do you understand what the implications of that are? I know I don't...
Comment 23•6 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #22)
I don't think this is a security issue.
Are you sure? This is violating assertions in the WindowWatcher; do you understand what the implications of that are? I know I don't...
Thanks for stepping in, Boris. I was mostly reading only the findings from your comment 17 and not much the rest of the bug. Thinking about it more, I am not able to assess if this is a security bug or not, to be honest. The assertion was added in bug 1261842, which was not a security bug and thus a comment for the assertion could have been added, but was not. Hard to say why this is expected only on the parent process and what the implications of violating it are.
This has been prioritized as a P2 in Necko. IMO, if there is a security bug then it's more likely in the response handling (doc nav) than in Necko itself. The necko bug mostly just uncovers that the paths above necko are vulnerable.
I think this should move back to window management component, and try to find a fix in ww, and we should also file a new bug for the necko issue to fix what you have found in comment 17.
What do you think?
Comment 24•6 years ago
|
||
IMO, if there is a security bug then it's more likely in the response handling
I mean, necko is violating its contract, which makes it really hard to reason about anything in the higher layers. There are lots of places where we assume that once we cancel a channel it is in fact canceled!
I think we just need to fix this on the necko side; it's really not clear to me what the higher layers could do here without breaking all sorts of other cases where we really do want to process a response from necko.
Comment 25•6 years ago
|
||
Got it. If this can only trigger because of necko contract violation and there is nothing else for the consumer but keeping some additional information about the channel cancellation (which is actually pretty redundant), then this is solely a necko bug, yes. Please feel free to nominate as a P1 so we can find an assignee and fix for this release.
Comment 26•6 years ago
|
||
Right, on the consumer side what this looks like is that we cancel a channel and then later we still get an OnStartRequest with a non-error status. How do I nominate as P1?
Comment 27•6 years ago
|
||
Ah, I mean to simply change the priority to P1. Done and ni? Nhi to find an assignee.
Updated•6 years ago
|
| Assignee | ||
Comment 28•6 years ago
|
||
| Assignee | ||
Comment 29•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #20)
(In reply to Bob Clary [:bc:] from comment #19)
Honza: Do you still need the rr recording?
I believe you can delete it. It's possible to write a test for the scenario, so we will be able to verify the cause and the fix.
It looks like there is no easy way to write a test for this. To reproduce this, the pre-redirect channel has to be canceled after SendRedirect2Verify.
Comment 30•6 years ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #29)
(In reply to Honza Bambas (:mayhemer) from comment #20)
(In reply to Bob Clary [:bc:] from comment #19)
Honza: Do you still need the rr recording?
I believe you can delete it. It's possible to write a test for the scenario, so we will be able to verify the cause and the fix.
It looks like there is no easy way to write a test for this. To reproduce this, the pre-redirect channel has to be canceled after
SendRedirect2Verify.
OK, if this is too hard, just leave it. Although, I was thinking about having a custom redirect veto handler added that would sync somehow between the two processes. It's easy now to do messaging in xpcshell tests for e10s. But that was just a thought and maybe it's really not that easy.
Comment 31•6 years ago
|
||
Comment 32•6 years ago
|
||
| bugherder | ||
Updated•6 years ago
|
| Comment hidden (Intermittent Failures Robot) |
Description
•