Closed
Bug 1006695
Opened 11 years ago
Closed 11 years ago
Preserving system sources fails if source compression is used
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
VERIFIED
FIXED
mozilla32
People
(Reporter: jryans, Assigned: laszio.bugzilla)
References
Details
(Keywords: regression)
Attachments
(2 files, 5 obsolete files)
1.15 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
1.10 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
In bug 990353, a feature was added to stop saving system sources, which is enabled by default for B2G.
For B2G developers, we'd like to return to the previous behavior so that certified apps can be debugged again, and so bug 1001348 attempted go back to previous behavior when we're already debugging certified apps by setting "javascript.options.discardSystemSource" to "false" at runtime.
Setting this pref to false (preserving system sources once again) works correctly on single core devices, like the Keon, but causes issues (breaks the Gaia UI) on multi-core devices, like the Nexus 4.
Using the discardSystemSource = false setting causes issues when set at both runtime or boot time (in the profile).
:mccr8 hypothesized that it could be related to source compression (which is only done on multi-core). I manually disabled source compression in a local Nexus 4 build, and now preserving system sources works correctly.
So, there appears to be strange interaction between using the discardSystemSource = false and source compression.
Comment 1•11 years ago
|
||
Is anyone looking into this? We really need the debugger back for certified bugs
Comment 2•11 years ago
|
||
Is this something you could look at Brian? It seems related to things you've worked on. Thanks.
Flags: needinfo?(bhackett1024)
Comment 3•11 years ago
|
||
Brian - As context, this is something that's blocking a lot of Gaia people from getting work done, and that we should fix quickly. I would really like to look at it, but I just have too many things on my plate right now (which, incidentally, is why I'm behind on that review I owe you).
Comment 4•11 years ago
|
||
Naveed/Brian,
This bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1001348 is blocking our team for making progress on madai related features (camera) that only work on nexus4. The root of which is tracked here. Please help address this issue.
Thanks a bunch!
Hema
Flags: needinfo?(nihsanullah)
Comment 5•11 years ago
|
||
I really don't know this code well. The only thing I've done with source compression was changing it to happen on JS helper threads rather than on its own thread.
Flags: needinfo?(bhackett1024)
Comment 6•11 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #5)
> I really don't know this code well. The only thing I've done with source
> compression was changing it to happen on JS helper threads rather than on
> its own thread.
FWIW, nobody knows it all that well. The person who wrote it doesn't work at Mozilla IIUC. We just need somebody to jump in and figure it out.
Comment 7•11 years ago
|
||
OK but I don't understand the urgency here. If turning off source compression works then why not just do that until a better fix is in place?
Is the problem in this bug with setting discardSystemSource to false dynamically, or with it ever being set to false? If the latter, does function source accessing work correctly for content scripts (where presumably off thread compression will be used)? Also, are there any STR for desktop b2g?
Comment 8•11 years ago
|
||
A custom-prefs.js file with the two lines below causes gaia not to boot on Nexus 4. No homescreen, just a black screen:
user_pref("devtools.debugger.forbid-certified-apps", false);
user_pref("javascript.options.discardSystemSource", false);
Reporter | ||
Comment 9•11 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #7)
> OK but I don't understand the urgency here. If turning off source
> compression works then why not just do that until a better fix is in place?
Maybe this would be okay if there was an "easy" way to turn it off for Gaia developers, such as a pref or something... Currently it requires editing C++ source, unless you want to disable JS_THREADSAFE entirely, which is more like a sledgehammer.
Also, doing this would make the developer setup diverge further from a real production device, which is not great, but I suppose it's workable if everyone commits to it being temporary.
> Is the problem in this bug with setting discardSystemSource to false
> dynamically, or with it ever being set to false?
The problem seems to manifest if it is ever set to false. As in, if it is set to false in the profile during b2g startup, we get to the failure scenario. So, somehow setting it to false at boot time is not the same as reverting the patch from bug 990353 entirely (when source compression is in effect).
> If the latter, does function source accessing work correctly for content scripts (where
> presumably off thread compression will be used)?
I am not sure of the answer here. I can attempt to find out.
> Also, are there any STR for desktop b2g?
I originally tested setting discardSystemSource to false in the b2g simulator (which uses b2g desktop), and it seemed to work fine there. However, I don't know at the moment whether source compression is active on b2g desktop. I can try to determine this too.
Comment 10•11 years ago
|
||
Should bug 990353 just be reverted until this bug is fixed? Does source compression work for non-system sources?
Related to B2G source compression, bug 837110 suggests replacing zlib compression with a faster algorithm like lz or snappy might make source compression viable on single-core devices.
Comment 11•11 years ago
|
||
The issue is not exclusive to Nexus 4. Sources don't show up either on my hamachi:
Gaia 870a5c518742665d36b17e7e88c2ab07d440b94c
Gecko https://hg.mozilla.org/mozilla-central/rev/417acde736e7
BuildID 20140507040203
Version 32.0a1
ro.build.version.incremental=324
ro.build.date=Thu Dec 19 14:04:55 CST 2013
Reporter | ||
Comment 12•11 years ago
|
||
(In reply to Diego Marcos [:dmarcos] from comment #11)
> The issue is not exclusive to Nexus 4. Sources don't show up either on my
> hamachi:
>
> Gaia 870a5c518742665d36b17e7e88c2ab07d440b94c
> Gecko https://hg.mozilla.org/mozilla-central/rev/417acde736e7
> BuildID 20140507040203
> Version 32.0a1
> ro.build.version.incremental=324
> ro.build.date=Thu Dec 19 14:04:55 CST 2013
To clarify this slightly, Diego was running in the default configuration of discardSystemSource = true in this case, so it's expected that the sources are missing here.
Comment 13•11 years ago
|
||
(In reply to Chris Peterson (:cpeterson) from comment #10)
> Should bug 990353 just be reverted until this bug is fixed? Does source
> compression work for non-system sources?
We're basically committed to that bug for Tarako at this point. It's critical for our entire FFOS strategy.
> Related to B2G source compression, bug 837110 suggests replacing zlib
> compression with a faster algorithm like lz or snappy might make source
> compression viable on single-core devices.
I don't think we need to turn on source compression. It's almost certainly just some dumb logic bug in the JS engine that somebody with a few cycles needs to go in and fix.
Reporter | ||
Comment 14•11 years ago
|
||
(In reply to J. Ryan Stinnett [:jryans] from comment #9)
> (In reply to Brian Hackett (:bhackett) from comment #7)
> > Also, are there any STR for desktop b2g?
>
> I originally tested setting discardSystemSource to false in the b2g
> simulator (which uses b2g desktop), and it seemed to work fine there.
> However, I don't know at the moment whether source compression is active on
> b2g desktop. I can try to determine this too.
I've confirmed that b2g desktop does appear to use source compression, but I cannot get it to fail when discardSystemSource = false like I can with a multi-core device build.
Comment 15•11 years ago
|
||
Luke: bhackett thought you might have some insight into this B2G source compression bug because you reviewed (some of) the discardSystemSource patches in bug 990353.
Flags: needinfo?(luke)
Priority: -- → P1
![]() |
||
Comment 16•11 years ago
|
||
I rescanned bug 990353 and the only thing that looks fishy is:
http://hg.mozilla.org/mozilla-central/diff/95391b25bc11/js/xpconnect/loader/mozJSComponentLoader.cpp
since the logic in the comment would seem to be invalidated by setting setDiscardSource(false) on b2g.
Flags: needinfo?(luke) → needinfo?(bobbyholley)
Comment 17•11 years ago
|
||
(In reply to Luke Wagner [:luke] (on partial PTO) from comment #16)
> I rescanned bug 990353 and the only thing that looks fishy is:
>
> http://hg.mozilla.org/mozilla-central/diff/95391b25bc11/js/xpconnect/loader/
> mozJSComponentLoader.cpp
> since the logic in the comment would seem to be invalidated by setting
> setDiscardSource(false) on b2g.
Yeah, but that just means we lose an optimization. It shouldn't impact correctness.
This code is just saying that we can't enable lazy source retrieval for functions in the b2g Frankenstein compartment, because the source hook doesn't know how to deal with that. Ordinarily, turning off lazy source retrieval would be a memory regression, because we'd have to store the source. But in this case, it doesn't matter in production, because we just throw away all the source on b2g.
Flags: needinfo?(bobbyholley)
Assignee | ||
Comment 18•11 years ago
|
||
It looks like that the worker thread for compression is not known by Nuwa so Nuwa blocked (by chances?) when starting. This is not a bug of discarding sources; It just hides the real problem.
@jryans, would you mind to try the attached patch on Nexus 4? I don't have a Nexus 4 and can only test by unconditionally enabling source compression on a single core device.
Attachment #8419248 -
Flags: feedback?(cyu)
Flags: needinfo?(jryans)
Reporter | ||
Comment 19•11 years ago
|
||
Comment on attachment 8419248 [details] [diff] [review]
jsworkers.nuwa.diff
This works for me! After adding this patch, I am able to use the Nexus 4 with source compression (as it is by default since it's multi-core) and discardSystemSource = false.
Great work! :D
Attachment #8419248 -
Flags: feedback+
Flags: needinfo?(jryans)
Comment 20•11 years ago
|
||
(In reply to Ting-Yuan Huang from comment #18)
> It looks like that the worker thread for compression is not known by Nuwa so
> Nuwa blocked (by chances?) when starting. This is not a bug of discarding
> sources; It just hides the real problem.
Woah, I'm surprised that doesn't mess more stuff up! Maybe we usually don't start up many JS helpers that early.
Comment 21•11 years ago
|
||
We should've catch bugs like this in mochitest, but this case escaped from our safety net because our emulator runs only with once virtual core.
Assignee | ||
Comment 22•11 years ago
|
||
This patch makes js worker threads, only in the Nuwa process when booting, wait for Nuwa to be ready.
NuwaFreezeCurrentThread() is required because anything, including IPC, can happen in the threads. This can be suboptimal that worker stuffs won't be shared among process but is much safer.
I'll update try results once they are finished.
Attachment #8419248 -
Attachment is obsolete: true
Attachment #8419248 -
Flags: feedback?(cyu)
Attachment #8419961 -
Flags: review?(luke)
Attachment #8419961 -
Flags: review?(khuey)
Assignee | ||
Comment 23•11 years ago
|
||
Same as part.1 but for workers in thread pool.
Attachment #8419965 -
Flags: review?(luke)
Attachment #8419965 -
Flags: review?(khuey)
![]() |
||
Comment 24•11 years ago
|
||
Comment on attachment 8419961 [details] [diff] [review]
Bug 1006695 - Mark JS worker threads to be known by Nuwa.
Review of attachment 8419961 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jsworkers.cpp
@@ +23,5 @@
> #include "jsobjinlines.h"
> #include "jsscriptinlines.h"
>
> +#ifdef MOZ_NUWA_PROCESS
> +#include "ipc/Nuwa.h"
Style nit: can you indent 'include' one space (# include "ipc/Nuwa.h") here and in the other patch?
Attachment #8419961 -
Flags: review?(khuey) → review+
![]() |
||
Updated•11 years ago
|
Attachment #8419965 -
Flags: review?(luke) → review+
![]() |
||
Comment 25•11 years ago
|
||
Comment on attachment 8419961 [details] [diff] [review]
Bug 1006695 - Mark JS worker threads to be known by Nuwa.
Oops, wrong box.
Attachment #8419961 -
Flags: review?(luke) → review?(khuey)
Comment 26•11 years ago
|
||
Nice catch Ting. Assigning it to you since you have the patches up.
Assignee: nobody → tchou
Flags: needinfo?(nihsanullah)
Assignee | ||
Comment 28•11 years ago
|
||
NuwaFreezeCurrentThread() freezed Nuwa in the tests...sigh. Some functions Nuwa preloads/calls require the worker threads to work before Nuwa forks.
@Kyle and @Jason, do you think it's safe to go without freezing those workers in Nuwa? There are similar things in gc helper threads and no NuwaFreezeCurrentThread() called. I think, if those workers get no jobs/IPCs after Nuwa stabilized, then it should be OK.
The following tries showed me some confidence:
(The patch without NuwaFreeCurrentThread())
https://tbpl.mozilla.org/?tree=Try&rev=a28f8cbaea1b
(The patch without NuwaFreeCurrentThread() + don't discard script source + force compression)
https://tbpl.mozilla.org/?tree=Try&rev=57108aff0a86
Flags: needinfo?(khuey)
Flags: needinfo?(jorendorff)
Comment 29•11 years ago
|
||
(In reply to Ting-Yuan Huang from comment #28)
> NuwaFreezeCurrentThread() freezed Nuwa in the tests...sigh. Some functions
Hmm, I made the suggestion to freeze the js worker in its entry point. It turns out that this freezes the Nuwa process and effectively breaks the whole device except things residing in the chrome process. It freezes the Nuwa process because the js worker is used during XPCOM initialization, and the main thread is waiting for a frozen thread:
#0 __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:182
#1 0x400c955c in __pthread_cond_timedwait_relative (cond=0x40304514, mutex=0x40315c90, reltime=0x0) at bionic/libc/bionic/pthread.c:1477
#2 0x400c9610 in __pthread_cond_timedwait (cond=0x40304514, mutex=0x40315c90, abstime=0x0, clock=0) at bionic/libc/bionic/pthread.c:1500
#3 0x40084616 in __wrap_pthread_cond_wait (cond=<value optimized out>, mtx=<value optimized out>) at mozglue/build/Nuwa.cpp:1011
#4 0x4227db60 in PR_WaitCondVar (cvar=0x40304510, timeout=4294967295) at nsprpub/pr/src/pthreads/ptsynch.c:385
#5 0x417070ee in js::GlobalWorkerThreadState::wait (this=0xbee917d4) at js/src/jsworkers.cpp:517
#6 js::SourceCompressionTask::complete (this=0xbee917d4) at js/src/jsworkers.cpp:960
#7 0x417d2568 in CompileFunctionBody (cx=0x40310f20, fun=..., options=<value optimized out>, formals=..., srcBuf=...) at js/src/frontend/BytecodeCompiler.cpp:657
#8 js::frontend::CompileFunctionBody (cx=0x40310f20, fun=..., options=<value optimized out>, formals=..., srcBuf=...) at js/src/frontend/BytecodeCompiler.cpp:668
#9 0x41680394 in JS::CompileFunction (cx=0x40310f20, obj=..., options=<value optimized out>, name=0x0, nargs=0, argnames=0x0, srcBuf=...) at js/src/jsapi.cpp:4822
#10 0x41680534 in CompileFunction (cx=0x40310f20, obj=..., options=..., name=0x0, nargs=0, argnames=0x0, bytes=0x43191000 "/* This Source Code Form is subject to the terms of the Mozilla Public\n * License, v. 2.0. If a copy of the MPL was not distributed with this file,\n * You can obtain one at http://mozilla.org/MPL/2.0/"..., length=8445) at js/src/jsapi.cpp:4841
#11 CompileFunction (cx=0x40310f20, obj=..., options=..., name=0x0, nargs=0, argnames=0x0, bytes=0x43191000 "/* This Source Code Form is subject to the terms of the Mozilla Public\n * License, v. 2.0. If a copy of the MPL was not distributed with this file,\n * You can obtain one at http://mozilla.org/MPL/2.0/"..., length=8445) at js/src/jsapi.cpp:4857
#12 0x40d2c350 in mozJSComponentLoader::ObjectForLocation (this=0x43034690, aComponentFile=<value optimized out>, aURI=<value optimized out>, aObject=..., aTableScript=..., aLocation=0x4320c8c0, aPropagateExceptions=false, aException=...) at js/xpconnect/loader/mozJSComponentLoader.cpp:952
#13 0x40d2cf88 in mozJSComponentLoader::LoadModule (this=0x43034690, aFile=<value optimized out>) at js/xpconnect/loader/mozJSComponentLoader.cpp:424
#14 0x406f2894 in nsComponentManagerImpl::KnownModule::Load (this=0x403f5860) at xpcom/components/nsComponentManager.cpp:741
#15 0x406f28ec in nsFactoryEntry::GetFactory (this=0x403afd60) at xpcom/components/nsComponentManager.cpp:1774
#16 0x406f29f8 in nsComponentManagerImpl::CreateInstanceByContractID (this=<value optimized out>, aContractID=0x403ad460 "@mozilla.org/browser/directory-provider;1", aDelegate=0x0, aIID=..., aResult=0xbee91cc4) at xpcom/components/nsComponentManager.cpp:1075
#17 0x406f3d84 in nsComponentManagerImpl::GetServiceByContractID (this=0x40375400, aContractID=0x403ad460 "@mozilla.org/browser/directory-provider;1", aIID=..., result=<value optimized out>) at xpcom/components/nsComponentManager.cpp:1435
#18 0x406c3b9a in CallGetService (aContractID=0x403ad460 "@mozilla.org/browser/directory-provider;1", aIID=..., aResult=0x0) at xpcom/glue/nsComponentManagerUtils.cpp:64
#19 0x406c3bca in nsGetServiceByContractID::operator() (this=<value optimized out>, aIID=..., aInstancePtr=0x0) at xpcom/glue/nsComponentManagerUtils.cpp:252
#20 0x406c315a in nsCOMPtr_base::assign_from_gs_contractid (this=0xbee91d94, gs=..., iid=<value optimized out>) at xpcom/glue/nsCOMPtr.cpp:92
#21 0x406e93d2 in nsCOMPtr (this=0x403143c0) at ../../dist/include/nsCOMPtr.h:658
#22 nsDirectoryService::RegisterCategoryProviders (this=0x403143c0) at xpcom/io/nsDirectoryService.cpp:477
#23 0x406ca9bc in NS_InitXPCOM2 (result=<value optimized out>, binDirectory=0x40324110, appFileLocationProvider=0x40301bb0) at xpcom/build/nsXPComInit.cpp:655
#24 0x4132d038 in XRE_InitEmbedding2 (aLibXULDirectory=<value optimized out>, aAppDirectory=0x40324110, aAppDirProvider=0x0) at toolkit/xre/nsEmbedFunctions.cpp:150
#25 0x40841550 in mozilla::ipc::ScopedXREEmbed::Start (this=0x4033f268) at ipc/glue/ScopedXREEmbed.cpp:99
#26 0x40cc0c86 in mozilla::dom::ContentProcess::Init (this=0x4033f000) at dom/ipc/ContentProcess.cpp:28
#27 0x4132ce96 in XRE_InitChildProcess (aArgc=-1092015912, aArgv=0xbee9284c, aProcess=1077145600) at toolkit/xre/nsEmbedFunctions.cpp:509
#28 0x000087a0 in main (argc=8, argv=0xbee928d4) at ipc/app/MozillaRuntimeMain.cpp:149
I think we need to remove the NuwaFreezeCurrentThread() and let the worker freeze after we have finished XPCOM initialization.
Comment 30•11 years ago
|
||
Retest with NuwaFreezeCurrentThread() call removed from both patches:
https://tbpl.mozilla.org/?tree=Try&rev=774b10720b22
Assignee | ||
Comment 31•11 years ago
|
||
Luke, do you think removing the freeze is OK? The situation of js workers should be similar to gc helpers.
I don't worry about web workers; They should not run in such early stages.
Attachment #8419961 -
Attachment is obsolete: true
Attachment #8419965 -
Attachment is obsolete: true
Attachment #8419961 -
Flags: review?(khuey)
Attachment #8419965 -
Flags: review?(khuey)
Attachment #8422241 -
Flags: review?(luke)
Attachment #8422241 -
Flags: review?(khuey)
Assignee | ||
Comment 32•11 years ago
|
||
Attachment #8422244 -
Flags: review?(luke)
Attachment #8422244 -
Flags: review?(khuey)
![]() |
||
Comment 33•11 years ago
|
||
Comment on attachment 8422241 [details] [diff] [review]
Bug 1006695 - Mark JS worker threads to be known by Nuwa. v2
I can't think of a problem (but I'm not that familiar with Nuwa).
Attachment #8422241 -
Flags: review?(luke) → review+
![]() |
||
Updated•11 years ago
|
Attachment #8422244 -
Flags: review?(luke) → review+
Comment 34•11 years ago
|
||
What's the ETA for this?
Comment 35•11 years ago
|
||
If we are not waiting for kyle's review, we should land be able to land this.
There will be a problem only when we freeze a worker thread and then uses it in the Nuwa process. Since we don't run PreloadSlowThings() in the Nuwa process, I can't think of any situation where we might need to use it.
Attachment #8422241 -
Flags: review?(khuey) → review+
Attachment #8422244 -
Flags: review?(khuey) → review+
Flags: needinfo?(khuey)
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(jorendorff)
Keywords: checkin-needed
Comment 36•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/70bc18f7b6d5
https://hg.mozilla.org/integration/mozilla-inbound/rev/3da78e2f3ac8
Keywords: checkin-needed
Comment 37•11 years ago
|
||
sorry had to back out https://hg.mozilla.org/integration/mozilla-inbound/rev/70bc18f7b6d5 for test failures like https://tbpl.mozilla.org/php/getParsedLog.php?id=39994876&tree=Mozilla-Inbound
Comment 38•11 years ago
|
||
also had to backout the other cset for test failures like https://tbpl.mozilla.org/php/getParsedLog.php?id=39997995&tree=Mozilla-Inbound
Assignee | ||
Comment 39•11 years ago
|
||
Hi Luke, would you mind to review again? Sorry I only tried on the B2G platforms last time, now I pushed full tries:
https://tbpl.mozilla.org/?tree=Try&rev=2d81378204ef
https://tbpl.mozilla.org/?tree=Try&rev=e0b4aa70c756
The changes to the previous patches (v2) are replacing header inclusions with prototype declarations. This consists with what exists in jsgc.c and seems to be an acceptable compromise (to introducing new or refactoring inclusion paths).
Attachment #8422241 -
Attachment is obsolete: true
Attachment #8422244 -
Attachment is obsolete: true
Attachment #8425482 -
Flags: review?(luke)
Assignee | ||
Comment 40•11 years ago
|
||
Attachment #8425484 -
Flags: review?(luke)
![]() |
||
Updated•11 years ago
|
Attachment #8425482 -
Flags: review?(luke) → review+
![]() |
||
Updated•11 years ago
|
Attachment #8425484 -
Flags: review?(luke) → review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 41•11 years ago
|
||
remote: https://hg.mozilla.org/integration/b2g-inbound/rev/4e0ddc525371
remote: https://hg.mozilla.org/integration/b2g-inbound/rev/53af735f5916
Keywords: checkin-needed
Comment 42•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4e0ddc525371
https://hg.mozilla.org/mozilla-central/rev/53af735f5916
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Reporter | ||
Comment 43•11 years ago
|
||
I can confirm that this resolves the issue on my Nexus 4.
Status: RESOLVED → VERIFIED
Comment 44•11 years ago
|
||
Do you think we can uplift this to 31?
Bug 990353 regressed devtools so that we can't see sources without crashing gaia on Fx31.
Blocks: 1018385
Assignee | ||
Comment 45•11 years ago
|
||
Yes, the patch only depends on Nuwa which is landed in 28 or 29.
You need to log in
before you can comment on or make changes to this bug.
Description
•