Closed Bug 1017490 Opened 10 years ago Closed 10 years ago

rewrite clock build script in javascript

Categories

(Firefox OS Graveyard :: Gaia::Build, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: yurenju, Assigned: yurenju)

References

Details

Attachments

(3 files, 1 obsolete file)

we use Makefile just for r.js and I found a way to execute r.js in javascript, so let's rewrite it.
Assignee: nobody → yurenju.mozilla
James,

I'm trying to load r.js in gaia build system environment instead of executing it as a xpcshell script, now it can be required and optimize but as this configuration (apps/clock/build/require_config.jslike) I expect file "build_stage/clock/startup.js" should be concated with other files (about ~5000 lines) but actually not.

I'm not r.js expert and it's not easy to find the root cause of it for me. Can you point some ways which can solve this problem or should I check something first?

thank you!

https://github.com/yurenju/gaia/commit/a400f30fd3fccfb7a6901b24341cc62fb8e3655e

BTW, I found |requirejsAsLib = true| can load r.js as a library but how do I optimize app by r.js as a library?
Flags: needinfo?(jrburke)
I'm still a bit new to how script loading works with a subscript loader, but here is a script I can run in xpcshell that loads r.js, then uses requirejs.optimize to run the build, passing the build options as a JS object:

// Let r.js know it will be used as a library.
var requirejsAsLib = true;

// Load the script, this creates a requirejs global.
load('tools/r.js');

// Pass the require_config.jslike build options as the first argument,
// as a JS object. Second arg is success callback, third is error
// callback. Example build args from another project shown, just for
// illustration.
requirejs.optimize({
    // Pass logLevel 0 to see build output in console.
    logLevel: 0,
    "appDir": "www",
    "baseUrl": "lib",
    "paths": {
        "app": "../app"
    },
    "dir": "www-built",
    "modules": [
        {
            "name": "app"
        }
    ]
}, function(buildText) {
  dump('build OK\n');
}, function (err) {
  dump('ERROR: ' + err + '\n');
});
Flags: needinfo?(jrburke)
That's great! I'll try this way to optimize with r.js, thanks!
it works, thanks!
Alex, could you review this pr? I found a way to optimize by r.js without makefile and we can use the same way to other apps.

let's remove them :-)

BTW, I removed make_gaia_shared.js because it looks not necessary for now.
Attachment #8431413 - Flags: review?(poirot.alex)
now we have some issue for building clock app on Linux, I'll take a look on tuesday.
As bug 968661 comment 59 describes, the make_gaia_shared.js was broken in that bug's changeset, and is less than ideal due to the duplication results. 

If you are hoping to convert the other r.js-optimized apps to the approach in this bug, the email app does other work in its make_gaia_shared.js to generate the hash digest used for its cookie cache. So in that case, there will be other work than just deleting the make_gaia_shared.js. 

Although, ideally, restoring the feature that make_gaia_shared.js provided instead of needing the comments in the index.thml could be considered too.
Comment on attachment 8431413 [details] [review]
github PR: https://github.com/mozilla-b2g/gaia/pull/19804

Really great idea! Could you tell us where, in the reflactoring plan, this bug is? I don't see it on the wiki page, nor does it block any other bug. Are you trying to hunt all app Makefiles down? (+1 for yes)

Two things:
1) It ends up being significantly slower. For APP=clock make, on master it takes 9s and 11s on your branch. On master shared.js takes 0.6s and the call to r.js takes only 1.6s whereas the call to the new build.js file takes 4.1.

2) And I think it is related. I don't think we are doing the exact same thing in r.js. The output is different. The call to `requirejs.optimize()` take 4s.

The files ends up being different in stage.
* Good point: we no longer copy gaia_shared.json
* JS files have 'use strict'; being inserted almost everywhere
* CSS files have comments being kept
* build_stage/clock/style/clock.css is completely different, on your patch it has many @import url('../shared/style/*.css'); whereas it hasn't on master. On master all shared css seems to be merged into clock.css.
Attachment #8431413 - Flags: review?(poirot.alex) → review-
yes, I would like to remove all Makefiles and this is an experiment result for this purpose, bit I haven't had a plan yet for this changes, but the goal is building gaia in firefox, let's discuss it today.
'use strict' and css issues has been resolved, now I'm investigating speed issue.
we spend double time on those two methods when using |require()| for r.js instead of using xpcshell command line:
- parse.traverse();
- parse();

both of two way use Reflect parser, and I also dump stack but looks the same.

James, any clue for it?


last pr: https://github.com/mozilla-b2g/gaia/pull/19804
Flags: needinfo?(jrburke)
Attached patch rjs-time.patch (obsolete) — Splinter Review
this patch is used for build time measurement in r.js
more information, this is experiment result for loading shared/js/l10n.js

executing r.js by command line:
- definesRequire: 34ms
- parse: 45ms

require r.js for optimization:
- definesRequire: 102ms
- parse: 122ms
I would not expect much difference in times once the code is started up. I wonder if it is how the subscript loader is used to loader r.js? As in, maybe it sets an environment setting, either jit settings or something else that are different than a command line load?

For instance, if a plain script that just calls load() to evaluate r.js, as in Comment 2, that might be closer to how r.js operates when called through the command line. If the timings between that plain load() call are better than the subscript pathway, then I would suspect something with the subscript pathway. Nothing that r.js would explicitly detect, but something with how the environment runs in that subscript approach.
Flags: needinfo?(jrburke)
thanks James, I'll try to use load() for getting same environment as command line.
BTW, there is my last experiment result, I found the root cause of performance issue is in parse.traverse() and parse.recurse(), so it's not an issue for Reflect parser.

executing r.js by command line:
parse.traverse : 38ms
parse.recurse : 38ms

require r.js for optimization:
parse.traverse : 107ms
parse.recurse : 113ms

interesting thing is those two functions are pure js without any xpcom components.

And I also found it's not easy to use load() since it is implemented in C++[1].

Alex, do you have any idea?

[1] http://dxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCShellImpl.cpp#318
Flags: needinfo?(poirot.alex)
try to export |load()| from Loader.Loader with globals but still not work :-/
tried another way -- remove requirejsAsLib and added ["-o", configPath]

but still spend two more seconds :(

https://github.com/yurenju/gaia/commit/92177fe0f4bf270426924ebd88327aeb18572db6
push another try since I got |console| not defined error on try server.

https://tbpl.mozilla.org/?tree=Gaia-Try&rev=8d071eca63b2
I haven't heard about any particular slowness regarding loadSubScript.

(In reply to Yuren [:yurenju] from comment #18)
> interesting thing is those two functions are pure js without any xpcom
> components.

Are you sure? It looks like traverse and recurse are using Reflect.jsm:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/reflect/reflect.jsm

But both command line and xpcshell are using it, so I'm clueless.

(In reply to Yuren [:yurenju] from comment #19)
> try to export |load()| from Loader.Loader with globals but still not work :-/

You would first have to expose load from xpcshell-common.js:
+++ b/build/xpcshell-commonjs.js
@@ -10,6 +10,8 @@ let { Loader } = Cu.import(loaderURI, {});
 Cu.import('resource://gre/modules/Services.jsm');
 Cu.import("resource://gre/modules/FileUtils.jsm");
 
+let xpcshellScope = this;
+
 var CommonjsRunner = function(module) {
   const GAIA_DIR = env.get('GAIA_DIR');
   const APP_DIR = env.get('APP_DIR');
@@ -39,7 +41,8 @@ var CommonjsRunner = function(module) {
   let loader = Loader.Loader({
     paths: paths,
     modules: {
-      'toolkit/loader': Loader
+      'toolkit/loader': Loader,
+      'xpcshell-scope': xpcshellScope

And then do: require("xcpshell-scope").load(...)
But load doesn't return any object nor accept any object as input and just uses and pollute the original xpcshell scope.

Have you tried only replacing direct r.js loading with loadSubScript, with something like this:
  $(XULRUNNERSDK) $(XPCSHELLSDK) ../../build/r.js -o build/require_config.jslike

With something minimal that:
  $(XULRUNNERSDK) $(XPCSHELLSDK) test.js -o build/require_config.jslike

test.js:
  Services.scriptloader.loadSubScript("file:///home/gaia/r.js");
Also, even if files are no longer different in stage, I'm still wondering if we are really executing the same methods/optimization/code in r.js.
I looked at config object between master and your patch and see differences:
* fileExclusionRegExp: is an object that looks empty (via JSON.stringify, so it can be a regexp...) in your branch whereas it is a string "^\\.|^test$|^build$" on master.
* logLevel is 0 on master and not defined on your branch
* buildFile is set only in your branch to "build/require_config.jslike"
* mainConfigFile is an absolute path on master whereas it is "../js/require_config.js" on your branch

Also I've traced call to readFile and I see more call on master than your branch (good sign). On master we are reading files from source tree AND build stage, whereas on your branch we are only reading files from build stage!

r.js looks like a black box, which takes a lot of time, now it is the slower step of the build system (even without this patch)!
We have to get a clear picture of each of its features: what it does and how.
Flags: needinfo?(poirot.alex)
(In reply to Alexandre Poirot (:ochameau) from comment #24)
> r.js looks like a black box, which takes a lot of time, now it is the slower
> step of the build system (even without this patch)!
> We have to get a clear picture of each of its features: what it does and how.

Keep in mind that r.js as stored in-tree is a concatenated distribution build.  If you want to look at the source components of it (including the optimizer which is likely where the time is actually being spent), it comes from: https://github.com/jrburke/r.js

Documentation on the optimizer features can be found at http://requirejs.org/docs/optimization.html
try to debug before soccer game but it's too late here :S

I will check config between previous way and this pr, hope I can find some clue tomorrow.
(In reply to Alexandre Poirot (:ochameau) from comment #23)
> With something minimal that:
>   $(XULRUNNERSDK) $(XPCSHELLSDK) test.js -o build/require_config.jslike
> 
> test.js:
>   Services.scriptloader.loadSubScript("file:///home/gaia/r.js");

that performance is good even I put |global| in second parameter.
Attached patch rjs-time2.patchSplinter Review
I still believe the problem is in parse.traverse()[1] and attachment is patch for mesure speed for parse.traverse().

[1] https://github.com/mozilla-b2g/gaia/blob/master/build/r.js#L20971-L20991
Attachment #8437572 - Attachment is obsolete: true
checked fileExclusionRegExp, mainConfigFile they are same, I tried using relative path for buildFile and result is same.

for current pr, count for calling readFile also is the same :-(

https://github.com/mozilla-b2g/gaia/pull/19804
Attached patch Reflect hackSplinter Review
Yuren, you were right, the biggest problem is Reflect. (Thanks for having narrowed down to that!)
I got this script running way faster with this hack.
It looks like there is some wrapper issues around Reflect.jsm implementation...

And good news, with this patch, your patch is now faster than master by 10% (3.2s to 2.9s on my laptop) \o/
OH MY GOD, thank you!!! I spent a lot of time to trace this issue.
and we still have a problem of Segmentation fault when executing xpcshell, I have requested a loaner machine for it on bug 1024948
I can't reproduce Segmentation fault on loaner machine, I will push another pr to try server.
same result.

Hi John Ford,

I don't know who should I ask, so feel free to redirect this needinfo?. I got Segmentation fault on gaia try server[1] but it can't be reproduced on loaner machine even I use the same xulrunner-sdk.

at first, I suspected that is customized xulrunner-sdk issue because Gb (build test) is passed but G (unit test) is failed, and we both need to build "clock" app which is the build process with Segmentation fault for Gb and G, and I found we use USE_LOCAL_XUNRUNNER_SDK=1 for G but doesn't use it for Gb.

after trying to reproduce it on loaner machine and I think that is not root cause since I can't reproduce it with USE_LOCAL_XUNRUNNER_SDK=1.

any idea for that? or do you know who added G (unit test) and Gb (build test) for gaia try server?

[1] https://tbpl.mozilla.org/php/getParsedLog.php?id=41937485&tree=Gaia-Try&full=1#error0
Flags: needinfo?(jhford)
also needinfo? Jonathan Griffi since I feel Jonathan may know about gaia try server environment
Flags: needinfo?(jgriffin)
Yuren,

I wonder if the copy of xulrunner-sdk that's on the test machines is busted or outdated?  Try applying this patch which ignores localxulrunner and submitting to try:

diff --git a/Makefile b/Makefile
index 83daad5..7b4826e 100644
--- a/Makefile
+++ b/Makefile
@@ -624,7 +624,6 @@ print-xulrunner-sdk:

 $(XULRUNNER_BASE_DIRECTORY):
        @echo "XULrunner directory: $(XULRUNNER_DIRECTORY)"
-ifndef USE_LOCAL_XULRUNNER_SDK
 ifneq ($(XULRUNNER_SDK_DOWNLOAD),$(shell test -d $(XULRUNNER_DIRECTORY) && cat $(XULRUNNER_URL_FILE) 2> /dev/null))
 # must download the xulrunner sdk
        rm -rf $(XULRUNNER_BASE_DIRECTORY)
@@ -643,7 +642,6 @@ else
 endif # MINGW32
        @echo $(XULRUNNER_SDK_DOWNLOAD) > $(XULRUNNER_URL_FILE)
 endif # XULRUNNER_SDK_DOWNLOAD
-endif # USE_LOCAL_XULRUNNER_SDK

 # Optional files that may be provided to extend the set of default
 # preferences installed for gaia.  If the preferences in these files
Flags: needinfo?(jhford)
Also, it looks like the infrastructure is pulling down the wrong Xulrunner bundle?

The logs show that it's pulling a xulrunner that has a sha512 of d4a0da54e75c27cd2f535e66b586f119ef08b3bde4a9eee03662d296b3434189c542c0a7e7a75954030c04396a9823e22e1f884f5d87c0f4017944cd50ff38de

which doesn't seem to match the xulrunner that the makefile specifies:

jhford:~ $ cd junk
jhford:~/junk $ curl -LO http://ftp.mozilla.org/pub/mozilla.org/xulrunner/nightly/2014/03/2014-03-08-03-02-03-mozilla-central/xulrunner-30.0a1.en-US.linux-x86_64.tar.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25.2M  100 25.2M    0     0  2349k      0  0:00:10  0:00:10 --:--:-- 2421k
jhford:~/junk $ openssl dgst -sha512 xulrunner-30.0a1.en-US.linux-x86_64.tar.bz2
SHA512(xulrunner-30.0a1.en-US.linux-x86_64.tar.bz2)= 5b18be7f5502395ec0c13cdfdf37cc8cc52fc62ec8414aa263bad0643107e853bb120ba6057f8366c3f7ff82481b8408f662259d652fec54d1dd3cfc20d0d048
jhford:~/junk $ curl -LO http://ftp.mozilla.org/pub/mozilla.org/xulrunner/nightly/2014/03/2014-03-08-03-02-03-mozilla-central/xulrunner-30.0a1.en-US.linux-i686.tar.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 26.0M  100 26.0M    0     0  2322k      0  0:00:11  0:00:11 --:--:-- 2301k
jhford:~/junk $ openssl dgst -sha512 xulrunner-30.0a1.en-US.linux-i686.tar.bz2
SHA512(xulrunner-30.0a1.en-US.linux-i686.tar.bz2)= 63a2929c13e3ec73e90be204e5446731e03d23330b5a243cfd3cf4bbef2b43d326a1c8b4a350ff71b8c1ee3e0ce00899c622ab8ca23e7332f1f5eb56446e2a7d


It could always be that the differences are because Releng is repacking xulrunner, but I bet it's that we're using an out of date/wrong xulrunner.
I agree this is the issue.  The xre.zip we're using in buildbot is currently v26.
Flags: needinfo?(jgriffin)
I can't reproduce Segmentation fault with xulrunner-sdk-26, now I'm trying to remove USE_LOCAL_XUNRUNNER_SDK and test on try server.
Surprised me, looks like a python issue.
I can reproduce on my box and this looks like a mozharness issue, I used a python script[1] which using pdb to get same arguments from mozharness/base/script.py@run_command[2] on linux64 box.

John Ford, any thought for this?

[1] https://gist.github.com/yurenju/4abf50117c48478288d4
[2] http://hg.mozilla.org/build/mozharness/file/3347b848256c/mozharness/base/script.py#l688
Flags: needinfo?(jhford)
Comment on attachment 8431413 [details] [review]
github PR: https://github.com/mozilla-b2g/gaia/pull/19804

since Segmentation fault looks like a mozharness issue, I would like to review this pr first and file another mozharness issue if needed.

Alex, could you review this pr? thanks!
Attachment #8431413 - Flags: review- → review?(poirot.alex)
Comment on attachment 8431413 [details] [review]
github PR: https://github.com/mozilla-b2g/gaia/pull/19804

\o/ for an even faster build system!
We would still need to figure out what is being done in r.js and why it takes so long.

r+ as long as you find a way to keep the tree green while landing.
Attachment #8431413 - Flags: review?(poirot.alex) → review+
(In reply to Yuren [:yurenju] from comment #44)
> I can reproduce on my box and this looks like a mozharness issue, I used a
> python script[1] which using pdb to get same arguments from
> mozharness/base/script.py@run_command[2] on linux64 box.
> 
> John Ford, any thought for this?
> 
> [1] https://gist.github.com/yurenju/4abf50117c48478288d4
> [2]
> http://hg.mozilla.org/build/mozharness/file/3347b848256c/mozharness/base/
> script.py#l688

That's not a python or mozharness segfault, that's a segfault in whatever make is invoking.  Also, using your script, I am not able to reproduce the segfault using the the xulrunner specified in the makefiles.

This is the output of the script on my machine:

jhford:~/b2g/gaia $ python test.py
Makefile:135: NO_FTU_PING=1
/bin/bash: --version: command not found
run-js-command gaia/preferences
run-js-command gaia/pre-app
[svoperapps.js] PROFILE_DIR, GAIA_DISTRIBUTION_DIR, VARIANT_PATH are all required
copy bluetooth to build_stage/
....
Flags: needinfo?(jhford)
Sadly, I can reproduce this problem on my machine; it goes away if I swap subprocess.call() for subprocess.Popen().  Which OS are you using jhford?  I'm on Ubuntu 12.04.

It's hard to say exactly where the bug is.  Python is invoking a make file which invokes xpcshell with some JS files.
I am on OS X 10.9 running python 2.7.5.  I'll try this out on my Fedora 20 machine in a bit when I get home.
John Ford, this issue only appears on linux64, my mac is also fine.
filed bug 1028816 for mozharness.
I seperate it to two pull request:

[part 1] add r-wrapper.js in build/ directory
https://github.com/mozilla-b2g/gaia/pull/20949

[part 2] rewrite clock build script in javascript
https://github.com/mozilla-b2g/gaia/pull/19804

and I will land part 1 today and it won't cause bug 1028816, part 2 will be land until bug 1028816 is fixed. that can help us to use r-wrapper for other makefiles.
Comment on attachment 8431413 [details] [review]
github PR: https://github.com/mozilla-b2g/gaia/pull/19804

Hi :mcav, I'm working on rewrite clock Makefile in javascript to get faster build process and have the compatibility to build gaia in firefox exntension.

could you take a look to this pr? thanks!
Attachment #8431413 - Flags: feedback?(m)
:jrburke told me we should fix the regression for make_gaia_shared.js on bug 1029967, so I filed bug 1032681 to fix it and restored make_gaia_shared.js back on my pull request.
merged.

https://github.com/mozilla-b2g/gaia/commit/aca5b50cec6012621cc0be765d41f971b485b0b0
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: