Closed Bug 1053703 Opened 10 years ago Closed 10 years ago

Merge pre-app.js, app.js and post-app.js to one javascript file

Categories

(Firefox OS Graveyard :: Gaia::Build, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: yurenju, Assigned: rickychien)

References

Details

Attachments

(1 file)

after landing bug 1029385, we can merge pre-app.js, app.js and post-app.js to one javascript file to get faster build time. this change may have potential to fix some intermittent issues like bug 1022192. in this issue we can get settings object from build/settings.js and pass to build/copy-build-stage-data.js. that may solve the intermittent issue since we no longer write temporary file in build_stage/settings_stage.json and access it in build/copy-build-stage-data.js
Assignee: nobody → yurenju.mozilla
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Hi Tomcat, I run the same build on try server but they are all green, so I think that would be an intermittent issue for build machine infrastructure since all l10n file of manifest does not found.
I will land it again and keep an eye on try server for b2g-inbound.
after getting a loaner machine to find the root cause, I found this is a xulrunner-sdk issue since I don't get any issue if use last master with last b2g-sdk (we switch from xulrunner-sdk to b2g-sdk on bug 1002545) and I also can reproduce this issue if use old xulrunner-sdk.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
still have some issues for emulator build, reverting
Flags: needinfo?(yurenju.mozilla)
reverted this for emulator bustages and device bustages like https://tbpl.mozilla.org/php/getParsedLog.php?id=47866875&tree=B2g-Inbound
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Thanks, Tomcat, I was trying to revert it from my home PC but it took long time to fetch code.
I made a mistake when rebase the branch, it should be okay for last commit in pull request. it will be landed in tomorrow morning and I will keep an eye on b2g-inbound.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ricky will help on this issue.
Flags: needinfo?(ricky060709)
I will take over this issue as soon as possible.
Assignee: yurenju → ricky060709
Flags: needinfo?(ricky060709)
Rebased yuren's PR to master and fixed some conflicts. Push to try for waiting all green. https://tbpl.mozilla.org/?tree=Try&rev=a8542134db56
1.) Comment in the bug when you push a patch. Leaving no indication in the bug that anything happened isn't acceptable. 2.) Backed out for breaking tests on B2G desktop builds (again). I tried clobbering, but the failures persisted. https://github.com/mozilla-b2g/gaia/commit/af5703b967e45816b30546faf0753a655878070e https://treeherder.mozilla.org/ui/#/jobs?repo=b2g-inbound&revision=9fef147b2da0
Comparing the test log of "b2g_ubuntu64_vm try opt test mochitest-1" btw try and in-bound, I found the difference as below , try, result :PASS 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:52 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:53 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 06:27:53 INFO - JavaScript error: file:///builds/slave/test/build/application/b2g/components/nsHandlerService.js, line 120: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] in-bound, result: failed 15b741e1ba10} 10:45:21 INFO - Exception in thread Thread-5: 10:45:21 INFO - Traceback (most recent call last): 10:45:21 INFO - File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner 10:45:21 INFO - self.run() 10:45:21 INFO - File "/usr/lib/python2.7/threading.py", line 504, in run 10:45:21 INFO - self.__target(*self.__args, **self.__kwargs) 10:45:21 INFO - File "runtestsb2g.py", line 303, in runMarionetteScript 10:45:21 INFO - script_args=test_script_args) 10:45:21 INFO - File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 1221, in execute_script 10:45:21 INFO - filename=os.path.basename(frame[0])) 10:45:21 INFO - File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/decorators.py", line 35, in _ 10:45:21 INFO - return func(*args, **kwargs) 10:45:21 INFO - File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 638, in _send_message 10:45:21 INFO - self._handle_error(response) 10:45:21 INFO - File "/builds/slave/test/build/venv/local/lib/python2.7/site-packages/marionette/marionette.py", line 686, in _handle_error 10:45:21 ERROR - raise errors.JavascriptException(message=message, status=status, stacktrace=stacktrace) 10:45:21 ERROR - JavascriptException: JavascriptException: TypeError: container is null 10:45:21 INFO - stacktrace: 10:45:21 INFO - execute_script @runtestsb2g.py, line 303 10:45:21 INFO - inline javascript, line 66
I can reproduce below failure through local "make", System JS : ERROR file:///Users/tuangeorge/code/gaia/b2g_sdk/34.0a1-2014-08-12-04-02-01/B2G.app/Contents/MacOS/components/nsHandlerService.js:120 - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] System JS : ERROR file:///Users/tuangeorge/code/gaia/b2g_sdk/34.0a1-2014-08-12-04-02-01/B2G.app/Contents/MacOS/components/nsHandlerService.js:120 - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] System JS : ERROR file:///Users/tuangeorge/code/gaia/b2g_sdk/34.0a1-2014-08-12-04-02-01/B2G.app/Contents/MacOS/components/nsHandlerService.js:120 - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]
Sorry, comment 26 should be different case, it also happen in master.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
hey guys, since this change was backed out now several times would it be possible to have some try run before we attend to land this change again, thanks!
And please do not land this again before figuring out the failure highlighted by the various backouts! One way to figure this out would be to first manage to get an orange try, without modifying the existing patch. Then once we have it failing, modify the patch to make it green.
I just put $(STAGE_DIR) back to "app:" since I guess that the logic of this patch is almost same as before besides $(STAGE_DIR) has been removed. There is too hard to reproduce this bug even though I've borrowed a machine for trying to reproduce it. Release engineer told me we've same instance and environment in loaner server, Try and B2g-inbound. So I cannot figure out why we can passed on loaner server and Try but failed on B2g-inbound. Alexandre, do you get any idea?
Flags: needinfo?(poirot.alex)
Just pushed to try rebased on last gaia and against last gecko: https://tbpl.mozilla.org/?tree=Try&rev=f7c2ff6be0e8 If that run is green I would like to hear from releng what could be possibly different between this run and b2g-inbound. I don't have any idea. Having a green try doesn't mean we can move forward, we have to figure out what is going on there.
Flags: needinfo?(poirot.alex)
Ok so a quick debrief on what I found so far. As previous try runs, this one is also green. But looking at inbound failure, I found some issues: 1) the b2g package has a broken gaia profile with just settings and pref file. No webapps folder!! But there isn't any explicit error/exception suggesting the profile is any broken: http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux64_gecko/1412764697/b2g-inbound-linux64_gecko-bm71-build1-build1482.txt.gz http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux64_gecko/1412764697/b2g-35.0a1.multi.linux-x86_64.tar.bz2 2) It seems to be related to multilocale, I was able to reproduce such broken profile by running a multilocale build.
Here is the exception I got when running multilocale that also appear in inbound logs: Invalid JSON file : /builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build_stage/email/manifest.webapp Exception: Error: -*- build/utils.js: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileInputStream.init]" nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame :: resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/utils-xpc.js :: getFileContent :: line 88" data: no] file not found: /builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build_stage/email/manifest.webapp getFileContent@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/utils-xpc.js:104:11 getJSON@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/utils-xpc.js:211:15 localizeManifest@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/multilocale.js:244:20 localize@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/multilocale.js:225:5 execute/<@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/multilocale.js:492:1 execute@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/multilocale.js:480:3 execute@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/post-app.js:14:5 exports.execute@resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/app.js:44:3 CommonjsRunner.prototype.run@/builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/xpcshell-commonjs.js:109:5 run@/builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/xpcshell-commonjs.js:124:3 @-e:1:1 System JS : ERROR resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/utils-xpc.js:104 - Error: -*- build/utils.js: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileInputStream.init]" nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame :: resource://gre/modules/commonjs/toolkit/loader.js -> file:///builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build/utils-xpc.js :: getFileContent :: line 88" data: no] file not found: /builds/slave/b2g-in-l64_g-00000000000000000/build/gaia/build_stage/email/manifest.webapp
I have some idea why such error would happen, but it's still not clear why it would happen only on inbound. I think this patch raise some races, but it has nothing particular to the environment... Could it be that inbound use slaves that are significantly slower/faster than try ones?!!
Try server appears to allocate VM resource arbitrarily. For example: tst-linux64-spot-298 tst-linux64-spot-402 ... So I guess maybe it's very possible to be a situation that all Try VMs are allocated slower/faster resources than B2g-Inbound since Try passed but B2g-Inbound failed every times. We can contact releng to understand differences in VM resources and further request a B2g-Inbound-like VM to reproduce this issue.
Here is a patch to fix this race: https://github.com/ochameau/gaia/commit/7115cb962510234367c8a8e0dc52610edc25cf23 But I would really like to see a orange try without this patch and a green one with it...
Hey Chris Cooper, Could you explain more detail about the differences in VM resources as we mentioned on comment 37 and comment 38?
Flags: needinfo?(coop)
At least this patch fixes multilocale builds locally: LOCALES_FILE="locales/languages_dev.json" LOCALE_BASEDIR="~/gaia-l10n" make Without this additional patch, profile was broken, without any webapps directory. The good news is that bug 1074200 is going to make the build system quit with an error code. (I verified, it stops the build during multilocale step) So that we would immediately and explicitely know that something went wrong!
(In reply to Ricky Chien [:rickychien] from comment #40) > Hey Chris Cooper, > > Could you explain more detail about the differences in VM resources as we > mentioned on comment 37 and comment 38? IIRC we decided that the fault was in the builds, since there were uncaught exceptions in the b2g-inbound builds that were not present in the try build. We don't exclusively use AWS slaves for building. We also maintain a handful of in-house hardware machines to provide basic baseline capacity in the event of a catastrophic AWS failure. Looking at the selection of build jobs you posted this morning, namely: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f7c2ff6be0e8 https://treeherder.mozilla.org/ui/#/jobs?repo=b2g-inbound&revision=d77a66154bbc Your try build job ended up on an in-house hardware machine - b-linux64-hp-0019: http://ftp.mozilla.org/pub/mozilla.org/b2g/try-builds/apoirot@mozilla.com-f7c2ff6be0e8/try-linux64_gecko/try-linux64_gecko-bm87-try1-build2847.txt.gz All three of the b2g-inbound build jobs ended up on AWS instances - bld-linux64-spot-107, bld-linux64-spot-055, bld-linux64-spot-018: http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux64_gecko/1412764697/b2g-inbound-linux64_gecko-bm94-build1-build1317.txt.gz http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux64_gecko/1412764697/b2g-inbound-linux64_gecko-bm71-build1-build1481.txt.gz http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux64_gecko/1412764697/b2g-inbound-linux64_gecko-bm71-build1-build1482.txt.gz Comment #39 indicates that there is a race condition that needs to be fixed. The hardware machines will always be faster than the VMs. If there are timing-sensitive tests involved, they should be removed or fixed.
Flags: needinfo?(coop)
Thanks! IIRC my loaner machine is an AWS instance, so why I cannot reproduce this problem? Can we request an AWS instance (bug 1074655) running b2g-inbould build for catching up this issue. Otherwise, the only way we can do is land it again and back out again. :(
Flags: needinfo?(coop)
(In reply to Ricky Chien [:rickychien] from comment #43) > Can we request an AWS instance (bug 1074655) running b2g-inbould build for > catching up this issue. Otherwise, the only way we can do is land it again > and back out again. :( I'm getting a build instance setup in bug 1074655.
Flags: needinfo?(coop)
Wow! Treeherder is green. Thanks Alex and George!
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Great job! I still think this bug is quite wierd. We check the failure test and found out there's no webapp folder in build-dir. The profile looks like the app building process has never run, so that we cannot find manifest.webapp in build_stage folder. The fix in latest patch would make sure each of pre-app, apps and post-app has run completely in current thread and by order. As I know, all the code in xpcshell should run in sequence. what kinda usage might break this rule?
(In reply to George Duan [:gduan] [:喬智] from comment #48) > Great job! > > I still think this bug is quite wierd. We check the failure test and found > out there's no webapp folder in build-dir. The profile looks like the app > building process has never run, so that we cannot find manifest.webapp in > build_stage folder. The fix in latest patch would make sure each of pre-app, > apps and post-app has run completely in current thread and by order. > > As I know, all the code in xpcshell should run in sequence. what kinda usage > might break this rule? First thing is that bug 1074200 is going to make the build fail explicitely if any such exception happen again. Otherwise the very precise issue here was that the apps rule was still executing while post-app was running. It ended up throwing an exception in multilocale because of a missing file. That ended up preventing copying stage to profile. (see post-app.js) xpcshell code runs in sequence as any JS file, there is nothing specific to xpcshell. But some code is asynchronous, like email build.js file. See apps/email/build/build.js and its execute method with many asynchronous promises. We could fix that issue without the magic of processEvents by waiting for promise completion returned by execute from app.js.
In reply of comment 49, I know there might be something wrong in apps.js, but I never think of promise... So, promise might behave differently by machine and we probably should implement promise patter in most of our build scripts in case.
It's not that promise behave differently, it's more about asynchronous code being slower/faster on some machine. Promises are just one possible abstraction out of many to handle async code.
Reading MDN https://developer.mozilla.org/en-US/Firefox_OS/Developing_Gaia/Build_System_Primer#Build_process and noticed this bug. Is anyone planning to update the build process explanation on it? That would be a great help. Thanks!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: