Closed Bug 1221510 Opened 9 years ago Closed 9 years ago

Unable to download builds when the link contains a bewit

Categories

(Firefox OS Graveyard :: Bitbar, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlorenzo, Unassigned)

References

Details

Attachments

(4 files)

Attached file flash-421838.log
In order use the builds coming from TaskCluster (bug 1219290), we have a new URL to provide to the flash script. For example: > https://queue.taskcluster.net/v1/task/TU4XRN8FQz2v0JRtzoYYCw/artifacts/private/build/flame-kk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ2NjM4NTEzXHBUWlI5c3hYdWFGaU5xZmpEeEJRNjJWbjVuUlBsbXZwb05FZ3haVWI4ZGM9XGUzMD0 The bewit [1] part is required in order to authenticate to TaskCluster. Here are attached the flash logs and the logs coming from Jenkins. An interesting part comes from the flash logs: > Downloading https://queue.taskcluster.net/v1/task/TU4XRN8FQz2v0JRtzoYYCw/artifacts/private/ > build/flamekk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ2NjM4NTEzXHBUWlI5c3hYdWFGaU5 > xZmpEeEJRNjJWbjVuUlBsbXZwb05FZ3haVWI4ZGM9XGUzMD0 failed! > Build step 'Download file from file server' marked build as failure > Publishing results for test run 421843, device session 198004 to http://10.1.2.9/cloud I'm not too sure what is behind the download step. [1] https://github.com/hapijs/hapi-auth-hawk#bewit-authentication
Attached file Jenkins logs
Runs which are launched from Taskcluster are already using a new type of URL. If you wish to use in the all build let me know so I will change the configuration so it will use this type of URL (the current URL needs credentials and therefore downloading is different). If you use new type of URL only some builds then I will configure a new project for it.
Our jobs will use the new type of URL only, so we can change the current configuration. Please tell me when you plan to perform the change, so I can modify our jobs right after. Thanks!
Flags: needinfo?(sakari.rautiainen)
Ok, so have added a new project there: flash-fxos-task-cluster It actually needs a small modification to marionette/testdroid plugin. The current plugin uses the following parameters when it launches a new flash run: FLAME_ZIP_URL and MEM_TOTAL - The plugin used FLAME_ZIP_URL as identifier for the build, but as now we have extra param bewit, which is like token, we need to give the BUILD_LABEL when launching a test run Totally there is then 3 params given for the project when it's started: BUILD_LABEL = (for example https://queue.taskcluster.net/v1/task/LlkMEZocRq2GOOR7_5Qi-Q/artifacts/private/build/flame-kk.zip) FLAME_ZIP_URL = (https://queue.taskcluster.net/v1/task/LlkMEZocRq2GOOR7_5Qi-Q/artifacts/private/build/flame-kk.zip?bewit=abcd1244) MEM_TOTAL = (319) When flashing is completed successfully testdroid adds BUILD_LABEL for the device and it can be then found when querying device.
Flags: needinfo?(sakari.rautiainen)
I started to work on the modifications for the plugin. I still need to narrow down the RegExp.
Comment on attachment 8686678 [details] [review] mozilla/testdroid-marionette-plugin PR I made this change in the plugin[1], does it look like to way to go? Is there a way to test the integration with the new project you made? [1] https://github.com/mozilla/testdroid-marionette-plugin/pull/17/files#diff-8237f9fb7924f5dbe18302c3f06962fc
Attachment #8686678 - Flags: feedback?(sakari.rautiainen)
Looks good for me. To test it you can run plugin locally and set "Flash project" to "flash-fxos-task-cluster" in Jenkins job configuration page(it's under advanced configs of the build step).
Comment on attachment 8686678 [details] [review] mozilla/testdroid-marionette-plugin PR Based on comment 7.
Attachment #8686678 - Flags: feedback?(sakari.rautiainen) → feedback+
I ran the plugin (that contains the patch) on my own machine. The plugin execution doesn't fail. However, I wanted to see if the version displayed was the correct one, but I got this error: > [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com > [Testdroid] - Searching for devices... > [Testdroid] - [Codename: Flame] > [Testdroid] - [SIMs: 0] > [Testdroid] - Found 11 devices > [Testdroid] - Selected device 1d9a5649 (176) > [Testdroid] - Flashing device with https://queue.taskcluster.net/v1/task/GV60qa7JRombVfdbsisb2A/artifacts/private/build/flame-kk.zip?bewit=A_VALID_BEWIT and memory throttled at 319MB > [Testdroid] - Started session 202575 > [Testdroid] - ADB port: 1098 > [Testdroid] - ADB host: fxos.testdroid.com > [Testdroid] - Android serial: 1d9a5649 > [Testdroid] - Marionette port: 1099 > [Testdroid] - Marionette host: fxos.testdroid.com > [Testdroid] - Marionette forwarding host: 172.27.240.12 > [Testdroid] - Marionette forwarding port: 30009 > [workspace] $ /bin/sh -xe /tmp/hudson5044686452375847266.sh > > + adb -H 54.67.13.230 -P 1098 logcat -c > ** Cannot start server on remote host > - waiting for device - > ** Cannot start server on remote host > error: cannot connect to daemon > error: cannot connect to daemon > ** Cannot start server on remote host It seems like the firewall is blocking my connection. Sakari, is there a whitelist that allows the adb connections? You can email me the details if you wish.
Flags: needinfo?(sakari.rautiainen)
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #9) > However, I wanted to see if the version displayed was the correct one In order to do so, I set up the job to run 1 test, so it'll display the version. I couldn't get a job running because of the timeout on the adb connection.
I tried again today. I'm still getting the same issue: > [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com > [Testdroid] - Searching for devices... > [Testdroid] - [Codename: Flame] > [Testdroid] - [SIMs: 0] > [Testdroid] - Found 13 devices > [Testdroid] - Selected device f04341d5 (208) > [Testdroid] - Flashing device with A_VALID_URL and memory throttled at 319MB > [Testdroid] - Started session 204724 > [Testdroid] - ADB port: 1084 > [Testdroid] - ADB host: fxos.testdroid.com > [Testdroid] - Android serial: f04341d5 > [Testdroid] - Marionette port: 1085 > [Testdroid] - Marionette host: fxos.testdroid.com > [Testdroid] - Marionette forwarding host: 172.27.240.17 > [Testdroid] - Marionette forwarding port: 30009 > > + adb -H 54.67.13.230 -P 1084 logcat -c > ** Cannot start server on remote host > - waiting for device - > ** Cannot start server on remote host
Can you please try again. The build was correctly flashed on device but adb/marionette connection opening failed, had to change a configuration a bit and now it should work.
Flags: needinfo?(sakari.rautiainen)
I tried 3 more times. I got the same results each time :S Last try was with: > [Testdroid] - Selected device 1d9e5723 (188) > [Testdroid] - Started session 204819 > [Testdroid] - ADB port: 1036 > [Testdroid] - Android serial: 1d9e5723 > [Testdroid] - Marionette port: 1037
Still?! Hmm..I am running out of ideas. And the test runs were from your jenkins cluster. This is in the logs: (bootloader) Device adjusted mem: 319m OKAY [ 0.005s] finished. total time: 0.005s target reported max download size of 301989888 bytes sending 'boot' (7486 KB)... OKAY [ 0.985s] writing 'boot'... OKAY [ 0.748s] finished. total time: 1.733s target reported max download size of 301989888 bytes erasing 'system'... OKAY [ 0.682s] sending sparse 'system' (292094 KB)... OKAY [ 38.766s] writing 'system'... OKAY [ 46.552s] sending sparse 'system' (35165 KB)... OKAY [ 4.616s] writing 'system'... OKAY [ 5.198s] finished. total time: 95.815s target reported max download size of 301989888 bytes erasing 'userdata'... OKAY [ 2.925s] sending 'userdata' (36916 KB)... OKAY [ 4.761s] writing 'userdata'... OKAY [ 1.395s] finished. total time: 9.081s target reported max download size of 301989888 bytes sending 'recovery' (8224 KB)... OKAY [ 1.059s] writing 'recovery'... OKAY [ 0.304s] finished. total time: 1.363s rebooting... finished. total time: 0.001s Running script returned status code 0 So at least flashing works fine. However, there is no input from adb or marionette after that. Device build label has been set correctly to https://queue.taskcluster.net/v1/task/QUgi1vPqQC2szkvmkIbC4w/artifacts/private/build/flame-kk.zip If this also a debug build, so adb is enabled etc?
(In reply to Sakari Rautiainen from comment #14) > Device build label has been set correctly to > https://queue.taskcluster.net/v1/task/QUgi1vPqQC2szkvmkIbC4w/artifacts/private/build/flame-kk.zip > If this also a debug build, so adb is enabled etc? That's a good guess. I double-checked it. I flashed that build locally, and ran one test locally => no problem on this side. > So at least flashing works fine. However, there is no input from adb or > marionette after that. Do you try to connect to the device (thanks to adb) from your side? If not, that's probably because the adb connection doesn't occur, as the logs in comment 11 and 13 suggest. Then, it seems like something is blocking the connection between the device and my machine.
Comment on attachment 8695227 [details] [review] Second PR (to apply on top of the first) Thank you for this patch, Sakari. I tried the plugin locally. I got an error (label not found) and then I can't connect to ADB. My assumption is that the firewall is blocking my packets > [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com > [Testdroid] - Flash device and open connection immediately > [Testdroid] - Searching for devices... > [Testdroid] - [Codename: Flame] > [Testdroid] - [SIMs: 0] > [Testdroid] - [Build Identifier: 319_https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip] > [Testdroid] - [ERROR] - Label '319_https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip' not found > [Testdroid] - Searching for devices... > [Testdroid] - [Codename: Flame] > [Testdroid] - [SIMs: 0] > [Testdroid] - Found 13 devices > [Testdroid] - Selected device 1b31c288 (177) > [Testdroid] - Flashing device with https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ5MTU0NzM1XE9PMnl0UlIrRW5DRGJFb2xCcnlwVjBhdzR0UmZwSkRuYU52a2Jaajl0Q289XGUzMD0 and memory throttled at 319MB > [Testdroid] - Started session 209458 > [Testdroid] - ADB port: 1036 > [Testdroid] - ADB host: fxos.testdroid.com > [Testdroid] - Android serial: 1b31c288 > > [Testdroid] - Marionette port: 1037 > [Testdroid] - Marionette host: fxos.testdroid.com > [Testdroid] - Marionette forwarding host: 172.27.240.17 > [Testdroid] - Marionette forwarding port: 30007 > ... > + adb -H 54.67.13.230 -P 1036 logcat -c > ** Cannot start server on remote host > - waiting for device - > ** Cannot start server on remote host > error: cannot connect to daemon > error: cannot connect to daemon On another note, I'm not too sure to understand the changes made. I left some questions in the Pull Request. You can reply there.
Flags: needinfo?(sakari.rautiainen)
Yes, it firewall what is blocking connection. I added few comments there in pull request. There is way too much logic in that plugin now as the code is supposed to handle both type of projects("flash-fxos" and "flash-fxos-task-cluster"). As the project configurations are quite different the way how adb/marionette connection and flashing is done differs a lot. After rechecking the changes, I'm thinking to create a new wrapper(DevicesessionWrapper) or remove all the code related to current type of project and just leave the code which uses this "new way" - so basically flashing and opening adb/marionette connection in project run. This would clean up the code a lot.
Flags: needinfo?(sakari.rautiainen)
(In reply to Sakari Rautiainen from comment #18) > I'm thinking to create a new wrapper(DevicesessionWrapper) That sounds good to me. Do you plan to make the changes in this pull request? I'm not feeling confident with merging the current patch: I can't test the patch from my side, there is no new automated test added and the complexity introduced makes the patch error-prone. Would it also be possible to get something that makes my requests pass the firewall? A VPN access sounds like a solution given that my IP often changes, and some other people might also want to test patches locally.
Flags: needinfo?(sakari.rautiainen)
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #19) > A VPN access sounds like a solution given that my IP often changes, > and some other people might also want to test patches locally. For the record, a VPN connection is being set up.
I came across: https://github.com/taskcluster/testdroid-proxy/blob/7263d8a5ca1dd228ebde79f66054363ee68c03d9/src/handlers/device.js#L11-L14 This made me realize there was a misunderstanding between: (In reply to Sakari Rautiainen from comment #2) > Runs which are launched from Taskcluster are already using a new type of > URL. If you wish to use in the all build let me know so I will change the > configuration so it will use this type of URL (the current URL needs > credentials and therefore downloading is different). If you use new type of > URL only some builds then I will configure a new project for it. and (In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #3) > Our jobs will use the new type of URL only, so we can change the current > configuration. Please tell me when you plan to perform the change, so I can > modify our jobs right after. Thanks! I'm sorry, let me rephrase comment 3: The Jenkins jobs are planned to use the new type of URL only; exactly like the runs launched from Taskcluster. At some point, we'll drop the the old type of URL. Then, I'm thinking: should the fix simply to use the project called 'flash-fxos-new-url' and put the URL with bewit in 'FLAME_ZIP_URL' ?
Firefox OS labs are being winded down. This issue won't be solved. Closing.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(sakari.rautiainen)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: