Unable to download builds when the link contains a bewit

RESOLVED FIXED

Status

Firefox OS
Bitbar
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jlorenzo, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments)

(Reporter)

Description

2 years ago
Created attachment 8683053 [details]
flash-421838.log

In order use the builds coming from TaskCluster (bug 1219290), we have a new URL to provide to the flash script. For example:
> https://queue.taskcluster.net/v1/task/TU4XRN8FQz2v0JRtzoYYCw/artifacts/private/build/flame-kk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ2NjM4NTEzXHBUWlI5c3hYdWFGaU5xZmpEeEJRNjJWbjVuUlBsbXZwb05FZ3haVWI4ZGM9XGUzMD0

The bewit [1] part is required in order to authenticate to TaskCluster. Here are attached the flash logs and the logs coming from Jenkins.

An interesting part comes from the flash logs:
> Downloading https://queue.taskcluster.net/v1/task/TU4XRN8FQz2v0JRtzoYYCw/artifacts/private/
> build/flamekk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ2NjM4NTEzXHBUWlI5c3hYdWFGaU5
> xZmpEeEJRNjJWbjVuUlBsbXZwb05FZ3haVWI4ZGM9XGUzMD0 failed!
> Build step 'Download file from file server' marked build as failure
> Publishing results for test run 421843, device session 198004 to http://10.1.2.9/cloud

I'm not too sure what is behind the download step.

[1] https://github.com/hapijs/hapi-auth-hawk#bewit-authentication
(Reporter)

Comment 1

2 years ago
Created attachment 8683054 [details]
Jenkins logs

Comment 2

2 years ago
Runs which are launched from Taskcluster are already using a new type of URL. If you wish to use in the all build let me know so I will change the configuration so it will use this type of URL (the current URL needs credentials and therefore downloading is different). If you use new type of URL only some builds then I will configure a new project for it.
(Reporter)

Comment 3

2 years ago
Our jobs will use the new type of URL only, so we can change the current configuration. Please tell me when you plan to perform the change, so I can modify our jobs right after. Thanks!
Flags: needinfo?(sakari.rautiainen)

Comment 4

2 years ago
Ok, so have added a new project there: flash-fxos-task-cluster 
It actually needs a small modification to marionette/testdroid plugin. The current plugin uses the following parameters when it launches a new flash run: FLAME_ZIP_URL and MEM_TOTAL - The plugin used FLAME_ZIP_URL as identifier for the build, but as now we have extra param bewit, which is like token, we need to give the BUILD_LABEL when launching a test run 

Totally there is then 3 params given for the project when it's started:
BUILD_LABEL = (for example https://queue.taskcluster.net/v1/task/LlkMEZocRq2GOOR7_5Qi-Q/artifacts/private/build/flame-kk.zip)
FLAME_ZIP_URL = (https://queue.taskcluster.net/v1/task/LlkMEZocRq2GOOR7_5Qi-Q/artifacts/private/build/flame-kk.zip?bewit=abcd1244)
MEM_TOTAL = (319)


When flashing is completed successfully testdroid adds BUILD_LABEL for the device and it can be then found when querying device.
Flags: needinfo?(sakari.rautiainen)
(Reporter)

Comment 5

2 years ago
Created attachment 8686678 [details] [review]
mozilla/testdroid-marionette-plugin PR

I started to work on the modifications for the plugin. I still need to narrow down the RegExp.
(Reporter)

Comment 6

2 years ago
Comment on attachment 8686678 [details] [review]
mozilla/testdroid-marionette-plugin PR

I made this change in the plugin[1], does it look like to way to go? Is there a way to test the integration with the new project you made?

[1] https://github.com/mozilla/testdroid-marionette-plugin/pull/17/files#diff-8237f9fb7924f5dbe18302c3f06962fc
Attachment #8686678 - Flags: feedback?(sakari.rautiainen)

Comment 7

2 years ago
Looks good for me. To test it you can run plugin locally and set "Flash project" to "flash-fxos-task-cluster" in Jenkins job configuration page(it's under advanced configs of the build step).
(Reporter)

Comment 8

2 years ago
Comment on attachment 8686678 [details] [review]
mozilla/testdroid-marionette-plugin PR

Based on comment 7.
Attachment #8686678 - Flags: feedback?(sakari.rautiainen) → feedback+
(Reporter)

Comment 9

2 years ago
I ran the plugin (that contains the patch) on my own machine. The plugin execution doesn't fail. However, I wanted to see if the version displayed was the correct one, but I got this error:

> [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com
> [Testdroid] - Searching for devices...
> [Testdroid] - [Codename: Flame]
> [Testdroid] - [SIMs: 0]
> [Testdroid] - Found 11 devices
> [Testdroid] - Selected device 1d9a5649 (176)
> [Testdroid] - Flashing device with https://queue.taskcluster.net/v1/task/GV60qa7JRombVfdbsisb2A/artifacts/private/build/flame-kk.zip?bewit=A_VALID_BEWIT and  memory throttled at 319MB
> [Testdroid] - Started session 202575
> [Testdroid] - ADB port: 1098
> [Testdroid] - ADB host: fxos.testdroid.com
> [Testdroid] - Android serial: 1d9a5649
> [Testdroid] - Marionette port: 1099
> [Testdroid] - Marionette host: fxos.testdroid.com
> [Testdroid] - Marionette forwarding host: 172.27.240.12
> [Testdroid] - Marionette forwarding port: 30009
> [workspace] $ /bin/sh -xe /tmp/hudson5044686452375847266.sh
> 
> + adb -H 54.67.13.230 -P 1098 logcat -c
> ** Cannot start server on remote host
> - waiting for device -
> ** Cannot start server on remote host
> error: cannot connect to daemon
> error: cannot connect to daemon
> ** Cannot start server on remote host

It seems like the firewall is blocking my connection. Sakari, is there a whitelist that allows the adb connections? You can email me the details if you wish.
Flags: needinfo?(sakari.rautiainen)
(Reporter)

Comment 10

2 years ago
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #9)
> However, I wanted to see if the version displayed was the correct one

In order to do so, I set up the job to run 1 test, so it'll display the version. I couldn't get a job running because of the timeout on the adb connection.
(Reporter)

Comment 11

2 years ago
I tried again today. I'm still getting the same issue:
> [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com
> [Testdroid] - Searching for devices...
> [Testdroid] - [Codename: Flame]
> [Testdroid] - [SIMs: 0]
> [Testdroid] - Found 13 devices
> [Testdroid] - Selected device f04341d5 (208)
> [Testdroid] - Flashing device with A_VALID_URL and memory throttled at 319MB
> [Testdroid] - Started session 204724
> [Testdroid] - ADB port: 1084
> [Testdroid] - ADB host: fxos.testdroid.com
> [Testdroid] - Android serial: f04341d5
> [Testdroid] - Marionette port: 1085
> [Testdroid] - Marionette host: fxos.testdroid.com
> [Testdroid] - Marionette forwarding host: 172.27.240.17
> [Testdroid] - Marionette forwarding port: 30009
> 
> + adb -H 54.67.13.230 -P 1084 logcat -c
> ** Cannot start server on remote host
> - waiting for device -
> ** Cannot start server on remote host

Comment 12

2 years ago
Can you please try again. 
The build was correctly flashed on device but adb/marionette connection opening failed, had to change a configuration a bit and now it should work.
Flags: needinfo?(sakari.rautiainen)
(Reporter)

Comment 13

2 years ago
I tried 3 more times. I got the same results each time :S 
Last try was with:

> [Testdroid] - Selected device 1d9e5723 (188)
> [Testdroid] - Started session 204819
> [Testdroid] - ADB port: 1036
> [Testdroid] - Android serial: 1d9e5723
> [Testdroid] - Marionette port: 1037

Comment 14

2 years ago
Still?! Hmm..I am running out of ideas. And the test runs were from your jenkins cluster. 
This is in the logs:
(bootloader) 	Device adjusted mem: 319m
OKAY [  0.005s]
finished. total time: 0.005s
target reported max download size of 301989888 bytes
sending 'boot' (7486 KB)...
OKAY [  0.985s]
writing 'boot'...
OKAY [  0.748s]
finished. total time: 1.733s
target reported max download size of 301989888 bytes
erasing 'system'...
OKAY [  0.682s]
sending sparse 'system' (292094 KB)...
OKAY [ 38.766s]
writing 'system'...
OKAY [ 46.552s]
sending sparse 'system' (35165 KB)...
OKAY [  4.616s]
writing 'system'...
OKAY [  5.198s]
finished. total time: 95.815s
target reported max download size of 301989888 bytes
erasing 'userdata'...
OKAY [  2.925s]
sending 'userdata' (36916 KB)...
OKAY [  4.761s]
writing 'userdata'...
OKAY [  1.395s]
finished. total time: 9.081s
target reported max download size of 301989888 bytes
sending 'recovery' (8224 KB)...
OKAY [  1.059s]
writing 'recovery'...
OKAY [  0.304s]
finished. total time: 1.363s
rebooting...

finished. total time: 0.001s
Running script returned status code 0

So at least flashing works fine. However, there is no input from adb or marionette after that. 


Device build label has been set correctly to https://queue.taskcluster.net/v1/task/QUgi1vPqQC2szkvmkIbC4w/artifacts/private/build/flame-kk.zip

If this also a debug build, so adb is enabled etc?
(Reporter)

Comment 15

2 years ago
(In reply to Sakari Rautiainen from comment #14)
> Device build label has been set correctly to
> https://queue.taskcluster.net/v1/task/QUgi1vPqQC2szkvmkIbC4w/artifacts/private/build/flame-kk.zip
> If this also a debug build, so adb is enabled etc?

That's a good guess. I double-checked it. I flashed that build locally, and ran one test locally => no problem on this side.

> So at least flashing works fine. However, there is no input from adb or
> marionette after that. 
Do you try to connect to the device (thanks to adb) from your side? If not, that's probably because the adb connection doesn't occur, as the logs in comment 11 and 13 suggest. Then, it seems like something is blocking the connection between the device and my machine.
(Reporter)

Comment 16

2 years ago
Created attachment 8695227 [details] [review]
Second PR (to apply on top of the first)
(Reporter)

Comment 17

2 years ago
Comment on attachment 8695227 [details] [review]
Second PR (to apply on top of the first)

Thank you for this patch, Sakari. I tried the plugin locally. I got an error (label not found) and then I can't connect to ADB. My assumption is that the firewall is blocking my packets

> [Testdroid] - Connecting to https://fxos.testdroid.com/testdroid-cloud as admin@localhost.com
> [Testdroid] - Flash device and open connection immediately
> [Testdroid] - Searching for devices...
> [Testdroid] - [Codename: Flame]
> [Testdroid] - [SIMs: 0]
> [Testdroid] - [Build Identifier: 319_https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip]
> [Testdroid] - [ERROR] - Label '319_https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip' not found
> [Testdroid] - Searching for devices...
> [Testdroid] - [Codename: Flame]
> [Testdroid] - [SIMs: 0]
> [Testdroid] - Found 13 devices
> [Testdroid] - Selected device 1b31c288 (177)
> [Testdroid] - Flashing device with https://queue.taskcluster.net/v1/task/NYaLOctDQQ-qCaNVNWYwzQ/artifacts/private/build/flame-kk.zip?bewit=X3JVeVhDTHRUMFNTRHczN1BYRklGd1wxNDQ5MTU0NzM1XE9PMnl0UlIrRW5DRGJFb2xCcnlwVjBhdzR0UmZwSkRuYU52a2Jaajl0Q289XGUzMD0 and memory throttled at 319MB
> [Testdroid] - Started session 209458
> [Testdroid] - ADB port: 1036
> [Testdroid] - ADB host: fxos.testdroid.com
> [Testdroid] - Android serial: 1b31c288
> 
> [Testdroid] - Marionette port: 1037
> [Testdroid] - Marionette host: fxos.testdroid.com
> [Testdroid] - Marionette forwarding host: 172.27.240.17
> [Testdroid] - Marionette forwarding port: 30007
> ...
> + adb -H 54.67.13.230 -P 1036 logcat -c
> ** Cannot start server on remote host
> - waiting for device -
> ** Cannot start server on remote host
> error: cannot connect to daemon
> error: cannot connect to daemon

On another note, I'm not too sure to understand the changes made. I left some questions in the Pull Request. You can reply there.
Flags: needinfo?(sakari.rautiainen)

Comment 18

2 years ago
Yes, it firewall what is blocking connection. 
I added few comments there in pull request. 
There is way too much logic in that plugin now as the code is supposed to handle both type of projects("flash-fxos" and "flash-fxos-task-cluster"). As the project configurations are quite different the way how adb/marionette connection and flashing is done differs a lot. After rechecking the changes, I'm thinking to create a new wrapper(DevicesessionWrapper) or remove all the code related to current type of project and just leave the code which uses this "new way" - so basically flashing and opening adb/marionette connection in project run. This would clean up the code a lot.
Flags: needinfo?(sakari.rautiainen)
(Reporter)

Comment 19

2 years ago
(In reply to Sakari Rautiainen from comment #18)
> I'm thinking to create a new wrapper(DevicesessionWrapper)
That sounds good to me. Do you plan to make the changes in this pull request? I'm not feeling confident with merging the current patch: I can't test the patch from my side, there is no new automated test added and the complexity introduced makes the patch error-prone.

Would it also be possible to get something that makes my requests pass the firewall? A VPN access sounds like a solution given that my IP often changes, and some other people might also want to test patches locally.
Flags: needinfo?(sakari.rautiainen)
(Reporter)

Comment 20

2 years ago
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #19)
> A VPN access sounds like a solution given that my IP often changes,
> and some other people might also want to test patches locally.
For the record, a VPN connection is being set up.
(Reporter)

Comment 21

2 years ago
I came across: https://github.com/taskcluster/testdroid-proxy/blob/7263d8a5ca1dd228ebde79f66054363ee68c03d9/src/handlers/device.js#L11-L14

This made me realize there was a misunderstanding between:

(In reply to Sakari Rautiainen from comment #2)
> Runs which are launched from Taskcluster are already using a new type of
> URL. If you wish to use in the all build let me know so I will change the
> configuration so it will use this type of URL (the current URL needs
> credentials and therefore downloading is different). If you use new type of
> URL only some builds then I will configure a new project for it.

and

(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #3)
> Our jobs will use the new type of URL only, so we can change the current
> configuration. Please tell me when you plan to perform the change, so I can
> modify our jobs right after. Thanks!

I'm sorry, let me rephrase comment 3: The Jenkins jobs are planned to use the new type of URL only; exactly like the runs launched from Taskcluster. At some point, we'll drop the the old type of URL. 

Then, I'm thinking: should the fix simply to use the project called 'flash-fxos-new-url' and put the URL with bewit in 'FLAME_ZIP_URL' ?
(Reporter)

Comment 22

2 years ago
Firefox OS labs are being winded down. This issue won't be solved. Closing.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Flags: needinfo?(sakari.rautiainen)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.