1058724 - [Meta] Explore monkey testing framework

Reporter

Description

•

11 years ago

If I run the ./run-monkey.sh script with a debug gecko build I usually hit assertions within an hour. We should automate this and make sure we are not crashing with opt builds and we don't hit assertions with debug gecko builds. The priority should be: 1) crashes, 2) situations that require reboot. This includes: 1) Creating a testing profile that disables emergency calls and maybe other terrible things like bluetooth. 2) be able to debug crashes in gdb. either run the main process in b2g or make it stop and wait for the debugger to attach 3) in a future version we should also detect when our phone is in an unusable state. Like when pressing the home-button doesn't bring us back to the homescreen or if the homescreen doesn't load at all.

Gregor Wagner [:gwagner]

Reporter

Updated

•

11 years ago

Depends on: 1056958

Zac C (:zac)

Comment 1

•

11 years ago

Emergency call can be disabled by setting adb shell setprop "ril.ecclist" "dummy" (already asked the RIL team about this once in the past). If you wrap this around existing Python runner stack like mozdevice then you would get 2 and 3 with much less effort as it's done or partially done there already.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 2

•

11 years ago

James has already asked me to work on a way to run these scripts for every b2g-inbound build so that they report to Treeherder. I have something almost complete, which will generate the Orangutan script (with a few tweaks), and then run it. I'm planning on having crash detection and reporting to Treeherder (with logcat, script, and minidumps as artifacts) in the initial version. Things like disabling the emergency call feature can be added to the tool, which already leverages other tools such as mozdevice, mozrunner, gaiatest, etc. What else should be disabled? If we allow calls to be made then there's a chance we'll still dial inappropriate numbers from these devices.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 3

•

11 years ago

Here's my initial working version: https://github.com/davehunt/b2gmonkey You'll see there are a number of TODOs in the code, but this should at least allow us to get monkey scripts running for each b2g-inbound build.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 4

•

11 years ago

Note that on Flame I rarely see tap events happening. I suspect this is related to bug 1026527.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 5

•

11 years ago

Gregor: Could you try b2gmonkey and let me know if you see tap events happening? I've even tried flashing a KK base build and still I only see swiping.

Flags: needinfo?(anygregor)

Gregor Wagner [:gwagner]

Reporter

Comment 6

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #5) > Gregor: Could you try b2gmonkey and let me know if you see tap events > happening? I've even tried flashing a KK base build and still I only see > swiping. I see it launching apps on the homescreen so I guess the events are happening!

Flags: needinfo?(anygregor)

Gregor Wagner [:gwagner]

Reporter

Comment 7

•

11 years ago

It works great :) The good thing about my ./run-monkey.sh was that I could run b2g in gdb on the phone. Is this also possible with b2gmonkey?

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 8

•

11 years ago

(In reply to Gregor Wagner [:gwagner] from comment #7) > It works great :) Great! I managed to get it working on my device too - looks like my Orangutan binary was out of date. > The good thing about my ./run-monkey.sh was that I could run b2g in gdb on > the phone. Is this also possible with b2gmonkey? I can't imagine why not, how do you do this currently?

Flags: needinfo?(anygregor)

Gregor Wagner [:gwagner]

Reporter

Comment 9

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #8) > (In reply to Gregor Wagner [:gwagner] from comment #7) > > It works great :) > > Great! I managed to get it working on my device too - looks like my > Orangutan binary was out of date. > > > The good thing about my ./run-monkey.sh was that I could run b2g in gdb on > > the phone. Is this also possible with b2gmonkey? > > I can't imagine why not, how do you do this currently? I noticed that you restart the b2g process when the monkey starts. ./run-gdb.sh does the same. So you can either run the monkey or run in gdb. It probably works when you attach gdb after you start the monkey but thats not ideal. It works with my script since it doesn't restart the b2g process and just starts simulating the touch events.

Flags: needinfo?(anygregor)

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 10

•

11 years ago

(In reply to Gregor Wagner [:gwagner] from comment #9) > (In reply to Dave Hunt (:davehunt) from comment #8) > > (In reply to Gregor Wagner [:gwagner] from comment #7) > > > The good thing about my ./run-monkey.sh was that I could run b2g in gdb on > > > the phone. Is this also possible with b2gmonkey? > > > > I can't imagine why not, how do you do this currently? > > I noticed that you restart the b2g process when the monkey starts. > ./run-gdb.sh does the same. So you can either run the monkey or run in gdb. > It probably works when you attach gdb after you start the monkey but thats > not ideal. > > It works with my script since it doesn't restart the b2g process and just > starts simulating the touch events. I see. We could change the remote binary in mozrunner [1], possibly based on an optional argument. Where can I find run-gdb.sh? Another option would be to avoid the restart, however this is currently present so we can do things like set the crash reporter up [2] and create a clean profile each time. [1] http://hg.mozilla.org/mozilla-central/file/9ee9e193fc48/testing/mozbase/mozrunner/mozrunner/application.py#l52 [2] http://hg.mozilla.org/mozilla-central/file/9ee9e193fc48/testing/mozbase/mozrunner/mozrunner/base/device.py#l24 [3] http://hg.mozilla.org/mozilla-central/file/9ee9e193fc48/testing/mozbase/mozrunner/mozrunner/base/device.py#l68

Flags: needinfo?(anygregor)

Gregor Wagner [:gwagner]

Reporter

Comment 11

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #10) > (In reply to Gregor Wagner [:gwagner] from comment #9) > > (In reply to Dave Hunt (:davehunt) from comment #8) > > > (In reply to Gregor Wagner [:gwagner] from comment #7) > > > > The good thing about my ./run-monkey.sh was that I could run b2g in gdb on > > > > the phone. Is this also possible with b2gmonkey? > > > > > > I can't imagine why not, how do you do this currently? > > > > I noticed that you restart the b2g process when the monkey starts. > > ./run-gdb.sh does the same. So you can either run the monkey or run in gdb. > > It probably works when you attach gdb after you start the monkey but thats > > not ideal. > > > > It works with my script since it doesn't restart the b2g process and just > > starts simulating the touch events. > > I see. We could change the remote binary in mozrunner [1], possibly based on > an optional argument. Where can I find run-gdb.sh? Another option would be > to avoid the restart, however this is currently present so we can do things > like set the crash reporter up [2] and create a clean profile each time. > run-gdb.sh can be found here:https://github.com/mozilla-b2g/B2G/blob/master/run-gdb.sh I think we can live without a fresh profile but crash-reporting is a nice thing to have.

Flags: needinfo?(anygregor)

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 12

•

11 years ago

I've made restarting the device an optional argument, which we will always set when running in the CI. Can you try updating and running this again to see if it works while running b2g in gdb?

Flags: needinfo?(anygregor)

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 13

•

11 years ago

In CI we're currently running 100000 steps, which is taking 10 minutes to run. Do we have an idea of how many steps or how long we'd like this to run for?

Flags: needinfo?(jlal)

James Lal [:lightsofapollo]

Comment 14

•

11 years ago

Hrm IMO my non-scientific response would be to run one 10 min check per commit and a longer N (we can start at 2 hours?) run once we have capacity. For this an other fuzzing tests I think our best bet is probably shorter 10-30 min runs (but run 1-10 of those) + one or two longer running tests (maybe the longer ones we run on nightly then bisect).

Flags: needinfo?(jlal)

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 15

•

11 years ago

(In reply to James Lal [:lightsofapollo] from comment #14) > Hrm IMO my non-scientific response would be to run one 10 min check per > commit and a longer N (we can start at 2 hours?) run once we have capacity. > For this an other fuzzing tests I think our best bet is probably shorter > 10-30 min runs (but run 1-10 of those) + one or two longer running tests > (maybe the longer ones we run on nightly then bisect). If I've understood correctly then we should just leave this as a 10 minute run? When you say more capacity are you referring to adding more devices to the current pool (of two) or something else?

Geo Mealer [:geo] -- This account is inactive after 2015-07-07

Updated

•

11 years ago

QA Whiteboard: [fxosqa-auto-backlog-]

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 16

•

10 years ago

The monkey tests for B2G are no longer maintained or running.

Status: NEW → RESOLVED

Closed: 10 years ago

Flags: needinfo?(anygregor)

Resolution: --- → WONTFIX

Bugzilla

[Meta] Explore monkey testing framework

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect)

Tracking

(Not tracked)

People

(Reporter: gwagner, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Comment 16