Closed Bug 1165814 Opened 9 years ago Closed 6 years ago

[raptor] Add an option to let user determinate when to flush the log, and add a Marionette phase for combing them together

Categories

(Firefox OS Graveyard :: Gaia::PerformanceTest, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gweng, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

I'm still struggling with Marionette + Raptor, so I don't know if this is the best solution. However from my experiment, if we want to allow user measuring any Marionette behavior, we need to allow user to claim when to flush the log. This can be done by add an option and pass it to the specific phase. I've mimic the 'reboot' phase to create a 'marionette' phase, which could combine Marionette + Raptor together, and print the logs as usual. In such example, since the last mark is hard coded as 'osLogoEnd', user is actually unable to measure arbitrary behaviors freely. So I submit this bug and attach the example first. I would send the patch later and we may have some discussion on this, to see if it's a good solution.
Assignee: nobody → gweng
Summary: [raptor] Add an option to let user determinate when to flush the log → [raptor] Add an option to let user determinate when to flush the log, and add a Marionette phase for combing them together
Oh sorry, I just forgot that these two modifications should be submitted at the same patch, or the meaning would miss. Since the option is for the new phase, so add the option without phase would make a patch without any use case, which looks weird.
Attached file Proposal
Hi Eli, I want to know what's your opinion about to run Marionette + Raptor with this modification. This proposal would allow user to specific which mark should be the ending mark of the one's Marionette behaviors (ex: screen lock handler-drag-unlock-unlocking-unlocked), and then to flush the log out at the end of the test. 

And since I'm very unfamiliar to the inner design of the phase, I just copy the 'reboot' phase and modified it to be available for test. If there are more proper way to do that please tell me, so that I can modify the patch. Thanks!
Attachment #8606896 - Attachment is obsolete: true
Attachment #8606908 - Flags: feedback?(eperelman)
And if you want to take a look of how I prepare to let Gaia guys can write the Marionette + Raptor test, here is my helper's example:

https://github.com/snowmantw/gaia-marionette-raptor/blob/master/sample-test.js

With this helper, Gaia developers only need to modify their existing Marionette code with another callback:

    RaptorRunner.capture(function(detach) {
      // The original Marionette code goes here.
      detach();
    }, 'osLogoEnd')

In the existing Marionette test structure.
So I think the way the Marionette phase is currently structured goes against the architecture of Raptor, let me explain:

The point of phases is to remove boilerplate code, and prepare the device for a state of testing. In Raptor thus far, one of the main phases is "cold launch". We put the device in a state where an application has been cold launch, then any kind of test can be created based off of that state. The "app launch" test uses the cold launch phase for starting the app, but since nothing else needs to be tested, the test can complete immediately. The phase defines at what point execution passes from the phase to the test, and it can be confusing because some tests are tough to distinguish from phases.

Now to be a little more concrete. I don't believe the Marionette phase is a legitimate phase by itself. You don't put the device into a Marionette phase, the Marionette phase should be considered the Orangutan alternative to the Phase.js file. It is a base class that other phases should inherit from, but not used directly. Creating child "classes" should determine at which event we pass execution back to the test.

Does this make sense?
Attachment #8606908 - Flags: feedback?(eperelman)
I try to catch what you explained: since you said the phase.js should not be used directly, there must be a new child phase if we want to run the test couldn't reply on the current phases. We could call it X phase to avoid the naming issue.

Now the X phase should "prepare the device for a state of testing". For Marionette test, which would execute some code to emulate user behavior, the prepared state should be a "booted device" and nothing more. It may be similar to "reboot" phase, but they're different: the "reboot" phase is hard-coded to capture only to 'osLoglEnd' mark, not arbitrary mark that the behavior may trigger. Also, to reboot both by Marionette & Raptor would break the socket, and thus we can't just use the "reboot" phase as X phase.

So if this holds true for your, the X phase should:

1. make sure the device is booted
2. be able to flush the log according to the customized mark

The 1. is always be true. Since if it's off we would encountered error even before run the phase & test ("device is not connected"). So X phase doesn't need to ensure that. Now, the 2. looks like the only thing the phase should make it happen, which is currently in the patch.

However, I think you're right for how to name the X phase. I call it "marionette", is just for "I want to do some Raptor test after <reboot/first-time-use/code-launch>", so I think it's OK to substitute any of them with "marionette". I don't have a better name for that, and maybe I'm wrong with these assumptions and conclusions, so your opinions are appreciated.

(By the way, since Marionette tests may be crossed app, so I think the current first-time-launch and cold-launch phases are improper. Although they're closed to what you said to "prepare the device for a state of testing").
Flags: needinfo?(eperelman)
Depends on: 1166339
Flags: needinfo?(eperelman)
I don't have the exact answer to this yet, so I'm leaving the needinfo. I'm currently investigating extracting Raptor from Gaia to be a global npm module and I'll look into consuming Marionette instead of Orangutan to overcome the socket breakage issue. I think you raise a good point that I hadn't thought of before:

Runs are delineated by TWO marks, one to denote when the phase has been met, and then other denotes when the run is complete. For the app launch test, these marks are the same: fullyLoaded. For other tests though, the first mark is fullyLoaded and the second mark could be user-provided. Using this knowledge, we may be able to build a generic phase like you have suggested. Let me do some thinking on this and I will get back to you.
Flags: needinfo?(eperelman)
Hi Eli, since I've heard you're now busying on test app startup and closing time, would you mind I use the forked Raptor with my patch in Gaia, to start some early Marionette-Raptor tests for Gaia developers? If we write only few tests in some critical points (ex: keyboard slidingup & lockscreen unlocking), we may easily switch to the new and complete solution in the future. For the present, we could have some concrete things to evaluate how many unfinished things exist in the current path, especially if this approach is too confused for the most of Gaia developers.

If it's OK for you, I may add one more node repo line in Gaia build to fetch my repo, and run Marionette + Raptor test on that. I would consult others to see if it's OK to land such hacks in master, or we should test it in a totally different Gaia repo or branches.
Flags: needinfo?(eperelman)
I probably wouldn't land those hacks in Gaia itself, but maybe in a fork until I can get some changes landed. I am going through a refactor of Raptor and mozdevice now that should give us the Marionette support we need. I am also moving Raptor and mozdevice into the Gaia tree to alleviate some of these issues and get better integration and support for third-party testing. See bug 1168432.
Depends on: 1168432
OK thanks. Then I'll wait that.
Depends on: 1169788
Hi Eli, is it good to you if I continue my work to let the Raptor could capture logs between arbitrary two marks? Since I've checked the latest version of your refactored Raptor, but I didn't see the related changes, or at least from MDN or raptor's code I couldn't find one. I've found it makes capturing sessions like rebooting to unlocking is impossible, since the 'reboot' phase only listen to the 'homescreenFullyLoaded', and it looks like to discard all the logs after that, and unfortunately 'unlocking' mark is one of them. I think there are still lots of test cases will rely on that.
Flags: needinfo?(eperelman)
Hey Greg,

So in Raptor v2.0, we did away with capturing *any* performance markers that don't exist along the phase path of a test; we assume that marks along the phase path are de facto performance *measures*. For example, during an application launch we capture all performance markers from appLaunch to fullyLoaded and convert them to performance measures. They are assumed to be measures as they are the difference between the mark and appLaunch. After fullyLoaded, this is no longer the case as the phase has been completed.

That is all to say: if you need to capture performance metrics after a phase has completed, e.g. fullyLoaded, you may create marks but the data that will be captured must be a measure. For example:

```js
// Won't be captured
performance.mark('spotA');

...

// Won't be captured
performance.mark('spotB');

...

// Will be captured
performance.measure('metric', 'spotA', 'spotB');
```

Also remember that in order to do more work after a phase, to not use the base tests like cold launch or reboot. You can derive tests from these, but they won't be useful for capturing data after the phase, which is the point of those tests. To take your example, if you wanted to capture from rebooting to unlocking, you could write a test that uses the reboot phase:

```js
// pseudocode
setup(function(options) {
  options.phase = 'reboot';
  options.test = 'unlock-device'; // just an arbitrary test name
});

afterEach(function(phase) {
  return marionette
    .startSession()
    .then(function(client) {
      client.unlockDevice();
      client.deleteSession();
    });
});
```

And then in the lockscreen you would create your measurements. Does that work?
Flags: needinfo?(eperelman)
Thanks for the guide! I will try that and to see if we can get more tests.
Assignee: gweng → nobody
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: