Closed Bug 1159639 Opened 9 years ago Closed 6 years ago

Implement API to add performance marker

Tracking

(tracking-b2g:+)

Status:

RESOLVED WONTFIX

Project Flags:

tracking-b2g

People

(Reporter: jhao, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [perf-wanted])

Attachments

(1 file, 11 obsolete files)

(WIP) Designed as an XPCOM service. Not implemented yet. 9 years ago Jonathan Hao (inactive) [:jhao] 8.38 KB, patch		Details \| Diff \| Splinter Review
WIP patch 9 years ago Jonathan Hao (inactive) [:jhao] 9.04 KB, patch		Details \| Diff \| Splinter Review
Performance Mark API - V1 9 years ago Jonathan Hao (inactive) [:jhao] 9.77 KB, patch		Details \| Diff \| Splinter Review
Sample usage of this API 9 years ago Jonathan Hao (inactive) [:jhao] 3.91 KB, patch		Details \| Diff \| Splinter Review
Performance Mark API - V1 9 years ago Jonathan Hao (inactive) [:jhao] 9.15 KB, patch		Details \| Diff \| Splinter Review
Reuse PerformanceBase 9 years ago Jonathan Hao (inactive) [:jhao] 10.49 KB, patch		Details \| Diff \| Splinter Review
Testing on MediaDecoderStateMachine 9 years ago Jonathan Hao (inactive) [:jhao] 12.34 KB, patch		Details \| Diff \| Splinter Review
Presentation Mark API V2 (reuse PerformanceBase) 9 years ago Jonathan Hao (inactive) [:jhao] 10.51 KB, patch		Details \| Diff \| Splinter Review
Performance mark in Gecko 9 years ago Jonathan Hao (inactive) [:jhao] 14.50 KB, patch	baku : review-	Details \| Diff \| Splinter Review
Performance Mark API (V3) 9 years ago Jonathan Hao (inactive) [:jhao] 21.70 KB, patch		Details \| Diff \| Splinter Review
Chrome test 9 years ago Jonathan Hao (inactive) [:jhao] 4.55 KB, patch		Details \| Diff \| Splinter Review
Performance Mark API (V4) 9 years ago Jonathan Hao (inactive) [:jhao] 23.32 KB, patch		Details \| Diff \| Splinter Review

Jonathan Hao (inactive) [:jhao]

Reporter

Description

•

9 years ago

Raptor let us add performance markers in Gaia, but we also have the need to add performance markers in Gecko code.

The interface could be very similar to the Gaia counterpart, e.g. mark(), measure().

We should connect our interface to the implementation of Raptor, so that these entries can be retrieved together with the markers in Gaia.

Ting-Yu Chou [:ting] (away)

Comment 1

•

9 years ago

Can we reuse PROFILER_LABEL or PROFILER_MARKER for the mark(), and add another macro for measure()?

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 2

•

9 years ago

Patrick said that there may be some cases where we want a marker, but do not want it to be exposed to profiler. Also, the marker could be denser than the profiler markers.

I'll talk to each functional team, asking their use case and what their desired API look like.

Chiajung said it's good to combine PROFILER_LABEL or PROFILER_MARKER and our marker, because it can save duplicated work. That's also exactly what Ting-Yu told me. There are already several of those in the critical parts of graphics code.

Also, the API should be able to work well in both the b2g process and content processes. The markers could also be placed and measured in different threads.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 3

•

9 years ago

I've talked to John (media team) and Szu-Yu (RIL team) respectively. They were OK with combining performance markers with PROFILER_LABEL/MARKER.

As for Patrick's concern about exposing too much to the profiler, Chiajung and John both mentioned that we could use a special tag to avoid this.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 4

•

9 years ago

I was planning about the performance mark interface in Gecko, and has collected some use cases and suggestions these days.

At first, I planned to use the corresponding C++ implementation of performance.mark[1]. It was not easy because in order to get a nsPerformance object, we need to get a window object.

Then, Thinker reminded me that the purpose of this interface is to generate logcat for Raptor to parse. I looked at where [1] generates logcat, it's

    PERFLOG("Performance Entry: %s|%s|%s|%f|%f|%" PRIu64 "\n",
            uri.get(),
            NS_ConvertUTF16toUTF8(aEntry->GetEntryType()).get(),
            NS_ConvertUTF16toUTF8(aEntry->GetName()).get(),
            aEntry->StartTime(),
            aEntry->Duration(),
            static_cast<uint64_t>(PR_Now() / PR_USEC_PER_MSEC));

which would generate something like:

    Performance Entry: email.gaiamobile.org|mark|navigationLoaded|1754.750363|0.000000|1430794848953
    Performance Entry: email.gaiamobile.org|mark|navigationInteractive|1754.869270|0.000000|1430794848953
    Performance Entry: email.gaiamobile.org|mark|visuallyLoaded|1755.240624|0.000000|1430794848953
    Performance Entry: email.gaiamobile.org|mark|contentInteractive|1755.449478|0.000000|1430794848953
    Performance Entry: email.gaiamobile.org|mark|fullyLoaded|1755.719374|0.000000|1430794848954

The first column is URI, I think we could come up with some convention such as "network.protocol.rtsp" for each component.
The second column is the entry type, which are all "mark" if it's generated by performance.mark().
For all marks, the duration is 0 by definition, which explains the fourth column. Non-zero duration occurs when entries are produced by performance.measure().
The fifth column, as you can see, is the absolute time of the moment.

The above could be done without using nsPerformance. As for the third column, it represents the elapsed time counting from the start of navigation.
In Gecko, we may have to set a reference point of time by ourselves, but I'm not sure when it should be.

Or perhaps we should focus on the difference between the marks? (which is what performance.measure() does)

[1] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.h#341
[2] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.cpp#678

:Eli Perelman

Comment 5

•

9 years ago

Hey Jonathan,

You are correct about reference points with regards to measurements. In order for User Timing/performance marks to determine how long something takes, there is always a start mark and an end mark. In your standard web page, the start mark is implicit: performance.timing.navigationStart. Any mark created is essentially an implicit *performance.measure* between the time the mark was created and navigationStart. It's strange; all marks are technically just measures with a hard-coded starting point.

Raptor uses this same principle, but changes the start mark away from navigationStart to a different performance mark generated by the homescreen: appStart. Instead of calculating the delta between navigationStart and a marker to generated the time, we generate the delta between appStart and the marker. It's very similar, it just makes times a little longer than before to account for application startup time done by the System and homescreen.

In order for marks/measures to mean anything, they need a reference point to base their duration on. Without an application start or navigationStart (since these belong to the window or app), you need a different reference point. I think the usage of a performance interface for generating marks and measures is fine, but how you go about it depends on this reference point. Without a user-defined point, you couldn't rely on High-Resolution time, instead you would need to use absolute times, i.e. epochs, to generate marks, and have a measure method that would do the delta between those marks. I can't see any way of using a high-resolution timer without first having a defined reference point.

I'm sure you know all this, but I guess I just wanted to clear up what needs to happen here:

1. Generate an absolute timestamp for every "mark", and then "measure" the delta between the timestamp. Going this route would not require a reference marker.

Or:

2. Generate a high-resolution time for every "mark", again using "measure" to calculate the difference. This would require that every mark have some reference point for which to generate the initial delta.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 6

•

9 years ago

Thanks Eli for the detailed explanation.

I think I'll use absolute timestamp for now, until we ever come up with a reasonable reference point.

I'm planning to create an XPCOM object for this API.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 7

•

9 years ago

Attached patch (WIP) Designed as an XPCOM service. Not implemented yet. (obsolete) — Details — Splinter Review

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 8

•

9 years ago

Attached patch WIP patch (obsolete) — Details — Splinter Review

We can now measure the time difference between two markers.

Attachment #8604031 - Attachment is obsolete: true

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 9

•

9 years ago

Attached patch Performance Mark API - V1 (obsolete) — Details — Splinter Review

I set the reference point as the time when the XPCOM object is created.
It now produces logs compatible with Raptor.

Fernando Jiménez Moreno [:ferjm]

Updated

•

9 years ago

Blocks: 1158848

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 10

•

9 years ago

Attached patch Sample usage of this API (obsolete) — Details — Splinter Review

To add a performance mark in Gecko:

nsCOMPtr mPerformance = do_GetService(NS_PERFORMANCE_CONTRACTID);
mPerformance->Mark("sample_mark");

For example, in this sample patch I add two marks in MediaDecoderStateMachine, when it starts to decode metadata and first frame.
These code will be run when the booting animation appears.

In gaia folder, run |make raptor| and
|RUNS=1 APP=settings node tests/raptor/restart_b2g_test|
we'll see the results

Metric                                        Mean       Median     Min        Max        StdDev  p95
--------------------------------------------  ---------  ---------  ---------  ---------  ------  ---
restartb2g.loadEnd                            10476.290  10476.290  10476.290  10476.290  0.000   n/a
restartb2g.DECODER_STATE_DECODING_METADATA    19077.290  19077.290  19077.290  19077.290  0.000   n/a
restartb2g.DECODER_STATE_DECODING_FIRSTFRAME  19079.290  19079.290  19079.290  19079.290  0.000   n/a
restartb2g.homescreenStart                    24578.290  24578.290  24578.290  24578.290  0.000   n/a
restartb2g.navigationLoaded                   24913.290  24913.290  24913.290  24913.290  0.000   n/a
restartb2g.navigationInteractive              24914.290  24914.290  24914.290  24914.290  0.000   n/a
restartb2g.osLogoEnd                          25083.290  25083.290  25083.290  25083.290  0.000   n/a

Jonathan Hao (inactive) [:jhao]

Reporter

Updated

•

9 years ago

Attachment #8606155 - Attachment is obsolete: true

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 11

•

9 years ago

Attached patch Performance Mark API - V1 (obsolete) — Details — Splinter Review

Hi Eli,

I wanted to reuse the code of the User Timing API, but I found it hard to do so.
Currently, I didn't reuse any of them. I just tried to make it produce similar logs so that Raptor can parse them.

Could you recommend someone that I could ask feedback or review from?
I've talked to Thinker's team, but they weren't sure who's in charge with this either.

I'm also not really sure which folder this piece of code should be in.

P.S. This patch is essentially the same as attachment 8607420 [details] [diff] [review]. Some redundant lines are removed.

Thanks!

Attachment #8607420 - Attachment is obsolete: true

Flags: needinfo?(eperelman)

:Eli Perelman

Comment 12

•

9 years ago

Hey :baku, do you have any thoughts on this?

Flags: needinfo?(eperelman) → needinfo?(amarchesini)

Andrea Marchesini [:baku]

Comment 13

•

9 years ago

:jhao, I can review your code, but, going quickly through your patch, I see that you have done sometime similar to what I did for bug 1155761. Basically I implemented a PerformanceBase class that can be used in workers and main-thread in order to share the basic methods for the Performance Timing API.

Hopefully my patch should land to m-i this week.
Wondering if my patch can be useful and we can merge your code with mine somehow.

Flags: needinfo?(amarchesini) → needinfo?(jhao)

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 14

•

9 years ago

Hi Andrea,

It looks very useful. I think it would be great to merge our code.

Thanks a lot!

Depends on: 1155761

Flags: needinfo?(jhao)

Bobby Chien

Updated

•

9 years ago

tracking-b2g: --- → +

Keywords: perf

Whiteboard: [perf-wanted]

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 15

•

9 years ago

Hi Andrea,

Now that bug 1155761 is resolved, I want to make my class inherit PerformanceBase to avoid duplicated code, but there's one problem.

This assertion occured in PerformanceEntry and PerformanceMeasure:
MOZ_ASSERT(mParent || !NS_IsMainThread());

It forbids us to use Performance.Mark() in main thread, since we don't have mParent.
We wish to be able to use Performance.Mark() in every thread.

How do you suggest we deal with this problem?
Thanks very much!

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Comment 16

•

9 years ago

> This assertion occured in PerformanceEntry and PerformanceMeasure:
> MOZ_ASSERT(mParent || !NS_IsMainThread());

mParent is just use for GetParentObject(). Change this assertion to something that covers your usecase.

Flags: needinfo?(amarchesini)

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 17

•

9 years ago

Attached patch Reuse PerformanceBase (obsolete) — Details — Splinter Review

In this patch I tried to reuse PerformanceBase. It works fine most of the time.
But, when I try to test it by adding a mark in MediaDecoderStateMachine, it will crash and keeps reopening b2g process.
It crashes in NS_CycleCollectorSuspect3. The crash reason is that the decoder thread didn't start any cycle collector.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 18

•

9 years ago

Attached patch Testing on MediaDecoderStateMachine (obsolete) — Details — Splinter Review

This patch is to test on MediaDecoderStateMachine.

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 19

•

9 years ago

Attached patch Presentation Mark API V2 (reuse PerformanceBase) (obsolete) — Details — Splinter Review

fixed a problem, but the situation is still the same

Attachment #8617188 - Attachment is obsolete: true

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 20

•

9 years ago

I have confirmed with JW. When we called the function |Mark()|, it creates a PerformanceMark. JW said that anything with cycle collection must be created in the main thread (or perhaps worker thread).

So, if we still want to reuse code by inheriting PerformanceBase and be able to use it out of main thread, we'll need two kinds of PerformanceMarks, one with cycle collection and one without.

Hi :baku, do you think we should do that?

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Comment 21

•

9 years ago

Yes, you are right. But you don't care about PerformanceEntry/PerformanceMark. Probably you can change the code so that we create PerformanceEntry only when needed and we store some non CCed data in the arrays of entries.

Flags: needinfo?(amarchesini)

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 22

•

9 years ago

Attached patch Performance mark in Gecko (obsolete) — Details — Splinter Review

Hi :baku,
Could you review my patch?

Attachment #8607429 - Attachment is obsolete: true

Attachment #8607445 - Attachment is obsolete: true

Attachment #8617189 - Attachment is obsolete: true

Attachment #8617199 - Attachment is obsolete: true

Attachment #8624153 - Flags: review?(amarchesini)

Andrea Marchesini [:baku]

Comment 23

•

9 years ago

Comment on attachment 8624153 [details] [diff] [review]
Performance mark in Gecko

Review of attachment 8624153 [details] [diff] [review]:
-----------------------------------------------------------------

You cannot remove the CC from PerformanceEntry.
What I meant was something like:

1. instead having mUserEntries create something like:

struct UserEntry {
  nsRefPtr<PerformanceEntry> mEntry; // Initially null
  EntryType mType; // This can be just Mark and Meausre
  nsString mName;
  DOMHighResTimeStamp mStartTime;
  DOMHighResTimeStamp mDuration;
}

nsTArray<UserEntry> mUserEntries;

So, basically InsertUserEntry() adds a new data struct without creating a UserEntry.mEntry object.
Then in GetEntry* methods (not exposed in your nsIPerformance idl) you call:

PerformanceEntry* GetOrCreatePerformanceEntry(UserEntry& aEntry);

that creates mEntry in case it's null using mType, mName, mStartTime and mDuration.
How does it sound?

Attachment #8624153 - Flags: review?(amarchesini) → review-

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 24

•

9 years ago

Attached patch Performance Mark API (V3) (obsolete) — Details — Splinter Review

This patch can be compiled and pass the test "test_performance_user_timing_html" in my computer.

I am writing a chrome test as :baku suggested in Whistler.

Attachment #8624153 - Attachment is obsolete: true

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 25

•

9 years ago

Attached patch Chrome test (obsolete) — Details — Splinter Review

Hi :baku,

I wrote this test, but when I ran it, it produced this error:

7 INFO TEST-UNEXPECTED-FAIL | dom/base/test/chrome/test_bug1159639.xul | uncaught exception - NS_ERROR_XPC_CANT_CREATE_WN: Component returned failure code: 0x80570019 (NS_ERROR_XPC_CANT_CREATE_WN) [nsIJSCID.getService] at chrome://mochitests/content/chrome/dom/base/test/chrome/test_bug1159639.xul:30

NS_ERROR_XPC_CANT_CREATE_WN seems to come from https://dxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSID.cpp?from=XPCJSID.cpp&case=true#696 but I don't really understand why. And I can getService without problems in C++ code.

Do you have any idea what the reason is?

Thanks!

Attachment #8628287 - Flags: feedback?(amarchesini)

Andrea Marchesini [:baku]

Comment 26

•

9 years ago

Jonathan, sorry for the delay. I hope to give you an answer for this week!

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 27

•

9 years ago

Attached patch Performance Mark API (V4) — Details — Splinter Review

The previous patch is out-of-date. This one is rebased.
I tried xpcshell-test, and it still crashed when I tried to getService.

I wonder if I exposed the XPCOM module wrong.

Since baku is away until 11/8, I'll have to find someone familiar with XPCOM to help.

Attachment #8628102 - Attachment is obsolete: true

Attachment #8628287 - Attachment is obsolete: true

Attachment #8628287 - Flags: feedback?(amarchesini)

Bobby Chien

Updated

•

9 years ago

Status: NEW → ASSIGNED

Bobby Chien

Updated

•

9 years ago

Priority: -- → P1

Jonathan Hao (inactive) [:jhao]

Reporter

Comment 28

•

8 years ago

Unassign myself since I'm not actively working on Firefox OS anymore.

Assignee: jhao → nobody

Status: ASSIGNED → NEW

BMO Automation

Comment 29

•

6 years ago

Firefox OS is not being worked on

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.