sending NotifyIMEFocus IPC messages can jank the browser

NEW
Assigned to

Status

()

Core
DOM
P2
normal
5 months ago
8 hours ago

People

(Reporter: froydnj, Assigned: freesamael)

Tracking

(Blocks: 4 bugs)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 affected)

Details

(Whiteboard: [qf:p1])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(4 attachments)

(Reporter)

Description

5 months ago
https://perfht.ml/2n457Ak shows a profile which has multiple Msg_NotifyIMEFocus hangs in it (~3s, ~1.5s, ~600ms).  Unfortunately, the stacks are complete garbage, for reasons that I do not understand--I think it has something to do with pending updates, though.  Is this a known sore spot for IPC messages?
Flags: needinfo?(ehsan)
It's in the telemetry data but I don't think we have a bug on file for it, thanks for filing it.

The stacks are really surprising.  Markus, have you seen something like this?  We may wanna spin off the stack walking issue int a separate bug...
Blocks: 1331674
Flags: needinfo?(ehsan) → needinfo?(mstange)
I haven't seen this particular variant before, and I'm puzzled by it. Nathan, if you see this again, please file a bug.
Flags: needinfo?(mstange)
Masayuki, please take a look at this or delegate it when you have time. Thank you.
Flags: needinfo?(masayuki)
Yeah, I should do this. But I have an urgent bugs, one is a security bug and the others are improving performance issue of IMEContentObserver. I could take this after them.

However, I'm not sure making it async works without any regressions because user may type soon after moving focus, then, IME state may mismatch with the visual result.
Flags: needinfo?(masayuki)

Updated

5 months ago
Priority: -- → P2
Is this profile helpful? https://perfht.ml/2oIdimx
It seems to have correct stacks.
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p1]
Blocks: 1355519

Comment 6

4 months ago
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from comment #4)
> Yeah, I should do this. But I have an urgent bugs, one is a security bug and
> the others are improving performance issue of IMEContentObserver. I could
> take this after them.
> 
> However, I'm not sure making it async works without any regressions because
> user may type soon after moving focus, then, IME state may mismatch with the
> visual result.

Hi Masayuki,

This bug seems to be clear enough for me to help you out. 

Regarding the mismatch issue, do you think if it would help to have a flag
to indicate if the result of "SendNotifyIMEFocus" has asynchronously
come out? If the result hasn't out, we then (unfortunately) call a 
sync function [1] to get the result and ignore the one on the way.

[1] The sync function may have to be something like "idempotent" (?).
    That is, calling SendNotifyIMEFocusAsync followed by SendNotifyIMEFocusSync
    should be equivalent to calling only SendNotifyIMEFocusAsync only.
Flags: needinfo?(masayuki)

Updated

4 months ago
Assignee: nobody → hchang
Comment hidden (mozreview-request)

Comment 8

4 months ago
I submitted a patch which addresses this issue in two aspects:

1) Add PBrowser.AsyncNotifyIMEFocus, which is totally identical to 
   PBrowser.NotifyIMEFocus except that it returns the result
   asynchronously.

2) In order to avoid the mismatch issue, the result (mIMENotificationRequestsOfParent)
   will be obtained with PBrowser.NotifyIMEFocus again if the previous
   PBrowser.AsyncNotifyIMEFocus hasn't return the result.

Masayuki, 

Could you give me some feedback about my patch on the reviewboard?
I tested my patch basically and it seemed working fine. I've quickly
changed the editable focus and there seemed be no mismatch. But, I
am not sure if there is still anything that I wasn't aware since I am
not very familiar with this function.

Thanks :)
Blocks: 1362068
I'm still thinking about bug, though, perhaps, the fix of bug 1217700 must help this bug.

After bug 1217700, IMEContentObserver started to observe any changes in focused editor. When it receives an change, it checks if the widget needs which notifications at that timing. So, perhaps, now, the IPC message doesn't need to be sync anymore.

When PuppetWidget::NotifyIMEOfFocusChange() is called, PuppetWidget should request all notifications until it receives actual notification requests of parent process. If parent process receives unnecessary notifications, it should just ignore them.
Flags: needinfo?(masayuki)
Comment hidden (mozreview-request)

Updated

3 months ago
Attachment #8864779 - Flags: review?(masayuki)
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from comment #9)
> I'm still thinking about bug, though, perhaps, the fix of bug 1217700 must
> help this bug.
> 
> After bug 1217700, IMEContentObserver started to observe any changes in
> focused editor. When it receives an change, it checks if the widget needs
> which notifications at that timing. So, perhaps, now, the IPC message
> doesn't need to be sync anymore.
> 
> When PuppetWidget::NotifyIMEOfFocusChange() is called, PuppetWidget should
> request all notifications until it receives actual notification requests of
> parent process. If parent process receives unnecessary notifications, it
> should just ignore them.

Hi Masayuki-san,

If I am not misunderstanding your explanation, the patch I just asked for your 
review is supposed to work perfectly without the mismatch issue you mentioned.
Could you please help review that patch? Do you think if I should add any test
case to ensure the change doesn't screw up anything?

Thanks!

Comment 12

3 months ago
mozreview-review
Comment on attachment 8864779 [details]
Bug 1349255 - Make PBrowser.NotifyIMEFocus async to avoid UI jank.

https://reviewboard.mozilla.org/r/136444/#review140458

I'd like to check next version, r- until then.

::: dom/ipc/PBrowser.ipdl:228
(Diff revision 2)
> -    nested(inside_cpow) sync NotifyIMEFocus(ContentCache contentCache,
> +    async NotifyIMEFocus(ContentCache contentCache,
> -                                            IMENotification notification)
> +                              IMENotification notification)

nit: fix the indent if it's not a bug of reviewboard.

::: widget/PuppetWidget.cpp:847
(Diff revision 2)
>    } else {
>      // When IME loses focus, we don't need to store anything.
>      mContentCache.Clear();
>    }
>  
>    mIMENotificationRequestsOfParent = IMENotificationRequests();

mIMENotificationRequestsOfParent should request all notifications.

I think that IMENotificationRequests::Notifications should have NOTIFY_ALL whose value is (NOTIFY_TEXT_CHANGE | NOTIFY_POSITION_CHANGE | NOTIFY_MOUSE_BUTTON_EVENT_ON_CHAR). (NOTIFY_DURING_DEACTIVE isn't necessary even on Windows, perhaps.) And it should set to mIMENotificationRequestsOfParent before calling SendNotifyIMEFocus(). Then, it's guaranteed that all notifications are sent to parent process until receiving actual requests.
Attachment #8864779 - Flags: review?(masayuki) → review-
Comment hidden (mozreview-request)

Comment 14

3 months ago
mozreview-review
Comment on attachment 8864779 [details]
Bug 1349255 - Make PBrowser.NotifyIMEFocus async to avoid UI jank.

https://reviewboard.mozilla.org/r/136444/#review140576

::: widget/PuppetWidget.cpp:847
(Diff revision 3)
> -  mIMENotificationRequestsOfParent = IMENotificationRequests();
> -  if (!mTabChild->SendNotifyIMEFocus(mContentCache, aIMENotification,
> -                                     &mIMENotificationRequestsOfParent)) {
> +  auto NOTIFY_ALL = IMENotificationRequests::NOTIFY_TEXT_CHANGE |
> +                    IMENotificationRequests::NOTIFY_POSITION_CHANGE |
> +                    IMENotificationRequests::NOTIFY_MOUSE_BUTTON_EVENT_ON_CHAR;

No, I meant that you should add NOTIFY_ALL to the enum (declared in IMEData.h). Then, when the others add new notifications, they will modify NOTIFY_ALL too (perhaps).
Attachment #8864779 - Flags: review?(masayuki) → review-
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from comment #14)
> Comment on attachment 8864779 [details]
> Bug 1349255 - Make PBrowser.NotifyIMEFocus async to avoid UI jank.
> 
> https://reviewboard.mozilla.org/r/136444/#review140576
> 
> ::: widget/PuppetWidget.cpp:847
> (Diff revision 3)
> > -  mIMENotificationRequestsOfParent = IMENotificationRequests();
> > -  if (!mTabChild->SendNotifyIMEFocus(mContentCache, aIMENotification,
> > -                                     &mIMENotificationRequestsOfParent)) {
> > +  auto NOTIFY_ALL = IMENotificationRequests::NOTIFY_TEXT_CHANGE |
> > +                    IMENotificationRequests::NOTIFY_POSITION_CHANGE |
> > +                    IMENotificationRequests::NOTIFY_MOUSE_BUTTON_EVENT_ON_CHAR;
> 
> No, I meant that you should add NOTIFY_ALL to the enum (declared in
> IMEData.h). Then, when the others add new notifications, they will modify
> NOTIFY_ALL too (perhaps).

Got it! I'll do it tomorrow.

BTW, I just saw a bunch of testing failures from the try result 
and about to drop the review :p

Thanks :)
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from comment #14)
> Comment on attachment 8864779 [details]
> Bug 1349255 - Make PBrowser.NotifyIMEFocus async to avoid UI jank.
> 
> https://reviewboard.mozilla.org/r/136444/#review140576
> 
> ::: widget/PuppetWidget.cpp:847
> (Diff revision 3)
> > -  mIMENotificationRequestsOfParent = IMENotificationRequests();
> > -  if (!mTabChild->SendNotifyIMEFocus(mContentCache, aIMENotification,
> > -                                     &mIMENotificationRequestsOfParent)) {
> > +  auto NOTIFY_ALL = IMENotificationRequests::NOTIFY_TEXT_CHANGE |
> > +                    IMENotificationRequests::NOTIFY_POSITION_CHANGE |
> > +                    IMENotificationRequests::NOTIFY_MOUSE_BUTTON_EVENT_ON_CHAR;
> 
> No, I meant that you should add NOTIFY_ALL to the enum (declared in
> IMEData.h). Then, when the others add new notifications, they will modify
> NOTIFY_ALL too (perhaps).

By the way, do you think if we can just set NOTIFY_ALL to 0xFF so that
we would never miss any newly added notifications?
Flags: needinfo?(masayuki)
I don't think so. Like NOTIFY_DURING_DEACTIVE, new flag could have different meaning. (It requests notifications even during the window deactivated, that means that composition should not be committed at inactivating the window, but on some platforms, it should be committed due to native IME behavior's limitation.)
Flags: needinfo?(masayuki)
Comment hidden (mozreview-request)
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from comment #17)
> I don't think so. Like NOTIFY_DURING_DEACTIVE, new flag could have different
> meaning. (It requests notifications even during the window deactivated, that
> means that composition should not be committed at inactivating the window,
> but on some platforms, it should be committed due to native IME behavior's
> limitation.)

Does that mean NOTIFY_ALL may end up being platform specific?

Sorry I am super unfamiliar with IME stuff and am reading [1].
Besides, do you think if we should update that document when
this bug is done?

Thanks :)

[1] https://developer.mozilla.org/en-US/docs/Mozilla/IME_handling_guide
(In reply to Henry Chang [:hchang] from comment #19)
> (In reply to Masayuki Nakano [:masayuki] (JST, +0900) (PTO: 5/3-5/5) from
> comment #17)
> > I don't think so. Like NOTIFY_DURING_DEACTIVE, new flag could have different
> > meaning. (It requests notifications even during the window deactivated, that
> > means that composition should not be committed at inactivating the window,
> > but on some platforms, it should be committed due to native IME behavior's
> > limitation.)
> 
> Does that mean NOTIFY_ALL may end up being platform specific?

Could be, but perhaps, it's not necessary. So, I think that just ignoring NOTIFY_DURING_DEACTIVE is enough for now. (I'll test it after landing your patch.)

> Sorry I am super unfamiliar with IME stuff and am reading [1].
> Besides, do you think if we should update that document when
> this bug is done?
> 
> Thanks :)
> 
> [1] https://developer.mozilla.org/en-US/docs/Mozilla/IME_handling_guide

Ah, the document is already old. I'll update it when I have much time, don't worry.

Comment 21

3 months ago
mozreview-review
Comment on attachment 8864779 [details]
Bug 1349255 - Make PBrowser.NotifyIMEFocus async to avoid UI jank.

https://reviewboard.mozilla.org/r/136444/#review141932

Oops, sorry, I forgot to check your review request. Thank you for your work!
Attachment #8864779 - Flags: review?(masayuki) → review+

Updated

3 months ago
Depends on: 1364857
The patch itself and the test cases look good but they suffer from
the bug where the PuppetWidget might receive the async IPDL Promise
reply even if the actor (mTabChild) has been destroyed. This is the 
fundamental bug of IPDL async returned Promise and I will try to solve
that in bug 1364857.
No longer depends on: 1364857

Updated

3 months ago
Depends on: 1364857
Comment hidden (mozreview-request)
Even thought bug 1364857 has been fixed, I am still stuck in 

http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/xpcom/threads/MozPromise.h#444

where the runnable for Reject() (on the sender side) will never be executed 
in the shutdown phase for unknown reason. If for some reason the runnable 
cannot be executed, the Promise will never be released. Need to investigate 
what's actually going on behind the scenes. Probably mRespondTarget has
been destroyed to reject new dispatched runnable or something like that.
I am guessing this might happen in the shut down phase.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=450bb9593f4a
https://treeherder.mozilla.org/#/jobs?repo=try&revision=44c40c9182c0
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3df6f0b2d0ee
It turns out I ended up failing to solve the following issue:

"GECKO(45210) | ###!!! [Child][DispatchAsyncMessage] Error: PBrowser::Reply_NotifyIMEFocus Route error: message sent to unknown actor ID"

What I expected is

TabParent                           | TabChild
----------------------------------------------------------------------
- RecvNotifyIMEFocus                  |
    NotifyIMEFocusPromise->Resolve    | 
      Reply_NotifyIMEFocus()          |
                                      | Received "Reply_NotifyIMEFocus"
- Destroy()                           |  
    SendDestroy()                     |
                                      | RecvSendDestroy()
                                      |


What it actually is

TabParent                             | TabChild
----------------------------------------------------------------------
- RecvNotifyIMEFocus                  |
    NotifyIMEFocusPromise->Resolve    | 
                                      |
                                      | 
- Destroy()                           |  
    SendDestroy()                     |
                                      | RecvSendDestroy()
  !!!Reply_NotifyIMEFocus()           | 
                                      | Received "Reply_NotifyIMEFocus"

As you can see, even though we didn't *asynchronously* resolve the promise,
it's resolver is still be run asynchronously (because of [1])

I can either 

1) Give up using async IPDL promise
2) Instead of handing over the promise to the handler, we ask the sender
   to set a resolver/rejecter callback, which will be *synchronously* 
   executed. This approach has to be carefully designed to not leave us
   unable to reply the result asynchronously.

Kanru, what do you think of option 2? That might require the IPDL layer
on the receiver side to maintain a list of pending Promise as we do on
the sender side.


[1] http://searchfox.org/mozilla-central/rev/35b37316149108d53a02fb8498e250ea8a22ab5d/xpcom/threads/MozPromise.h#444
Flags: needinfo?(kchen)
(In reply to Henry Chang [:hchang] from comment #28)
> It turns out I ended up failing to solve the following issue:
> 
> "GECKO(45210) | ###!!! [Child][DispatchAsyncMessage] Error:
> PBrowser::Reply_NotifyIMEFocus Route error: message sent to unknown actor ID"
> 
> What I expected is
> 
> TabParent                           | TabChild
> ----------------------------------------------------------------------
> - RecvNotifyIMEFocus                  |
>     NotifyIMEFocusPromise->Resolve    | 
>       Reply_NotifyIMEFocus()          |
>                                       | Received "Reply_NotifyIMEFocus"
> - Destroy()                           |  
>     SendDestroy()                     |
>                                       | RecvSendDestroy()
>                                       |
> 
> 
> What it actually is
> 
> TabParent                             | TabChild
> ----------------------------------------------------------------------
> - RecvNotifyIMEFocus                  |
>     NotifyIMEFocusPromise->Resolve    | 
>                                       |
>                                       | 
> - Destroy()                           |  
>     SendDestroy()                     |
>                                       | RecvSendDestroy()
>   !!!Reply_NotifyIMEFocus()           | 
>                                       | Received "Reply_NotifyIMEFocus"
> 
> As you can see, even though we didn't *asynchronously* resolve the promise,
> it's resolver is still be run asynchronously (because of [1])
> 
> I can either 
> 
> 1) Give up using async IPDL promise
> 2) Instead of handing over the promise to the handler, we ask the sender
>    to set a resolver/rejecter callback, which will be *synchronously* 
>    executed. This approach has to be carefully designed to not leave us
>    unable to reply the result asynchronously.
> 
> Kanru, what do you think of option 2? That might require the IPDL layer
> on the receiver side to maintain a list of pending Promise as we do on
> the sender side.
> 
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 35b37316149108d53a02fb8498e250ea8a22ab5d/xpcom/threads/MozPromise.h#444

It's expected that resolving a MozPromise will need a trip to the event queue. If MozPromise had microtask semantic then this should work as expected.

I think for now for safety we can require the Actor to inherit from SupportsWeakPtr so we can check if the Actor is still alive.
Flags: needinfo?(kchen)
Blocks: 1339557
(This was showing up in one of the inferno speedometer v2 profiles I was looking at, for example see https://perfht.ml/2qkezAq)
Created attachment 8871388 [details] [diff] [review]
0001-WIP-Do-not-send-reply-if-actor-is-dead.patch

Henry, could you try this patch?

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4516f86851eee4c5449ce5ea0b17fc66e95dd9ec
Attachment #8871388 - Flags: feedback?(hchang)
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #31)
> Created attachment 8871388 [details] [diff] [review]
> 0001-WIP-Do-not-send-reply-if-actor-is-dead.patch
> 
> Henry, could you try this patch?
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=4516f86851eee4c5449ce5ea0b17fc66e95dd9ec

Hmm, this is not useful because SendDestroy does not actually destroy the Actor.
I'm writing a patch to make the resolver promise a resolver function and remove the ability to reject. I think it actually makes the recv function looks better.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=844785c986679443d0ae04ec5477cbc71e08983f
https://treeherder.mozilla.org/#/jobs?repo=try&revision=60c5f14b6395
https://treeherder.mozilla.org/#/jobs?repo=try&revision=12acd78ed6e7
Depends on: 1368031
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2e1522d0aad4
Comment hidden (mozreview-request)

Updated

3 months ago
Blocks: 1366960
Some updates lately:

1) After I landed bug 1364857, I still got a bunch of try failures due to
   what I mentioned in comment 23. So, I decided not to use Promise but
   a pair of async Notify/Reply IPC messages.

2) However, there's at least one test case (browser_master_password_autocomplete.js)
   which permanently fails. The error message is like: [1]

[task 2017-06-07T06:29:02.292567Z] 06:29:02     INFO - Console message: [JavaScript Error: "1496816612127	Browser.Experiments.Experiments	ERROR	Experiments #0::httpGetRequest::onLoad() - Request to http://127.0.0.1:8888/experiments-dummy/manifest returned status 404" {file: "resource://gre/modules/Log.jsm" line: 748}]
[task 2017-06-07T06:29:02.295813Z] 06:29:02     INFO - App_append@resource://gre/modules/Log.jsm:748:9
[task 2017-06-07T06:29:02.297729Z] 06:29:02     INFO - log@resource://gre/modules/Log.jsm:386:7
[task 2017-06-07T06:29:02.300138Z] 06:29:02     INFO - getLoggerWithMessagePrefix/proxy.log@resource://gre/modules/Log.jsm:501:44
[task 2017-06-07T06:29:02.302777Z] 06:29:02     INFO - Experiments.Experiments/this._log.log@resource:///modules/experiments/Experiments.jsm:320:5
[task 2017-06-07T06:29:02.304637Z] 06:29:02     INFO - error@resource://gre/modules/Log.jsm:394:5
[task 2017-06-07T06:29:02.307061Z] 06:29:02     INFO - _httpGetRequest/</xhr.onload@resource:///modules/experiments/Experiments.jsm:951:11
[task 2017-06-07T06:29:02.310178Z] 06:29:02     INFO - openModalWindow@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/nsPrompter.js:363:5
[task 2017-06-07T06:29:02.312132Z] 06:29:02     INFO - openPrompt@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/nsPrompter.js:553:9
[task 2017-06-07T06:29:02.314393Z] 06:29:02     INFO - nsIPrompt_promptPassword@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/nsPrompter.js:771:9
[task 2017-06-07T06:29:02.317562Z] 06:29:02     INFO - promptPassword@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/nsPrompter.js:584:20
[task 2017-06-07T06:29:02.319712Z] 06:29:02     INFO - decrypt@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/crypto-SDR.js:119:20
[task 2017-06-07T06:29:02.322513Z] 06:29:02     INFO - _decryptLogins@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/storage-json.js:492:26
[task 2017-06-07T06:29:02.326262Z] 06:29:02     INFO - searchLogins@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/storage-json.js:267:14
[task 2017-06-07T06:29:02.328568Z] 06:29:02     INFO - searchLogins@jar:file:///home/worker/workspace/build/application/firefox/omni.ja!/components/nsLoginManager.js:411:12
[task 2017-06-07T06:29:02.331091Z] 06:29:02     INFO - searchLoginsWithObject@resource://gre/modules/LoginHelper.jsm:171:12
[task 2017-06-07T06:29:02.333214Z] 06:29:02     INFO - _searchAndDedupeLogins@resource://gre/modules/LoginManagerParent.jsm:45:16
[task 2017-06-07T06:29:02.336535Z] 06:29:02     INFO - doAutocompleteSearch@resource://gre/modules/LoginManagerParent.jsm:282:16
[task 2017-06-07T06:29:02.338526Z] 06:29:02     INFO - receiveMessage@resource://gre/modules/LoginManagerParent.jsm:123:9
[task 2017-06-07T06:29:02.342133Z] 06:29:02     INFO - receiveMessage@jar:file:///home/worker/workspace/build/application/firefox/browser/omni.ja!/components/nsBrowserGlue.js:194:15

which is a "404 Not Found" error while loading http://127.0.0.1:8888/experiments-dummy/manifest
I wonder if it's because the child sent unexpected event to parent
in the state. (e.g. after Destroy()'ed)

Hi Masayuki san,

Do you have any suggestion for me to debug this kind of error?
I have been stumbled by this for a couple of days and still
have no clue :( The new approach (the Notify/Reply) has been
submitted to the reviewboard. 

Thanks!


[1] https://public-artifacts.taskcluster.net/Y5lZrzoXQhKN5Sgt3DDEYg/0/public/logs/live_backing.log
Flags: needinfo?(masayuki)
(In reply to Henry Chang [:hchang] from comment #39)
> Some updates lately:
> 
> 1) After I landed bug 1364857, I still got a bunch of try failures due to
>    what I mentioned in comment 23. So, I decided not to use Promise but
>    a pair of async Notify/Reply IPC messages.

I'm interested to know the difference between a manual Notify/Reply message and the IPC promise approach. In theory they should behave the same because the IPC promise is implemented by a pair of async messages. Did bug 1368031 change anything?
I also think they makes no difference :p I was just trying to isolate any potential
issues first. But sadly I have still been stuck in the try test case failure for
about a week...
Sorry for the delay to reply. But I have no idea. I'm not familiar with it.

Let's ask it to the original author of the test.

johannh:

Do you have any ideas about the 404 error mentioned in comment 39?
Flags: needinfo?(masayuki) → needinfo?(jhofmann)
I don't think the 404 error is the source of failure here. That seems like random noise to me (someone should look into this). 

This test is a bit flaky with timing (it's hard to get right, it's testing a timing issue). The test uses .focus() in a time-sensitive way (has to occur faster than 3s). Could that be a problem? You could try increasing these two values to find out if this is about the focus not happening fast enough https://searchfox.org/mozilla-central/source/toolkit/components/passwordmgr/test/browser/browser_master_password_autocomplete.js#34,49

Feel free to hit me up on IRC to figure this out.
Flags: needinfo?(jhofmann)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc5739852bf3
(In reply to Johann Hofmann [:johannh] from comment #43)
> I don't think the 404 error is the source of failure here. That seems like
> random noise to me (someone should look into this). 
> 
> This test is a bit flaky with timing (it's hard to get right, it's testing a
> timing issue). The test uses .focus() in a time-sensitive way (has to occur
> faster than 3s). Could that be a problem? You could try increasing these two
> values to find out if this is about the focus not happening fast enough
> https://searchfox.org/mozilla-central/source/toolkit/components/passwordmgr/
> test/browser/browser_master_password_autocomplete.js#34,49
> 
> Feel free to hit me up on IRC to figure this out.

Hi Johnna,

Thanks for your information  :)

I tried have longer timeouts but they are still permanent failures. So I would
think it might be some other fundamental bugs. I am trying to find clues from
other test failures. (yes there is not just this very test failure ^^")

Masayuki san,

I found in TabParent we might need to ignore unexpected events from child [1].
Not sure if they are crucial enough to lead to such many testing failures [2].
What do you think? I have kicked off a try build to see if my findings matters.

Thanks!


[1] http://searchfox.org/mozilla-central/source/dom/ipc/TabParent.cpp#1711,1727
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc5739852bf3

Updated

2 months ago
Flags: needinfo?(masayuki)
> I found in TabParent we might need to ignore unexpected events from child [1].

Ah, yes. With your patch, actual request and notifications may mismatch. So, please remove the check.

> Not sure if they are crucial enough to lead to such many testing failures [2].

Looks like that these oranges are high-frequent oranges (I also have some trouble with these oranges now). How about to rebase and retry?
Flags: needinfo?(masayuki)

Comment 47

2 months ago
Hmm, does bug 1375491 end up changing the timing here too?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=00313bee6ff0
While I'm investigating bug 1377672, I realized that IMEStateManager in the main process doesn't handle delayed notifications and requests to IME properly. So, it might cause the odd oranges.
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) from comment #49)
> While I'm investigating bug 1377672, I realized that IMEStateManager in the
> main process doesn't handle delayed notifications and requests to IME
> properly. So, it might cause the odd oranges.

Hi Masayuki san,

So do have any suggestion for me? Have that issue been fixed by bug 1377672 or
still a thing? Thanks!
Flags: needinfo?(masayuki)
I meant that those issues which I found were already fixed by bug 1377672.
Flags: needinfo?(masayuki)
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) from comment #51)
> I meant that those issues which I found were already fixed by bug 1377672.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c0a2adef6683000d98e4d5a68f9317754903248

Have triggered a new try run but those failures are still present.
(plus new failures ^^") 

BTW, according to https://bugzilla.mozilla.org/show_bug.cgi?id=1377672#c41
the intermittent failure that bug 1377672 is trying to resolve is
still very frequent. Am I missing something?
Hi Henry, any updates here?  This is still showing up in Speedometer profiles, it would be nice if we can fix it soon.  Thanks!
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #53)
> Hi Henry, any updates here?  This is still showing up in Speedometer
> profiles, it would be nice if we can fix it soon.  Thanks!

I still cannot solve the try-only failures :(
(even on top of Masayuki san's fix in bug 1377672)

I might have to investigate IME handling in e10s mode much deeper.
(maybe start from https://developer.mozilla.org/en-US/docs/Mozilla/IME_handling_guide)
(In reply to Henry Chang [:hchang] from comment #54)
> I might have to investigate IME handling in e10s mode much deeper.
> (maybe start from
> https://developer.mozilla.org/en-US/docs/Mozilla/IME_handling_guide)

That is perhaps not updated for the latest design, but it must help you to understand the idea of e10s support.
(Assignee)

Comment 56

9 days ago
I'll try to continue the investigation on top of Henry's effort.
Assignee: hchang → sawang
(In reply to Samael Wang [:freesamael] from comment #56)
> I'll try to continue the investigation on top of Henry's effort.

Thank you for taking this, Samael!
(Assignee)

Comment 58

a day ago
Some updates about the failed test case browser_master_password_autocomplete.js:

While I haven't figured out why the visible symptom is that 2 prompts left on screen [1], I've found the main reason the test case fails is that click [2] doesn't make the prompt closed immediately. nsGlobalWindow::ReallyCloseWindow of the 1st prompt can happen after the focus ContentTask [3] gets resolved, and it's because FinalClose [4] would make a runnable scheduled twice before execution, and Henry's patch has made focus() on input elements not being blocked by NotifyIMEFocus. 

The point of master password canceling time being recorded is after the prompt actually closed [5], and since the 4s waiting can start before previous prompt being closed, it easily gets failed on slow machines or on try server (I can easily reproduce it when using --debugger=rr).

I tried a simple fix by waiting for "DOMModalDialogClosed" event, and that runs much better on try.
5 runs with original test case:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c167ec8ce6be321e18f8ff1a48cd8c158aaef08e&selectedJob=123488572
20 runs with "DOMModalDialogClosed" fix:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b145f79b915221415500f541b3c7b741ec5c7b02&group_state=expanded

I'm inclined to make the test case even less time-sensitive and hopefully prevent intermittent failures. I'm still verifying the patch:
https://hg.mozilla.org/try/comparison/278ca9dad347/toolkit/components/passwordmgr/test/browser/browser_master_password_autocomplete.js

[1] https://public-artifacts.taskcluster.net/F_5bfXqiQDynIn62aOyl5g/0/public/test_info//mozilla-test-fail-screenshot_c2NfDM.png
[2] http://searchfox.org/mozilla-central/rev/5ce320a16f6998dc2b36e30458131d029d107077/toolkit/components/passwordmgr/test/browser/browser_master_password_autocomplete.js#12
[3] http://searchfox.org/mozilla-central/rev/5ce320a16f6998dc2b36e30458131d029d107077/toolkit/components/passwordmgr/test/browser/browser_master_password_autocomplete.js#43-46
[4] http://searchfox.org/mozilla-central/rev/13148faaa91a1c823a7d68563d9995480e714979/dom/base/nsGlobalWindow.cpp#9444-9447
[5] http://searchfox.org/mozilla-central/rev/5ce320a16f6998dc2b36e30458131d029d107077/toolkit/components/passwordmgr/LoginManagerParent.jsm#55
(Assignee)

Comment 59

16 hours ago
(In reply to Samael Wang [:freesamael] from comment #58)
> Some updates about the failed test case
> browser_master_password_autocomplete.js:
> 
> While I haven't figured out why the visible symptom is that 2 prompts left
> on screen

FWIW, this is what rr told me:

1. The 1st `await dialogShown` clicked cancel button, which eventually called nsCloseEvent::PostCloseEvent.
2. nsCloseEvent::Run executed, which PostCloseEvent again.
3. TabParent::RecvNotifyIMEFocus got called. Probably because the prompt still exists, it fell to !IsSameProcess(aTabParent, sActiveTabParent) [1] and got ignored.
4. Received message "content-task:complete".
5. Received message "RemoteLogins:autoCompleteLogins", which triggered another promt.

The eventual js stack:

(rr) call js::DumpBacktrace((JSContext*)0x7f1db41c8000)
#0   7f1d7d558190 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:363 (7f1da7f9d848 @ 186)
#1   7f1d7d5580b8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:573 (7f1da7f9de38 @ 525)
#2   7f1d7d558020 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:791 (7f1da7fa2550 @ 142)
#3   7f1dad7ebf38 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:604 (7f1da7fa2090 @ 56)
#4   7f1dad7ebe78 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/crypto-SDR.js:119 (7f1d908a7d08 @ 89)
#5   7f1dad7ebdb0 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/storage-json.js:492 (7f1d908a7718 @ 130)
#6   7f1dad7ebce8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/storage-json.js:267 (7f1d9089ff68 @ 435)
#7   7f1dad7ebc58 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsLoginManager.js:409 (7f1d90885a10 @ 219)
#8   7f1dad7ebbb8 i   resource://gre/modules/LoginHelper.jsm:171 (7f1d90892aa8 @ 45)
#9   7f1dad7ebb00 i   resource://gre/modules/LoginManagerParent.jsm:45 (7f1d9c9e8680 @ 71)
#10   7f1dad7eb9f8 i   resource://gre/modules/LoginManagerParent.jsm:282 (7f1d9c9e8a10 @ 475)
#11   7f1dad7eb930 i   resource://gre/modules/LoginManagerParent.jsm:123 (7f1d9c9e87b0 @ 396)
#12   7ffcf7eb94c0 b   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/browser/components/nsBrowserGlue.js:182 (7f1dab21e420 @ 120)
#13   7f1dad7eb868 i   self-hosted:995 (7f1da3797a10 @ 407)
#14   7f1dad7eb7c0 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:363 (7f1da7f9d848 @ 186)
#15   7f1dad7eb6e8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:573 (7f1da7f9de38 @ 525)
#16   7f1dad7eb650 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:791 (7f1da7fa2550 @ 142)
#17   7f1dad7eb5b8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsPrompter.js:604 (7f1da7fa2090 @ 56)
#18   7f1dad7eb4f8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/crypto-SDR.js:119 (7f1d908a7d08 @ 89)
#19   7f1dad7eb430 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/storage-json.js:492 (7f1d908a7718 @ 130)
#20   7f1dad7eb368 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/storage-json.js:267 (7f1d9089ff68 @ 435)
#21   7f1dad7eb2d8 i   file:///home/freesamael/Repos/build/obj-firefox-icecc-disable-optimize/dist/bin/components/nsLoginManager.js:409 (7f1d90885a10 @ 219)
#22   7f1dad7eb238 i   resource://gre/modules/LoginHelper.jsm:171 (7f1d90892aa8 @ 45)
#23   7f1dad7eb180 i   resource://gre/modules/LoginManagerParent.jsm:45 (7f1d9c9e8680 @ 71)
#24   7f1dad7eb0f0 i   resource://gre/modules/LoginManagerParent.jsm:234 (7f1d9c9e88e0 @ 564)
#25   7f1dad7eb020 i   self-hosted:1187 (7f1d909e02f0 @ 173)

[1] http://searchfox.org/mozilla-central/rev/e8c36327cd8c9432c69e5e1383156a74330f11f2/dom/events/IMEStateManager.cpp#1567
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Updated

16 hours ago
Attachment #8898722 - Flags: review?(masayuki)
Attachment #8898723 - Flags: review?(MattN+bmo)
(Assignee)

Comment 62

16 hours ago
Hi Masayuki,

I'm taking over the bug from Henry, but since MozReview wouldn't let me just "inherit" the review, I have to ask for your review again. The patch is exactly the same as Henry's one.
Comment on attachment 8898723 [details]
Bug 1349255 - Wait for DOMModalDialogClosed after clicked cancel button.

https://reviewboard.mozilla.org/r/170138/#review175344

Thanks
Attachment #8898723 - Flags: review?(MattN+bmo) → review+
You need to log in before you can comment on or make changes to this bug.