Closed Bug 843309 Opened 11 years ago Closed 11 years ago

paymentSuccess sometimes does not appear in scope (payment window does not close)

Categories

(Firefox OS Graveyard :: Gaia::System, defect)

defect
Not set
normal

Tracking

(blocking-b2g:tef+, firefox20 wontfix, firefox21 wontfix, firefox22 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 fixed)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-b2g tef+
Tracking Status
firefox20 --- wontfix
firefox21 --- wontfix
firefox22 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- fixed

People

(Reporter: kumar, Assigned: ferjm)

References

Details

(Whiteboard: [qa-] QARegressExclude)

Attachments

(3 files, 1 obsolete file)

See https://wiki.mozilla.org/WebAPI/WebPaymentProvider#Completion for info

Somtimes paymentSuccess is undefined at the end of the webpay flow. This doesn't happen all the time but I'm filing this bug to track it.
This is similar to bug 841521 in that this behavior only happens the first time you make a purchase. The second time it works.
Component: Payments/Refunds → Gaia::System
Product: Marketplace → Boot2Gecko
Version: 1.2 → unspecified
Blocks: PayId-v1next
No longer blocks: marketplace-payments
No idea how to take this bug forward - this is too vague. Can you clarify?

In some cases paymentSuccess isn't defined because the app process was killed during the execution of the payment process. So you can't always guarantee it'll exist.
Flags: needinfo?(kumar.mcmillan)
The STR:
- Open Marketplace (dev/stage)
- Pay for Private Yacht
- Sign in / create/enter PIN
- Confirm phone number on Bango screen
- Click Buy Private Yacht

Expected: at this point, the buy flow can be closed so the WebPay HTML page can call paymentSuccess()

Actual: paymentSuccess() is an undefined global. The code polls, waiting for it, but it never appears.
Flags: needinfo?(kumar.mcmillan)
Summary: paymentSuccess sometimes does not appear in scope → paymentSuccess sometimes does not appear in scope (payment window does not close)
Fernando - Any thoughts?
Flags: needinfo?(ferjmoreno)
Kumar can you include a logcat here?
The only potential idea I have that this could happen is the fact that trusted UI runs in the system app process, while the originating app runs in a separate app process. So there's a definite possibility that we could end up killing the originating app while the trusted UI was open, which would cause this issue.

Adding Justin for input to see what he thinks.
btw, this is how paymentSuccess becomes a global: https://mxr.mozilla.org/mozilla-central/source/b2g/chrome/content/payment.js#96

I don't understand what the global 'content' is in this exact context. It seems to work most of the time ;)
If you can include logcat and adb shell dmesg, that will make it clear whether something is crashing.

> I don't understand what the global 'content' is in this exact context. It seems to work most of the 
> time ;)

This is a "frame script": It's a script we inject to run "inside" a particular iframe, which may or may not be cross-process with respect to the injector.  |content| is the content window inside that iframe -- that is, |content| is what iframe.contentWindow would return if the iframe was not remote.  (If the iframe is cross-process, contentWindow is null.)
Attached file b2g desktop log
Here is a log from b2g desktop where I was able to reproduce it. I'll get one from a phone. 

The last log message is when the HTML loads within the Trusted UI iframe. The HTML page polls, waiting for paymentSuccess to be defined, but it never happens.
If you can reproduce this on desktop b2g, I'm almost positive multiprocess is not involved, because I'm pretty sure desktop b2g runs in one process.  Also, you're probably not OOMing on your desktop.
I should ask - can you reproduce this on device?
Logcat analysis shows a lot of:

XXX FIXME : Got a mozContentEvent: undefined

I've seen the Got mozContentEvent FIXME thing before as a benign message, but it usually includes something defined. The undefined piece seems off.
tef? for similar reasons as bug 841521
blocking-b2g: --- → tef?
Actually, I'm going to wait on the nom until I get more info here.
blocking-b2g: tef? → ---
Fernando, are you able to look into this?  It looks like an issue in this code.
(In reply to Jason Smith [:jsmith] from comment #11)
> I should ask - can you reproduce this on device?

Yeah, I've seen this happen on device before but since I use desktop more for day-to-day dev I see it there more. On yesterday's build I tried a couple times on device but couldn't catch it in action.

One thing worth noting: the code *always* does not see paymentSuccess defined when it first loads. It waits for it to appear in a hacky polling loop: https://github.com/mozilla/webpay/blob/master/media/js/pay/pay.js#L80 

For example, I can see on my latest device logs that it polled twice then found it defined.

Maybe that is a clue that the timing is off somewhere?
Alright, sounds like this now appropriate to nom now that we know this reproduces on device.
blocking-b2g: --- → tef?
(In reply to Kumar McMillan [:kumar] from comment #16)

> One thing worth noting: the code *always* does not see paymentSuccess
> defined when it first loads. It waits for it to appear in a hacky polling
> loop: https://github.com/mozilla/webpay/blob/master/media/js/pay/pay.js#L80 
> 
> For example, I can see on my latest device logs that it polled twice then
> found it defined.
> 
> Maybe that is a clue that the timing is off somewhere?

We wait for the DOMContentLoaded event before injecting the functions in the payment flow [1]. So that might be the reason why you are not seeing 'paymentSuccess' defined when you first check. Anyway, it should be injected at some point... I'll take a look at this.

[1] https://mxr.mozilla.org/mozilla-central/source/b2g/chrome/content/payment.js#95
Assignee: nobody → ferjmoreno
Flags: needinfo?(ferjmoreno)
> We wait for the DOMContentLoaded event before injecting the functions in the payment flow 

Wouldn't it be better to inject these functions much earlier, so that pages cannot observe the lack of these methods?  That would match how every other DOM API works.

There's probably a way to stick them on the outer window so you never have to re-inject them.  (In fact, that may be what the code is doing already.)
Attached patch Potential fixSplinter Review
Unfortunately I am not able to reproduce this issue either. I'm testing in the device with the latest b2g18 and the latest gaia master.

However, I am attaching a potential fix based on Justin's suggestion.

Kumar, would it be possible to check if you can reproduce this issue with this patches applied? Thanks!
Comment on attachment 722322 [details] [diff] [review]
Potential fix

This is fine for testing, but before we check it in, we should ask smaug whether it's necessary to listen to this event using the system event listener service (as in BrowserElementChildPreload.js), or whether what you have is fine.  I'm not sure.

I don't think the difference between the system event listener service and this should make a difference for testing.
Attachment #722326 - Attachment is obsolete: true
The attached Gaia PR avoids repeatedly sending a mozContentEvent requesting the re-injection of the frameScript in the payment flow.

Note that even if these two patches don't fix this bug we should probably merge them anyway in a separate bug since this seems to be an improvement of the way that we inject the functions in the payment flow.
OS: Mac OS X → All
Hardware: x86 → All
Olli, can we have your opinion about Justin's comment above (comment 21), please? Thanks! :)
(In reply to Jason Smith [:jsmith] from comment #13)
> tef? for similar reasons as bug 841521

bug 841521 is tracking-b2g18? so setting this bug to match
blocking-b2g: tef? → ---
tracking-b2g18: --- → ?
See https://bugzilla.mozilla.org/show_bug.cgi?id=841521#c48 for the justification of re-noming for tef? 

We probably don't want a person's first payment experience to be broken.
blocking-b2g: --- → tef?
tracking-b2g18: ? → ---
(In reply to Lukas Blakk [:lsblakk] from comment #27)
> See https://bugzilla.mozilla.org/show_bug.cgi?id=841521#c48 for the
> justification of re-noming for tef? 
> 
> We probably don't want a person's first payment experience to be broken.

This one is a tad worse than bug 841521 mainly cause the payment flow gets stuck at the end. There's also enough analysis to be able to prove the bug is happening with a potential fix. So the tef nom I think is okay on this bug, but I'm not sure the rationale in the other bug argues for it to block.
(In reply to Jason Smith [:jsmith] from comment #28)
> (In reply to Lukas Blakk [:lsblakk] from comment #27)
> > See https://bugzilla.mozilla.org/show_bug.cgi?id=841521#c48 for the
> > justification of re-noming for tef? 
> > 
> > We probably don't want a person's first payment experience to be broken.
> 
> This one is a tad worse than bug 841521 mainly cause the payment flow gets
> stuck at the end. There's also enough analysis to be able to prove the bug
> is happening with a potential fix. So the tef nom I think is okay on this
> bug, but I'm not sure the rationale in the other bug argues for it to block.

Were you able to reproduce this issue? I couldn't. Not in desktop and, most important, not in the device.

Also the attached logcat does not show anything interesting. I am pretty sure that the "XXX FIXME : Got a mozContentEvent: undefined" is not related to this issue.

The attached patch is an improvement of the way we inject the functions in the payment flow but it doesn't mean that it fix this issue. Since I can't reproduce it, I can't check any fix.
(In reply to Fernando Jiménez Moreno [:ferjm] from comment #29)
> (In reply to Jason Smith [:jsmith] from comment #28)
> > (In reply to Lukas Blakk [:lsblakk] from comment #27)
> > > See https://bugzilla.mozilla.org/show_bug.cgi?id=841521#c48 for the
> > > justification of re-noming for tef? 
> > > 
> > > We probably don't want a person's first payment experience to be broken.
> > 
> > This one is a tad worse than bug 841521 mainly cause the payment flow gets
> > stuck at the end. There's also enough analysis to be able to prove the bug
> > is happening with a potential fix. So the tef nom I think is okay on this
> > bug, but I'm not sure the rationale in the other bug argues for it to block.
> 
> Were you able to reproduce this issue? I couldn't. Not in desktop and, most
> important, not in the device.
> 
> Also the attached logcat does not show anything interesting. I am pretty
> sure that the "XXX FIXME : Got a mozContentEvent: undefined" is not related
> to this issue.
> 
> The attached patch is an improvement of the way we inject the functions in
> the payment flow but it doesn't mean that it fix this issue. Since I can't
> reproduce it, I can't check any fix.

I haven't managed to reproduce this bug myself. I'll talk with Krupa and see if I can mess around with the real payment flow to get a reproduction.
(In reply to Fernando Jiménez Moreno [:ferjm] from comment #29)

> Also the attached logcat does not show anything interesting. I am pretty
> sure that the "XXX FIXME : Got a mozContentEvent: undefined" is not related
> to this issue.

These messages come from identity API that doesn't set a type on its mozContentEvents (it's bad in general, we should fix that).
(In reply to Fabrice Desré [:fabrice] from comment #31)
> (In reply to Fernando Jiménez Moreno [:ferjm] from comment #29)
> 
> > Also the attached logcat does not show anything interesting. I am pretty
> > sure that the "XXX FIXME : Got a mozContentEvent: undefined" is not related
> > to this issue.
> 
> These messages come from identity API that doesn't set a type on its
> mozContentEvents (it's bad in general, we should fix that).

Sorry, I didn't know we were doing anything wrong there.  I've filed Bug 849877 to fix the identity mozContentEvents.
(In reply to Fabrice Desré [:fabrice] from comment #31)
> (In reply to Fernando Jiménez Moreno [:ferjm] from comment #29)
> 
> > Also the attached logcat does not show anything interesting. I am pretty
> > sure that the "XXX FIXME : Got a mozContentEvent: undefined" is not related
> > to this issue.
> 
> These messages come from identity API that doesn't set a type on its
> mozContentEvents (it's bad in general, we should fix that).

If I am not wrong, in this case the mozContentEvents are responses to specific mozChromeEvents that are identified by an id generated in the moment that the mozChromeEvents are sent, so the type could be avoided in this case, as the chrome side expects mozContentEvents with an specific id.

With the patches proposed for Bug 794999 we will force every mozContentEvent (as far as we use mozChromeEvent<->mozContentEvent communication) involved in a trusted UI flow to have a type anyway.

In any case, again, I really think that this is not related to the issue of this bug.
(In reply to Jed Parsons [:jparsons] from comment #32)

> Sorry, I didn't know we were doing anything wrong there.  I've filed Bug
> 849877 to fix the identity mozContentEvents.

No problem, and blame your reviewer ;)

(In reply to Fernando Jiménez Moreno [:ferjm] from comment #33)

> If I am not wrong, in this case the mozContentEvents are responses to
> specific mozChromeEvents that are identified by an id generated in the
> moment that the mozChromeEvents are sent, so the type could be avoided in
> this case, as the chrome side expects mozContentEvents with an specific id.

I'm not saying that this doesn't work (well, you could have id conflicts if several apis were doing that). All the mozChromeEvent <-> mozContentEvent mechanism is a poor replacement for a proper embedding API and I'd like these events to be more easily recognizable when we move to something else.

> 
> In any case, again, I really think that this is not related to the issue of
> this bug.

I agree.
(tef+ cause it's a 1 line patch and Moz folks want it in)
blocking-b2g: tef? → tef+
Kumar - Can you test the patch ferjm has here to see if it works for you?
Flags: needinfo?(kumar.mcmillan)
Yeah! This seems to be working. 

First, I removed the JS polling in the success screen to test. If paymentSuccess is ever undefined, this makes it fail immediately. I could reproduce this 'undefined' failure with b2g18/gaia. 

Next, I applied the chrome patch from Comment #20 and the gaia patch from Comment #22. I tried many many times and never hit the 'undefined' failure. This seems like it fixes the timing issue and guarantees that paymentSuccess is always defined. Thanks ferjm!
Flags: needinfo?(kumar.mcmillan)
Great! Thanks for your help Kumar!
Attachment #722325 - Flags: review?(fabrice)
Comment on attachment 722322 [details] [diff] [review]
Potential fix

Olli, can we have your opinion about Justin's comment above (comment 21), please? Thanks!
Attachment #722322 - Flags: review?(fabrice)
Attachment #722322 - Flags: feedback?(bugs)
Comment on attachment 722322 [details] [diff] [review]
Potential fix

Review of attachment 722322 [details] [diff] [review]:
-----------------------------------------------------------------

r=me conditionnal on smaug feedback
Attachment #722322 - Flags: review?(fabrice) → review+
Comment on attachment 722322 [details] [diff] [review]
Potential fix

This should be ok. DOMWindowCreated is dispatched to chrome only, so
content can't call stopPropagation()
Attachment #722322 - Flags: feedback?(bugs) → feedback+
Attachment #722325 - Flags: review?(fabrice) → review+
https://hg.mozilla.org/mozilla-central/rev/96af92fa87fd
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [qa-]
https://hg.mozilla.org/releases/mozilla-b2g18/rev/0cd01841d22d
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/b267b5ed7aea

Leaving the status flags set to affected for the v1-train and v1.0.1 uplifts.
Target Milestone: --- → B2G C4 (2jan on)
Ferjm, I'd love that you uplift the gaia part yourself here, as I don't really know how to test this works...
Flags: needinfo?(ferjmoreno)
v1-train: 046b5f94a3440f6f670867aae553eee3eca48e6b
Flags: needinfo?(ferjmoreno)
v1.0.1: 72af49be465f357d57977c948607ad93bdc4bcf0
please check that the bug is fixed on both branch, and the feature still works.

Thanks.
Keywords: verifyme
(In reply to Julien Wajsberg [:julienw] from comment #48)
> please check that the bug is fixed on both branch, and the feature still
> works.
> 
> Thanks.

I put qa- on this one originally (which means a decline on verification) because only Kumar can reproduce this bug consistently. I however could not. I can regression test around this, along with Krupa doing her regular payment testing.

What I think we should do here is just indicate to Kumar & Krupa to keep a look out for potential issues on the client side as they test payments with this patch. If something regresses, then we'll find out with new bugs filed. As for making sure this bug works, regular marketplace payment testing will end up by caveat watching out for this bug on a regular basis.

Make sense?
Keywords: verifyme
(In reply to Julien Wajsberg [:julienw] from comment #45)
> Ferjm, I'd love that you uplift the gaia part yourself here, as I don't
> really know how to test this works...

Sorry Julien! I didn't see your comment until now. Thanks for landing the patch! It works for me in both branches.
Cannot verify as this requires access to Payment Tools
Whiteboard: [qa-] → [qa-] QARegressExclude
Kumar - Just double check, is this working for you after the patch landed to fix this bug?
Flags: needinfo?(kumar.mcmillan)
I haven't seen it since the patch landed so it seems good. However, it was never 100% reproducible.
Flags: needinfo?(kumar.mcmillan)
(In reply to Kumar McMillan [:kumar] from comment #54)
> I haven't seen it since the patch landed so it seems good. However, it was
> never 100% reproducible.

Okay, I'll mark as verified then.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: