Closed Bug 1175527 Opened 6 years ago Closed 4 years ago

Intermittent test_SpecialPowersPushPermissions.html | expired permission should be removed! - Result logged after SimpleTest.finish()

Categories

(Testing :: Mochitest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cbook, Assigned: chunmin)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 5 obsolete files)

mozilla-inbound_ubuntu64_vm_armv7_mobile_test-mochitest-16

https://treeherder.mozilla.org/logviewer.html#?job_id=10818645&repo=mozilla-inbound

23:51:10 INFO - 256 INFO TEST-UNEXPECTED-FAIL | testing/mochitest/tests/Harness_sanity/test_SpecialPowersPushPermissions.html | expired permission should be removed! - Result logged after SimpleTest.finish()
See Also: → 1175263
Assignee: nobody → cchang
Well, I can't reproduce it on tryserver now
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=a05e46f5dfa0
Any news on this and bug 1175263? I'm likely to skip the test otherwise soon given the lack of activity.
Flags: needinfo?(cchang)
Martijn might be able to help here too.
Hi, Ryan,
I am working on other bugs now. If others want to take these bugs, it will be a good news to me.
Flags: needinfo?(cchang)
Joel, can you please help find an owner for these failures?
Flags: needinfo?(jmaher)
:chunmin, we have a lot of failure instances here which need to be addressed.  Luckily this appears to be on android 4.3 only- that is something we should investigate to see why this is failing intermittently on android and not elsewhere.  Could you take a bit of time and look through some logs or push to try and see if you can come up with more information?
Flags: needinfo?(jmaher) → needinfo?(cchang)
Hi, Joel, 
I'd like to take this bug. However, I might not fix it quickly because my prime task now is red-tai project.
Flags: needinfo?(cchang)
Disabled on Android.

https://hg.mozilla.org/integration/fx-team/rev/9fa21d1f6e5c
Whiteboard: [test disabled on Android][leave open]
Hi, Joel,
I found that the PR_Now() used in nsPermissionManager is bigger than the time value got by Date.now() on Android 4.3 API11(200~600ms actually). The problem is very similar to the situation
bug 1141415 comment 44 mentioned: "PR_Now() used in nsPermissionManager is 15~17ms less than the time value got by Date.now() on Windows".

We can simply adjust the check time point for some specific platforms like what we do in bug 1141415[1]. However, if we take this approach, we need to modify this test file every time when there is another platform whose PR_Now() is out of sync from Date.now().

I hope we can have a reliable solution to fix this timing issue. I reconsider the approach that mentioned in bug 1141415 comment 47: "To make sure the time base is same, one way is to add a readonly attribute named systemTimestamp and use it to get system time instead of Date.now(). Nevertheless, it's unnatural to use systemTimestamp instead of Date.now()." What do you think?


[1] https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/tests/Harness_sanity/test_SpecialPowersPushPermissions.html#197
Flags: needinfo?(jmaher)
I am fine adjusting the time for android- your point about new platforms is interesting and valid.  This got me thinking about what other odd test issues/product issues we might be hitting while depending on Date.now().

Maybe one thing here is to fix this for android in the same way we did for Windows, then file a bug to investigate other approaches for timestamps and determine what features and tests might be impacted (there are 1688 instances according to dxr: https://dxr.mozilla.org/mozilla-central/search?q=Date.now%28%29&redirect=true&case=false&limit=51&offset=451)
Flags: needinfo?(jmaher)
Hi, Joel,
I add some time buffer to the test file for android and win 32 platform in this patch. From the previous test, The PR_now() called by nsPermissionManager on android-api-11 platforms is sometimes 100ms~600ms larger than Date().now() called by the test file itself. However, I couldn't re-produce this error again recently, so I am not sure that this issue is caused by a long-delay-function-call or others.

In this patch, I use two ways to avoid this timing issues.
1) Make sure permission-expiration is larger than the time-diff between Date.now() and PR_NOW() on Android-API-11 platform. Thus, I change permission-expiration from 500ms into 750ms(750 > 600).
2) Skip the test on Android-API-11 when the error "unexpired permission should be allowed!"(The js test file expects the permission is still in its expiration but the nsPermissonManager think the permission is out-of-expiration.) happens.

Which method would you prefer?
Attachment #8652147 - Flags: feedback?(jmaher)
I would prefer to increase the expiration time, maybe 900ms just to be safe?
Attached patch extend permission expiration (obsolete) — Splinter Review
Hi, Joel,
Thx for your feedback! Could you review this patch when you're available?
Attachment #8652147 - Attachment is obsolete: true
Attachment #8652147 - Flags: feedback?(jmaher)
Attachment #8652211 - Flags: review?(jmaher)
Comment on attachment 8652211 [details] [diff] [review]
extend permission expiration

Review of attachment 8652211 [details] [diff] [review]:
-----------------------------------------------------------------

I would like to figure out if we can simplify this given the longer expiration period.

::: testing/mochitest/tests/Harness_sanity/test_SpecialPowersPushPermissions.html
@@ +202,3 @@
>    if (SpecialPowers.testPermission('pEXPIRE', ALLOW_ACTION, document) &&
> +      (getPlatformInfo().platform == "Win32") &&
> +      (now < start + PERIOD + getPlatformInfo().timeCompensation)) {

is this still necessary, we are not compensating for android specifically, quite possibly we don't need to compensate for windows now as well.
Attachment #8652211 - Flags: review?(jmaher) → review-
(In reply to Joel Maher (:jmaher) from comment #80)
Well, it's still necessary to add time compensation for win32 in this case. 

The PR_Now() called by nsPermissionManager in win32 is 15ms slower than Date.now(). On the contrary, PR_Now() called by nsPermissionManager in Android-api-11 sometimes 600ms faster than Date.now(). 

For example, if we require permission X at Data.now()=100 with period 500, then the expired time will be 600 and the PR_NOW() in different platforms will be:

platform       | PR_Now()    | expired | permission period
---------------+-------------+---------+-------------------
Linux          | 100         | 600     | 600-100=500
---------------+-------------+---------+-------------------
Win32          | 100-15=85   | 600     | 600-85=515
---------------+-------------+---------+-------------------
Android-api-11 | 100+600=700 | 600     | 600-700=-100
---------------+-------------+---------+-------------------

In the current test file, we will test permission upon the permission we request is approved(the permission should be valid) and then test it again after the expiration passed(the permission should be invalid). Suppose we spent 1ms to do it, then the PR_Now() will be 101, 86, 701 respectively in Linux, Win32, Android-api-11. When we test the permission at this moment:

Linux          : 100+1     =101 < 600 -> valid
Win32          : 100-15+1  =86  < 600 -> valid
Andoird-api-11 : 100+600+1 =701 > 600 -> invalid => That's why we need to change period to a longer one.


On the other hand, if permission period is 900 and we require it at Data.now()=100, then the expired time will be 1000 and the PR_NOW() in different platforms will be:

platform       | PR_Now()    | expired | permission period
---------------+-------------+---------+-------------------
Linux          | 100         | 1000    | 1000-100=900
---------------+-------------+---------+-------------------
Win32          | 100-15=85   | 1000    | 1000-85=915
---------------+-------------+---------+-------------------
Android-api-11 | 100+600=700 | 1000    | 1000-700=300
---------------+-------------+---------+-------------------

When we test the permission in the first round, the results will be
Linux          : 101 < 1000 -> valid
Win32          : 86  < 1000 -> valid
Andoird-api-11 : 701 > 1000 -> valid

In the second round, we will test permission at 1000ms(after the expiration passed),
Linux          : 1000+1     = 1001 > 1000 -> invalid
Win32          : 1000-15+1  = 986  < 1000 -> valid => That's why we still need to compensate the time
Andoird-api-11 : 1000+600+1 = 1601 > 1000 -> invalid
(In reply to Chun-Min Chang[:chunmin] from comment #81)
or...I can adjust the permission period depends on platform we use and get rid of adding time compensation in the second round test
Attachment #8652211 - Attachment is obsolete: true
Attachment #8653312 - Flags: review?(jmaher)
Comment on attachment 8653312 [details] [diff] [review]
adjust permission period depends on used platform

Review of attachment 8653312 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the explanation in detail.  Lets get this patch in!
Attachment #8653312 - Flags: review?(jmaher) → review+
Format to hg patch and carry r+
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9e264e531d87
Attachment #8653312 - Attachment is obsolete: true
Attachment #8653864 - Flags: review+
Whiteboard: [test disabled on Android][leave open] → [test disabled on Android][leave open][checkin-needed]
Windows XP is permafail in your run.
https://treeherder.mozilla.org/logviewer.html#?job_id=10850840&repo=try

Also, checkin-needed is a bug keyword, please use that instead of the whiteboard.
Whiteboard: [test disabled on Android][leave open][checkin-needed] → [test disabled on Android][leave open]
Delay the check time point after permission is out of period
Attachment #8653864 - Attachment is obsolete: true
Attachment #8656410 - Flags: review+
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #86)
Hi, I can't reproduce this error but I try avoiding this error by adjusting the check time. Could you help me to test it again?
Keywords: checkin-needed
(In reply to Chun-Min Chang[:chunmin] from comment #93)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #86)
> Hi, I can't reproduce this error but I try avoiding this error by adjusting
> the check time. Could you help me to test it again?

i guess you meant need-info instead of checkin-neeed, so changing
Flags: needinfo?(ryanvm)
Keywords: checkin-needed
(In reply to Carsten Book [:Tomcat] from comment #94)
> (In reply to Chun-Min Chang[:chunmin] from comment #93)
> > (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #86)
> > Hi, I can't reproduce this error but I try avoiding this error by adjusting
> > the check time. Could you help me to test it again?
> 
> i guess you meant need-info instead of checkin-neeed, so changing
Thx you for your help. Originally, I want to land this patch if it pass the test.
(In reply to Chun-Min Chang[:chunmin] from comment #93)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #86)
> Hi, I can't reproduce this error but I try avoiding this error by adjusting
> the check time. Could you help me to test it again?

Push it to Try again?
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #96)
I pushed the previous patch to try and run Mochitest Mochitest M(5) on Windows XP opt multiple times but this error didn't happen, so I can only try adjusting some part that may cause the problem.
I don't understand what you're saying. The Try push in comment 85 clearly shows XP M5 being permafail. You've pushed later patches and the failures have gone away? That's a good thing :). Can you provide a link to those runs? All I have to go off is what you've put in the bug.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #98)
> I don't understand what you're saying. The Try push in comment 85 clearly
> shows XP M5 being permafail. You've pushed later patches and the failures
> have gone away? That's a good thing :). Can you provide a link to those
> runs? All I have to go off is what you've put in the bug.
no. Instead of the new one, I pushed the old patches to try. However, this error didn't happen. This is a timing issue and it doesn't happen every time or on every device. OK, I can just push the old patches to try again.
Add a delay to make sure that the permission expired check will be triggered after permission is out-of-period
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c1ad4edbcb9
Attachment #8656410 - Attachment is obsolete: true
Attachment #8658524 - Flags: review+
(In reply to Chun-Min Chang[:chunmin] from comment #100)
> Add a delay to make sure that the permission expired check will be triggered
> after permission is out-of-period
> try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c1ad4edbcb9
Hi, Ryan,
I updated the patch to avoid the permafail on Windows XP which was mentioned at comment 86. Could you help me to land it?
Keywords: checkin-needed
No longer blocks: conn_priority
Can't accurately set the target milestone since this was fixed for 43.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Whiteboard: [test disabled on Android][leave open]
You need to log in before you can comment on or make changes to this bug.