Closed Bug 1009977 Opened 6 years ago Closed 6 years ago

[B2G][Clock] The seconds portion of the timer inconsistently has a different starting point

Categories

(Firefox OS Graveyard :: Gaia::Clock, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(blocking-b2g:1.4+, b2g-v1.4 verified, b2g-v2.0 verified)

VERIFIED FIXED
2.0 S2 (23may)
blocking-b2g 1.4+
Tracking Status
b2g-v1.4 --- verified
b2g-v2.0 --- verified

People

(Reporter: dharris, Assigned: mcav)

References

()

Details

(Keywords: regression, Whiteboard: [openc-1.4-exploratory])

Attachments

(3 files)

Attached file Logcat
Description:
The user will see the timer start at different times when having the same amount of time set for the timer.

Repro Steps:
1) Update a Open_C to BuildID: 20140513000208
2) Open Clock app > Tap on Timer
3) Set the timer to at least 1 minute
4) Tap start

Actual:
Timer starts at inconsitent times. Sometimes the seconds portion starts at 00, and sometimes starts at 59.

Expected: Timer always starts with the seconds at 00, as the user is unable to change the seconds

1.4 Environmental Variables:
Device: Open_C 1.4
BuildID: 20140513000208
Gaia: b40103dec34a147c9018a1af76eb21c3184f2f93
Gecko: c140bcd02b9b
Version: 30.0
Firmware Version: P821A10V1.0.0B06_LOG_DL

Keywords: Countdown, Intermittently, Inconsistency, Incorrect, Same, Hours, Seconds

Repro frequency: 75% 
Started with seconds at 00 11/50 times
Started with sceonds at 59 39/50 times

See attached: Logcat, Firewatch, Video - http://youtu.be/0J6z08vNNl4
This isue DOES occur on Buri 1.4

1.4 Environmental Variables:
Device: Buri 1.4 MOZ
BuildID: 20140508000201
Gaia: 4ce973ef0732b0d52cb043210db598aa176b2ce9
Gecko: 16ab7f6b18f8
Version: 30.0
Firmware Version: v1.2-device.cfg
I have also seen that if the timer starts with the seconds at 00, such as 1:00 vs 0:59, and then the user does + 1 Min the timer will only increase by 59 seconds. If however, the timer starts at 0:59 seconds, the timer will go up by a full minute.
Does this reproduce on 1.3?
Keywords: qawanted
This issue does NOT occur on Open C 1.3

1.3 Environmental Variables:
Device: Open_C 1.3
BuildID: 20140505052400
Gaia: Unknown Git commit; build date shown here.
Gecko:
Version: 28.0
Firmware Version: P821A10V1.0.0B06_LOG_DL
Keywords: qawanted
Can you be more specific about what exactly is seen on 1.3 right now?
Keywords: qawanted
On Open C 1.3 the timer will always start with the seconds at :00, such as 1:00, 5:00 etc. Which is the expected result.
Keywords: qawanted
I can reproduce this on Flame. I tried other timer settings with minutes and was able to reproduce this as well. Looks like we're randomly off by a second or two if minutes are included in the timer.
blocking-b2g: --- → 1.4?
Tested and issue occurred on Buri with today's master. Buri master channel will be used to find the regression window.
QA Contact: pcheng
The following regression window was found using regular RIL builds on Buri central. We don't have builds available for a narrower window. (I do see a Timer related change in the Gaia pushlog.)

b2g-inbound Regression Window:

Last Working Environmental Variables:
Device: Buri MOZ
BuildID: 20140116040206
Gaia: 82878ba16172213cd00ba3e8b377564b290e59c1
Gecko: 324e2cba1029
Version: 29.0a1
Firmware Version: v1.2-device.cfg

First Broken Environmental Variables:
Device: Buri MOZ
BuildID: 20140117041037
Gaia: ef8bb31b462f364b57432a0724c78034d3f4f303
Gecko: b53589696cf8
Version: 29.0a1
Firmware Version: v1.2-device.cfg

Last Working Gaia / First Broken Gecko: Issue Does NOT reproduce
Gaia: 82878ba16172213cd00ba3e8b377564b290e59c1
Gecko: b53589696cf8

Last Working Gecko / First Broken Gaia: Issue DOES reproduce
Gaia: 324e2cba1029
Gecko: ef8bb31b462f364b57432a0724c78034d3f4f303

Gaia Pushlog:
https://github.com/mozilla-b2g/gaia/compare/82878ba16172213cd00ba3e8b377564b290e59c1...ef8bb31b462f364b57432a0724c78034d3f4f303
Caused by bug 898354.

Marcus - Can you take a look?
Blocks: 898354
Flags: needinfo?(m)
Interesting in that when the timer does initially start at 1:00, it hangs there for an extra beat and the next update you see is for 0:58 rather than 0:59. Regardless of whether the initial display shows :59 or 1:00, it looks like the actual time elapsed before sounding the alarm is the same and is correct. 

I tested with 2 phones side by side -- Flame & Peak both running master -- and when one initially displays 1:00 and the other initially displays 0:59, they both tick to 0:58 at the same time.

I am unable to reproduce the +1 error reported in comment #3. 

Bumping forward to consider for 2.0, but we wouldn't block on this.
blocking-b2g: 1.4? → 2.0?
I don't agree with that. This makes the timer functionality appear broken & it's not minor, so I actually do think this is a blocker.
blocking-b2g: 2.0? → 1.4?
For the UX side, the expected behavior is that the clock should begin at 1:00, not :59. In addition, the countdown should NOT skip seconds and should tick down at the same rate (an actual second, presumably). This is basic stuff that we can make work, and I would block on this. We are under too much pressure for quality on basic things and I'd rather this not slip.
In addition: when you hit start, it should only show 1:00 for a split second, and after that every second = 1 second.
please retriage.  QA and UX both agree that this is a clear regression from 1.3 that is user-facing.  People have expectations of when they hit start, and what they see.   starting at 1:00 means 1:00, not :59.

Please retriage this for 1.4+.
I'm going to make a brief digression here and explain why there are at
least three "correct" ways to display the timer's countdown value.
While it may seem like elementary math to display a timer countdown,
there are multiple, equally valid ways to display the same countdown.

All of these scenarios represent a one minute timer. Remember that
this isn't just seconds counting down, there are milliseconds in play too.

1. ROUND UP: If you consider the time displayed to be "seconds rounded
   up", you'd see START=1:00 for a full second before you see 0:59.

2. ROUND DOWN: If you consider the time to be "seconds rounded down",
   you'd see "0:59" _immediately_ for a full second.

3. ROUND TO THE CLOSEST NUMBER: This is what iOS does. It displays
   "1:00" for half a second, followed by 0:59 for half a second, etc.
   -- in effect, the timer's display is a half-second off for the
   duration of the timer. But since you're rounding, at the end you
   see "0:01" for a full second and 0:00 for half a second.

EXAMPLE:

Real Time |   Up  | Down | Closest
----------+-------+------+---------
 0:59.999   1:00    0:59     1:00
 0:59.499   1:00    0:59     0:59
 0:58.999   0:59    0:58     0:59
    ...
 0:01.999   0:02    0:01     0:02
 0:01.499   0:02    0:01     0:01
 0:00.999   0:01    0:00     0:01
 0:00.499   0:01    0:00     0:00
 0:00.000   0:00    0:00     0:00

Obviously the behavior displayed is inconsistent, and I agree it
should be corrected. That said, there are multiple _correct_ ways to do it.

Per the UX comments in this thread, it appears that the track that
would cause most people to say "yeah that seems right" is #3, what iOS
does. You'll see 1:00 very briefly, followed by a stable countdown,
followed by 0:00 very briefly, before it finally fires.

When I go through and fix this, I'll take approach #3.

The prevailing wind seems to be in favor of giving out blocking flags
like candy. I personally wouldn't block on this -- what good is a
blocking flag if we just flag all the bugs? -- but that's not my
concern at the moment.

As long as you all understand and agree with the timing selection I've
described above, you can add whatever flags you want. It'll get
fixed imminently.
Assignee: nobody → m
Status: NEW → ASSIGNED
Flags: needinfo?(m)
Target Milestone: --- → 2.0 S2 (23may)
Correction: On #3 above, the second line should read "followed by 0:59 for a full second".
Attached file Pull Request
Patch attached. Previously, the timer was only rendered occasionally, presumably in an effort to save battery life. However, it did so unreliably (using setTimeout) and without rounding, which could cause the timer to appear to skip a number.

This patch uses requestAnimationFrame, to ensure the timer's display is precisely up to date, but maintains battery life savings by only reading/updating the DOM when necessary.
Attachment #8424214 - Flags: review?(gaye)
(In reply to Marcus Cavanaugh [:mcav] <mcav@mozilla.com> from comment #17)
> The prevailing wind seems to be in favor of giving out blocking flags
> like candy. I personally wouldn't block on this -- what good is a
> blocking flag if we just flag all the bugs? -- but that's not my
> concern at the moment.
> 
> As long as you all understand and agree with the timing selection I've
> described above, you can add whatever flags you want. It'll get
> fixed imminently.

The reason why there's push back going in the bug here to block is because:

1. This is broken user trust [a]
2. It's a confirmed regression.
3. High reproduction rate.
4. Front-facing feature of the clock in a basic use case.

[a] - a user sees a starting time they didn't expect & can possibly see the 2 second jump occur, which would lead someone to question whether they reliably can trust the count down of the timer.
(In reply to Jason Smith [:jsmith] from comment #20)

I agree that this breaks users' trust, and that it should be fixed, and under the wide definition of "blocking" we've been using as a project recently, it should block.

My disagreement stands on (irrelevant) philosophical grounds: I doubt we'd actually block a release for this. The term "blocking" is inconsistent with its actual usage in practice for B2G. Here, it really means "priority for a fix in this release". If it gets down to the wire, people will unblock small issues like this in order to ship. Sadly (to me), that's irrelevant, which is why I couldn't strongly argue against blocking in this case.

So, in our reality, where "blocking-b2g" means "priority-b2g-fix-asap", yes, it should block.
https://wiki.mozilla.org/B2G/Triage#Issues_that_Should_Block

Based on these guidelines this issue seems a blocker. Regression of a basic use case.
Although, according to those guidelines, "Polish and other minor issues" should not block, and I'd definitely consider this a case of polish rather than broken functionality, considering that the timer still fires correctly. It's not black and white.
(In reply to Marcus Cavanaugh [:mcav] <mcav@mozilla.com> from comment #23)
> Although, according to those guidelines, "Polish and other minor issues"
> should not block, and I'd definitely consider this a case of polish rather
> than broken functionality, considering that the timer still fires correctly.
> It's not black and white.

This is not polish. Polish is when you have existing functionality that already works as expected that with a small enhancement can improve the UX a bit. A minor issue is something like the following:

* Awkward transition
* Awkward button illumination
* Issue hidden off in the corner of the phone (e.g. screen reader regression)

This issue is a functional regression on a basic use case, as you are showing wrong information when the timer starts off any time minutes is used with the timer (i.e. bustage in smoketest of the timer). A clock is only useful if 1) it represents the true state of time 2) it actually shows the right time. [2] is being broken here as a regression, which makes this a clear cut and dry blocker.
This is incorrect feature behavior and a regression. It is not polish: the clock is plainly not working as expected. We need the basics to work as expected in order to deliver a quality product.
(In reply to Preeti Raghunath(:Preeti) from comment #22)
> https://wiki.mozilla.org/B2G/Triage#Issues_that_Should_Block
> 
> Based on these guidelines this issue seems a blocker. Regression of a basic
> use case.

This bug fulfills at least 2 of the following from the wiki page:

Issues that Should Block
* Major issue in new feature - especially those in which a large number of users will be impacted, or a smaller number of users will be significantly impacted
* Major identifiable regressions (perf or otherwise)

Lets please get this patch blocked and approved.  Happy to test it first if you have any risk concerns.
We'll have to agree to disagree. Yes, it is a regression, yes, it is misleading, but it is also minor, and not something that would in practice stop us from shipping a release.

Consider this: The timer in v1.3, for which this is regressing, _does not fire_ if the app terminates. (Tarako, anyone?) That's a horrible, horrible failure of a basic use case which makes Timer in 1.3 completely useless.

And yet we shipped it -- a horribly broken feature. If anyone migrates from 1.3 to 1.4, the fact that the timer fires _at all_ is a tremendous improvement.

Shipping a quality product is about more than just fixing every little glitch, it's about prioritizing features and making the tough compromises that come with shipping a complex product on time.

If we had actually blocked Firefox OS releases for bugs as minor as this one, we would have never shipped any release. Yes, this bug should be a priority. Yes, this bug is annoying and should be prioritized, fixed and approved for the next release.

But calling it "blocking"? The blocking flag is b2g's hammer, for which everything looks like a nail. You're all correct -- it's the only tool we have -- but let's not be blind to the fact that our tool is suboptimal at best, and wasteful of our limited resources at worst.
Attachment #8424214 - Flags: review?(gaye) → review?(mmedeiros)
(In reply to Marcus Cavanaugh [:mcav] <mcav@mozilla.com> from comment #27)
> We'll have to agree to disagree. Yes, it is a regression, yes, it is
> misleading, but it is also minor, and not something that would in practice
> stop us from shipping a release.

I don't think any other phone vendor in the world would ever ship with a timer that starts off wrong. If this was a timer was in a manufacturing process for certification, would they certify the hardware? I certainly don't think so - they would claim the hardware was defective & could not be shipped out. On that regard, I'm pretty sure none of our partners would be okay with shipping with this bug.

> 
> Consider this: The timer in v1.3, for which this is regressing, _does not
> fire_ if the app terminates. (Tarako, anyone?) That's a horrible, horrible
> failure of a basic use case which makes Timer in 1.3 completely useless.
> 
> And yet we shipped it -- a horribly broken feature. If anyone migrates from
> 1.3 to 1.4, the fact that the timer fires _at all_ is a tremendous
> improvement.

This is different. That's a feature, not a bug. This is a case where we're taking something that worked before & broke it in a followup release.

> 
> Shipping a quality product is about more than just fixing every little
> glitch, it's about prioritizing features and making the tough compromises
> that come with shipping a complex product on time.

Shipping a quality phone requires shipping a high quality device actually. We're not feature driven primarily, we're date driven to hit a quality bar in order to allow our partners to allow the device to be shipped. Features are always dropped if they impact the schedule & quality bar requirements.

> 
> If we had actually blocked Firefox OS releases for bugs as minor as this
> one, we would have never shipped any release. Yes, this bug should be a
> priority. Yes, this bug is annoying and should be prioritized, fixed and
> approved for the next release.

I think multiple people have clarified above why this isn't minor, so I don't think argument applies.

> 
> But calling it "blocking"? The blocking flag is b2g's hammer, for which
> everything looks like a nail. You're all correct -- it's the only tool we
> have -- but let's not be blind to the fact that our tool is suboptimal at
> best, and wasteful of our limited resources at worst.

I don't think we're overusing the blocking flag here. We have criteria & we're sticking to it.
Lets table the debate on the blocking or not, and move this to 1.4 triage team.   It may seem "minor", and yes there are other bugs that have not been caught or prioritized, but lets treat other bugs independently of this.   There's enough evidence in here to justify to the triage team why this is warranted, and vice versa.   We realize there are lots of uncaught bugs that can lead to the larger picture of improvement, but lets focus on improving quality one at a time, and start with this bug.   

thanks!
Shipping a feature as buggy as the timer in 1.3 was a mistake, feature or not, and this bug is minuscule in relation to the "feature" we shipped. If quality is our goal, it doesn't matter that this was a regression or "not a feature" -- we should just block on anything that causes poor quality. But we don't, because that would be prohibitive. 

My point stands: if we truly upheld "quality" as the overriding feature, then we wouldn't have shipped a timer in 1.3 in the first place. Why did we? Because we prioritized a "feature" over quality, and because our procedures for landing features failed to prevent a poor feature from landing. 

I'm all for quality and for following our procedures. Our procedures are failing us, hence my disagreement with our triage criteria. I made it clear throughout this thread that I do not care whether we block on this issue or not, and that I was merely commenting on the inadequacy of our current policies to ensure a quality release.
Comment on attachment 8424214 [details] [review]
Pull Request

r+ from me. implementation is very clear and unit tests matches the rounding behavior (which should avoid regressions).
Attachment #8424214 - Flags: review?(mmedeiros) → review+
Landed on master.

https://github.com/mozilla-b2g/gaia/commit/8e68b1de900986d10014f7008083ffbc87838b1a
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
(In reply to Marcus Cavanaugh [:mcav] <mcav@mozilla.com> from comment #30)
> Shipping a feature as buggy as the timer in 1.3 was a mistake, feature or
> not, and this bug is minuscule in relation to the "feature" we shipped. If
> quality is our goal, it doesn't matter that this was a regression or "not a
> feature" -- we should just block on anything that causes poor quality. But
> we don't, because that would be prohibitive. 
> 
> My point stands: if we truly upheld "quality" as the overriding feature,
> then we wouldn't have shipped a timer in 1.3 in the first place. Why did we?
> Because we prioritized a "feature" over quality, and because our procedures
> for landing features failed to prevent a poor feature from landing. 
> 
> I'm all for quality and for following our procedures. Our procedures are
> failing us, hence my disagreement with our triage criteria. I made it clear
> throughout this thread that I do not care whether we block on this issue or
> not, and that I was merely commenting on the inadequacy of our current
> policies to ensure a quality release.

Hey Marcus,  you are more than welcomed to discuss this within b2g-internal thread, and cc b2g-release-drivers on this.   If there are others that share similar concerns, we should discuss it openly with the main group so others have visibility and comments.   That sounds like the right avenue.

Anyway, thank you marcus and miller for the patch!    Much appreciated.
blocking-b2g: 1.4? → 1.4+
This issue has been verified fixed on v1.4 on the following devices:

Environmental Variables:
Device: Buri v1.4 MOZ ril
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
RIL Version:
Firmware Version: v1.2-device.cfg

Environmental Variables:
Device: Flame v1.4 MOZ ril
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
Firmware Version: v10F-3

Environmental Variables:
Device: Open_C 1.4
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
Firmware Version: P821A10V1.0.0B06_LOG_DL
This issue is verified fixed on the Open_C v2.0 master:

2.0 Environmental Variables:
Device: Open_C 2.0 MOZ
BuildID: 20140522040230
Gaia: ef66efa34ed8a559c8998bde688fae88eb943a7a
Gecko: b40296602083
Version: 32.0a1
Firmware Version: P821A10V1.0.0B06_LOG_DL
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.