Closed Bug 828222 Opened 11 years ago Closed 11 years ago

[Sound] Sound playback is delayed giving the impression the device has poor performance.

Categories

(Firefox OS Graveyard :: Gaia::System, defect, P2)

All
Other
defect

Tracking

(b2g18+)

RESOLVED FIXED
Tracking Status
b2g18 + ---

People

(Reporter: padamczyk, Assigned: djf)

Details

(Keywords: perf, Whiteboard: [c= p= s=2013.11.08 u=], visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking)

Attachments

(1 file, 2 obsolete files)

The sound playback for ALL notifications / UI is delayed making the device feel slower than it really is.

Ie.
+ unlock the device
+ enable sound feedback on keyboard click

Playback should be instant.
Component: Gaia → Gaia::System
Whiteboard: visual design, sound, interaction, UX-P1 → visual design sound interaction UX-P1 yedo
Whiteboard: visual design sound interaction UX-P1 yedo → visual design sound interaction UX-P1 yedo PRODUCT-PERF
Whiteboard: visual design sound interaction UX-P1 yedo PRODUCT-PERF → visual design sound interaction UX-P1 yedo PRODUCT-PERF, [TEF UX Critical]
blocking-b2g: --- → tef?
Arun has been working on performance and should have numbers to work towards.
Flags: needinfo?(aganesan)
Who can take this?
"Designing with the mind in mind" says:

Short version
Our hearing is more sensitive to short events and gaps than our vision is. The maximum tolerable delay or drop-out time for audio feedback (e.g., tones, “earcons,” music) is 1 millisecond.

Long version
The chapter dedicated to brain cognitive science, explains in detail for those interested in this kind of stuff  "the human auditory system is sensitive to very brief intervals between sounds. Audio feedback and content should be provided by well-timed processes running with high priority and sufficient resources. Our hearing is much more sensitive to short events and small differences than our vision is. Our ears operate using mechanical sound transducers, not electrochemical neural circuitry. The eardrum transmits vibration to the ossicles (middle-ear bones), which in turn transmit vibration to the cochlea’s hair cells, which, when vibrated, trigger electrical pulses that go to the brain. Because the connection is mechanical, our ears respond to sound much faster than the rods and cones in our retina respond to light. This speed allows our auditory system to detect very small differences in the time when a sound arrives at our two different ears, from which our brain calculates the direction of the sound’s source."
Flags: needinfo?(aganesan)
1ms for audio latency considering what we are doing is completely unachievable.

Also, the source is wrong, a latency in the region of 10ms is not perceivable (this is what my very expensive audio rig does, and this fact is confirmed all over the internet and by my ears).
Also, fwiw, high-end devices like a Galaxy Nexus or an iPhone 5 are in the 10-30ms range for specialized audio applications.
Paul:

Thanks for sharing this info. I will try and verify with other sources and post an update.
(In reply to Paul Adenot (:padenot) from comment #5)
> 1ms for audio latency considering what we are doing is completely
> unachievable.
> 
> Also, the source is wrong, a latency in the region of 10ms is not
> perceivable (this is what my very expensive audio rig does, and this fact is
> confirmed all over the internet and by my ears).

1 millisecond is the shortest gap of silence the human brain can detect in a sound. However the tolerance for cause-and-effect (which is what we're concerned with for the sake of UI responsiveness) may be more forgiving. Paul, is there any chance you can share sources for those competitor figures?
- For Android: [1], [2].
- For iPhone: RemoteIO [3]. 

Basically, we need to run 4.1 Android kernel and libraries to get low latency. But our stack adds a lot of latency as well. We are at the moment looking into reducing the latency we add in Gecko, mainly for WebRTC and games [4].

[1]: http://www.youtube.com/watch?v=UGJbPPjANKA&feature=player_de%20tailpage#t=3552s (Galaxy Nexus + Jelly Beans: 12ms, they want sub-10ms).

[2]: http://createdigitalmusic.com/2012/07/android-high-performance-audio-in-4-1-and-what-it-means-plus-libpd-goodness-today/

[3]: http://developer.apple.com/library/ios/#documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/Introduction/Introduction.html#//apple_ref/doc/uid/TP40009492-CH1-SW1

[4]: Bug 785584
blocking-b2g: tef? → -
tracking-b2g18: --- → +
Assignee: nobody → waldron.rick
I don't have a device to test my work on and I tried b2g desktop, but ran into an issue where the keyboard won't appear. I tried forcing it with an input.focus(); "hack", but no luck. I filed this: https://bugzilla.mozilla.org/show_bug.cgi?id=840704 but it hasn't been confirmed.
Rick, if you want someone to review this, you need to attach either the full patch or a simple html page redirecting to you PR to this bug, and ask someone to review it. If you are not sure who to ask the review, just look at the blame for the code you changed. If this is still not enough, just put the review on any of the module peers from here: https://wiki.mozilla.org/Modules/Boot2Gecko
Attached patch Fix for 828222 (obsolete) — Splinter Review
Fabrice thanks—I wanted to squash the commit first
Attachment #715191 - Flags: review?(dflanagan)
Comment on attachment 715191 [details] [diff] [review]
Fix for 828222

Setting the patch bug so I can splinter review this.
Attachment #715191 - Attachment is patch: true
Comment on attachment 715191 [details] [diff] [review]
Fix for 828222

Review of attachment 715191 [details] [diff] [review]:
-----------------------------------------------------------------

Rick,

I'm excited to see the change where you do the aural feedback before the visual feedback. I'm hoping that just switching the order of those two lines is enough to make a difference. As you'll see in the comments that follow, though, I'm worried about two of the other changes you've made.

We need to get this tested on an actual device. I'm swamped with reviews and can't do it right now. And if you still don't have a device, we'll have to find someone else to try it out.

And here's a meta-comment:

For gaia code it is more common to do patches like this in the form of pull requests to the github repo. When I do that, the attachment I create for the bugzilla bug is just a one line HTML file with a link to the github PR.  Others do a fancy meta tag that automatically redirects to github. (Its absurd that we have two different systems, here, agreed.)  If you do your patches directly on bugzilla, remember to click the "patch" check box, so that the patches are reviewable.

::: apps/keyboard/js/keyboard.js
@@ +255,5 @@
>    'B4kD3QKPBcAHhgaHBDAEngO6BBcFbwJ2/qD7rPtG/voBwQGU/pn9Lv3T/g==';
>  
> +// The audio element used to play the click sound. This only needs to be
> +// created once
> +var clicksound = new Audio(CLICK_SOUND);

I suspect it is better to keep the initialization code where it was so that if clicks are off, the audio element never even needs to be created. I fear that creating the element with a src specified will do something to the audio hardware to get it ready, and we should only do that if necessary.

I agree that clicksoundEnabled and clicksound are better variable names than the ones I had chosen. But changing them here makes the patch harder to review because it is harder to tell what the substantive changes are.

@@ +306,4 @@
>  
>      // Copy the keyboard group settings too
>      enabledKeyboardGroups = {};
> +

nit: remove the blank line, because the comment above applies to the code below, and adding a blank line makes it look like a separate paragraph.

@@ +1044,5 @@
>      return;
>  
> +  // Feedback in this order is intentional:
> +  //  1. Call the vibration/audible feedback
> +  //  2. Call the on-screen key highlight

Thank you for the explicit comment!

What if this is the only change you make? Is that enough to fix the latency issue.

@@ +1511,5 @@
>      } catch (e) {}
>    }
>  
> +  if (clicksoundEnabled) {
> +    clicksound.play();

I claim no credit for the cloneNode() bit that was here. But when I worked on this code, I assumed that it was there so that when keys were pressed in rapid succession there would be two overlapping clicks.

I would think that with this change, if you're typing really fast (the keyboard is multitouch on devices, so you can hit keys in very rapid succesion), you would only get one click rather than two.  And I think that incorrect aural feedback is worse than laggy aural feedback.

My review queue is overflowing at the moment, so I haven't tested this out myself though.
Attachment #715191 - Flags: review?(dflanagan) → review-
Rick,

Note that in the initial bug report Patryk also wants the lockscreen unlock sound fixed. Are you planning to investigate that, too?  Its fine to fix the two issues with two separate patches. I think you put [leaveopen] in the whiteboard line when you land the first patch so that the bug doesn't get closed. Though that probably matters more for gecko patches than for gaia.
David, thanks for the review. I'll make the relevant updates in the AM. In the meantime, this is the fiddle I used to test the behaviour of click sound without cloneNode: http://jsfiddle.net/rwaldron/sedKv/ (open it at http://jsfiddle.net/rwaldron/sedKv/show/light/ from the browser app)
Just a quick note to say that the |cloneNode| is fast-pathed in the media code, and I believe it is actually the correct way to achieve what we want here [1]. HTML5 video game developers are using this technique for sound effects, and it is working great for them.

[1]: http://robert.ocallahan.org/2011/11/latency-of-html5-sounds.html
Attached file Updates per djf review notes (obsolete) —
Attachment #715191 - Attachment is obsolete: true
Attachment #716008 - Flags: review?(dflanagan)
Previous patch did not redirect—sorry for the noise.
Attachment #716008 - Attachment is obsolete: true
Attachment #716008 - Flags: review?(dflanagan)
Attachment #716009 - Flags: review?(dflanagan)
I've asked padenot to comment on whether there is any gecko overhead introduced by creating the Audio() element when the setting is turned off an it is never going to be used.  His response will basically be my review...
So, because we are on B2G, the audio element has preload="none" by default, so we don't go and create the decoder and all the threads and stuff.

But we are effectively using some memory by doing that (by copying the URL in Gecko, and because this is a data url, this is not negligible). You could probably save some memory by lazily setting the |src| attribute. Since you have a observer on the setting, this is trivial.
Thanks Paul.

Rick:

I'm not too concerned about the memory used by the data url, but would worry more about what happens if the preload default ever changes. Please update the patch to create or at least set the src of the audio element only if the setting is true (and when it becomes true). You might want to override the preload=none default so that the first click is fast, too.

Note that the current code just sets the audio element to null if the setting changes to false. Probably better would be to do audio.removeAttribute('src'); audio.load().  (I've found that if I just set audio.src = null I get a warning in the console.)
Comment on attachment 716009 [details]
Updates per djf review notes

r- again just to get this bug back on your radar. See my comments above.
Attachment #716009 - Flags: review?(dflanagan) → review-
Right, audio.src = null throws because null is not a valid URL.

Part of the strategy of re-using the node was to avoid any need to set or reset |src| at any later time, which ultimately means less operational overhead for the lifetime of the app's use. 

Compare: 

- initialize the node with a src
- let the state of clicksoundEnabled be the arbiter of playback

vs. 

- initialize an empty node
- set the src if keyboard.clicksound is true
- if keyboard.clicksound is false, at some arbitrary point in the future keyboard.clicksound may become true, in which case, set the src.

On the one hand I agree that there might be benefit to setting the src only if keyboard.clicksound is true, but then the app has to take on the role of managing the state of the node throughout the lifetime of the app.

...Reading through the Keyboard spec doc and I'm unable to determine what the lifecycle of the Keyboard app is—is it created once and reused forever or does it get initialized and torn down with each use?
Tested on unagi device, including enable/disable of click sound, all work as expected.

https://github.com/mozilla-b2g/gaia/pull/8176/files
Attachment #716009 - Flags: review- → review?(dflanagan)
(In reply to Rick Waldron from comment #25)
> Right, audio.src = null throws because null is not a valid URL.
> 
> Part of the strategy of re-using the node was to avoid any need to set or
> reset |src| at any later time, which ultimately means less operational
> overhead for the lifetime of the app's use. 

I don't understand your point at all. Many users will have keyboard sounds off, and never enable them. With your version of the patch, the system will always be holding on to this audio element and the decoded version of the audio stream that will never ever be used.

> Compare: 
> 
> - initialize the node with a src
> - let the state of clicksoundEnabled be the arbiter of playback
> 
> vs. 
> 
> - initialize an empty node

No, you wouldn't create a node at all. You'd just have var clicksound = null.

> - set the src if keyboard.clicksound is true

if it is enabled at startup, then you create the audio element.

> - if keyboard.clicksound is false, at some arbitrary point in the future
> keyboard.clicksound may become true, in which case, set the src.
> 

it it becomes enabled after startup you create the audio elment then.

If it becomes disabled after startup, you can discard the audio element if you want to, but I don't care much about that because settings changes are likely to be uncommon. 

> On the one hand I agree that there might be benefit to setting the src only
> if keyboard.clicksound is true, but then the app has to take on the role of
> managing the state of the node throughout the lifetime of the app.

I don't see why that is so hard. But really it manages the existance of the element, not its state. Either it exists and has a src, or it doesn't exist.

My point in a previous commently was simply that I had code in there where I just set it to null. Better would have been to explicitly free the src and then set it to null.

> ...Reading through the Keyboard spec doc and I'm unable to determine what
> the lifecycle of the Keyboard app is—is it created once and reused forever
> or does it get initialized and torn down with each use?

It looks like a separate app, but is part of the system app, so it only gets created once.
Comment on attachment 716009 [details]
Updates per djf review notes

r- because you didn't make the change I asked for (on Paul's advice) the last time around, and you didn't explain why you thought it would be okay to create and set the src of an Audio element that you know you would never be using. (i.e. in the case where the setting is turned off).
Attachment #716009 - Flags: review?(dflanagan) → review-
(In reply to David Flanagan [:djf] from comment #27)

> It looks like a separate app, but is part of the system app, so it only gets
> created once.

That's true for now, but may change in a the future, see bug 816874.
I did explain my approach and I stand by it, but I don't want to argue, the next patch will be at your command. I'll even set your name and email as the commit author.
(In reply to Rick Waldron from comment #30)
> I did explain my approach and I stand by it, but I don't want to argue, the
> next patch will be at your command. I'll even set your name and email as the
> commit author.

Nope, you are the author and David the reviewer. There's no problem here.
Hi guys, just a quick note to say: don't over-rotate on sound-playback performance of keyboard clicks. Silent is fine for v1. We'd much rather ship a 100ms-response-time silent keyboard than a 200ms "clicking" keyboard (which is why we disabled audio and vibration feedback in the first place).
Whiteboard: visual design sound interaction UX-P1 yedo PRODUCT-PERF, [TEF UX Critical] → u=user c=sound s=ux-most-wanted, visual design sound yedo PRODUCT-PERF, [TEF UX Critical]
Just to add: per my application of "ux-most-wanted" tag, the rest of sound playback (eg: on unlock, on new notification, etc) is important, and ideally should fall under 100ms of the parent event (eg: finger tapping a button).
Whiteboard: u=user c=sound s=ux-most-wanted, visual design sound yedo PRODUCT-PERF, [TEF UX Critical] → visual design sound yedo PRODUCT-PERF, [TEF UX Critical], u=user c=sound s=ux-most-wanted
blocking-b2g: - → leo?
blocking-b2g: leo? → ---
Mass edit to set tracking-b2g18+ for these UX bugs that were called out for v1.1
Hi Rick, haven't noticed activity on this bug for a while. What's the status of this bug?
Flags: needinfo?(waldron.rick)
The patch is at https://github.com/mozilla-b2g/gaia/pull/8176.patch

Not sure what else I'm supposed to do.
Flags: needinfo?(waldron.rick)
(In reply to Rick Waldron from comment #36)
> The patch is at https://github.com/mozilla-b2g/gaia/pull/8176.patch
> 
> Not sure what else I'm supposed to do.

Does it address https://bugzilla.mozilla.org/show_bug.cgi?id=828222#c28 ?
(In reply to Fabrice Desré [:fabrice] from comment #37)
> (In reply to Rick Waldron from comment #36)
> > The patch is at https://github.com/mozilla-b2g/gaia/pull/8176.patch
> > 
> > Not sure what else I'm supposed to do.
> 
> Does it address https://bugzilla.mozilla.org/show_bug.cgi?id=828222#c28 ?

I disagree with the approach that I was being forced into, so I've reduced the patch to the simplest possible modification that I believe will have an impact on the perceived performance of the audio feedback.


https://github.com/mozilla-b2g/gaia/pull/8176.patch
Is it possible to quantify (subjective ballpark is fine) the improvement with this patch?
I guess that depends on how you perceive it ;)

From a purely technical perspective, the program is now invoking the sound playback before changing the display via DOM operations, which _should_ result in a "snappier" experience.
(In reply to Rick Waldron from comment #36)
> The patch is at https://github.com/mozilla-b2g/gaia/pull/8176.patch
> 
> Not sure what else I'm supposed to do.

In this case what you forgot to do was request review again, so no one ever looked at your patch to review or land it.
Comment on attachment 716009 [details]
Updates per djf review notes

This is a simple and safe fix that might be all we need to do for the keyboard click sounds.

It only addresses keyboard sounds, so I'm going to land it and keep the bug open, and reassign it.
Attachment #716009 - Flags: review- → review+
Assignee: waldron.rick → dflanagan
Keyboard sounds patch landed on master: https://github.com/mozilla-b2g/gaia/commit/12e49e3b831e7f60e9bb4977a86ad7e6cf5c1469

Leaving this bug open for other patches that address other audio performance issues.
Josh and Patryk: what are your highest priority sounds to fix here?  The lockscreen unlock sound has been mentioned explicitly. Are there others you want me to look at?

I know that the camera has a shutter sound and now has a nice sound for starting and stopping video recording.  What other sounds play directly in response to a user action?

I'm not certain that Gaia fixes will be enough, but at this point, we probably can't count on getting any gecko fixes, so I'll see what I can do.

I'm also working on keyboard autocorrect and am taking Friday off, so I won't realistically get to this bug until next week.  So feel free to steal or re-assign it.
Flags: needinfo?(padamczyk)
Flags: needinfo?(jcarpenter)
Hi David, keyboard click sounds were the biggie, so it's great that we may have cracked that nut. Let me consult w/ Patryk today and follow up on responsiveness of other sounds. We're in London for the week as part of a full UX team summit.
Flags: needinfo?(jcarpenter)
Yes it would be ideal to add add keyboard feedback. We have 2 tones, one for glyph keys and the other for functional keys.
Flags: needinfo?(padamczyk)
Patryk,

Let's open a new bug for adding new sounds to the keyboard, and keep this one for sound latency issues.
Whiteboard: visual design sound yedo PRODUCT-PERF, [TEF UX Critical], u=user c=sound s=ux-most-wanted → visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking
Whiteboard: visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking → c= , visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking
Priority: -- → P2
Attachment mime type: text/plain → text/x-github-pull-request
Hi David, is this still a valid bug?
Flags: needinfo?(dflanagan)
Mason,

Comment 0 mentions the unlock sound in addition to the keyboard sound.  The only patch that has been landed was for the keyboard click sound.  I don't know if anything has changed elsewhere.

Patryk: are there still audio responsiveness issues that you'd like to be addressed in 1.3?  Do you want to keep this bug open for that, or file a new bug and close this one?
Flags: needinfo?(dflanagan) → needinfo?(padamczyk)
Playing with a recent (Oct 28) unagi 1.2 build there are still issues when it comes to sound response.
+ unlock sound ... works well actually
+ keypad in dialer ... works well


keyboard & camera are a different story.
keyboard ... touch key, sound plays, devices thinks for .25-.5 sec, glyph is rendered on the screen. 
camera ... touch shutter, devices thinks for .5 sec, sound plays, device thinks for .5 sec picture is taken.

both instances are far too slow, but I think it has to do with device / code for those apps not the sound play back response.

I would make one adjustment in the camera, and that is the sound should playback as soon as the user touches the shutter. To me that would fix this bug. Since it would make: TOUCH = INSTANT PLAYBACK
Other bugs could address the other delay before command execution (glyph being rendered, picture being saved)
Flags: needinfo?(padamczyk)
Hi,

In dialer app, it used audio.mozWriteAudio() (will be replaced by Web Audio) to pass the raw PCM data directly into audio output device. This way can avoid the time for creating decoder and loading data from file I/O.

If we want to look more detail, the raw PCM data used by mozWriteAudio should use the sample rate which is the same with audio output device. ex: 48k Hz (not applied yet). This can avoid the time for re-sampling.
(In reply to Patryk Adamczyk [:patryk] UX from comment #50)
> Playing with a recent (Oct 28) unagi 1.2 build there are still issues when
> it comes to sound response.
> + unlock sound ... works well actually
> + keypad in dialer ... works well
> 
> 
> keyboard & camera are a different story.
> keyboard ... touch key, sound plays, devices thinks for .25-.5 sec, glyph is
> rendered on the screen. 
> camera ... touch shutter, devices thinks for .5 sec, sound plays, device
> thinks for .5 sec picture is taken.
> 
> both instances are far too slow, but I think it has to do with device / code
> for those apps not the sound play back response.

Neither of those are sound responsiveness issues.
 
> I would make one adjustment in the camera, and that is the sound should
> playback as soon as the user touches the shutter. To me that would fix this
> bug. Since it would make: TOUCH = INSTANT PLAYBACK
> Other bugs could address the other delay before command execution (glyph
> being rendered, picture being saved)

Once we have continuous focus we can do that. It doesn't make sense to play the shutter sound before we focus currently.  But we'll work that out when when we change the camera app.

I'll close this bug since the two specific issues it was opened for have been addressed.  Marco: I agree that using the audio API may be the right approach in other cases where responsiveness matters.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: c= , visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking → [c= p= s=2013.11.08 u=], visual design sound yedo PRODUCT-PERF, [TEF UX Critical], ux-tracking
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: