MozReview Request: Bug 1256038: Remove special NotifyDataArrived handling in the DirectShow reader. r?cpearce
58 bytes, text/x-review-board-request
User Agent: Mozilla/5.0 (Windows NT 5.1; rv:44.0) Gecko/20100101 Firefox/44.0 Build ID: 20160114165817 Steps to reproduce: Going to allmusic.com, selecting any kind of album, pressing the play button. Actual results: The playback stutters up to three times in the 30 seconds pre-listening. Pressing the play-button again does without stuttering then. Expected results: No stuttering. This works fine in v. 44 - I went back to that.
3 years ago
Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
WFM with FF45 32b Win 7 64b. Could you test with a fresh profile? https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
As said, I already changed back to rv. 44 where everything works fine again. Which means something with rv. 45 must be wrong.
So maybe there is a regression in v45. Could you install the tool mozregression to find a regression range, please. See http://mozilla.github.io/mozregression/ for details. Run the command "mozregression --bits=32 --good=44" then copy here the final pushlog provided in the console output.
If you ask me to do this, does it mean you can't produce the stuttering with your configuration?
Yes, I'm not able to reproduce the issue on my machine, that's why maybe there is an issue with Firefox only on specific platforms. And a regression range would help for that.
Sorry for my stupid question ;-) OK, I did the regression. It pointed to the difference between 13.8.2015 (good) and 14.8.2015 (bad). (Actually I don't understand what the difference between these (long ago) versions has todo with the difference between v.44.0b9 (good) and v.45 (bad).) After the bisection process the mozregression tools begins to try more versions. And always starting the second one of this I get (reproducable = 3 times) a blue screen. Which happens very, very seldom (I nearly forgot how it looked like ;-) However that's why no final pushlog from me. Hope this helps you to understand what happens with the html5-thing.
As good/bad limits, you can set release or date or buildid. Release gap is 6 weeks, dates are more accurate. According to your results: Last good revision: 7649ffe28b67aa2dad0f67ea01500c0ff91b2bac (2015-08-13) First bad revision: 4e883591bb5dff021c108d3e30198a99547eed1e (2015-08-14) the pushlog should be: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7649ffe28b67aa2dad0f67ea01500c0ff91b2bac&tochange=4e883591bb5dff021c108d3e30198a99547eed1e Too bad you got a BSOD with mozilla-inbound builds during bisection, because we woulg have got a reduced pushlog.
SO MUCH changes within just one day?! Well, if there are not many (no one at all?) to confirm the bug, then the reason for the effect might be the combination of the changes with my OS? (Win XP SP3) Maybe there will be someone with XP to check this? If the effect remains in v45++, I will have to stop updating at v.44 ... which doesn't let me feel good. Am I right with my explanation that the change effecting me has been inserted first in 2015-08-14 and then (7 month later) into v.45?
I suspect the issue about the BSOD is due to Bug 1160295. There is a possible workaround to reduce the changelog from comment #7 by disabling hardware acceleration in the testing profile used during the bisection, it should be enough to avoid BSOD. So could you run the command: mozregression --bits=32 --good=2015-08-13 --bad=2015-08-14 --profile-persistence=reuse [The last argument will create a new profile in /TEMP (just used for the bisection and deleted after that) but this profile will be reused for each build, so you can modify the settings as you want for the next builds downloaded by mozreg. You can even test different settings for a build by entering "Retry" in mozreg, it will reload the build with the new modifications in the profile.] For the 1t build downloaded (which is 2015-08-13), go to Options > Advanced > General > Disable hardware acceleration then type "Retry" (and verify HWA is disabled). After that, continue your bisection.
... and so I did. I used the mozreg GUI, but with option "profile reuse". So I had to test always new audio medias, as the effect is only by the first playing (cache problem?). I disabled HWA, and moreover disabled my antivir avast. After a dozen of bisections: In the window "bisection information" - pushlog url: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=d4ef1bba0bbc2f8b26c0a8aaae9041f63594f643&tochange=34ab0b8678b25a78620c957c9bdd5de0b4c77728 (I hope this is what you need). Just in case ... the last lines of the log view window: 2016-03-16T19:34:02: DEBUG : Using url: https://hg.mozilla.org/integration/mozilla-inbound/json-pushes?changeset=66eba222c7677e81c4dd67c3af483e10c45ef852&full=1 2016-03-16T19:34:03: DEBUG : Found commit message: Bug 1190238: P3. Do not loop calling MediaResource::Read or ReadAt, let MediaResourceIndex do it for us. r=cpearce This allows to remove a fair amount of duplicated logic. Most of it is in obsoleted code though. 2016-03-16T19:34:03: INFO : The bisection is done. 2016-03-16T19:34:03: INFO : Stopped
Thanks for the bisection.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regressionwindow-wanted → regression
Version: 45 Branch → 43 Branch
Can you reproduce in Nightly? (Or beta or dev edition for that matter) Don't see how the regression range provided could have any effects as described.
Sorry, I'm just a simple user. So I don't understand your question. Having done the regression it shows the effect at that very point. I'm no expert ... maybe the code changes caused the use of different Win API calls, that may be faulty in XP or something like that? Another hint could be that playing a song a second time doesn't stutter then.
(In reply to nji9 from comment #13) > Sorry, I'm just a simple user. that's okay. my only question for you was to retry with Nightly, beta and/or Developer Edition
... and that's just what I don't understand. I thought the regression process made the finest possible search? What other kind of editions should be tested? Where do I get these special versions? How do I avoid the mentioned effect with the BSOD because of activated hardware accelleration? etc.
... or do you mean this newest branches (nightly, beta etc.)? And still there remains my question why a change between versions 7 month ago that caused the effect now caused the effect from v.44 (good) to v.45 (bad). Is it explainable in that these changes 7 month ago just now (in v.45) went in? Otherwise I wouldn't understand the last version (at least 5 or so) went fine with me ...
(In reply to nji9 from comment #16) > ... or do you mean this newest branches (nightly, beta etc.)? yes... my suspicion is on a bug we have with audio device with long latency. Are you using a BT headset by any chance? The dev are working on Firefox 48 right now.. 45 is released and it won't receive any more fixes that aren't deemed critical or security related. Hence why testing 46 (beta) makes more sense, because that one *can* receive fixes and hopefully we can find what the problem is and fix it for the next release. > > And still there remains my question why a change between versions 7 month > ago that caused the effect now caused the effect from v.44 (good) to v.45 > (bad). We have 6 weeks release cycle and it goes from central -> aurora -> beta -> release. So you could have 18 weeks between the time a commit is made in central and the time it makes it to release. So for this it can't be, 7 months is too much. If you look at the bug you pointed at however (bug 1190238), you'll see that in the tracking flags, it went into Firefox 41 (at the time it was beta). So it can't be what's causing the regression you are seeing. Regression tools don't always get it right, it happens if there's a merge, of if there are other intermittent failure: so when a step of the regression search could have gone in one direction (it's a binary search) it went the wrong way. What's certain in my mind, is that bug 1190238 is a false positive. It's been in since 41. > Is it explainable in that these changes 7 month ago just now (in v.45) went > in? my guess is that the regression search too a wrong branch and ended on an irrelevant commit > Otherwise I wouldn't understand the last version (at least 5 or so) went > fine with me ...
there: https://www.mozilla.org/en-US/firefox/channel/ you can get beta (46) or developer edition (47) for nightly you go: https://nightly.mozilla.org/
Well then ... I tried beta v.46.0b2 and it has the bug in it. Reinstalled release v.44.0.2 and everything is OK again. But - as I said - the bug came in with the update from release v.44.0.2 to release v.45. Concerning bug 1190238: If you have a closer look at Comment 9 you will find that it's introduced by Loic as a supposition concerning the BSOD while regression. As far as I understand it has nothing to do with the bug we're talking about (and should be removed?). Concerning regression: I only can say that exactly this stuttering effect arose at the bisection point. Tested two times. That's why I asked (three times) how it can be possible that an effect in a nighty change aroses not until 7 months later.
Sorry, forgot to answer: No, I don't use a BT headset (Not even by chance ;-)
(In reply to nji9 from comment #19) > That's why I asked (three times) how it can be possible that an effect in a > nighty change aroses not until 7 months later. and I answered your question in comment 17
It would be helpful if there would be anyone else to confirm this bug. Someone with Win XP SP3. To rule out that the effect is due to my (special?) hw/sw config.
Why did you change in the "version" field to 43? (The effect came in from release 44.02 to 45)
Because the pushlog shows the regression has been introduced in FF43 (and you are able to reproduce in nightlies 43 when using mozregression)
(In reply to Loic from comment #24) > Because the pushlog shows the regression has been introduced in FF43 (and > you are able to reproduce in nightlies 43 when using mozregression) But there is the contradiction between the result of the mozregression and - as Jean-Yves Avenard pointed out - that that effect should have shown not until 7 months later (from release v.44.0.2 to v.45.0). It's a question where to search. And again: Is there a way to make a "needinfo?" to Win XP users to test it? By this we could distinguish between an effect that exists only at my special config and an important error (html5-support) that concerns all XP users - that perhaps would lead to the conclusion to let XP go. (Or, if you have to stay for XP for a while as me, the stop updating FF with v.44.0.2).
I can't even test this anymore. I no longer reproduce playback, there's no more play button showing like a couple of days ago I went there: http://www.allmusic.com/artist/jeff-buckley-mn0000240665 upon refreshing the page a few times, I could see a play button in the top left corner, but this is inconsistent. same in chrome. do I need to open an account?
never mind, works now
Perhaps I didn't described specific enough. I mean the play buttons within an album. For example: http://www.allmusic.com/album/grace-mw0000624110
... and - as already said - the stuttering is only on first time pushing the track play button. After that it plays without stuttering (points to cache problem?). And within all my testing it also happened that the effect seems to turn on at the second or third tracked tried. And one more specification to the "stutterin": pushing play, it begins to play about a second, then sometimes repeats the first second, then plays on, then every few (about 10 seconds) a stutter. Hope this helps.
can reproduce in a win XP VM.
Assignee: nobody → jyavenard
"...for he’s a jolly good fellow ..." ;-) Hurray, relief, first step taken. Let me know if I shall help/ test anything.
Another STR is to open a preferences tabs, going into advanced and pressing the clear cache button continuously. This causes lots of cuts for me, and often you can hear the audio going slightly backward. There's no significant difference between 44 and 45 related to the DirectShow player (which is what is used on XP) A bit at loss on what is happening. I can't reproduce it on win 7 or win 10 (after disabling the new MP3 player to use DirectShow); which makes it difficult to debug as we can't compile or debug on xp due to the lack of tools.
(In reply to Jean-Yves Avenard [:jya] from comment #32) > > There's no significant difference between 44 and 45 related to the > DirectShow player (which is what is used on XP) I've ask to get you right: No "significant difference between 44 und 45" in the code, but a difference in the effect between these versions (within XP VM), right?
(In reply to nji9 from comment #33) > I've ask to get you right: > No "significant difference between 44 und 45" in the code, > but a difference in the effect between these versions (within XP VM), right? not so sure about this. I've reproduced the issue in 44 (using the clearing of the buffer repeatedly and also without doing that); just not as much as with 45. So no, none of the changes made between 44 and 45 can explain the change in behaviour at a glance. Please understand that the MP3 player used in XP is old, and has received almost no change in several years. yet, something did happen ... but what I can't tell. I'm certain it's not the regression range you pointed at though. Because I made a 45 without those changes, and it still happens
If I get you right then, you can reproduce the stuttering effect. But you cannot reproduce (as I do) that stuttering happens in v.45 and not in v.44? (All in XP VM of course). I don't know anything about coding etc., but as far as I understand, it would be essential to have two versions that differ in behaviour? Two version that are build as close as possible. So couldn't you try the mozregression (at your XP VM)? Concerning your test scenario with clearing the cache while playing: That's a different test than mine with just pushing the play button, isn't it? I mean, perhaps your effect happened in a quite earlier release? You pointed out that there are no mediaplay code changes that could explain the effect. But maybe the reason is not playing the media, but something about accessing the (sw-) cache? Perhaps these thoughts are just nonsense (?), as I'm just a user. I would like to help, but don't know how.
So I restarted the regression search. I quickly gave up trying to bisect release. I assumed that the regression was in 45. Which was in central between 2015-10-29 and 2015-12-15. One of the issue regressing on XP is that I could hear audio stutter even with 44, albeit less frequent. I ended up blaming the slowness on my XP VM (which is on a RAID5 array). Luckily I could reproduce it on a real window 7 machines once media.wmf.enabled was set to false. https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ada1afb12a16333d26974328d0f340712f354bf2&tochange=5ddb8631dbccb9cca635eeb2c9252fe2ccaff4bf Bug 1223599 - Remove the throttling argument from AbstractMediaDecoder::NotifyDataArrived(). r=jya. The effect I can hear is that it goes backward slightly. We had this issue before, and last time it was due to rounding of duration which caused a seek to occur. My guess is that's what's happening now. Removing the throttling argument makes the problem more obvious because we may round more often than we used to as NotifyDataArrived is called very often (ever 32kB)
[Tracking Requested - why for this release]: it's a regression affecting winXP users
status-firefox45: --- → affected
status-firefox46: --- → affected
status-firefox47: --- → affected
status-firefox48: --- → affected
tracking-firefox46: --- → ?
Just to be safe. Could you please confirm that this build: https://queue.taskcluster.net/v1/task/iBI5Qk6uR6yoBKO1wbwCtw/runs/0/artifacts/public%2Fbuild%2Ffirefox-45.0a1.en-US.win32.zip works fine for you while this one: https://queue.taskcluster.net/v1/task/qcP5u9kOSxmtijCHxgNriQ/runs/0/artifacts/public%2Fbuild%2Ffirefox-45.0a1.en-US.win32.zip doesn't? you may want to install them in different folder than the default (using custom) installation. thank you
I tried, but ... I didn't know how to install "using custom installation" (in a different folder), as I didn't find any installer or so. So I just unzipped the first of your archives into a new folder and started "firefox.exe". I don't know if this can't work, as a proper installation routine perhaps would build fresh dependencies ... However: Crash with blue screen! Now I don't know if it's because of my wrong way of "installation", or is it about bug 1190238 (see Comment 9) ...?
Flags: needinfo?(nji9) → needinfo?(jyavenard)
(In reply to nji9 from comment #39) > I tried, but ... > > I didn't know how to install "using custom installation" (in a different > folder), as I didn't find any installer or so. So I just unzipped the first > of your archives into a new folder and started "firefox.exe". I don't know > if this can't work, as a proper installation routine perhaps would build > fresh dependencies ... > However: Crash with blue screen! > Now I don't know if it's because of my wrong way of "installation", or is it > about bug 1190238 (see Comment 9) ...? oh actually, just unzipping and running firefox is good enough. I had forgotten those weren't installers. in any case, none of this has *ANYTHING* to do with bug 1190238. Forget about that bug, it wasn't the cause of this regression. Nothing in Firefox code can cause a BSOD ; only a kernel drivers can cause that : typically with firefox those are graphic drivers. I'm not familiar with the graphics side of things to tell you which pref could be disabled. I'll craft a custom 45 version for you with what I believe is the fix and you can test it from there.
Sorry wrong citation - Bug 1160295 was meant. If you have a look at Comment 9 and Comment 10 you'll see we had the problem with BSOD at that point already (nowhere else). Loic said it was something about Bug 1160295, and Hardware Accelleration. After disabling this, and also my antivir ... BSOD was gone. I don't know what happens if unzipping firefox into a new folder and than starting (HWA same as the "main" (= installed) firefox? Or does the new one build a new profile on its own (with HWA enabled)?). I'm a little concerned about killing my system with this BSODs, and I need this XP system (else I had switched to Win7 long time ago).
the builds above include 1160295... it's really a central/nightly build as of November 12th 2015. so it's something in between 44 and 45.
But UPDATING from v.44.0.2 to v.45.0 I had no BSOD. As updating reuses the profiles, and I had de-activated HWA, the latter well be the reason for the BSOD. IF - as former mentioned - the unzipping and starting does create a NEW profile (With HWA activated). However, PLEASE no test versions with BSOD for me.
... one more info: I just detected that as a different user (on my XP) the HWA acc still is enabled. And no BSOD starting that ... v. 44.0.2. (the version with no stuttering bug). I'm quite sure this is another issue. However to summarize it: Doing the regression there was BSOD in a version about 7 months ago. Disabling HWA and antivirus prevented BSOD. Now starting your first of the two new made versions (second I didn't try), after unzipping the complete unzipping into a new folder, I get BSOD with it.
BSODs are due to hardware (like bad sectors on HDD or faulty RAM) or drivers (sometimes filter-driver of antivirus).
Well, I'm no expert. I just made the experience not having a BSOD for a long time. Then starting many versions of FF during regression testing there are BSODs with always the same versions. Doing as you recomment in Comment 9 the BSODs vanished. Now again BSODs with FF (the version made from Jean-Yves Avenard) Still no BSODs running all my other programs. If it would be "due to hardware" than these firefox version will trigger/ use something of the hardware/ of the drivers that all the other programs do not. If my thinking is right. Still it's hard for me not think as FF as the cause. However, all this BSOD seems a different problem.
What I thought would fix the problem ended up not fixing it (I thought it was because we called MediaDecoder::DispatchUpdateEstimatedMediaDuration too often which would result in durationchange fired too often and maybe causing to seek backward as currentTime could end up being greater than the duration. However, DispatchUpdateEstimatedMediaDuration does nothing if the variation is < 0.5s which never happens (the duration does change consistently but only by a few ms after each call to DirectShow::NotifyDataArrivedInternal I'm a bit at a loss on why removing the throttling would have the effect reported. Could it be that we end up pinning/unpinning the MediaResource much more often which could result in a blockage when DirectShow is making a read? The behaviour I can hear however is that from time to time, playback appears to play the same thing for a very short period of time (it's very brief, like a few ms only). Upon debugging, I found that seeking however never occurs, not even due to the duration being past currentTime JW, if you have some ideas on what the problem could be, I would appreciate your analysis on the matter. Thank you
To reproduce on any windows version (can reproduce it on a realy win7 machine, a Dell with an intel i7-4790 with 32GB of RAM) set media.wmf.enabled to false, and open http://www.allmusic.com/album/this-is-what-the-truth-feels-like-mw0002917797 press the play button at the top. Playback will stutter / go backward from time to time. Make sure the media cache is emptied.
Created attachment 8733326 [details] MozReview Request: Bug 1256038: Remove special NotifyDataArrived handling in the DirectShow reader. r?cpearce Following bug 1223599, NotifyDataArrived is called extremely often (every 32kB block downloaded). The combination of pinning the MediaResource with small cache reads while reading on another thread can causes incorrect read. While this issue needs to be addressed, the special handling of NotifyDataArrived with DirectShow ultimately serves no purpose. By the time we have read the metadata, we have already identified if the MP3 has a known duration. If not, the duration will be amended as data is being returned to the MDSM. Continually feeding new data to the MP3FrameParser only allow to account for the new data downloaded ahead of the last decoded sample and to get a slightly more accurate buffered range and duration. The new MP3Demuxer doesn't do it, and ultimately this makes the DirectShowReader behaves like the new MP3Demuxer and gives us consistency across the readers. Review commit: https://reviewboard.mozilla.org/r/41727/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/41727/
Attachment #8733326 - Flags: review?(cpearce)
I'll try. But is it possible to remove/ not show my mail-adress public in the comment above please? There's no need for even more spam ... ;-)
I tested that nightly as you wanted me to. No stuttering anymo-more ;-) playback works fine with several audios tested. And then: no reason to thank me, as I'm glad to be able at this (very small) contribution. THANK YOU (all)! And there's even more profit for me, as I will be able to update FF (after the bugfix is included) at my XP. I would have bet that the bug is due an old XP system library. And last: It looks like my BSOD are due to avast antivir, not FF.
your email appears on every single post you make as well as any emails that was sent out whenever you added a comment.
(In reply to Jean-Yves Avenard [:jya] from comment #53) > your email appears on every single post you make as well as any emails that > was sent out whenever you added a comment. That's true if you are logged in. Members see all the mail addresses and that's OK for me too. But what I mean is different: One doesn't have to be logged in to see my mail address in your Comment 50 . Which means that it can (and will) be read out by spam bots. You can test this by yourself: Log out and than call this page - you won't see ANY mail address but mine. So, please ...
Unfortunately, you can't delete a post. But I've hidden it.
I don't understand how the patch fixes the problem. Can you catch logs with "NSPR_LOG_MODULES=timestamp:1,MediaCache:5,MediaResource:5" so I can check if there are any abnormal network/cache activities?
(In reply to JW Wang [:jwwang] from comment #56) > I don't understand how the patch fixes the problem. it no longer calls MediaResource::ReadAt and MediaResource::Pin/UnPin
(In reply to JW Wang [:jwwang] from comment #56) > Can you catch logs with > "NSPR_LOG_MODULES=timestamp:1,MediaCache:5,MediaResource:5" so I can check > if there are any abnormal network/cache activities? Unfortunately, it's difficult to reproduce in a debug build. The effects are nowhere near as severe and can take much longer to reproduce. In an opt build, the issue will be seen in the first 10s of playback
NSPR logs are always printed in both debug and opt builds.
Comment on attachment 8733326 [details] MozReview Request: Bug 1256038: Remove special NotifyDataArrived handling in the DirectShow reader. r?cpearce https://reviewboard.mozilla.org/r/41727/#review38375
Attachment #8733326 - Flags: review?(cpearce) → review+
that's news to me ! (In reply to JW Wang [:jwwang] from comment #59) > NSPR logs are always printed in both debug and opt builds. I didn't know that ! :) always thought you had to use debug build !! awesome I've put the results in bug 1258673
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox48: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
As a newbie I'm not sure about the flags: Can somebody tell in which release version this bug will be fixed please? Am I right to have to wait with my next update until v.48?
the "proper" fix is in bug 1258673 request to uplift to 46. so hopefully it will be.
This is great, nij9, thank you for all the testing and QE work! Amazing. This should be fixed in 46 beta 6 or 7, by the end of this week.
tracking-firefox46: ? → -
Updating flags to reflect that the fix for this has landed in Fx46 via bug 1258673.
status-firefox45: affected → wontfix
status-firefox46: affected → fixed
status-firefox47: affected → fixed
You need to log in before you can comment on or make changes to this bug.