Closed Bug 859640 Opened 11 years ago Closed 8 years ago

HTMLImageElement.src should be [TreatNullAs=EmptyString]

Categories

(Core :: DOM: Core & HTML, defect)

20 Branch
x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox20 --- affected
firefox21 - affected
firefox22 - affected
firefox23 - affected

People

(Reporter: juwagn, Unassigned)

References

Details

(Keywords: regression)

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130326150557

Steps to reproduce:

Used xhr-polling for data transfers.


Actual results:

After updating from version 19 to 20 FireFox begun to produce delays up to 3-4 seconds in creating new request (mixed POST/GET) with XMLHttpRequest.
This delay affects rapidly the usability of software based on xhr-polling, because the delay is produced in time frames: ok 3-4 seconds, delay 3-4 seconds, ok 3-4 seconds, delay 3-4 seconds and so on..



Expected results:

It should not produce delays and work as version FF 19 and all before.
This delay is not existing on Chrome PC, Opera PC/Opera Mobile, Maxthon, Android native browser, iOS Safari, only FireFox 20.0 has this not acceptable delay.
Why?
Do you have a testcase?
Flags: needinfo?(juwagn)
Hello, yes, I have, i have sent you the information per email.
Giving the access to server with test case!
Flags: needinfo?(juwagn)
According to testcase sent by the reporter, there is a regression.
His web app (remote Windows desktop) shows a slowness, typing (or switching) in username/password fields of the Windows user login menu is enough to reproduce the issue.

m-c
good=2013-01-04
bad=2013-01-05
Changeset: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=801ba75ac563&tochange=d8ca3e1c469e
Status: UNCONFIRMED → NEW
Ever confirmed: true
NB: I can provide the testcase by email (URL is not public) to narrow down the changeset.
May I ask about the reference to the bug 825527, also discussed bug hangs together with convering HTMLImageElement? Just would like to take into account, the issue does not happen, when Websockets are used as transport protocol (remove :4433/foo/ from the link to check)
Or when the bug with converting HTMLImageElement comes, the creation of new XMLHttpRequest connection is blocked and does not happen with Websockets due it's persistent nature!? (hardly proven logic)
PS: I have noticed, that the problem comes inside GET connection, when at same time the POST connections are created as usual even at time of the block, so from the transport layer view it makes no difference, only the downloaded amount of data, that is evidenlty much bigger by GET request.
I have checked yesterday FF 21.0b2 and Aurora 22.0a2, both had same problem.
Would like to check ff23, if I would get it :)
Last working version was FF19.0.2, checked it yesterday too.
(In reply to juwagn from comment #6)
> Would like to check ff23, if I would get it :)

Here: http://nightly.mozilla.org/
juwagn, do you have a minimal testcase to reproduce the issue?
It would help instaead of a huge web app showing the issue.
Flags: needinfo?(juwagn)
I have checked now ff23 nightly build, the same problem 1:1
Loic, I have of course no other test case, where i could reduce the code so, that the problem would be still visible, i have actually no idea which version would be minimalistic enough to see the difference. As said already, there are no errors, nothing to catch except the visible difference when blocking.
I have no special checks, that would check the FF version, also nothing in fact.
So there is somewhere small difference between 19.0.2 and >=20.0 which makes XMLHhttpRequest blocking for 3-4 seconds time frames.
PS: is that possible to revert back the module, which processes XMLHttpRequests to see, if it could fix the problem or are the changes to big to make that so easy?
Flags: needinfo?(juwagn)
I'm going to CC the dev Ehsan who has been assigned to bug 825527 and send him the testcase by email (because the URL is not public).
Flags: needinfo?(ehsan)
Hmm, how did we blame bug 825527?  That has nothing to do with XHR.  I also tried the test case but did not know what to expect, and things seemed to be working fine...
Flags: needinfo?(ehsan)
2 Ehsan Akhgari
Hm, fine? May I test the version, which you meant, is working fine?!

In fact you will not notice the issue immediately, you see always everything works fine except one thing. And you can see the difference when you type some characters, or do anything, that produces download traffic, but best visible when typing.
1. Open version 20.0 start the session
2. Open version 19.0.2 start the session
3. Type characters very fast as much as possible to initiate image traffic
(on this place I can ensure you, the post requests are arriving right at time, so no problem with)
4. On version 20.0 you will notice, there is blocking delay up to 3-4 seconds when downloading data with GET per XMLHttpRequest - visible on changes. This delay comes not always, but in time chunks as described, 3-4 seconds ok, than 3-4 seconds blocking, than again ok, than again blocking, and so on.
I can insert for you the counter, which will count up the GET and POST Requests to make it more useful and visible for you, so that on these changes you would feel the problem. Wait a little bit.
As said, on version 19.0.2 and below there is everything fine, as fine, as on Chrome, Opera, Safari Mobile etc..
I added the counter, which makes the problem very good visible.
Especially between FF20.0 and FF19.0.2(or Chrome, Opera, Safari etc.)
When POST request are running well, the GET request stop for a while, this is good visible on the counter amount correlation between POST and GET requests.
So I can assume, when GET request is in progress, it waits for something before creating new XMLHttpRequest, but for what does it wait, for data?
Than the question is, why this data is not arriving immediately? On server side I set correct data amount being sent for GET requests.
When used Websockets as transport protocol, everything is fine, but that is not the discussed problem.
PS: possibly you have to refresh/delete the cache to update the script.
XHRPolling.prototype._get = function(){
   var self = this;
   this._xhr = this._request(+ new Date, 'GET');
   this._xhr.onreadystatechange = function(){
   var status;
      if(self._xhr.readyState == 4) {
          self._xhr.onreadystatechange = empty;
          try { status = self._xhr.status; } catch(e) { }
          if(status == 200) { 
            try { self._onData(self._xhr.responseText); } catch(e) { }
            self._get();
          }
      }
  };

  this._xhr.onerror = function(e) { alert("error"); }; 
  try { this._xhr.send(null); } catch(e) { }
};

The onreadystatechange seems not to fire always at correct time when used with FF20.0, especially when many mixed GET/POST requests are sent. (that is part from socket.io)
The onerror handler does not fire too, so no error thrown in fact, where i could assume, the connection could be aborted or interrupted.
I am not FF programmer, but I can assume, the function has to wait for something before to fire. First assuming, it waits for arriving GET data, so I would firstly check that place. May be some strange bug inside network communication part, however POST requests are not involved. May be some ordering problem.
There are also very many possible pitfalls, well, hopefully it will be fixed, you know your product much better than me :)
For the record, regression range found by Alice:

Regression window(m-i):
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/62fca43436ce
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20130103 Firefox/20.0 ID:20130103053146
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/8df6f376fdfb
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20130103 Firefox/20.0 ID:20130103064243
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=62fca43436ce&tochange=8df6f376fdfb


In local build:
Last Good: e3a0ca11961e
First Bad: 8df6f376fdfb

Regressed by:Bug 825527
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #11)
> Hmm, how did we blame bug 825527?  That has nothing to do with XHR.  I also
> tried the test case but did not know what to expect, and things seemed to be
> working fine...

Just type the login username/password in "another user" session menu. After that, when you see the remote Win 7 desktop, open a .txt with Notepad and type some words. You'll see the delay between your typing and the screen rendering.
Ok, also, fixing one bug creates the bug inside absolutely different place :)
I see, possibly the images part creates this delay inside GET requests.
As example, when that is not image drawing, but just cursor image change, which produces GET traffic too, than this problem does not happen. So I assume, after drawing the images into canvas the GET request gets big delays while retrieving data even when POST requests are ok at same time. And this bug does not affect Websocket, but only XMLHttpRequest GET
:) looks promising.
I don't think that the regression range is valid.  Trying mozregression myself now.
mozregression gives me http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=801ba75ac563&tochange=d8ca3e1c469e.  Somebody should bisect this range by building.  Lots of different stuff in there...
Hello, possibly i have found the deeper problem.
Usually I push the images into array
-------------------just an example
function myfunc(dx, dy, data) {
 var img = new Image();
 window.image.push(new Array(img, dx, dy));
 img.src = .....
-------------------and so on
on the image onload I call context.drawImage(img, dx, dy) etc..
Now comes the main part of deleting the images
------
window.image[0][0].src = null; //the image source
window.image[0][1] = null; 
window.image[0][2] = null; 
window.image[0] = null;
window.image.shift();
------
As you see, I try to set all array references to null to mark it for garbage collecting. By some browsers that really helps to avoid memory leaks or fast increasing of memory usage.
Now when I comment the line with .src
//window.image[0][0].src = null; //the image source
window.image[0][1] = null; 
window.image[0][2] = null; 
window.image[0] = null;
window.image.shift();
The delay disappears and even FF20.0 begins to work correctly now :)
window.imageDoingCacheDIM[0][0] = null; -> ok
window.imageDoingCacheDIM[0].src = null; -> 3-4 sec. delay in xmlhttp request :)
so in fact the setting of image src to null is dangerous when used with GET XMLHttpRequests, but same code works fine with Websockets despite of fact, that I set there image src to null..
However, no one browser had problem with this code, only FF20, this fix for me is enough to say, I can live with :)
Thank you all for your attention!
PS: on my test server i have changed this part too, instead to set src to null, i set the image reference inside array to null
before -> window.image[0][0].src = null;
after  -> window.image[0][0] = null;
So when you access the server for tests, than you will not notice the bug anymore. However it will not ensure you, that there will be no other idiot like me who sets src to null and uses xml request, and wonders, why there is a delay :)
Hmm, that is... interesting.  Do you have a smaller test case which reproduces this?

Boris, does any of this ring any bells?
WebIDL bindings for HTMLImageElement changed the behavior of the src setter.  It used to treat null as "", but now it treats it as "null" (which of course starts a new image load, etc).

Looks like Opera, Chrome, Safari all treat null as "" here.

IE9, like us, follows the current spec and treats it as "null".

Anne filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=21668 on the spec here if that's what we want to change.

That said, I have no idea why this last would affect XHR, since we should be coalescing the image loads into a single HTTP request.  Unless the XHR is also being done to this "null" url?
Flags: needinfo?(juwagn)
Summary: XMLHttpRequest FF>=20.0 delays up to 4 seconds → null and HTMLImageElement.src
So the question is does IE also have the delay? What's the difference between us and IE?
Did the reporter even test with IE?
Well, even if my html5 client supports IE, usually the Websockets and Canvas are emulated via Flash for versions from IE5.5 up to IE9, and the image handling differs completely.
So the test case I have provided runs internally via Apache's reverse proxy, that does not support Websockets and I have to switch to xhr-polling instead (however the direct xhr-polling without reverse proxy between shows same behavior on FF20)
Internally the Apache's reverse proxy uses rewriting rules and downgrades the connection for MSIE browsers (I do not know for which reason). However, I do not care about IE, the support for it was not mandatory, only IE10 has needed support, but unchecked with Apache's reverse proxy.
By using direct xhr-polling the IE shows good behavior and by usage via Apache's reverse proxy due the rewriting rules it is very slow. So I can not directly compare IE5.5-IE9 with FF, Chrome or Opera.
And Masatoshi Kimura has right, I did not report IE as tested browser due predictable limits.
2 Boris, the setting the .src to "" is good working way too instead null, thank you! :)
Flags: needinfo?(juwagn)
But yes, it's worth figuring out where the delay is coming from, if the XHR is not in fact loading the same url.
Boris, I can revert the "null issue" back extra for you if you want to check, where it happens. However, what ever I tried to reduce the code so, that I could reproduce the delay issue with lesser code without canvas etc., I did not success (everything worked fine). Therefore this issue happens exactly under the actual code. So if you want to spend hours/days on finding that difference inside FF versions, just let me know.
What I'd actually appreciate is a clear description of how the XHR is related to these images, if at all.  Specifically, whether there's only one XHR and what URI(s) it goes to.  Also, what triggers it and how the delay is being computed.
Tracking for now, but if we don't find a low risk fix, we may untrack for upcoming releases given the minimal user impact (only 1 report).
Flags: needinfo?(juwagn)
(In reply to Boris Zbarsky (:bz) from comment #30)
> What I'd actually appreciate is a clear description of how the XHR is
> related to these images, if at all.  Specifically, whether there's only one
> XHR and what URI(s) it goes to.  Also, what triggers it and how the delay is
> being computed.

Hey bz, I have added needsinfo on the reporter to get help on your comment here.Any thing else we might be able to do here for Fx21 keeping in mind we are close to release ?
Short of reverting 21 to [TreatNullAs=EmptyString] here, no.

Not sure whether we want to do that, honestly.
Hello, to reproduce the error I would need to set again src to null instead = ""
Of course I can do that again, but as long the workaround was that easy there is just not so high interest on fixing it.
To invest so much time into digging such bugs is not worth of it, imho.
Flags: needinfo?(juwagn)
(In reply to Boris Zbarsky (:bz) from comment #33)
> Short of reverting 21 to [TreatNullAs=EmptyString] here, no.
> 
> Not sure whether we want to do that, honestly.

Thanks :bz. Given there was only one report and based on the work around here in comment# 34 I am going to untrack for future releases.
OK.  The real questions are:

1)  Whether other sites are affected (maybe not).
2)  Whether there is some other bug here that the workaround is just covering up...  That part is hard to tell.  :(
hixie is being unresponsive on the spec end, but I think we should assume the spec is just wrong here and switch the behavior back to [TreatNullAs=EmptyString].

Peter, Jonas, any objections?
Component: Untriaged → DOM: Core & HTML
Flags: needinfo?(peterv)
Flags: needinfo?(jonas)
Product: Firefox → Core
Summary: null and HTMLImageElement.src → HTMLImageElement.src should be [TreatNullAs=EmptyString]
I don't feel strongly either way. This isn't the first place where the fact that null stringifies to "null" is causing problems. Stringifying to "" generally seems like a better idea.

On the other hand, being consistent is generally good too.

I would say adding [TreatNullAs=EmptyString] here seems like the simpler way to deal with the problem. And it seems unlikely to have any negative real-world impact.
Flags: needinfo?(jonas)
OK.  Compat testing now shows that Chrome does the same thing we do (maps null to "null").  Safari maps null to "remove the attribute", but webkit nightly maps it to "null".  Given that, wontfix.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(peterv)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.