Closed Bug 565938 Opened 14 years ago Closed 14 years ago

[HTML5] LinkedIn recruiter loads, then disappears, then starts looking for google.com

Categories

(Tech Evangelism Graveyard :: English US, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: breckard, Unassigned)

References

Details

(Keywords: regression)

Attachments

(1 file)

https://www.linkedin.com/cap/

Sadly, you need an account, but if you email me I can provide you with the credentials to use.

When you hit the site, it loads, then disappears, and then we seem to go off looking for google.com.
> but if you email me I can provide you with the credentials to use.

Please.
Incoming!
OK, definitely due to the way HTML5 parser treats document.write.  What happens here is that the script https://www.google.com/jsapi/?key=notsupplied is running after parsing is done, and this script happens to do a document.write of something like <script src="https://www.google.com/uds/?file=feeds&v=1"> in this case.

This script is triggered from a JS call in https://www.linkedin.com/cap/js/cap-blog.js on line 1, which is:

  google.load("feeds", "1");

cap-blog.js is loaded like so:

  getJS("/cap/js/cap-blog.js", true);

where getJS is defined in https://www.linkedin.com/cap/js/loader.js like so, trimming out the part irrelevant to the call for cap-blog.js:

/**
 * Prevent JS blocking ( || downloads)
 * @param {String} path     Absolute path to the JS file
 * @param {Boolean} defer   Should in be BODY or HEAD? inserts "document.getElementsByTagName( node )[0]"
 * @param {String} method   Choose between "scriptDOM" or "XHRInject"
 */
function getJS( path, defer, method ){
  var isIE = /*@cc_on!@*/false;
  var isSafari = document.childNodes && !document.all &&
                !navigator.taintEnabled;
  if ( !isIE && !isSafari ) {
    var node;
    (defer !== true) ? node = 'head' : node = 'body';
    method = method || 'scriptDOM';
    node = document.getElementsByTagName(node.toUpperCase())[0];

    if (node) {
      var s = document.createElement('SCRIPT');
      s.src = path;
      document.getElementsByTagName(node.appendChild(s));
    }
  } else if ( isSafari || isIE ) {
      document.write('<scr'+'ipt src="' + path + '"><\/scr'+'ipt>');
  }  // END isIE/isSafari
}

Now what confuses me is that the getJS call above is happening directly off the parser:

Breakpoint 2, nsHTMLDocument::GetElementsByTagName (this=0x7fffda524800, aTagname=..., aReturn=0x7fffffffbbb0)
    at ../../../../../mozilla/content/html/document/src/nsHTMLDocument.cpp:1393
1393      nsAutoString tmp(aTagname);
(gdb) jsstack 
0 getJS(method = "scriptDOM", defer = true, path = "/cap/js/cap-blog.js") ["https://www.linkedin.com/cap/js/loader.js":71]
    s = undefined
    xhr = undefined
    node = "body"
    isSafari = false
    isIE = false
    this = [object Window @ 0x7fffe3c94d30 (native @ 0x7fffe3c86c60)]
1 <TOP LEVEL> ["https://www.linkedin.com/cap/dashboard/home":885]
    this = [object Window @ 0x7fffe3c94d30 (native @ 0x7fffe3c86c60)]
(gdb) bt
#0  nsHTMLDocument::GetElementsByTagName (this=0x7fffda524800, aTagname=..., aReturn=0x7fffffffbbb0) at ../../../../../mozilla/content/html/document/src/nsHTMLDocument.cpp:1393
#1  0x00007ffff6932476 in nsIDOMDocument_GetElementsByTagName (cx=0x7fffe3c5c000, argc=1, vp=0x7fffdbd97138) at dom_quickstubs.cpp:3692
#2  0x00007ffff5190e95 in js_Interpret (cx=0x7fffe3c5c000) at ../../../mozilla/js/src/jsops.cpp:2199
#3  0x00007ffff51a59fa in js_Execute (cx=0x7fffe3c5c000, chain=0x7fffe98e2680, script=0x7fffdc437400, down=0x0, flags=0, result=0x0)
    at ../../../mozilla/js/src/jsinterp.cpp:1073
#4  0x00007ffff5116b22 in JS_EvaluateUCScriptForPrincipals (cx=0x7fffe3c5c000, obj=0x7fffe98e2680, principals=0x7fffdbd5b9d8, chars=0x7fffffffc870, length=49, filename=
    0x7fffdbb180a8 "https://www.linkedin.com/cap/dashboard/home", lineno=881, rval=0x0) at ../../../mozilla/js/src/jsapi.cpp:4880
#5  0x00007ffff6443a48 in nsJSContext::EvaluateString (this=0x7fffe3c94550, aScript=..., aScopeObject=0x7fffe98e2680, aPrincipal=0x7fffdbd5b9d0, aURL=
    0x7fffdbb180a8 "https://www.linkedin.com/cap/dashboard/home", aLineNo=881, aVersion=0, aRetValue=0x0, aIsUndefined=0x7fffffffc78c)
    at ../../../mozilla/dom/base/nsJSEnvironment.cpp:1763
#6  0x00007ffff61fe361 in nsScriptLoader::EvaluateScript (this=0x7fffdbd2e9a0, aRequest=0x7fffdbd11d00, aScript=...)
    at ../../../../mozilla/content/base/src/nsScriptLoader.cpp:760
#7  0x00007ffff61fdd64 in nsScriptLoader::ProcessRequest (this=0x7fffdbd2e9a0, aRequest=0x7fffdbd11d00) at ../../../../mozilla/content/base/src/nsScriptLoader.cpp:673
#8  0x00007ffff61fd9fa in nsScriptLoader::ProcessScriptElement (this=0x7fffdbd2e9a0, aElement=0x7fffdc492ba0) at ../../../../mozilla/content/base/src/nsScriptLoader.cpp:624
#9  0x00007ffff61fa799 in nsScriptElement::MaybeProcessScript (this=0x7fffdc492ba0) at ../../../../mozilla/content/base/src/nsScriptElement.cpp:195
#10 0x00007ffff62ea7f6 in nsHTMLScriptElement::MaybeProcessScript (this=0x7fffdc492b30) at ../../../../../mozilla/content/html/content/src/nsHTMLScriptElement.cpp:552
#11 0x00007ffff62ea4b4 in nsHTMLScriptElement::DoneAddingChildren (this=0x7fffdc492b30, aHaveNotified=1)
    at ../../../../../mozilla/content/html/content/src/nsHTMLScriptElement.cpp:480
#12 0x00007ffff65c452e in nsHtml5TreeOpExecutor::RunScript (this=0x7fffda5426f0, aScriptElement=0x7fffdc492b30) at ../../../mozilla/parser/html/nsHtml5TreeOpExecutor.cpp:725
#13 0x00007ffff65c3b3c in nsHtml5TreeOpExecutor::RunFlushLoop (this=0x7fffda5426f0) at ../../../mozilla/parser/html/nsHtml5TreeOpExecutor.cpp:521
#14 0x00007ffff65c513e in nsHtml5ExecutorReflusher::Run (this=0x7fffdbd6cea0) at ../../../mozilla/parser/html/nsHtml5TreeOpExecutor.cpp:90

Why is that script added by getJS not blocking the parser?   I would think it would.  Henri?
blocking2.0: --- → ?
Keywords: regression
Hmm.  I guess it did in the old parser but might not in the new one...  if it were document.written, it would, of course.

Which makes me want to say this should be tech evang.
I have a similar problem like this in bug 560256, which was marked invalid.
Right.  The question here is only why the document load finishes before the cap-log.js script loads.
(In reply to comment #3)
> /**
>  * Prevent JS blocking ( || downloads)
>  * @param {String} path     Absolute path to the JS file
>  * @param {Boolean} defer   Should in be BODY or HEAD? inserts
> "document.getElementsByTagName( node )[0]"
>  * @param {String} method   Choose between "scriptDOM" or "XHRInject"
>  */
> function getJS( path, defer, method ){
>   var isIE = /*@cc_on!@*/false;
>   var isSafari = document.childNodes && !document.all &&
>                 !navigator.taintEnabled;
>   if ( !isIE && !isSafari ) {
>     var node;
>     (defer !== true) ? node = 'head' : node = 'body';
>     method = method || 'scriptDOM';
>     node = document.getElementsByTagName(node.toUpperCase())[0];
> 
>     if (node) {
>       var s = document.createElement('SCRIPT');
>       s.src = path;
>       document.getElementsByTagName(node.appendChild(s));
>     }
>   } else if ( isSafari || isIE ) {
>       document.write('<scr'+'ipt src="' + path + '"><\/scr'+'ipt>');
>   }  // END isIE/isSafari
> }

Aargh. I very much dislike this pattern. We converge on a common behavior by implementing WebKit and IE traits but sites assume old Gecko traits and put us on the unsafe code path. But having different code paths is pointless in the first place!

> Why is that script added by getJS not blocking the parser?   I would think it
> would.  Henri?

The spec says it shouldn't block the parser:
See the second to last case under step 8 under http://www.whatwg.org/specs/web-apps/current-work/#running-a-script

The historical blocking behavior here is different between browsers. The problem is that the site assumes the blocking behavior will stay constant in each browsers engine and they will never converge on a common behavior. Worse, the site clearly tries to get unblocking behavior where possible ("Prevent JS blocking") but actually relies on getting blocked everywhere...

I suggest contacting the site asking them to rewrite the method as:
/**
 * Load a script during the HTML parse. Only to be called from 
 * script elements that appear in the document's source or that have
 * themselves been created using document.write(). Do not call 
 * from event handlers or timeouts!
 * @param {String} path     Absolute path to the JS file
 * @param {Boolean} defer   Ignored
 * @param {String} method   Ignored
 */
function getJS( path, defer, method ){
  document.write('<scr'+'ipt src="' + path + '"><\/scr'+'ipt>');
}

And then introducing a new method:
/**
 * Load a script avoiding blocking. The script designated by 
 * path MUST NOT call document.write()!
 * @param {String} path     Absolute path to the JS file
 */
function getJSAvoidBlocking( path ){
  var s = document.createElement("script");
  s.src = path;
  document.getElementsByTagName("head")[0].appendChild(s);
}

And then using the new method only for scripts that have been reviewed not to call document.write().
Assignee: nobody → english-us
Component: HTML: Parser → English US
OS: Mac OS X → All
Product: Core → Tech Evangelism
QA Contact: parser → english-us
Hardware: x86 → All
Bret, do you want to drop the linkedin folks an e-mail?
I'm also seeing this kind of problem on http://www.weer.nl . No problem in Firefox3.6.
BZ, Aakash knows someone on the LinkedIn Frontend team.  We'll get him CC'd on the bug.  Thank you for looking into this.
Attached file unminimized testcase
Works fine in Firefox3.6.3, IE8 and Google Chrome, but fails on Firefox trunk.
Ok, he's cc'ed to the bug.
Wait, I'm confused, what exactly changed when we turned on the HTML5 parser?

Comment 3 says:

"What happens here is that the script https://www.google.com/jsapi/?key=notsupplied is running after parsing is done".

Was this always running after parsing is done? If so, why didn't we break before turning on the HTML5 parser? getJS is obviously a enormously broken mess, but I can't see why it would have changed behavior by turning on the new parser?

We always nuked the current document if document.write happens after parsing is done.
> Was this always running after parsing is done?

No.  In the old parser, inserting a script like that blocked the parser.
Inserting a script how? Using createElement/appendChild?

I couldn't find a reference in comment 3 for how the initial script was inserted.
Which initial script?  You have an inline script in the page calling getJS.  getJS inserts a script via appendChild; I believe we blocked on the script getJS inserts in the old parser.
Ok, I guess I don't understand the first two paragraphs of comment 3 at all.

What it boils down to I guess is: Is this difference because a createElement/appendChild created script used to block the parser but no longer does?

If yes: Why isn't this biting safari? Is it due to the isSafari check making safari use document.write("<script>...") which makes safari block?

If no: Can someone explain the steps in comment 3 more clearly. (Defining all "this script" references would help)
> Is this difference because a createElement/appendChild created script used to
> block the parser but no longer does?

Yes.

> Is it due to the isSafari check making safari use document.write("<script>...")
> which makes safari block?

Yes.
Cool, thank you for explaining. Then I definitely agree this is Evangelism material and that there is nothing we should do on our end. Other than asking linkedin to change their code of course.
Note that it's not just about parser done vs. parser not done. As of HTML5, roughly like in IE, the document is blown away if document.write() is called when 'insertion point' (in the sense the spec defines the term) is undefined. When a script-inserted external script runs, 'insertion point' is undefined.

The old parser allowed asynchronously-running scripts insert stuff into wherever nsScanner was at the time. This seemed to "work" as long as the script happened fired before EOF and nsScanner happened to be at a token boundary. Some things "work" in Safari for the same reason (on other sites that don't have an isSafari check, that is): if the script happens to fire before EOF, Safari lets the script write into a timing-dependent point in the input stream.
weer.nl is bug 553795.
Why shouldn't Mozilla be doing as what IE is doing?
We are. Changing to behaving like IE is what broke us here. Likely because the page is using browser detection and using different code paths for us and IE.
That is not so the case from what I understood from bug 560256.
Hmm.. so IE blocks on dynamically inserted <script> elements? Do they block the parser or do they do something else whacky?

In any case, given how few sites have broken so far, I'd like to try to stick to the HTML5 spec for now.
Based on further testing, I've reopened bug 560256 to track changing Gecko & the spec so that in some cases (which cases exactly?) document.write() with an undefined insertion points gets ignored.

If we start ignoring document.write() in some cases, I'd expect LinkedIn (and weer.nl) still to be better off using their "IE" script loading mechanism for all browsers in order not to have their document.written stuff ignored.
Summary: LinkedIn recruiter loads, then disappears, then starts looking for google.com → [HTML5] LinkedIn recruiter loads, then disappears, then starts looking for google.com
breckard, can you please retest in a fresh nightly now that bug 560256 has landed?
Hi Henri, everything looks to be functioning normally!  Thank you for looking into this everyone!
This is now working, so I guess this can be closed (even though Henri thinks they should fix their script, as mentioned in comment 27).
Resolving as WFM, since the user-perceived experience works even though it's not clear if the site actually FIXED in the evangelism component sense.

Note to LinkedIn developers: If you want your analytics to actually work, comment 7 still applies after the landing of bug 560256.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
Assignee: english-us → nobody
Component: English US → HTML: Parser
Product: Tech Evangelism → Core
QA Contact: english-us → parser
Assignee: nobody → english-us
blocking2.0: ? → ---
Component: HTML: Parser → English US
Product: Core → Tech Evangelism
QA Contact: parser → english-us
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: