Closed Bug 12813 Opened 25 years ago Closed 24 years ago

<SCRIPT SRC="xx.js" CHARSET="xxx"> does not work

Categories

(Core :: DOM: Core & HTML, defect, P3)

x86
Windows NT
defect

Tracking

()

VERIFIED INVALID

People

(Reporter: teruko, Assigned: ftang)

References

()

Details

(Keywords: helpwanted, testcase, Whiteboard: nsbeta3+, wait for QA to correct test cases)

Above url has different test cases for <SCRIPT SRC="xx.js" CHARSET=xxx">

Only iso-8859-1 page works, but rest of the test cases do not work.

Steps of reproduce
1. Click on sjis.html, euc.html and so on.
   the Japanese is displayed as garbage.

Tested 8-30-10-M10 Win32 build.
Priority: P3 → P2
QA Contact: cbegle → teruko
Assignee: mccabe → vidur
Component: Javascript Engine → DOM Level 0
Not JavaScript engine. Vidur, your world or elsewhere?
The problem might be in CNavDTD. The CNavDTD::AddLeaf function doesn't preserve
the original result code which is NS_ERROR_HTMLPARSER_BLOCK, thus causing
SCRIPT blocks to be evaluated out of order. Note the addition of "nsresult rv =
NS_OK".

Try this substitute:
nsresult CNavDTD::AddLeaf(const nsIParserNode& aNode){
  nsresult result=NS_OK;

  if(mSink){
    eHTMLTags theTag=(eHTMLTags)aNode.GetNodeType();
    OpenTransientStyles(theTag);

    STOP_TIMER();

    result=mSink->AddLeaf(aNode);

#if 1
    PRBool done=PR_FALSE;
    nsCParserNode*  theNode=CreateNode();
    nsresult rv = NS_OK;
    while(!done) {
      CToken*   theToken=mTokenizer->PeekToken();
      if(theToken) {
        eHTMLTags theTag=(eHTMLTags)theToken->GetTypeID();
        switch(theTag) {
          case eHTMLTag_newline:
            mLineNumber++;
          case eHTMLTag_text:
          case eHTMLTag_whitespace:
            {
              theToken=mTokenizer->PopToken();
              theNode->Init(theToken,mLineNumber,GetTokenRecycler());
              rv=mSink->AddLeaf(*theNode);
            }
            break;
          default:
            done=PR_TRUE;
        } //switch
      }//if
      else done=PR_TRUE;
    } //while
    RecycleNode(theNode);

#endif

    START_TIMER();

  }
  return result;
}
Assignee: vidur → brendan
Status: NEW → ASSIGNED
Target Milestone: M15
I can't find any code that implements SCRIPT SRC= CHARSET= in MozillaClassic
(http://lxr.mozilla.org/mozilla/source/lib/layout/layscrip.c#1069 or below), and
I recall we decided against that attribute, because the server knows best.  Did
it make it into HTML 4.0?  D'oh!

Oh well, this can't be high priority, because Nav4.x doesn't do it.

/be
Priority: P2 → P4
Adjusting priority.

/be
Target Milestone: M15 → M16
Adding erik for advice -- is this a priority?  Does IE do it?  Any tips on
implementation will be gratefully accepted.  At this point it's likely to miss
M16 and M17, due to my other bugs.

/be
Adding Juraj (jbetak) to Cc list. Juraj, you did something like this recently,
right? Would you please help me answer Brendan's question in this bug report?
Erik, Brendan,

I'm marking as duplicate of 32604; that's the bug we discussed with Frank some 3 
weeks ago. I finalized the code changes on Friday and 
put them in, they are in today's build (04/17/2000).

Essentially, we now look at the charset info in the HTTP header content type, 
then at the charset attribute in the <SCRIPT> tag and if they 
are both not present, we default to the document charset. I still might need to 
do some clean up, since Nisheeth and RickG would like to 
have some profiling done first and possibly push for some optimization. 

IE's current support of external JS files is very bad, essentially they expect 
the content to be just ASCII. Please see 
http://www.arukikata.co.jp for a good real world example.

I can confirm from my observation that Nav4.x only supports the fall-back to the 
document charset, it doesn't observe charset information 
in the <SCRIPT> tag.

*** This bug has been marked as a duplicate of 32604 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
I tested this in 2000-05-16-09 Win32 build.
In http://babel/javascript_i18n/jsfiles/JavaScript_enc.htm
sjis.html, euc.html, iso-8859-1.html, iso-8859-2.html, and utf8.html does 
not work.

The source of sjis.html, euc.html, iso-8859-1.html, and iso-8859-2.html is

<SCRIPT SRC = "xx.js CHARSET=Shift_JIS"..>  

They do not work.

sjis-nochar.html, euc-nochar.html, iso-8859-1_nochar.html, and 
iso-8859-2_nochar.html works.
 
Their source is <SCRIPT SRC = "xx.js"..>

I reopen this. 
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
I didn't claim to fix a DUP of this, jbetak did.  Reassigning...

/be
Assignee: brendan → jbetak
Status: REOPENED → NEW
Status: NEW → ASSIGNED
M16 has been out for a while now, these bugs target milestones need to be 
updated.
Keywords: testcase
Assignee: jbetak → ftang
Status: ASSIGNED → NEW
OK, this has to do with the fact that the "Charset" keyword is not an attribute 
of the JavaScript tag. We currently parse all JS tag attributes in 
HTMLContentSink::ProcessSCRIPTTag. For the cases Teruko is pointing out, we 
actually need to parse the string containing the name of the external JS file. 

It might be a good idea to validate this procedure, meaning if it's really 
required. If it is, all the changes should be pretty much contained in 
nsHTMLContentSink.cpp

mark it as assign.
Status: NEW → ASSIGNED
Keywords: nsbeta3
nsbeta3+ per bug meeting with P3. We need more real world cases. 
Keywords: 4xp, helpwanted
Priority: P4 → P3
Whiteboard: nsbeta3+
There should be no question about the validity of the charset
attribute within the SCRIPT tag. It is defined in HTML 4.0.
Web designers can and will use this attribute as needed.
This is regression from 4.x, also. So, we need to change the priority of this 
bug.
teruko, your test cases is wrong. When I look at it They look like
<SCRIPT SRC="xx.js  CHARSET=xxx" LANGUAGE="JavaScript">
but not
<SCRIPT SRC="xx.js"  CHARSET="xxx" LANGUAGE="JavaScript">

It should be
<SCRIPT SRC="sjis.js"  CHARSET="Shift_JIS" LANGUAGE="JavaScript">
but not
<SCRIPT SRC="sjis.js  CHARSET=Shift_JIS" LANGUAGE="JavaScript">

You have these kind of error in all your test cases. Please correct them. 
Thanks.

Notice it should have a '"' after the xx.js and a '"' between CHARSET= and xxx

Can you correct your test cases and retest this problem again ?

The HTML 4.01 said the following
http://www.w3.org/TR/html4/interact/scripts.html#h-18.2.1

<!ELEMENT SCRIPT - - %Script;          -- script statements -->
<!ATTLIST SCRIPT
 charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
 type        %ContentType;  #REQUIRED -- content type of script language --
 src         %URI;          #IMPLIED  -- URI for an external script --
 defer       (defer)        #IMPLIED  -- UA may defer execution of script --
>

charset is an attribute here. It is very different than the <META 
HTTP-EQUIV="Content-Type"> . In there, charset is a sub parameter of the CONTENT 
attribute.
Summary: <SCRIPT SRC="xx.js" CHARSET=xxx"> does not work → <SCRIPT SRC="xx.js" CHARSET="xxx"> does not work
It could be simply the test cases are wrong. We should retest after teruko 
correct the test cases and see this is INVALID or not.
Whiteboard: nsbeta3+ → nsbeta3+, wait for QA to correct test cases
I will correct my test cases next Monday since I cannot access babel server 
from home.
I corrected my testcases.  They works fine. I mark this bug as INVALID.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → INVALID
Verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.