Closed Bug 186884 Opened 22 years ago Closed 22 years ago

[FIX]unknown decoder should detect XML files

Categories

(Core :: Networking, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla1.3beta

People

(Reporter: guy.marty, Assigned: bzbarsky)

Details

Attachments

(1 file, 1 obsolete file)

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130

I use to work on a Win98 computer. I gived to my transformation files the 
extension .xslt, it works wonderfully with Mozilla (same version).

I tried the transformation on a windows 2000 and mozzila displayed nothing. I 
changed the file extension from .xslt to .xsl (and correct the xml calling 
file) then it works !

This is a very annoying bug because it is impossible to differenciate XSL 
Transformation files and XSL Formatting files.

Reproducible: Always

Steps to Reproduce:
1.copy http://www.mozilla.org/projects/xslt/test.xml and 
http://www.mozilla.org/projects/xslt/test.xsl on your disk
2.Put test.xml in Mozilla, it display something.
2.change extension of test.xsl from .xsl to .xslt and correct the processing 
instruction in the xml file (change .xsl to .xslt).
3.Go to any other web page (mozilla.org)
4.Put test.xml in mozilla (DO NOT DO 'BACK') the display is not changed (ie 
nothing is displayed), but the source page is the right page. 
5.Revert the changes (change test.xslt file extension and correct test.xml file)
6.Refresh, the file is correctly display.

Actual Results:  
All is explain in "Steps to Reproduce"

Expected Results:  
All is explain in "Steps to Reproduce"

I can't evaluate if it is a 'Normal' bug or 'Major' bug.

XML/XSLT/CSS work very fine, thank you for these tools with Mozilla. I made a 
demonstration of these tool, people were very impressed. By now MSIE can't do 
that !
mozilla needs the xslt stylesheet to have "text/xml" as mimetype. Windows mapps
file->mimetype by looking at the extension, so you need to cofigure windows to
associate .xslt with the "text/xml" mimetype. I don't have windows 2000 here so
i'm not sure about the exact steps. In win98 the steps are:

Open file-explorer
Select  tools->folder options  in the menu
Go to the "file types" tab
Click "New Type"
Enter "XSLT Stylesheet" as description, ".xslt" as extension and "text/xml" as
  MIME type
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
I have done the modification in W2K and it's work.

BUT, the obligation to have an association beetween content type and file
extension at file manager level is a VERY BIG problem.

In Windows 2000, to associate a content type with a file extension you have to
edit the register database !

This mean that all XSL Transformation file should have an extension and this
extension is .xsl !

Heaven if you respect this constraint, nobody (XML/XSLT application developper)
can be sure that Mozilla (1.2.1 in W2K) is able to perform the transformation
because the extension .xsl can have been deleted and it is difficult to recreate
it correctly.

Why don't you use the type declare in the processing instruction
(<?xml:stylesheet type="text/xsl"...?>) and/or the declarations at the beginning
of the XSL stylesheet file (<?XML... ?> and the namespace declaration) instead
of the file extension associated with content type at file manager level ?

Thus, I ask more for correcting a feature than I declare a bug.

Thank you for you attention. 
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
mozilla is very strict on MIME types, this will not change. When loading files
off the web this is not a problem since all that is needed is a properly
configured webserver. When loading files from the local filesystem we are at
mercy of the OS to properly supply us with the MIME type. If it doesn't, then
what can we do?

You don't expect naming a file mypage.hello and have mozilla open as an HTML
file, do you? So why should mystylesheet.xslt be understood as an xml file?

If you don't want to reconfigure your OS you need to either put the files on a
webserver and configure it to send text/xml as MIME type, or use a filename that
your OS will map to text/xml, such as mystylesheet.xml


I do see your argument that the mimetype is specified in the <?xml-stylesheet?>
PI. It is a question of how strict we want to be wrt MIME handling, the spec
does say that xslt stylesheets should use a MIME type of "text/xml" or
"application/xml".

Peterv: with stylesheet-compilation we might be able to ignore the mime-type and
always parse as xml, do we want to?
> You don't expect naming a file mypage.hello and have mozilla open as an HTML
> file, do you? 

For a local file, I do indeed (and we do, by the way, if it looks anything like 
HTML).

We should only be strict with MIME types when the content comes from a source 
that sends MIME information (eg http).  This is the same problem that we have 
in the CSS loader -- see bug 120789.  Unfortunately, the existing necko apis 
are really not designed to give the caller that level of control easily... :(
In W2K, the simplest way to make an association between MIME content type and
file extension is to do thaht at Mozilla level, cook book :
* Choose menu : Edit/Preferences...
* Choose the categorie : Navigator/Helper Applications
* Click button : New Type
* Give a value to the field : "Description of type", "File extension" (for me
'.xslt'), "MIME type" (in this case 'text/xml').

And its work fine, at local level, I don't kwon if it's work with a web server
through HTTP protocol. But, it is enough for testing purpose.

Could you add this cook book in the XSLT page of Mozilla's web server ?
ok, file:// mime type sniffing has nothing to do with XSLT.
I'm still suggesting to WONTFIX this, or INVALID.
Basically because webservers don't do content type sniffing to set the mimetype,
and if we do from file://, we just get another bunch of those "works from file
[not], works from server [not]" bugs.
Moving this over to network, putting some of the Keith watchers on CC.

(I don't think that OS or server administering should be part of the XSLT pages,
btw.)
Assignee: peterv → dougt
Severity: normal → enhancement
Component: XSLT → Networking: File
QA Contact: keith → benc
Summary: In windows 2000, transformations (XSLT) are file extension dependants → file:// protocol should have rich content sniffing
As things currently stand, this is a dup of the bug on having the unknown 
content decoder detect XML documents, no?
Whiteboard: DUPEME
-> file handling
Assignee: dougt → law
Component: Networking: File → File Handling
QA Contact: benc → petersen
Ok.. We have no bugs on detecting XML documents, it seems.  May as well use 
this bug.
Assignee: law → dougt
Status: UNCONFIRMED → NEW
Component: File Handling → Networking
Ever confirmed: true
OS: Windows 2000 → All
Priority: -- → P1
QA Contact: petersen → benc
Hardware: PC → All
Summary: file:// protocol should have rich content sniffing → unknown decoder should detect XML files
Whiteboard: DUPEME
Target Milestone: --- → mozilla1.3beta
taking
Assignee: dougt → bzbarsky
Here is the issue.... we want to detect the following things correctly:

     name   |            contents              |  detect as
   ----------------------------------------------------------
1) foo.dll  |             HTML                 |  text/html (eg ebay pages)
2) foo.xsl  |     XSLT including HTML tags     |  text/xml
3) foo.xul  |       XUL (looks like XML)       |  application/vnd.moz.xul+xml
4) foo.foo  |       XML data for some app      |  application/foo (or something)
5) foo.html |             XHTML                |  text/html?
6) foo.xhtml|             XHTML                |  application/xhtml+xml

To get item 2 right, I have to look for the XML decl before scanning for HTML
tags.  To get item 3 and item 4 right I have to look at extensions before I look
at the XML decl.  To get item 1 right, I have to scan for HTML tags before I
look at the extension.

Possible approach:
1)  Look for the XML decl.  If this is found, look at the extension; if that
    gives a useful type use that, otherwise use text/xml
2)  If we didn't find an XML decl, scan for HTML tags
3)  If we didn't find any, look at extension as we do now after looking for HTML
    tags.

Thoughts?
Attached patch Patch v 1.0 (obsolete) — Splinter Review
I finally did some cleanup of this stuff too....
Attachment #111260 - Flags: superreview?(darin)
Attachment #111260 - Flags: review?(bbaetz)
Summary: unknown decoder should detect XML files → [FIX]unknown decoder should detect XML files
Comment on attachment 111260 [details] [diff] [review]
Patch v 1.0

r=bbaetz with the couple of changes we spoke about on irc (assert for both a
type _and_ a func, more comments, and changing the isLocalFile test to test the
result in that block, and return at the end, for consistency/readability/etc.
Plus a few others I've probably forgotten :)
Attachment #111260 - Flags: review?(bbaetz) → review+
Comment on attachment 111260 [details] [diff] [review]
Patch v 1.0

>Index: netwerk/streamconv/converters/nsUnknownDecoder.cpp

>+void nsUnknownDecoder::DetermineContentType(nsIRequest* aRequest)
...
>+    if (mBufferLen >= sSnifferEntries[i].mByteLen &&  // enough data
>+        strncmp(mBuffer, sSnifferEntries[i].mBytes, sSnifferEntries[i].mByteLen) == 0) {  // and type matches

seems like memcmp would be slightly better here since you know that
the length of mBuffer is >= the length of the sniffer entry.



>+      else if ((this->*(sSnifferEntries[i].mCheckType))(aRequest)) {
>+        return;
>+      }        

what is this doing?  why the return without setting mContentType?  ah,
because the "mCheckType" function actually sets mContentType.  ok, how
about a more descriptive name, like mContentTypeSniffer or something?


other than these two nits, this patch looks good to me.  sr=darin
Attachment #111260 - Flags: superreview?(darin) → superreview+
Attachment #111260 - Attachment is obsolete: true
fixed.
Status: NEW → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: