Closed
Bug 110767
Opened 23 years ago
Closed 23 years ago
Incorrect interpretation of "#!" in nsUnknownDecoder::DetermineContentType
Categories
(Core :: Networking, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla0.9.7
People
(Reporter: tenthumbs, Assigned: bzbarsky)
Details
Attachments
(1 file)
1.30 KB,
patch
|
bbaetz
:
review+
rpotts
:
superreview+
|
Details | Diff | Splinter Review |
In nsUnknownDecoder::DetermineContentType, the code says: // If the buffer begins with "#!" or "%!" then it is a script of some // sort... but that's incorrect. What "#!" really means is that the text immediately following is the name of some interpreter (big emphasis on interpreter). The file content can be anything meaningful to the interpreter, including binary data. All unix kernels read a small amount of an executable file. If the first two characters are "#!" the the kernel parses up to the first newline or the end of the buffer (whichever comes first). If it finds something that looks like an interpreter, it rewrites the argv array and calls the interpreter it found. Mozilla needs to look past this. If it finds a "#!", it must check beyond the first newline (or some fixed amount) before it decides an object is text.
Assignee | ||
Comment 2•23 years ago
|
||
So what you're saying is that we should dump the check for "#!" completely and just look for null chars like we do later in the code?
Comment 3•23 years ago
|
||
The code in question was simply moved from the original 'unknown content sniffer' in the 4x codebase :-) Personally, i think you are correct and we should just remove this check all together... if it turns out to be text/plain then great. otherwise, we should treat it as application/octet-stream and save it ... -- rick
Assignee | ||
Comment 4•23 years ago
|
||
Actually, looking at the comment before that check I see what the issue is. If I have a per script that emits html, we want it to be text/plain, not text/html.... This patch should fix things to be happy
Assignee | ||
Comment 5•23 years ago
|
||
bbaetz, would you review?
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla0.9.7
Comment 6•23 years ago
|
||
Why aren't we removing the #! check altogether? Also, how many btyes could we be scanning?
Assignee | ||
Comment 7•23 years ago
|
||
------------------------------------------ #! /usr/bin/perl print "Content-type: text/html\n\n"; print "<html><body>This is some HTML</body></html>" ------------------------------------------ We want to detect this snippet as text/plain, not text/html We would be scanning at most 1024 bytes (that's the buffer size). Note that if I were to remove this check completely we would first search for various html tags (not to quick) and then _still_ scan the 1024 bytes if no html tags are found. So perf-wise removal and my change are probably equivalent.
I don't see anything wrong with a special test in "#!" handling as long as mBufferLen is always big enough. You need 128 bytes for Linux. However, it does seem inefficient. I also think you want to check for more control characters than just nulls.
Make that you need to check more than 128 bytes on Linux if the first 128 are all text.
Comment 10•23 years ago
|
||
Comment on attachment 59329 [details] [diff] [review] Patch to fix Ah, ok. r=bbaetz
Attachment #59329 -
Flags: review+
Assignee | ||
Comment 11•23 years ago
|
||
> I also think you want to check for more control characters than just
> nulls.
Not really... those control characters could just be normal text characters in
plain text that's not in the ASCII encoding. Even checking for nulls is a hack
-- two-byte encodings will test as binary.
We want to flag as text as many text documents as we can, in my opinion. If
this causes a few binary documents to be treated as text, this is an acceptable
alternative... the user can still save them after they've displayed.
Reporter | ||
Comment 12•23 years ago
|
||
> We want to flag as text as many text documents as we can, in my
> opinion.
An awful lot of users freak if garbage appears on their screens. Even
more do not realize they can save a file that way. Besides, has mozilla
in the past, or does it now, or may it in the future, do newline
conversions on text files. If so, your binary data is screwed. I think
you're wrong about text files.
Comment 13•23 years ago
|
||
Comment on attachment 59329 [details] [diff] [review] Patch to fix sr=rpotts@netscape.com
Attachment #59329 -
Flags: superreview+
Assignee | ||
Comment 14•23 years ago
|
||
tenthumbs, if we fix text file stuff we should fix it not only here but also later in the mime decoder... Please file a separate bug on that?
Assignee | ||
Comment 15•23 years ago
|
||
checked in
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•