Closed Bug 39042 Opened 24 years ago Closed 8 years ago

Display TXT using mozTXTToHTMLConv (linkify links, uris, urls in text/plain documents)

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: BenB, Unassigned)

References

Details

(Keywords: helpwanted)

We have a* TXT->HTML stream converter in Necko. Please use it in the browser.

Jud <valeski@netscape.com> says, you 'have to ask for a conversion for it to
happen'. You can find the
interface at mozilla/netwerk/streamconv/public/nsStreamConverter.idl. Maybe see
mozilla/netwerk/streamconv/test bryner's finger protocol handler, if you need
examples. It shouldn't be difficult, if you know layout (I don't :( ).

*Currently, there are even 2 unfortunately. nsTXTToHTMLConv is registered as
TXTtoHTML converter and will be used. It doesn't escape "<>&" (interestingly, it
works) and only has basic URL recognition facilities. mozTXTToHTMLConv is
technically not a real stream converter surrently, but it will be soon. It has
sophisticated conversion routines used in Mailnews.
Basically: Don't worry, it will all be fine and cool.
Summary: Display TXT using stream comverters → Display TXT using stream converters
this is similar to bug 33334 (wontfix)
*** Bug 39580 has been marked as a duplicate of this bug. ***
Note that this requires the prefs controlling the stream converters -- currently 
in the Mail & Newsgroups category -- to be moved into a more general prefs 
category (such as the `Display' category in Tardis).
mpt, yes, that's bug 23582.
The conversion of the .txt stream to an HTML stream will probably need to happen 
in the parser.  Re-assigning to harish.  I would say that this is a pretty low 
priority RFE.  Harish, please schedule it with that in mind.
Assignee: clayton → harishd
I don't think, this should be low priority, because
- the parser shouldn't hardcode TXT->HTML conversion
- linkifying URLs in TXT is definitely desireable
- it is easy to implement (you just need to call the converter - the
  functionality is there)
Note, that *all* txt display is broken currently. Unlimited (well, many) donuts
to the guy who fixes it by fixing this (39042) bug.
*** Bug 40995 has been marked as a duplicate of this bug. ***
Wait, stop, don't go foward until someone explains the point of this request to 
me. Typically text documents get rendered by gecko as *text*, and are explicitly 
not converted to html. This would be inappropriate for the parsing engine to 
unilaterally decide. If you want a menu (somewhere) that says "load url with 
given mimetype" then you may get what you want. 
Rick, this is so that when I open (or navigate to) a text/plain file in 
Navigator, links are highlighted, *structs* are stylified, etc, just as they 
are in Messenger. So (for example) I don't have to copy URLs from plain-text
e-zines and paste them into the location bar.
> someone explains the point of this request to me.

1. use existing recognition functions for displaying text/plain documents in the
browser.
2. remove code redundancy (see below).

> Typically text documents get rendered by gecko as *text*, and are explicitly
> not converted to html.

Oh, I thought, NGLayout were a HTML/XML rendering engine and wouldn't know about
text/plain.

I thought, the parser anyhow converts text/plain to HTML (by adding <pre> and
escaping "<>&"). It might have been better to move this task to a stream
converter, then. Not sure about dependencies, however.

Please explain how things work (or refer to a doc) and how they are (/will be) used.

> This would be inappropriate for the parsing engine to unilaterally decide.

I don't understand that. Converting to HTML could be only one (small) step to
rendering. It should not decide about the recognition, however.
If 2. turns out to be invalid, please explain how 1. could be achieved.
> Please explain how things work (or refer to a doc) and how they are (/will be)
> used.

Well, don't mind.

Can you tell me at least, what path text/plain documents go, that I load in the
browser and where to hook up the converter, so we can move on with this bug,
please?
gecko cannot do what you're asking, unless directed to do so under some command. 
By default, we DO display text/plain documents (and text/rtf), so we can't 
automagically do text as HTML. Marking WONTFIX.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → WONTFIX
For the browser, we don't have to use Gecko's feature to display text/plain, but
can convert it to text/html before it reaches Gecko. And, judging from comments
in this and other bugs, it is desireable as well. REOPENing.

*What* should do it then? Where are the other conversions done? Moving to
COMPONENT Browser-General for triaging :-(.
Status: RESOLVED → REOPENED
Component: Layout → Browser-General
Resolution: WONTFIX → ---
/
Assignee: harishd → asa
Status: REOPENED → NEW
QA Contact: petersen → doronr
You're right -- it can be done before it hits gecko. To do so, you would need to 
run it through a converter and change the mimetype (probably to text/html).
Ben, want to help me figure out who should get this bug.  Browser General can be 
a bit of a black hole.  If no one wants to take the bug we can mark it 
helpwanted and assign it to nobody@mozilla.org until someone picks it up.
> Ben, want to help me figure out who should get this bug.

That's exactly my problem: I have no clue, where this should be hooked up. My
guess was Necko, but valeski sent me to the parser. If I knew, where exactly to
hook it up, I might even have fixed it by myself already.

REASSIGNing to Network, which is my best bet.
Assignee: asa → gagan
Component: Browser-General → Networking
QA Contact: doronr → tever
as mentioned above by nisheeth parser is the best place for this to happen. 
Harish your call on setting the target. 
Assignee: gagan → harishd
harishd,
please tell me where/how to hook it up. If it is not too hard, I might fix it
myself.
As rickg mentioned, it's simply not possible for the parser to make this 
decision. ".txt" documents will get rendered as plain text and the parser cannot 
make a random decision of rendering them as html ( unless instructed ). This 
ought to be done before entering the parser land.

Hate to bounce bugs...but got to do in this case.

Back to you gagan.

Assignee: harishd → gagan
Gagan, my understanding is that the text to html stream converter in necko 
should convert a "text/plain" stream into a "text/html" stream and pass that to 
the parser.  Please ignore my earlier comment.
Reassigning to myself.

Gagan, how, exactly do I do thatm especially where?
Assignee: gagan → mozilla
Status: NEW → ASSIGNED
Target Milestone: --- → M18
ops, accidently removed gagan.

gagan, I have no way to fix this without hints. I want to (try to) fix this for
M18.
gagan: ping?
This will most likely not make it for M18 :-( , because, nobody yet replied to
me and told me were to hook it up, espite all my attempts.
Target Milestone: M18 → M19
*** Bug 33334 has been marked as a duplicate of this bug. ***
Target Milestone: M19 → mozilla0.9.1
Changing personal priorities. Giving away most of my bugs :-( (reassigning to
default owner).

I will still track these bugs closely. If you need my input, feel free to ask me.

New owner: Please do *not* close these bugs (as WONTFIX or whatever you may
find) unless they are fixed. Rather, reassign to <nobody@mozilla.org>, if you
don't want to work on them.
Assignee: mozilla → mscott
Status: ASSIGNED → NEW
moving to future milestone.
Target Milestone: mozilla0.9.1 → Future
mass move, v2.
qa to me.
QA Contact: tever → benc
*** Bug 122714 has been marked as a duplicate of this bug. ***
Blocks: 10080
No longer blocks: 10080
Blocks: 10080
This shouldn't be done at the networking layer, since we don't want text/plain
files to have a complex DOM. (Some sites depend on our current DOM, even. Go
figure.) IMHO we should either not have a DOM or have an extremely simple one
(e.g. a Document with no documentElement and a single text node child), but
that's another story.

In fact, this is similar to text/xml pretty printing. Is there a way to call the
text stream convertor from an xbl binding? cc'ing sicking who may be interested
in doing this. It would be a lot easier than the pretty printing case. :-)
Assignee: mscott → other
No longer blocks: 10080
Component: Networking → Layout
Priority: P3 → --
QA Contact: benc → ian
Summary: Display TXT using stream converters → Display TXT using stream converters (linkify links, uris, urls in text/plain documents)
Target Milestone: Future → ---
> This shouldn't be done at the networking layer

I'm not sure what you mean with that. The text converter / stream converter is
on the networking layer.

I hope you're not suggesting to re-implement the link recognizing functions etc.
somewhere else. We should definitely have that functionality only once in the
tree (lots of tweaking, debates, complex, you add it). That's what this bug is
about - to let the browser use the same routines that Mailnews uses. In fact, I
wrote those routines with that usage in mind.
> The text converter / stream converter is on the networking layer.

Well, assuming it's well designed, it's just a service that accepts a text
stream as input and returns either a Document, DocumentFragment, or
differently-encoded text stream back, right?

So it shouldn't matter what layer actually calls into it.

The point is that we do _not_ want to lie to Gecko about the MIME type, etc, of
text/plain documents, as Bad Things happen when you do that. As I said in my
last comment, this should IMHO be handled just like the XML pretty printing.


> I hope you're not suggesting to re-implement the link recognizing functions 

No, not at all. I totally agree that code duplication is a very bad thing.
> it's just a service that accepts a text [...] and returns [...] a 
> differently-encoded text stream back, right?

It takes unicode plaintext and returns HTML (using <pre>, IIRC). The
recognitions can be disabled individually (link, *struct stuff*, smilies).

> what layer actually calls into it. The point is that we do _not_ want to lie
> to Gecko about the MIME type, etc

ah, OK, you're just talking about how to call the code.
> It takes unicode plaintext and returns HTML (using <pre>, IIRC). The
> recognitions can be disabled individually (link, *struct stuff*, smilies).

If it can be set to return a string containing just the <pre> block and the
contents, that would be ideal, because then the XBL would basically consist of a
single call into this service followed by setting output.innerHTML to the text.
(An ideal XBL binding would also have mutation event handlers watching the text
node and if it changed, it would re-set the text.)

Is the stream convertor scriptable?

How would this scale for large (500MB+) text files? Or do we not really want to
worry about that. If there is a practical limit to this, we could just not
pretty print for large files. That's probably wise anyway, since for large files
the user is almost certainly not particularly worried about pretty printing but
more concerned with searching.
> How would this scale for large (500MB+) text files?

eh? It's a browser. On a broadband connection, that typically takes 2-3 hours to
download. For local files, the user can use another, more suited app. In other
words, I wouldn't worry about that.

> Is the stream convertor scriptable?

Should be, although I never tried it.

<http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/public/mozITXTToHTMLConv.idl>

> If it can be set to return a string containing just the <pre> block and the
> contents, that would be ideal

I don't remember, but looking at the code, it seems it doesn't insert any <html>
or <pre> at all, it just assumes that it will be wrapped in a <pre>, including
the linebreak. This is untested, however - Mailnews just feeds it one line at a
time atm (which is *not* good, because links then cannot cross linebreaks, see
bug 5351).
> In other words, I wouldn't worry about [large files].

I frequently use Mozilla with large files (locally, obviously). Sometimes, I
open large files (e.g. multi-gigabyte video files) in Mozilla by mistake, and
occasionally, they get handled as text files. I wouldn't want Mozilla to die on
me because I made that mistake.

I'm not saying we should worry about making it work perfectly, but we should
gracefully handle that situation, or at least not make it worse.


> [scriptable]

Excellent.


> I don't remember, but looking at the code, it seems it doesn't insert any 
> <html> or <pre> at all, it just assumes that it will be wrapped in a <pre>, 
> including the linebreak. 

Ok, that should be fine then.

sicking: This should be really easy to do. First, make plain text files create
an XML DOM instead of a pseudo-HTML DOM. This DOM would have exactly one
element, in a Mozilla-specific namespace. Then, using either dynamic binding
addition, or, better IMHO, a rule in ua.css, bind that to a binding that has the
following anonymous content:

   <content>
     <hidden xmlns=""><children/></hidden>
     <pre xmlns="...xhtml..."/>
   </content>

...then create a method which, using the aforementioned stream convertor and the
pre element's innerHTML property, sets the contents of the pre element to the
pretty printed text. This method would be called from the constructor and from a
mutation event handler on the root element.

The binding stylesheet should then hide the <hidden> element and make the root
element block-level.

Or something. You are probably much more familiar with how to do this. Let me
know if you have any questions. If you don't want to do it, let me know as well,
and I'll find some other unsuspecting victim. ;-)
Assignee: other → cbiesinger
Component: Layout → Networking
Summary: Display TXT using stream converters (linkify links, uris, urls in text/plain documents) → Display TXT using mozTXTToHTMLConv (linkify links, uris, urls in text/plain documents)
Depends on: 20212
Assignee: cbiesinger → darin
Keywords: helpwanted
QA Contact: ian → benc
*** Bug 307336 has been marked as a duplicate of this bug. ***
Workaround: Use the "autolink" greasemonkey script from
http://www.squarefree.com/2005/05/22/autolink/.
Assignee: darin → nobody
QA Contact: benc → networking
This isn't on anyone's work list and realistically is an abandoned idea. I will close as wontfix - if someone has a patch or is actively going to work on it please reopen. (but please, only then.)
Status: NEW → RESOLVED
Closed: 24 years ago8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.