Open Bug 92250 Opened 23 years ago Updated 2 years ago

Form-Feed characters in text/plain documents are ignored

Categories

(Core :: Printing: Output, enhancement, P3)

x86
Windows NT
enhancement

Tracking

()

Future

People

(Reporter: gordm, Unassigned)

References

()

Details

Attachments

(1 file)

ASCII FF (^L/12/0C) chars are ignored when printing text documents.

FF should cause a page break

This is particularly relvant when printing RFCs (for example from rfc-editor.org).

Example URL: ftp://ftp.isi.edu/in-notes/rfc1945.txt
Status: UNCONFIRMED → NEW
Ever confirmed: true
Is this a request for enhancement.. and how do we get this form feed from the 
HTML to the printer.  I have not heard of this and dont really know how to 
proceed with such a thing.
until this bug is updated with the exact problem.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → WONTFIX
verified.
Status: RESOLVED → VERIFIED
FormFeed is just another way of saying "End of Page"
meaning that when printing a TXT with such characters, the printer should stop
the page at that point and start a new one

how do you get that to a printer? well, i believe that ever since the start of
computer printing, all printers should understand the form feed command... dunno
how modern printing works tho

i dont think this is a RFE, since it is a requirement when working with TXT
files, up there with dealing with NewLine and CarriageReturn
Like I said before.. "how do you get the FF to the printer when the source is in 
HTML or XML".  This request to me is basically.. a request have a tag or 
something that can get the FF to the printer.  So if it were a matter of just 
the printers understanding the FF.. it would not be our bug.. would it.
rods:
Can you take a look at this one, please ? IMHO we should implement this
(assuming the mimetype is text/plain or similar...)...
ewwwwww, a tough one, this is non-trival.
Severity: minor → enhancement
Status: VERIFIED → REOPENED
Priority: -- → P1
Resolution: WONTFIX → ---
Target Milestone: --- → Future
Status: REOPENED → ASSIGNED
taking
Assignee: dcone → rods
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
I don't know much about programming. That said, how hard can this be? I mean, 
its just a text file. Almost no extra formating needs to be done on this page. 
Would modern printers print this out correctly if you just sent the text down 
the pipe? I mean, we already to page breaks for webpages, so instead of 
calculating the page break (like for all webpages longer than one page) just 
use the breaks where the TXT file has them

I apologize if I just don't get it, but I rember playing around with QBasic and 
printers back in the day and putting page breaks in there was not hard at all.
Priority: P1 → P3
Summary: Form-Feed characters in text documents are ignored → Form-Feed characters in text/plain documents are ignored
*** Bug 174521 has been marked as a duplicate of this bug. ***
Assignee: rods → nobody
Status: ASSIGNED → NEW
QA Contact: sujay → printing
I don't think form feeds in HTML is an issue, since white space in HTML is not significant.  The original bug report here deals with the formatting of form feeds in plain text documents.
Confirming problem still exists in Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20120129 Firefox/10.0 SeaMonkey/2.7
(In reply to Julian Reschke from comment #13)
> The relevant code may be close to
> http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/
> nsTXTToHTMLConv.cpp

Some more investigation suggests that the actual code is in the HTML 5 parser, so the code to change is probably in here somewhere:
<http://dxr.lanedo.com/mozilla-central/parser/html/nsHtml5StreamParser.cpp.html>
(In reply to Julian Reschke from comment #13)
> The relevant code may be close to
> http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/
> nsTXTToHTMLConv.cpp

That's for converting text/plain e-mail messages into text/html with quotations and stuff.

The code relevant for handling text/plain on the Web is near
http://mxr.mozilla.org/mozilla-central/source/content/html/document/src/nsHTMLDocument.cpp#555
and
http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5StreamParser.cpp#904

However, changing that code to enhance RFC printing would be in violation of the HTML specification. I think that text/plain load path in Gecko should not be patched to enhance RFC printing unless the HTML specification changes accordingly.

The appropriate fix would be introducing a pseudo-element in CSS that'd allow attaching CSS page-break-* properties to form feeds. But even that might be an overkill considering how narrow the use case is. The more appropriate fix would be to get the IETF to publish RFCs in HTML without pre-pagination.
In other words, I think this is WONTFIX as far as the HTML parser goes and if I was a layout module owner I would WONTFIX the layout aspect as well on the grounds of a very narrow use case that should be addressed in IETF policy.
(In reply to Henri Sivonen (:hsivonen) from comment #15)
> However, changing that code to enhance RFC printing would be in violation of
> the HTML specification. I think that text/plain load path in Gecko should
> not be patched to enhance RFC printing unless the HTML specification changes
> accordingly.

I believe the HTML spec over-specifies things here. UAs should have the freedom to augment the presentation of text/plain. (Yes, I'll open a bug).

> The appropriate fix would be introducing a pseudo-element in CSS that'd
> allow attaching CSS page-break-* properties to form feeds. But even that

That's an interesting idea.

> might be an overkill considering how narrow the use case is. The more
> appropriate fix would be to get the IETF to publish RFCs in HTML without
> pre-pagination.

That's a separate issue; and the IETF is actively discussing it.

Even if these documents go away there may still be reasons for allowing more advanced handling of text/plain.
(In reply to Julian Reschke from comment #17)
> I believe the HTML spec over-specifies things here. UAs should have the
> freedom to augment the presentation of text/plain. (Yes, I'll open a bug).

-> <https://www.w3.org/Bugs/Public/show_bug.cgi?id=17304>
This proof-of-concept patch changes the HTML5 tokenizer to generate a <span> element with page break CSS hints for each FF in text/plain.

See <https://bugzilla.mozilla.org/show_bug.cgi?id=92250#c15> however for concerns that this violates the current HTML5 spec, and also <https://www.w3.org/Bugs/Public/show_bug.cgi?id=17304> for the related bug in W3C HTML WG space.
On Monday 04 June 2012, Henri Sivonen wrote:
> The more appropriate fix would be to get the IETF to publish RFCs 
> in HTML without pre-pagination.

This problem is not unique to IETF RFCs.  The use of form feed characters to mark page breaks in plain text files goes back to the original ASCII of 1963, if not earlier, and so there must be a large body of FF-containing documents out there.  Control characters in plain text are becoming rarer, but AFAIK their intended meanings are still specified in modern character encodings.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: