92250 - Form-Feed characters in text/plain documents are ignored

Reporter

Description

•

24 years ago

ASCII FF (^L/12/0C) chars are ignored when printing text documents. FF should cause a page break This is particularly relvant when printing RFCs (for example from rfc-editor.org). Example URL: ftp://ftp.isi.edu/in-notes/rfc1945.txt

Boris Zbarsky [:bzbarsky]

Updated

•

24 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

dcone (gone)

Comment 1

•

24 years ago

Is this a request for enhancement.. and how do we get this form feed from the HTML to the printer. I have not heard of this and dont really know how to proceed with such a thing.

dcone (gone)

Comment 2

•

24 years ago

until this bug is updated with the exact problem.

Status: NEW → RESOLVED

Closed: 24 years ago

Resolution: --- → WONTFIX

sujay

Comment 3

•

24 years ago

verified.

Status: RESOLVED → VERIFIED

Ariel Gonzalez

Comment 4

•

24 years ago

FormFeed is just another way of saying "End of Page" meaning that when printing a TXT with such characters, the printer should stop the page at that point and start a new one how do you get that to a printer? well, i believe that ever since the start of computer printing, all printers should understand the form feed command... dunno how modern printing works tho i dont think this is a RFE, since it is a requirement when working with TXT files, up there with dealing with NewLine and CarriageReturn

dcone (gone)

Comment 5

•

24 years ago

Like I said before.. "how do you get the FF to the printer when the source is in HTML or XML". This request to me is basically.. a request have a tag or something that can get the FF to the printer. So if it were a matter of just the printers understanding the FF.. it would not be our bug.. would it.

Roland Mainz

Comment 6

•

24 years ago

rods: Can you take a look at this one, please ? IMHO we should implement this (assuming the mimetype is text/plain or similar...)...

rods (gone)

Comment 7

•

24 years ago

ewwwwww, a tough one, this is non-trival.

Severity: minor → enhancement

Status: VERIFIED → REOPENED

Priority: -- → P1

Resolution: WONTFIX → ---

Target Milestone: --- → Future

rods (gone)

Updated

•

24 years ago

Status: REOPENED → ASSIGNED

rods (gone)

Comment 8

•

24 years ago

taking

Assignee: dcone → rods

Status: ASSIGNED → NEW

rods (gone)

Updated

•

24 years ago

Status: NEW → ASSIGNED

Ariel Gonzalez

Comment 9

•

24 years ago

I don't know much about programming. That said, how hard can this be? I mean, its just a text file. Almost no extra formating needs to be done on this page. Would modern printers print this out correctly if you just sent the text down the pipe? I mean, we already to page breaks for webpages, so instead of calculating the page break (like for all webpages longer than one page) just use the breaks where the TXT file has them I apologize if I just don't get it, but I rember playing around with QBasic and printers back in the day and putting page breaks in there was not hard at all.

rods (gone)

Updated

•

24 years ago

Priority: P1 → P3

Hixie (not reading bugmail)

Updated

•

23 years ago

Summary: Form-Feed characters in text documents are ignored → Form-Feed characters in text/plain documents are ignored

Jonathan Buschmann

Comment 10

•

23 years ago

*** Bug 174521 has been marked as a duplicate of this bug. ***

Phil Ringnalda (:philor)

Updated

•

16 years ago

Assignee: rods → nobody

Status: ASSIGNED → NEW

QA Contact: sujay → printing

Tristan Miller

Comment 11

•

15 years ago

I don't think form feeds in HTML is an issue, since white space in HTML is not significant. The original bug report here deals with the formatting of form feeds in plain text documents.

Tristan Miller

Comment 12

•

14 years ago

Confirming problem still exists in Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20120129 Firefox/10.0 SeaMonkey/2.7

Julian Reschke

Comment 13

•

13 years ago

The relevant code may be close to http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/nsTXTToHTMLConv.cpp

Joshua Cranmer [:jcranmer]

Comment 14

•

13 years ago

(In reply to Julian Reschke from comment #13) > The relevant code may be close to > http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/ > nsTXTToHTMLConv.cpp Some more investigation suggests that the actual code is in the HTML 5 parser, so the code to change is probably in here somewhere: <http://dxr.lanedo.com/mozilla-central/parser/html/nsHtml5StreamParser.cpp.html>

Henri Sivonen (:hsivonen)

Comment 15

•

13 years ago

(In reply to Julian Reschke from comment #13) > The relevant code may be close to > http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/ > nsTXTToHTMLConv.cpp That's for converting text/plain e-mail messages into text/html with quotations and stuff. The code relevant for handling text/plain on the Web is near http://mxr.mozilla.org/mozilla-central/source/content/html/document/src/nsHTMLDocument.cpp#555 and http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5StreamParser.cpp#904 However, changing that code to enhance RFC printing would be in violation of the HTML specification. I think that text/plain load path in Gecko should not be patched to enhance RFC printing unless the HTML specification changes accordingly. The appropriate fix would be introducing a pseudo-element in CSS that'd allow attaching CSS page-break-* properties to form feeds. But even that might be an overkill considering how narrow the use case is. The more appropriate fix would be to get the IETF to publish RFCs in HTML without pre-pagination.

Henri Sivonen (:hsivonen)

Comment 16

•

13 years ago

In other words, I think this is WONTFIX as far as the HTML parser goes and if I was a layout module owner I would WONTFIX the layout aspect as well on the grounds of a very narrow use case that should be addressed in IETF policy.

Julian Reschke

Comment 17

•

13 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #15) > However, changing that code to enhance RFC printing would be in violation of > the HTML specification. I think that text/plain load path in Gecko should > not be patched to enhance RFC printing unless the HTML specification changes > accordingly. I believe the HTML spec over-specifies things here. UAs should have the freedom to augment the presentation of text/plain. (Yes, I'll open a bug). > The appropriate fix would be introducing a pseudo-element in CSS that'd > allow attaching CSS page-break-* properties to form feeds. But even that That's an interesting idea. > might be an overkill considering how narrow the use case is. The more > appropriate fix would be to get the IETF to publish RFCs in HTML without > pre-pagination. That's a separate issue; and the IETF is actively discussing it. Even if these documents go away there may still be reasons for allowing more advanced handling of text/plain.

Julian Reschke

Comment 18

•

13 years ago

(In reply to Julian Reschke from comment #17) > I believe the HTML spec over-specifies things here. UAs should have the > freedom to augment the presentation of text/plain. (Yes, I'll open a bug). -> <https://www.w3.org/Bugs/Public/show_bug.cgi?id=17304>

Julian Reschke

Comment 19

•

13 years ago

Attached file proof-of-concept, modifying the HTML5 tokenizer — Details

This proof-of-concept patch changes the HTML5 tokenizer to generate a <span> element with page break CSS hints for each FF in text/plain. See <https://bugzilla.mozilla.org/show_bug.cgi?id=92250#c15> however for concerns that this violates the current HTML5 spec, and also <https://www.w3.org/Bugs/Public/show_bug.cgi?id=17304> for the related bug in W3C HTML WG space.

Tristan Miller

Comment 20

•

13 years ago

On Monday 04 June 2012, Henri Sivonen wrote: > The more appropriate fix would be to get the IETF to publish RFCs > in HTML without pre-pagination. This problem is not unique to IETF RFCs. The use of form feed characters to mark page breaks in plain text files goes back to the original ASCII of 1963, if not earlier, and so there must be a large body of FF-containing documents out there. Control characters in plain text are becoming rarer, but AFAIK their intended meanings are still specified in modern character encodings.

BMO Automation

Updated

•

3 years ago

Severity: normal → S3