Closed Bug 18427 Opened 20 years ago Closed 18 years ago

Non-presentational HTML mail

Categories

(SeaMonkey :: MailNews: Message Display, enhancement, P3)

enhancement

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: CodeMachine, Assigned: BenB)

References

(Blocks 1 open bug)

Details

(Keywords: helpwanted)

Attachments

(1 file)

HTML mail is a great idea - block quoting, lists, hyperlinks, emphasis, etc are
all useful things which aren't as easy to do in ASCII mail.  But HTML mail that
allows presentational features just gives everyone a license to make their
message unreadable.

It would be great to have a mode where all HTML presentational markup, inline
and external CSS was ignored, and just the structural markup, ma'am.  Of course,
you could still have a user stylesheet for HTML mail, but at least I wouldn't
have to put up with the ridiculous mail and news messages that other people
were responsible for.
Assignee: phil → nobody
Summary: Non-presentational HTML mail → [HELP WANTED] Non-presentational HTML mail
Whiteboard: [HELP WANTED]
Seems like we'd either need to write our own HTML parser, or teach Gecko's
parser about presentational vs. structural HTML. cc'ing rickg just for fun, and
putting this on the [help wanted] list.
Yeah, this should probably be an option in NGLayout too anyway.  You would want
to allow no stylesheet for xxML, and it would be nice to disable HTML
presentational markup (possibly the same option or in the stylesheet UI).
There is a conceptually simple way that this might be do-able, as a post-fix
(i.e., after opening a message and seeing a wonderful mess of colour and style),
without needing to overhaul the parser.

What about adding a menu item (and possibly a toolbar button) called
"Restore default formatting" that would do this and no more: apply html.css
at a higher cascade level that whatever the presentational formatting of
the html message is (just below the user stylesheets). Since CSS presentational
properties always trump HTML presentational formatting (level zero on the
cascade), this would undo the worst nonsense formatting.

This would not be a complete fix for bad presentational markup and stylesheets,
but it would at least restore all of the text content to readable sizes and
colours (although at the same time it may destroy unconventional (or just
differently-conventional than base html) visual formatting that someone
carefully crafted to present some sort of structure that made sense to them
at the time).

It would also be easy to do, once there is any support for manually-applied
user stylesheets - it would be essentially the same thing, applied just one
level lower on the cascade.

Remember, *conventional* visual formatting is all that lets any of the
sighted recognize structure, e.g., something like an <UL> list as a list,
without reading the source markup, so in that sense, for the sighted,
presentational formatting is a *must* - without html.css or some replacement
mechanism, straight HTML 4 code with no stylesheets would be about as readable
as XML with no stylesheets. I hear that auditory browsers use conventions, too,
to present structure without forcing the blind to listen to the raw HTML.
There really is no such thing structural-only markup unless you want to see,
for example, blockquotes represented on the screen as <BLOCKQUOTE>quote goes
here</BLOCKQUOTE> - and even that's a convention (one built into HTML).
Phil,
I don't think, we have to change the parser. As I see it, it is very flexible in
this area. The presentation of HTML tags is defined in html.css via ordinary
CSS. So, using another stylesheet without visible formatting for presentational
tags is all we would have to do. An easier way (from the coding perspective) is,
just to apply a stylesheet to all displayed msgs, which overwrites (nulls) the
html.css formatting for these tags. Such a stylesheet is already planned,
because it gives the user nearly full control over how to display msgs (plain
text or HTML). One thing I'm not sure of is, how to disable styles defined by
the msgs itself.

Sidr,
"Presentational tags" usually means tags, that want to define, how the msgs is
displayed (e.g. <b>, <font> and styles). The opposite are "structural markup
tags", which show the meaning of entities (e.g. <h1> or <ul>).
I'm wondering here whether this would be possible with a user stylesheet.  Would
it be possible to have a standard mail stylesheet with some CSS like "* {
!important}" or some such?
Actually, I think an implementation of bug #6782 (alternate stylesheets) for
mail would achieve this, since I am interpreting that bug to include the ability
to ignore all style.  Adding dbaron.
Some discussion:
1. The sense that I can make of "without visual formatting for presentational
tags" would involve a stylesheet that defines the formatting of almost all
non-structural tags identically. This would indeed be a very good basis for
making a "plaintext" version out of HTML mail if it was followed up by
sensible conversions for structural elements. Perhaps ALL CAPS for <H1>, etc.,
which could also mostly be handled by a stylesheet. To make sure that this
over-rides author stylesheets as well as html.css it would only need to be
introduced higher in the cascade than any author stylesheets.
All of this makes sense as one very useful way to handle incoming HTML mail.

2. There is room for a middle ground between conversion to a plaintext
presentation and leaving the incoming HTML-formatted message alone.
People make up ad hoc structural uses for presentational tags all the time
- <I> for <BOOKTITLE>, for example, or, more properly, CLASS=BOOKTITLE with
italics specified for .BOOKTITLE in css. Why throw that away to get rid of an
annoying green-on-orange or grey on black colour scheme? Applying html.css at
a higher level in the cascade would leave that alone while changing stupid
formatting choices made in HTML, and most made in css, back to something
readable.
This would be *another* sensible way to treat incoming HTML mail, in some
circumstances, which is why it was suggested as a user-triggered action.

3. The point I made about structural elements having a presentational aspect,
at least for the sighted, that through convention makes it possible to
understand the structure by size, weight, positioning, etc., becomes important
if someone makes serious changes to a stylesheet that affects the visible
formatting of structural elements. There are dozens of ways, for instance,
to visually format a heading and still have it recognized as such by most
people most of the time, but too much change and many will not recognize
it as a heading anymore.

If <H1> elements are made inline, for instance, which is perfectly possible,
then a facility that restores a sensible default stylesheet *only* for elements
that are normally considered presentational would leave the <H1> inline,
leaving the structure obscured.

I see no reason not to restore a sensible default presentational aspect
to *all* elements, when poorly chosen presentational properties can
cripple understanding of structural elements.

Both conversion to a "plaintext mode" and applying html.css at a higher
cascade level *would* undo poor visible formatting choices for structural
elements, while the latter would keep the structural elements more easily
recognizable by being able to use the default font sizing, etc., in html.css.

4. It would also be possible to define a "semi-plaintext" mode, where
all text in non-structural elements is the same font and size, possibly
fixed-width, but the heading tags would appear as usual, as would lists
and tables and <HR> etc, through yet another stylesheet. This may also
have its uses.

Implementing (2) as well as (1) would allow users to undo the worst of
visual stylistic formatting without having to go all the way to plaintext.
All that would be needed to do that would be a way to introduce html.css
at a higher cascade level almost as if it were a user stylesheet.

This could indeed be done with alternate stylesheets as described in
bug 6782, but the stylesheets for (2) and (4) would be neither
User, exactly, nor Author.

(1) I suspect would be triggered by a preference.
As mentioned several times above, this could be easily done by allowing user
stylesheets for news and mail. Similar functionality will appear in the browser
when user stylesheets are enabled there. (Is there a bug on that???)
Keywords: helpwanted
Summary: [HELP WANTED] Non-presentational HTML mail → Non-presentational HTML mail
Whiteboard: [HELP WANTED]
Blocks: 31907
Assignee: nobody → mozilla
Taking bug.

Most of it is very easy - we just need to output a class for the wrapping <div>,
so a stylesheet can match it specifically, and the user can override certain
presentational tags per stylesheet, e.g.
.text-html font {
  font-size: inherit !important;
  font-family: inherit !important;
  color: inherit !important;
}

But I'm not sure what to do with the style attribute. Luckily, you can override
these specs with the user stylesheet, too. But the author can add any style to
any tag, so you'd have to specify an inherit for all properites *and* tags.
Easier, but more hackish, is to just replace "style=" with
"class=\"html-author-style\" nil=" while the HTML source passes libmime, and
similar tricks with <meta link=...>.

I have an experimental version working, with the replace hack described above. I
even added more hacks for removing external images and scripts. My plan is to
use regexps or a JS-function for that, so the user can adjust the altering of
the HTML source, but unfortunately, we have no infrastructure ready for that.
s/<meta link=...>/"<link...", "<meta...", "<style" etc.


What works (in my tree):

You can add the rule below your global user stylesheet. This should remove most
style threats. Perf should also be OK.


Outstanding problems:

> But the author can add any style to
> any tag, so you'd have to specify an inherit for all properites *and* tags.

1:
Well, there is the universial selector "*" of CSS2 (is implemented in Mozilla),
but it is *very* slow, so you don't want to specify it in your user stylesheet,
which applies to all of Mozilla, including the UI. You can make the rule apply
only to msgs (".text-html *"), but it will be evaluated for each and every
element. Adding dependancy on bug 41637, a stylesheet only for msgs. Would be
nice, if we could add hooks to a user stylesheet from there, e.g. add "@import
profile://chrome/mailnews-msg.css" or so to the chrome msg stylesheet, because
it won't be easy to hack the chrome stylesheet once it is in a jar archive. Is
there a way to reference a file in the user's profile dir per URL?

2:
You would still have to override each and every property CSS Mozilla knows. Is
there some shorthand property for "all properties"? If yes, you could just say
".text-html * {all: inherit !important}" And would have overridden all rules, no
matter where included (style attribute, style element or external stylesheet).

3:
Once you overwrote all rules (including the user-agent stylesheet), HTML
elements have no meaning anymore. You'd have to copy (most of) html.css into
your msg stylesheet :-(. Is there some way to override all author rules, but not
the UA ones?



.text-html font, .text-html div, .text-html body {
  font-size: inherit !important;
  font-family: inherit !important;
  color: inherit !important;
  background-color: inherit !important;
  background-image: inherit !important;
  text-align: inherit !important;
  text-indent: inherit !important;
}
Status: NEW → ASSIGNED
Depends on: 41637
Or is there some other way to tell NGLAyout not to use author styles of *some*
documents (i.e. msgs only)?
> Or is there some other way to tell NGLAyout not to use
> author styles of *some* documents (i.e. msgs only)?

Not sure, but "No Author-Specified Style" (ie not even non-alternate
stylesheets) should be an option for both mail and the browser.  Is it currently
planned for the Browser Ian?
NGLayout seems to allow that, but I need to figure out how to make it do what I
want. This would be step 2 one this bug.

I think, being able to apply certain style rules only to html msgs makes sense
nevertheless, and is IMO a good short-term solution. So, adding the <div> would
be step 1. Will attach patch. rhp, can you review, please?
This seems reasonable to me.

- rhp
Step 1 checked in.
relnoteRTM: We should at least consider release noting this. Many users complain
about formatting in HTML msgs, and the "solution" we have now (adding cerain
rules to user stylesheet) is completely non-obvious.
Keywords: relnoteRTM
This probably shouldn't be mixed in with other more serious release notes, but a
"tips" section in the release notes would be useful that included this.

I suspect though this is going to become a knowledgebase sort of thing, like
most other tips.
Whiteboard: relnote-devel
I really don't think this fits in with the other developer release notes I'm
writing since this is actually a user tip on how to achieved a desired display
mode. Suggest that someone post a technical tip explaining this technique on the
mozilla.org web site as user documentation of "hints and tips."
OK, removing relnoteRTM keyword, adding "tip" to status whiteboard.
Keywords: relnoteRTM
Whiteboard: relnote-devel → tip
Until we get a more detailed tips page describing this, I've added it to the
userContent.css suggestions under http://www.mozilla.org/unix/customizing.html
(along with a comment pointing back to this bug for more info).
hey, akk, that's a cool doc! Thanks.
I don't see the addtion you mentioned, but I guess, the site just didn't sync yet.
Whiteboard: tip
For the record: Bug 30888 is related.
*** Bug 30896 has been marked as a duplicate of this bug. ***
Changing personal priorities. Giving away most of my bugs :-( (reassigning to
default owner).

I will still track these bugs closely. If you need my input, feel free to ask me.

New owner: Please do *not* close these bugs (as WONTFIX or whatever you may
find) unless they are fixed. Rather, reassign to <nobody@mozilla.org>, if you
don't want to work on them.
Assignee: mozilla → sspitzer
Status: ASSIGNED → NEW
QA Contact: lchiang → esther
Keywords: mozilla1.2
Blocks: 108153
No longer blocks: 108153
Depends on: 108153
For the mailing lists and newsgroups that I get, almost all HTML on them is
SPAM, with only a little bit from people with mis-configured mailers.

1. I would like to be able to set the default not to display the HTML portion of
the message automatically.

2. If HTML is to be displayed, then I do not want external links to be opened,
or pop-ups to be enabled.  Some of the spammers are now sending HTML that causes
all sorts of external sites to be pulled in.

IMHO: Having a mail or news client be forced to automatically display HTML
always is a bug.

Having a mail message execute scripts or open an external web page, where the
browser can give out cookies or accept cookies is a security bug.

It allows SPAMMERs to confirm that that their message was delivered.

3. All of this should be easily selected by the user.
John Malmberg, you are offtopic. Please read the bug and the bugs referenced
etc.. You will find bugs that exactly cover your requests.
HTML mail is even worse than this bug suggests.

I receive HTML email that contains a link to a GIF on a remote server.
The URL for the GIF is through CGI and the path contains my email
address! This means that if I view this spam in Mozilla it will kindly
report to the spammer that they have successfully hit a valid email
account. I really thought that Mozilla would have given us the ability
to switch off HTML viewing instead of following Microsoft's brain dead
adherence to allowing us to be spammed, virused and warez'd.
Now that blocking bug 108153 is fixed, is there anything left to be done here? I
don't think so.

Liam Parker: Please read the 2 comments just above yours.
Assignee: sspitzer → ben.bucksch
> Now that blocking bug 108153 is fixed, is there anything left to be done here? I
> don't think so.

No comment, marking FIXED.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
verified
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.