Closed Bug 1663947 Opened 4 years ago Closed 3 years ago

HTML entities like   appear in calendar event and task descriptions

Categories

(Calendar :: Dialogs, defect, P3)

Tracking

(thunderbird_esr78 unaffected, thunderbird85 affected)

RESOLVED FIXED
86 Branch
Tracking Status
thunderbird_esr78 --- unaffected
thunderbird85 --- affected

People

(Reporter: pmorris, Assigned: mkmelin)

References

Details

Attachments

(1 file)

In bug 1659363 we started rendering calendar event/task descriptions as HTML. But some HTML entities like   are showing up in the descriptions when they shouldn't be (i.e. users sees   instead of a space).

mkmelin noticed this and thinks it is likely an html vs xhtml issue. For example,   is the xhtml version of  . Ideally we fix it at the source of the issue, but otherwise a global regex replace would be another possibility.

I would support mkmelin's theory. Also I read that I could necessary to add the explicit XHTML doctype to a document to have the full range of entities available (i.e. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">). Maybe it is worth a try.

I tried the following in calendar-summary-dialog.xhtml:

 19 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"                                                                                                  
 20   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [                                                                                                   
 21   <!ENTITY % globalDTD SYSTEM "chrome://calendar/locale/global.dtd" >                                                                                     
 22   <!ENTITY % calendarDTD SYSTEM "chrome://calendar/locale/calendar.dtd" >                                                                                 
 23   <!ENTITY % dialogDTD SYSTEM "chrome://calendar/locale/calendar-event-dialog.dtd" >                                                                      
 24   <!ENTITY % brandDTD SYSTEM "chrome://branding/locale/brand.dtd" >                                                                                       
 25   %globalDTD;                                                                                                                                             
 26   %calendarDTD;                                                                                                                                           
 27   %dialogDTD;                                                                                                                                             
 28   %brandDTD;                                                                                                                                              
 29 ]>    

Does not seem to work. When I try to add a &nbsp; manually to the document I get:
XML Parsing Error: undefined entity

I don't understand what doesn't seem to work.
Yes,   can't be used in XML (and so XHTML). Instead use the NCR for it:  

Assignee: paul → nobody

(In reply to Magnus Melin [:mkmelin] from comment #3)

I don't understand what doesn't seem to work.

The goal is to to support character entities like   when they show up in the invitation description. As hinted before, I tried adding the xhtml doctype but that does not work.

Yes,   can't be used in XML (and so XHTML).

From what I can tell from searching online , xhtml is supposed to support these entities.

document.doctype returns null. Are these xhtml files really xhtml files or xml with xhtml file extensions?

Note: &nbsp; is not the only entity not being recognized.

If we can't get the doctype to work, I'll write a function to turn them into the number based equivalent.

Flags: needinfo?(mkmelin+mozilla)

(In reply to Lasana Murray from comment #4)

From what I can tell from searching online , xhtml is supposed to support these entities.

document.doctype returns null. Are these xhtml files really xhtml files or xml with xhtml file extensions?

They are XHTML files, but they do not have the XHTML doctype declaration, so entities won't work. We can't really add the "proper" doctype either, since the we can't add entities we need for localization.

Flags: needinfo?(mkmelin+mozilla)
Attached patch bug1663947_cal_nbsp.patch — — Splinter Review

Let's just fix the most common cases. Things don't fail badly without it, but it doesn't look too nice.

BTW, this test is failing rather badly on my machine since the local time settings are picked up (despite the pref being set in the test). I just disabled those subtests locally to be able to test the subtest I'm adding.

Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Attachment #9194944 - Flags: review?(geoff)
Attachment #9194944 - Flags: feedback?(lasana)
Comment on attachment 9194944 [details] [diff] [review]
bug1663947_cal_nbsp.patch

Review of attachment 9194944 [details] [diff] [review]:
-----------------------------------------------------------------

::: calendar/base/modules/utils/calViewUtils.jsm
@@ +370,5 @@
>      // Convert plain text to HTML. The main motivation here is to convert plain
>      // text URLs into <a> tags (to linkify them).
> +    text = text.replace(/\r?\n/g, "<br/>");
> +    // Resolve some of the most common entities.
> +    text = text.replace(/&nbsp;/g, "\u00A0");

Just a suggestion:
Maybe combine the regexes into one and use a replace function to provide the replacement? 
Something like:

```
let replacements = {
 '&nbsp;': '\u00A0', 
 '&copy;': '\u00A9'
};

let replacementRegex = new RegExp(Object.keys(replacements).join('|'), 'g');

function replacer(match) {

 if(replacements.hasOwnProperty(match)) {
   return replacements[match];
 }
  return '';
}
```

That way it's one replace and it's easier to update if this becomes the permanent solution.
Attachment #9194944 - Flags: feedback?(lasana) → feedback+

I think it's easier to stick with what I have in the patch.

Comment on attachment 9194944 [details] [diff] [review]
bug1663947_cal_nbsp.patch

If you insist but is this seriously the best option we have? Our code's based on a browser and we're doing RegExp replacement of HTML entities? :-(

Attachment #9194944 - Flags: review?(geoff) → review+

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/8e20b58b866a
prevent the most common entities from appearing in event/task descriptions. r=darktrojan

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

Hmm, and just as I committed this I think I thought of a better solution.

Target Milestone: --- → 86 Branch

We may need defer the correct fix a bit though, until after top level html in all relevant contexts.
Then it's apparently possible to fix it as:
<!DOCTYPE html [ <!ENTITY % html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> %html; ]>

I tried (In reply to Magnus Melin [:mkmelin] from comment #12)

We may need defer the correct fix a bit though, until after top level html in all relevant contexts.
Then it's apparently possible to fix it as:
<!DOCTYPE html [ <!ENTITY % html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> %html; ]>

I tried something like this a couple ways and it did not work. Would this have to be in messenger.html or something?

I only confirmed that works for a simple XHTML test case so far.
It would have to be in all the files where descriptions can be in, which is probably tens of files at least. I think later we could just add it everywhere...

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: