User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.4b) Gecko/20030507 Build Identifier: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.4b) Gecko/20030507 Composer changes the copyright character from © to © Reproducible: Always Steps to Reproduce: 1. Open a new composer page 2. Switch to HTML Source view and enter "© 2003" in the body 3. switch to Normal and back to HTML Source view, it changed to "© 2003" 4. save the file and validate it at http://validator.w3.org/file-upload.html 5. Observe the errors Actual Results: Composer changes © to © which generates 4 validation errors Expected Results: Composer should not induce errors. Tested only with <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> and charset=ISO-8859-1
This a regression; can someone pinpoint when this started? -->DOM to Text Conversion
It regressed somewhere between Build 2003040105 and Build 2003041208.
This wouldn't explain a recent regression, but it seems interesting that in htmlparser/src/nsHTMLEntityList.h COPY is one of a handful of entities that are specified in capitals; everything else in the file is in lower case. That seems like it's probably an error, and I wonder if fixing it might fix the bug too? The serializer uses NS_ENTITYCONVERTER_CONTRACTID to output entities, which lives in intl/unicharutil. But the actual entities don't live there (maybe they get them from the parser). Naoki, any ideas if anything changed recently in the entity converter?
Whoops, if I'm going to ask Naoki questions it would help if I cc'ed him. :-)
cc alecf since he changed nsHTMLEntities.cpp in the timeframe
http://lxr.mozilla.org/seamonkey/source/htmlparser/src/nsHTMLEntityList.h#319 319 // Navigator entity extensions; apos is from XML 320 HTML_ENTITY(apos, 39) 321 HTML_ENTITY(AMP, 38) 322 HTML_ENTITY(COPY, 169) So, "apos" is there since it is not in HTML 4 but in XML. Other upper case entities may be there for compatibilites for old clients as the comment mentions.
Comment on attachment 123517 [details] [diff] [review] patch that fixes Composer's output seek reviews for 1.4final alecf--is there some way to avoid this and fix the implementation to find the entities based on insertion order (finding the first item, not the last item) Heikki--if we go with this fix for 1.4final, is there something that might break by changing the order?
So returning the last match is a new behavior (since it used to return the lowercase one) and the table need to match that? Is the table also used for decoding entites?
IMO, the change should not cause any problem. However, I would like to know when the logic changed ( first-match to last-match ) and why.
Created attachment 123526 [details] [diff] [review] patch that should change the table behavior I haven't tested this yet, but this should revert to preferring earlier entries in the list. This also fixes a few other bugs: * AddRefTable and ReleaseTable didn't work as (implicitly) documented. * The initial capacity given in PL_DHashTableInit was too small -- it failed to account for the alpha bounds, leading to a change of having to do a table resize (I'm not sure whether it happened, since it depends where the size is relative to powers of 2.)
Is there any chance that as part of this fix, someone could check in a comment explaining why we have duplicate entries (except for case), but only for a handful of tags? Are we sure that we don't need mixed-case versions? "Navigator entity extensions: apos is from XML" doesn't explain why that list of tags, but no others, need duplicate entries.
Comment on attachment 123517 [details] [diff] [review] patch that fixes Composer's output obsoleting this patch; but it'd be nice if some form of the comment in this patch were checked in.
Comment on attachment 123526 [details] [diff] [review] patch that should change the table behavior seeking reviews on dbaron's patch
Comment on attachment 123526 [details] [diff] [review] patch that should change the table behavior r=alecf
15 years ago
Comment on attachment 123526 [details] [diff] [review] patch that should change the table behavior seeking to fix this regression for 1.4final (html entities were being grabbed in the wrong order so we were outputting invalid html).
reassign to dbaron since this is his patch
Comment on attachment 123526 [details] [diff] [review] patch that should change the table behavior a=asa (on behalf of drivers) for checkin to 1.4
Comment on attachment 123526 [details] [diff] [review] patch that should change the table behavior - Nit: test if (++gTableRefCnt != 1) instead? - Don't add 1 to PL_DHashTableInit capacity parameters computed by dividing NS_HTML_ENTITY_COUNT by default max-alpha (.75). firstname.lastname@example.org if you fix the second and consider the first. /be
Fix checked in to trunk (with comment added to nsHTMLEntityList.h), 2003-05-20 13:53 -0700.