<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 2

•

17 years ago

http://www.worldski.com/ski-specialoffers.aspx - I don't see any diamonds. http://cheap4holidays.com/faqs.aspx - one diamond, "Cheap4Holidays� is a trading name". There is a null byte in a UTF-8 stream, and I guess our UTF-8 decoder turns this into the standard replacement character. http://cheap4holidays.com/terms.aspx - same as faqs.aspx http://www.cheap4carhire.com/terms.aspx - "Cancellation & Amendments: <�br/>". Another null byte in a weird place in UTF-8. Strangely, I only see it if I save the file using Firefox, not using wget/curl. Tools used: wget, curl, hexdump -C, Firefox trunk (not 3.0.x) Dunno if this is a bug in Firefox, but if you're the owner of the sites, you can make the problem go away by removing those null bytes from the pages.

Assignee: nobody → smontagu

Component: General → Internationalization

Product: Firefox → Core

QA Contact: general → i18n

Comment 3

•

17 years ago

Wikipedia says 0x00 is valid UTF-8 for a null character. Seems like a bug in Firefox to me that it's being turned into a replacement character.

Comment 4

•

17 years ago

Testcase: data:text/html;charset=UTF-8,a%00b

Summary: Diamond like � Character → Null byte in UTF-8 page is shown as a replacement character (Diamond-like, �)

Comment 5

•

17 years ago

I don't think the UTF-8 decoder is inserting the replacement character: any other encoding I tried does the same. data:text/html;charset=iso-8859-1,a

Summary: Null byte in UTF-8 page is shown as a replacement character (Diamond-like, �) → Null byte in page source is shown as a replacement character (Diamond-like, �)

Comment 6

•

17 years ago

The last comment got cut off, it should have ended data:text/html;charset=iso-8859-1,a%00b data:text/html;charset=windows-1250,a%00b data:text/html;charset=Shift_JIS,a%00b data:text/html;charset=Big5,a%00b etc., etc.

Comment 7

•

17 years ago

If I'm not mistaken the parser is replacing all null bytes by the replacement character, q.v. bug 315473.

Assignee: smontagu → nobody

Component: Internationalization → HTML: Parser

OS: Windows XP → All

QA Contact: i18n → parser

Hardware: PC → All

Reporter

Comment 8

•

17 years ago

(In reply to comment #1) > can not be critical because this is no crash/dataloss. > A screenshot and attached page source etc are needed. > It have lost lot of my data as when we send request to third party services, the response is lost and the users can view it. The data usage and bandwidth usage is increased thousand times. Can you please tell me how can i provide your screen-shots on this forum.

Reporter

Comment 9

•

17 years ago

(In reply to comment #2) > http://www.worldski.com/ski-specialoffers.aspx - I don't see any diamonds. > > http://cheap4holidays.com/faqs.aspx - one diamond, "Cheap4Holidays� is a > trading name". There is a null byte in a UTF-8 stream, and I guess our UTF-8 > decoder turns this into the standard replacement character. > > http://cheap4holidays.com/terms.aspx - same as faqs.aspx > > http://www.cheap4carhire.com/terms.aspx - "Cancellation & Amendments: > <�br/>". Another null byte in a weird place in UTF-8. Strangely, I only see > it if I save the file using Firefox, not using wget/curl. > > Tools used: wget, curl, hexdump -C, Firefox trunk (not 3.0.x) > > Dunno if this is a bug in Firefox, but if you're the owner of the sites, you > can make the problem go away by removing those null bytes from the pages. > Thanks for your comments! 1- http://www.worldski.com/ski-specialoffers.aspx... it is not essential that this char is appearing at some particular position. It appears at different part of the page at different access. If you view the source code of the page, you will find it somewhere in data, tags, attribute values, and URLs. I have snapshots but i am not sure if i could attach those?

Reporter

Comment 10

•

17 years ago

(In reply to comment #2) > http://www.worldski.com/ski-specialoffers.aspx - I don't see any diamonds. > > http://cheap4holidays.com/faqs.aspx - one diamond, "Cheap4Holidays� is a > trading name". There is a null byte in a UTF-8 stream, and I guess our UTF-8 > decoder turns this into the standard replacement character. > > http://cheap4holidays.com/terms.aspx - same as faqs.aspx > > http://www.cheap4carhire.com/terms.aspx - "Cancellation & Amendments: > <�br/>". Another null byte in a weird place in UTF-8. Strangely, I only see > it if I save the file using Firefox, not using wget/curl. > > Tools used: wget, curl, hexdump -C, Firefox trunk (not 3.0.x) > > Dunno if this is a bug in Firefox, but if you're the owner of the sites, you > can make the problem go away by removing those null bytes from the pages. > http://cheap4holidays.com/faqs.aspx - Diamond like char as postfix is not the trade name. It appears due where the my aspx statement written as follows; <%=session("Sitename")%>... There is nothing wrong with this statement. just to have some experiments, i have cast it to String and trimmed it and used cleanHTML function etc. Can you please little bit explain, how firefox is handling null bytes and what is the possible solution to remove it from the HTML rendered?

Comment 11

•

17 years ago

If you do a 'hexdump -C' on the source code for your aspx program and don't see null bytes there, they're probably in places like the "Sitename" value. I can't give you detailed help with fixing that because I don't know aspx. The problem in http://www.cheap4carhire.com/terms.aspx shows up in Safari and Opera too; those browsers just don't show the replacement character between "<" and "/br>". If *random* (???) null bytes are getting into ski-specialoffers.aspx, you'll have similar problems there occasionally until you fix the site.

Blocks: 315473

Comment 12

•

17 years ago

The handling of null bytes in text nodes is rather inconsistent: if a null byte is at the beginning of the text it just gets omitted, but anywhere else it is replaced by the � character: data:text/html,<p>%00abc%00def</p>

Reporter

Comment 13

•

17 years ago

Any solution?

Comment 14

•

17 years ago

Even after we fix this bug (which is specific to text nodes), you'll need to stop putting random null bytes in your page (see comment 11), so you might as well do that now...