User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.1 (KHTML, like Gecko) Chrome/5.0.323.0 Safari/533.1 Build Identifier: version 3.4.5 I have just tried to file a bug on Thunderbird concerning a problem with its Unicode support. I included an example of the text in question; Bugzilla failed spectacularly, truncating my bug report at the first astral plane code point. The bug report in question is https://bugzilla.mozilla.org/show_bug.cgi?id=545478. Reproducible: Always Steps to Reproduce: 1. Attempt to file a bug report using some text containing characters in the Unicode astral plane range --- there's an example at http://twitter.com/hjalfi/statuses/8690602802. Actual Results: The bug report is truncated at the first astral plane code point. Expected Results: I see the text in the bug report. This *may* have security implications --- I don't know why it's truncating the bug report, but it smells like it's misparsing the text, and that sort of thing needs looking at closely in case it's doing something unexpected.
I shall now attempt to add the text sample in question to this bug report to see whether 'Additional Comments' fails too.
Dodgy text start: [
Yes, that truncated it as well.
My guess would be that MySQL is doing it, actually, not Bugzilla. Where in the Unicode space do these characters start?
I've heard rumours MySQL doesn't support astral plane properly. These specific characters are MATHEMATICAL BOLD FRAKTUR from U+1D56C to U+1D59F, but I suspect it would work with anything above U+FFFF.
Ahh, do you know what version of Unicode those were added in? It may be that either Perl or MySQL aren't using that version of Unicode.
3.1.0, apparently: http://www.fileformat.info/info/unicode/char/1d56c/index.htm (Which came out in 2001...)