<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 4

•

23 years ago

I think there *is* a bug about it, but I cannot find it. This bug is because the code uses |nsCRT::IsAsciiAlpha()| to determine, if it's an alpha char. Tell me a function that works internationally and I can change it.

Assignee: sspitzer → ben.bucksch

Severity: trivial → minor

Component: Mail Window Front End → Networking

Product: MailNews → Browser

Summary: Message text highlighting broken with 8bit chars → [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized

Target Milestone: --- → Future

Boris 'pi' Piwinger

Comment 5

•

23 years ago

*** Bug 142507 has been marked as a duplicate of this bug. ***

Boris 'pi' Piwinger

Comment 6

•

23 years ago

*** Bug 123326 has been marked as a duplicate of this bug. ***

Christian :Biesinger (don't email me, ping me on IRC)

Comment 7

•

23 years ago

*** Bug 149245 has been marked as a duplicate of this bug. ***

Comment 8

•

22 years ago

>Tell me a function that works internationally and I can change it. well, if it's a char*, isascii() might be enough but works only for the current locale. as there are more than 64000 characters, I don't think there's a function that tells you for each one if it's an alphanumeric char... at least none I know of.

Daniel Küstner

Comment 9

•

22 years ago

Sure? isascii() tells you only, wether a certain character belongs to 7bit ascii. What Ben asked for is a function wich tells him wether a certain character is alphanumeric.

R.K.Aa.

Comment 10

•

22 years ago

*** Bug 172800 has been marked as a duplicate of this bug. ***

benc

Comment 11

•

22 years ago

mass assignment of text->HTML bugs to MailNews w/ esther as QA.

Component: Networking → Mail Back End

Product: Browser → MailNews

Target Milestone: Future → ---

Version: Trunk → other

Jo Hermans

Comment 12

•

22 years ago

*** Bug 194032 has been marked as a duplicate of this bug. ***

Frank Wein [:mcsmurf]

Comment 13

•

22 years ago

We have some IsUTF8 function, but a IsUTF8Alpha function needs to be written first.

Myk Melez [:myk] [@mykmelez]

Comment 14

•

21 years ago

*** Bug 206298 has been marked as a duplicate of this bug. ***

Updated

•

20 years ago

Product: MailNews → Core

Comment 15

•

20 years ago

*** Bug 272981 has been marked as a duplicate of this bug. ***

Jungshik Shin

Comment 16

•

20 years ago

(In reply to comment #8) > >Tell me a function that works internationally and I can change it. > > as there are more than 64000 characters, I don't think there's a function that > tells you for each one if it's an alphanumeric char... at least none I know of. Actually, there are |nsIUGenCategory| and |nsIUGenDetailCategory|.

Keywords: intl

Comment 17

•

20 years ago

*** Bug 280298 has been marked as a duplicate of this bug. ***

Comment 18

•

20 years ago

*** Bug 292558 has been marked as a duplicate of this bug. ***

earlpiggot

Comment 19

•

17 years ago

It occurs also with UTF-8, ISO-8859-7, Windows-1253. And it is for all three formatting characters: *, /, _

Comment 20

•

17 years ago

Currently using isAsciiAlpha() to tell whether it's a letter. Need generic isUnicodeAlpha() which works for all Unicode chars/languages. Filed bug 415209 for this.

Depends on: 415209

Whiteboard: Currently using isAsciiAlpha() to tell whether it's a letter. Need generic isUnicodeAlpha() which works for all international chars/languages. See dependencies.

Updated

•

17 years ago

Whiteboard: Currently using isAsciiAlpha() to tell whether it's a letter. Need generic isUnicodeAlpha() which works for all international chars/languages. See dependencies. → See dependencies.

earlpiggot

Comment 21

•

17 years ago

I'm not sure whether replacing isAsciiAlpha() with some isUnicodeAlpha()-like is sufficient, as Tb should format phrases like *13 Ιαν*, or _νοκ-άουτ_, i.e., words and phrases that combine any letters with digits, punctuation marks, etc.

Nobody; OK to take it and work on it

Comment 22

•

17 years ago

> should format phrases like *13 Ιαν*, or_νοκ-άουτ_, i.e., words > and phrases that combine any letters with digits, punctuation marks No, that was an intentional decision not to do that. It's too likely to go wrong for math (simple), ascii-art etc.pp.. It may look a bit arbitrary, but the converter is written with the goal of minimal false positives, even if that means false negatives, esp. given that this is just niceness and nothing depends on this feature.

earlpiggot

Comment 23

•

17 years ago

IMHO, statistics might show you that the majority on the Tb users don't have English as their mother tongue. So, perhaps, you ought to put a little extra effort for them.

Magnus Melin [:mkmelin]

Updated

•

17 years ago

QA Contact: esther → backend

Updated

•

16 years ago

Product: Core → MailNews Core

Comment 32

•

11 years ago

The summary of this bug, while certainly correct and concise, is too technical for both general users and even QA to find it, which contributes to unnecessary inflation of duplicates (currently 16), which is highly undesirable for bug workflow and management as we are wasting time to analyse, discuss, and add testcases for the same problem all over again in each bug. More so given our current scarce manpower in QA. For exactly and only those reasons, I will add frequent search words that users associate with this bug to the bug summary, with the unfortunate but inevitable side effect that the summary will be longer and less concise than now. However, that appears to me to be clearly the lesser evil compared to wasting more time on duplicates. Feel free to improve the summary, but pls refrain from removing any of the popular search words (and you have no idea what people search for when they search, it's creative language use for sure). What I usually try is to blend relevant search words into a human-readable summary of the bug, which is also helpful for a better understanding of the bug itself and from search results. Btw, it's a major shortcoming of Bugzilla that we don't have a separate field for adding freetext searchwords, which would be a simple and much superior solution over stuffing summaries. Lacking that, there's currently no other way than using summary for that purpose.

Summary: [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized → [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized (structured plain text like */_éfoobar$_/* with special or accented characters at beginning or end of string is not displayed as formatted bold, italics, or underlined)

Comment 33

•

11 years ago

I fully appreciate that when Ben assigned himself to this bug in 2002 (comment 4), he was willing to fix it, but was hindered by bug 415209 on which this bug depends, where bug 415209 is obviously a lot harder than this one. So this bug and its assignee are apparently waiting for bug 415209 to be done first, but different assignee of bug 415209 unfortunately hasn't touched that one for several years either (I've just invited him to continue, in support of Ben's bug). The net effect is that while this bug appears "assigned", nobody is working on it. Again, I'm looking at this from a bug workflow and management perspective. Such bugs create a false sense of progress and security where there is neither. We could probably assign the whole database that way asserting that somebody will fix it IF all those other blocking bugs were fixed, and we'd end up with an assignment quota like from a book of fairy tales. I don't see much benefit in such resting assignments. On the contrary, I'd suspect that an inactive assignment might narrow down that little chance of somebody actually coming along to pick this up or contribute new ideas. Why should somebody try a bug that is already assigned? How can I as a QA volunteer credibly invite active coders to try their creativity and ideas on bugs like this (or their blockers) if it's already preoccupied by an inactive assignement where they will be shy to interfere? In conclusion, I'd recommend that we unassign bugs like this one to keep the door open for others who might wish to work on this or even just add alternative ideas, and to have a more truthful reflection of the bug status in our database. Instead of assignment-on-hold, we could just keep a comment from Ben for the record that he's willing to work on this after somebody else has fixed blocking bug 415209. How do others think about this? Comments welcome.

Flags: needinfo?

Assignee

Comment 35

•

11 years ago

Attached patch 106028.diff (obsolete) — Details — Splinter Review

This builds, but I'm not sure if it's correct and I don't know how to test it.

Attachment #796909 - Flags: feedback?(ben.bucksch)

Assignee

Comment 36

•

11 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

hg export did something strange in the previous attachment

Attachment #796909 - Attachment is obsolete: true

Attachment #796909 - Flags: feedback?(ben.bucksch)

Attachment #796915 - Flags: feedback?

Tony Mechelynck [:tonymec]

Assignee

Updated

•

11 years ago

Attachment #796915 - Flags: feedback? → feedback?(ben.bucksch)

Updated

•

11 years ago

Flags: needinfo?

Comment 37

•

11 years ago

Comment on attachment 796915 [details] [diff] [review] Patch Approach looks good to me. - Consider speed. This must process large texts. I don't know how fast the functions are that you call in comparison to the previous ones. - I don't know where IsAlpha() comes from, nor why the change from PRUnichar to uint32_t (that doesn't look right to me).

Attachment #796915 - Flags: feedback?(ben.bucksch) → feedback+

Comment 38

•

11 years ago

> - I don't know where IsAlpha() comes from, nor Ah, from bug 415209

Summary: [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized (structured plain text like */_éfoobar$_/* with special or accented characters at beginning or end of string is not displayed as formatted bold, italics, or underlined) → [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized

Comment 39

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #38) > > - I don't know where IsAlpha() comes from, nor > > Ah, from bug 415209 Fixing Ben's accidental truncation of summary from comment 38, sorry for spam. Longer summary, while not ideal, is required for QA workflow because of design shortcoming of Bugzilla, as explained with several clear reasons in my comment 32. In view of that comment, deliberate truncating of summary without refuting those arguments would be offensive, nonsensical and an open violation of cooperative spirit in Bugzilla, especially if you're not pushing large chunks of QA work as I do. So I assume it would be very unfortunate and I don't want to believe that Ben would deliberately insist on offending my work like that by annihilating my changes to the bug, while - all differences aside - I've actually succeeded to get some traction on a bug which was assigned to him since 2002 and hasn't seen any activity except piling up duplicates since 2008.

Summary: [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized → [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized (structured plain text like */_éfoobar$_/* with special or accented characters at beginning or end of string is not displayed as formatted bold, italics, or underlined)

Magnus Melin [:mkmelin]

Comment 40

•

11 years ago

For testing, see http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/test_mozTXTToHTMLConv.js

Assignee: ben.bucksch → smontagu

Assignee

Comment 41

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #37) > Comment on attachment 796915 [details] [diff] [review] > - Consider speed. This must process large texts. I don't know how > fast the functions are that you call in comparison to the previous ones. I believe the functions in nsUnicodeProperties.cpp are well enough optimized for speed that this won't be a problem. > - I don't know where IsAlpha() comes from, nor > why the change from PRUnichar to uint32_t (that doesn't look right to me). Ideally we should support any Unicode character up to U+10FFFF, not just the range up to U+FFFF that can fit in a PRUnichar. However, doing this properly will require disproportionately more work, and I've decided to postpone it to a follow-up bug.

Comment 42

•

11 years ago

(In reply to Thomas D. from comment #39) > Fixing Ben's accidental truncation of summary from comment 38, sorry for > spam. > > Longer summary, while not ideal, is required for QA workflow because of > design shortcoming of Bugzilla, as explained with several clear reasons in > my comment 32. In view of that comment, deliberate truncating of summary > without refuting those arguments would be offensive, nonsensical and an open > violation of cooperative spirit in Bugzilla, especially if you're not > pushing large chunks of QA work as I do. So I assume it would be very > unfortunate and I don't want to believe that Ben would deliberately insist > on offending my work like that by annihilating my changes to the bug, while > - all differences aside - I've actually succeeded to get some traction on a > bug which was assigned to him since 2002 and hasn't seen any activity except > piling up duplicates since 2008. Don't CC me to any more bugs, Thomas. I can't stand reading your self-important pronouncements.

David Bourguignon

Comment 43

•

11 years ago

@Thomas @Mike By the way, there are users facing this bug everyday... Time for change? :-) Thanks in advance to you guys for your precious help!

Assignee

Comment 44

•

11 years ago

Attached patch Patch v.2 — Details — Splinter Review

I removed the attempt to support supplementary characters and added a unit test. I also tested manually in a thunderbird build and everything seems to be working. Try run: https://tbpl.mozilla.org/?tree=Try&rev=39d247f29958

Attachment #796915 - Attachment is obsolete: true

Attachment #798299 - Flags: review?(ben.bucksch)

Magnus Melin [:mkmelin]

Comment 45

•

11 years ago

Comment on attachment 798299 [details] [diff] [review] Patch v.2 Review of attachment 798299 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/test/unit/xpcshell.ini @@ +15,5 @@ > [test_auth_proxy.js] > [test_authentication.js] > [test_authpromptwrapper.js] > [test_backgroundfilesaver.js] > +[test_bug106028.js] Please use a more descriptive file name.

Updated

•

11 years ago

Summary: [mozTXTToHTMLConv] structs with leading/trailing 8bit chars not recognized (structured plain text like */_éfoobar$_/* with special or accented characters at beginning or end of string is not displayed as formatted bold, italics, or underlined) → [mozTXTToHTMLConv] structs with leading/trailing international chars not recognized

Tony Mechelynck [:tonymec]

Comment 46

•

11 years ago

I wrote in comment 37: > - Consider speed. This must process large texts. I retract that, because I remember that we go through this code only when we found one of these */_ marker characters, so it's not a big deal. smontagu wrote in comment 41: > Ideally we should support any Unicode character up to U+10FFFF I don't think there's a need to it. Can you leave this with PRUnichar? r=BenB with this change. Remember that we never claimed we'd recognize *everything*. Let's not get overboard with supporting Maya language. (In fact, more important would be to recognize the emphasis in the last sentence, before sentence markers like .,;"' , without triggering any other false positives.) Code, testcase: > "\u03C5\u03C0\u03BF\u03B3\u03C1\u03AC\u03BC\u03BC\u03B9\u03C3\u03B7", > // Greek υπογράμμιση Minor NIT: If you can add the Greek characters as a comment, can you add them directly to the string literal instead of the escaped \u? That would require that your HTML doc is parsed as UTF8. If that poses any problem or costs significant time, please ignore this comment.

Comment 47

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #46) [...] > smontagu wrote in comment 41: > > Ideally we should support any Unicode character up to U+10FFFF > > I don't think there's a need to it. Can you leave this with PRUnichar? > r=BenB with this change. > > Remember that we never claimed we'd recognize *everything*. Let's not get > overboard with supporting Maya language. [...] Maybe not Maya, but what about CJK Extensions B, C and D in code pane 2 (U+20000 to U+2B81F)? My Chinese friends tell me they use some of them.

Assignee

Comment 48

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #46) > smontagu wrote in comment 41: > > Ideally we should support any Unicode character up to U+10FFFF > > I don't think there's a need to it. Can you leave this with PRUnichar? > r=BenB with this change. Well we have to cast to uint32_t at *some* point before passing to mozilla::unicode::GetGenCategory. If it isn't in this patch it will need to be in the patch for bug 415209, so adding needinfo jfkthame.

Flags: needinfo?(jfkthame)

Jonathan Kew [:jfkthame]

Comment 49

•

11 years ago

I think we should bite the bullet (it's not *that* hard!) and handle surrogate pairs properly here. Otherwise we risk passing isolated surrogates to the Unicode character-category functions, which is basically meaningless.

Flags: needinfo?(jfkthame)

Wayne Mery (:wsmwk)

Comment 50

•

11 years ago

unless it can be demonstrated that this bug does not cause "(structured plain text like */_éfoobar$_/* with special or accented characters at beginning or end of string is not displayed as formatted bold, italics, or underlined)", then supplementary phrase of this type should remain in the bug summary - for all the reasons that other people have stated. On behalf of intl users, thanks everyone for working on this.

Severity: minor → normal

Summary: [mozTXTToHTMLConv] structs with leading/trailing international chars not recognized → [mozTXTToHTMLConv] structs with leading/trailing international chars not recognized. For example structured plain text */_éfoobar$_/* not displayed as bold, italic, or underline when there are trailing or leading special or accented characters.

David Bourguignon

Comment 51

•

11 years ago

Thanks a lot to you guys for your support. By the way, I bumped recently in a new side effect caused by the / / markup, that could also hide another bug... See the bug 913768 report for more. Thanks in advance for your help!

Assignee

Comment 52

•

11 years ago

On second thoughts, perhaps the test should include "mark" characters as well as "letter" characters, to cover a case like "_souligné_" (using ASCII e and U+0301 COMBINING ACUTE ACCENT rather than U+00E9 LATIN SMALL LETTER E WITH ACUTE)

Severity: normal → minor

Comment 53

•

11 years ago

This bug actually applies only when either the first or last character of the string is not a "Western" alphabetical character. Thus, while the problem is seen for the string überhaupt, it is not seen for the string xüberhaupt. The problem also appears when the first or last character is numeric or a "special" character. Thus, the problem appears with the following character strings: 12345 1b2c3d4e a1b2c3d4 $1b2c3d4e a1b2c3d4# but not with a1b2c3d4e. It was argued in a comment to bug #949066 that the handling of numeric characters -- not applying the markup -- is not a problem, that it is intentional so as not to affect mathematical equations. That argument is invalid. My degree is in mathematics. There are many equations and formulae that have alphabetic terms without any numeric characters.

Updated

•

11 years ago

Summary: [mozTXTToHTMLConv] structs with leading/trailing international chars not recognized. For example structured plain text */_éfoobar$_/* not displayed as bold, italic, or underline when there are trailing or leading special or accented characters. → [mozTXTToHTMLConv] structs with leading/trailing international or numeric chars not recognized. For example structured plain text */_éfoobar$_/* not displayed as bold, italic, or underline when there are trailing or leading special or accented characters.

Comment 55

•

11 years ago

Attached file Testcase1.eml: Structs vs. Maths (showing that numbers should just be formatted like any other structs) — Details

Per Bug 949066 Comment 1, plain numbers in structs (more precisely, struct text content where 1st character is numeric) are intentionally not formatted because > math equations would be messed up. (In reply to David E. Ross from comment #53) > That argument is invalid. +1 Testcase1.eml tries this hypothesis, and proves it wrong: The only valid, numerical struct... 123 *345* 678 ...can never be correct mathematical syntax (wrong spacing), so it should just be formatted like any other alphabetical character struct (this bug). Otoh, correct mathematical syntax will never be formatted as a struct (wrong spacing again, correctly not recognized as a struct): 123 * 345 * 678 123*345*678 That's valid mathematical syntax, but not a valid struct - no problem again.

Comment 56

•

11 years ago

(In reply to Thomas D. from comment #55) I've already shown in attachment 8347713 [details] that "structs vs. maths" is a myth, so e.g. *123* should just be formatted bold the same way we format *foo*. This has major implications on how we fix this bug, so let me add some more arguments before conclusions: Arguments supporting that numerical structs *123* should be formatted like alphabetical structs *foo* 1) Valid struct syntax like *123* around numbers will always be invalid maths syntax (wrong spacing, see attachment 8347713 [details]), so formatting that is perfectly ok. 2) Valid maths syntax like 123 * 456 * 789 is always invalid struct syntax, so it will not be formatted anyway (because it's not recognized as a struct, regardless of alphabetic or numerical content). So again, no special rules for numbers required. 3) Suppose there would be valid maths syntax that is also valid struct syntax, and we'd actually format the number to be bold or italics - still no big deal: When TB message reader displays a struct like 123 *456* 789 (invalid maths syntax), the struct characters (*,/) are always preserved, so any potential mathematical equation would still be correctly printed, albeit with a little formatting - where's the problem? You can even copy that and paste without formatting, or paste into text editor, or paste into word and just remove formatting. No big deal. But anyway, unless shown otherwise, this is hypothetical and cannot occur. 4) How likely is it for users to send mathematical equations in a plaintext message, given that there are dozens of tools out there to produce nice graphical equations with proper display of fractions etc.? Sorry, but imho the scenario of mathematical equations in plaintext emails belongs to the realm of plaintext myths, of which there are far too many around TB, and mostly told by the very same few plaintext lovers against better evidence. Times have changed. Mathematical equations are much more likely to be found in appropriate file attachments. 5) On the other hand, given 1), there's actually a pretty high chance that numerical structs like *123* are intentionally used as structs by those users who actually use structs. So for that much more plausible scenario, we currently fail (this bug), which is also ux-inconsistent because there's just no convincing reason why number structs should not be formatted. q.e.d.

Comment 57

•

11 years ago

(In reply to Thomas D. from comment #56) > Arguments supporting that numerical structs *123* should be formatted like alphabetical structs *foo* (In reply to David E. Ross from comment #53) > There are many equations and formulae that have alphabetic terms without any numeric characters. 6) Given that many mathematical equations and formulae use alphabetic terms (even without any numeric characters), current algorithm (if it were applicable to valid maths, see 1) would only exclude a small and rather random subset of mathematical expressions from formatting by structs (why numbers only?). That's also ux-inconsistent. Assuming we'd rather keep structs than cater for *invalid* maths syntax (see 1), again, structs win because excluding only one half (or less) of mathematical expressions would not make much sense.

Comment 58

•

11 years ago

This bug violates ux-consistency as explained in comment 53 to comment 57 for numeric structs like *123*, and by analogy, for structs with leading/trailing international or special characters, too - no reason to exclude such structs from formatting.

Keywords: ux-consistency

Comment 59

•

11 years ago

(In reply to Thomas D. from comment #56) > I've already shown in attachment 8347713 [details] that "structs vs. > maths" is a myth, so e.g. *123* should just be formatted bold the same way > we format *foo*. This has major implications on how we fix this bug As shown in testcase1.eml, attachment 8347713 [details], comment 56 and comment 57, it's wrong and unnecessary to ignore structs with leading/trailing numeric character (e.g. *123* or *10EUR* or *EUR 10*) and treat them differently from alphabetical structs (e.g. *foo*). Per this bug, it's also wrong to ignore structs with leading/trailing international/special characters (e.g. *écriture*, *$1 US*, *13 Ιαν*) and treat them differently from simple ASCII alphabetical structs (e.g. *foo*). *** Conclusion (wrt fixing this bug): *** For formatted rendering of structs, the type of the leading/trailing character of the inner text is irrelevant. We can just remove the entire special-casing of alphabetical characters vs. numeric characters, and render all structs correctly formatted as they occur, regardless of the character type of the first or last character. (I can't think of any leading/trailing character that should cause the user's struct formatting intention to be ignored, can you?) So we no longer need complicated functions like isUnicodeAlpha() to verify the nature of leading/trailing characters inside structs; iow, this bug no longer depends on Bug 415209. As a nice side-effect, removing the special-casing of numeric characters, currently realized as incomplete special-casing of alphabetical characters, will also simplify the code and improve performance. And we'll allow our international users to enjoy Ben's struct algorithms. Looks like a win-win for everyone. :)

No longer depends on: 415209

Whiteboard: See dependencies.

Comment 60

•

11 years ago

I have also determined that at least some special characters (e.g., $, #) appearing first or last in a string cause the markup to be ignored. Thus, the problem is seen with the following despite the fact that the first and last alphanumeric characters are alphabetic and not numeric: *a1b2c3d4e$* *a1b2c3d4e#* *$a1b2c3d4e* *#a1b2c3d4e* I decline to test other special characters. I must concur with the conclusion stated in comment #59, especially the sentence: > For formatted rendering of structs, the type of the > leading/trailing character of the inner text is irrelevant.

Assignee

Comment 61

•

11 years ago

(In reply to Thomas D. from comment #59) > For formatted rendering of structs, the type of the leading/trailing > character of the inner text is irrelevant. We can just remove the entire > special-casing of alphabetical characters vs. numeric characters, and render > all structs correctly formatted as they occur, regardless of the character > type of the first or last character. (I can't think of any leading/trailing > character that should cause the user's struct formatting intention to be > ignored, can you?) > > So we no longer need complicated functions like isUnicodeAlpha() to verify > the nature of leading/trailing characters inside structs; iow, this bug no > longer depends on Bug 415209. Wait a minute. This bug has now morphed too far from its original description. Bug 949066 was duped to here, even though its scope was rather different and this bug has now been changed into a dupe of that bug. Better to keep bug 949066 as a separate RFE. If that is fixed, this will become WONTFIX.

Summary: [mozTXTToHTMLConv] structs with leading/trailing international or numeric chars not recognized. For example structured plain text */_éfoobar$_/* not displayed as bold, italic, or underline when there are trailing or leading special or accented characters. → [mozTXTToHTMLConv] structs with leading/trailing international chars not recognized. For example structured plain text */_éfoobar$_/* not displayed as bold, italic, or underline when there are trailing or leading special or accented characters.

Assignee

Updated

•

11 years ago

Depends on: 949066

Comment 62

•

11 years ago

Thomas D. in comment 59: > No longer depends on: 415209 > Whiteboard: See dependencies. Don't mess with the dependencies that I added. This is an actual, hard code-level dependency and the whole reason why this bug here exists.

Depends on: 415209
No longer depends on: 949066

Updated

•

11 years ago

Whiteboard: See dependency bug 415209

Comment 63

•

11 years ago

To translate: I would *like* to have any accented or non-English alphabetic characters to be recognized. I *do not* want numbers or other special symbols after * to be recognized as structs. There is a too high risk of false positives, e.g. in ASCII art or other strings of special characters.

Comment 64

•

11 years ago

xref bug 950605 for numbers/digits.

Updated

•

11 years ago

Depends on: 950606

Comment 65

•

11 years ago

To everyone affected by this bug 106028: Please have a look at xref bug 950606. Ben has just filed xref bug 950606 in which he seeks to ignore even more of what he considers "special characters" in the structs parsing algorithm. Even everyday puncuation like dots, commas, round brackets or $ signs will no longer be possible inside structs. So if that bug succeeds, we'll see even more struct variants fail or continue to fail as they do now: Ben's Bug 950606 wants structs like the following to *fail* (*not* get formatted *bold* in msg viewer): a) *(wth!)* leading and trailing brackets ("special chars") b) *I really hate inconsistent design!* trailing exclamation mark ("special char") c) *Good design needs user input, design principles, good reasons and cooperation!* More than 4 words (Ben has determined that structs should not have more than "up to 3 or 4 words", see bug 950606, comment 0); inner comma, and exclamation mark ("punctuation") d) M$ really make *XXL$$$* trailing $ sign ("special char") e) *$ 20* leading $ sign ("special char") f) *(!foo.bar)* leading round bracket ("special char") g) *I lv u 4ever. Really!* inner full stop; trailing exclamation mark ("punctuation") Fwiw, I think all of these should be recognized and rendered with formatting just like any other struct. That's what I proposed in comment 59. I am interested what others think.

Comment 66

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #62) > Thomas D. in comment 59: > > No longer depends on: 415209 > > Whiteboard: See dependencies. > > Don't mess with the dependencies that I added. Ben, I never "mess with the dependencies". Pls refrain from such personally abusive language which deliberately tries to discredit my painstaking work of bug triaging. After more than 6 years of being an active contributor and doing bug triage on thousands of bugs (see my BMO profile and activity log), I definitly know what I'm doing, and strangely there have almost never been any such problems with my 12700+ activities on TB bugs except on what you consider "your" bugs. Pls be advised that such agressive comments violate not only basic standards of mutual respect and cooperation among fellow contributors, but they can easily be read as a personal attack against me, in violation of the rules here on bmo: https://bugzilla.mozilla.org/page.cgi?id=etiquette.html > 3. No abusing people. On a more factual level, it's inappropriate to complain about "messing with dependencies" while I've provided extensive reason for this particular *change* in dependencies (and it's easy to undo, so where's the problem?). Instead of railing at your fellow contributors, what about *answering my comments in detail*, starting from my comment 55 and testcase attachment 8347713 [details]? > This is an actual, hard > code-level dependency and the whole reason why this bug here exists. Fwiw, that's just not true like that, and think you know that (if not, please re-read my comment 59). It all depends on the solution which the TB community(!) prefers for this bug: - For /your/ personally preferred solution, yes, bug 415209 is required to create more special cases of accepting international characters in structs and continue to fail for structs containing other characters that /you/ consider "special" characters, like dots(.), commas(,), $-signs and even round brackets (()). - For the more comprehensive solution which /I/ proposed in comment 59 for solving this bug and other bugs and return ux-consistency to structs parsing, bug 415209 is *not* required, that's why I correctly removed the dependency. So it's just that we are heading in different directions: Ben wants to add *even more* special cases, including more special cases which will be *ignored* by structs parsing (see xref bug 950606), while users and I are requesting the exact opposite: There's more than enough evidence on this bug and its 18 duplicates (including bug 949066 which unfortunately got moved out from here) that users are not happy with the current special-casing in structs algorithm due to its failure in terms of ux-consistency. (I encourage such users to vote for this bug and bug 949066). The solution I proposed in comment 59 tries to address these real-life problems faced by users. Afasics it's also in line with advice from the UX lead, Blake Winton, who just commented against special-casing on bug 949066: (In reply to Blake Winton (:bwinton) from bug 949066, comment #15) > Also, I strongly suspect that the "maths" argument is already kind of messed > up due to things like "5*a*b", and perhaps we want to figure out something > better to do there, instead of adding special cases to the struct parsing.

Comment 67

•

11 years ago

(In reply to Ben Bucksch (:BenB) from comment #63) > To translate: I would *like* to have any accented or non-English alphabetic > characters to be recognized. > I *do not* want numbers or other special symbols after * to be recognized as > structs. There is a too high risk of false positives, e.g. in ASCII art or > other strings of special characters. So that's what /Ben/ wants and claims. I've provided detailed arguments why the usecase of ASCII art is another non-argument, in Bug 949066 Comment 12. Swap "maths" against "ASCII art" and imo :bwinton's bug 949066, comment #15 applies seamlessly, again: (In reply to Blake Winton (:bwinton) from bug 949066, comment #15) > Also, I strongly suspect that the "maths" [and "ASCII art", T.D.] argument is already kind of messed > up due to things like "5*a*b", and perhaps we want to figure out something > better to do there, instead of adding special cases to the struct parsing. In short, because ASCII art can include any alphabetical character (not just "special" characters), in theory they would/could already fail now for those default structs like *foo* which we render formatted.

Comment 68

•

11 years ago

Please do NOT add me to the CC list of any bug report. I have a list of bugs that I am tracking and a query that uses that list.

Comment 69

•

10 years ago

I would like to propose the following solution. Start applying the markup when there is a blank before the *, /, or _ but not after; and end applying the markup when there is a blank after the character but not before. This would mean a*b*c or a * b * c as mathematical expressions would be preserved with the asterisks visible and the expressions not bold. On the other hand *a*b*c* would have the internal asterisks visible but the expression bold. This would mean 3/4 would be visible as a fraction not Italic, but /3/4/ would be visible as a fraction that is Italic. This would mean show_bug (from the URI of this bug report) would have the underline visible as part of the path (not underlined), but _show_bug_ would have the internal underline visible with the phrase underlined. Depending upon the user's fonts and monitor settings, the internal underline might be hidden by the overall underlining, but that is no different from how a Web page might appear with <span style="text-decoration:underline">show_bug</span>. While underlining is generally discouraged in Internet communications for this reason, this is a problem for the user and not relevant to fixing this bug report.