Closed Bug 62189 Opened 21 years ago Closed 21 years ago
Clean up plain text converter
Known "issues": We set mCacheLine everywhere. Set it at one place or maybe remove it completely. (Do we need the code that handles mCacheLine == false or can the normal code be used after some tweak? Would reduce code size.) What's the relationship between mPreformatted and the preformatted flag in the flag field? Only use mPreformatted instead of looking at both locations. Fix comment "wcwidth() and wcswidth()" and maybe rename one of the methods. mColPos is used as a bool (0 or not 0). Do it a bool. Move the wrap code to a routine of it's own.
And remove the old converter nsHTMLToTxtSinkStream if it's not used anymore.
The evolution of the HTML->TXT converter has been quite organic and some things in it make it really hard to understand. There is also things that are done differently at different places in the code and things that are done unnessicarily and too complicated. That makes the code hard to maintain and makes it harder for new people to understand it. I've made a pass over it last week and would like a review now. It's mostly name changes and very little real programmatic changes to make the patch easier to review. There shouldn't be any deep analysis necessary. So what have I done: * Convert all runtime conversions to unicode SPACE with a constant. It can't be made const static because that causes problems with OpenBSD and module unloading. (performance, clarity) * Only handle SELECT once in DoAddLeaf (performance, clarity) * Only create a linewrapper if we are going to do line wraps (performance) * Only get data from preferences if we are going to use it. (performance) * Reformulated some comments, and added function comments. (clarity) * Changed test for -1 to test for kNotFound. (clarity) * Renamed the unichar width functions to GetUnicharWidth and GetUnicharStringWidth. (clarity) * Removed the mCacheLine member variable and all code that was used when mCacheLine was false. The code that is used when mCacheLine is true handles that data too now. (code size, clarity, maintainability) * Renamed |WriteSimple| |Output| which is a descriptive name. (clarity) * Renamed |WriteQuotesAndIndent| |OutputQuotesAndIndent| which is a more descriptive name. (clarity) * Made the two almost identical code paths in EndLine one codepath. (code size, maintainability) * Added check for preformatted text before stripping spaces in EndLine (correctness) * Made two small functions inlined. One because it's used so much and the other because it is so small. * Unified use of string functions by replacing explicit conversions with NS_LITERAL_STRING * Renamed arguments to functions so that they starts with an 'a'. * mColPos was used as a bool. Made it a bool with the name mAtFirstColumn.
Status: NEW → ASSIGNED
Sorry for the delay. Here's the patch, Anthony. I hope you can look at it soon, because it will make yours and everyone else's future contacts with nsPlainTextSerializer so much less painful. For the content of the patch see my comment of 2000-12-18 10:59.
adding jfrancis to the cc list in the off-chance this would effect his code, but it looks good to me, and it runs well, so... sr=anthonyd
For the linebreak assignment code in the Init method (and in a few other places): its mildly cheaper to do a PRUnichar() cast of single characters instead of doing a NS_LITERAL_STRING wrapping when doing Assigns and Appends to strings. Other than that nit, sr=vidur.
This is checked in with single chars using a PRUnichar instead of a literal string as mentioned by vidur.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Bratell, please verify and mark VERIFIED-FIXED...thanks!
Status: RESOLVED → VERIFIED
The code is in. There hasn't been any reports of regressions, but since the bug is already VERIFIED, I won't do anything more.
You need to log in before you can comment on or make changes to this bug.