Closed Bug 300313 Opened 20 years ago Closed 19 years ago

Line/word breaking service should be deCOMtaminated and simplified

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: roc, Assigned: pythonesque+bugzilla)

Details

(Whiteboard: [good first bug])

Attachments

(1 file, 6 obsolete files)

nsILineBreakerFactory and nsIWordBreakerFactory are XPCOM services that let callers obtain an nsILineBreaker/nsIWordBreaker. There are some problems with this. 1) nsIBreakState, nsILinearIterator and nsIBinarySearchIterator are public interfaces that are never really used. They should be removed. 2) The factories offer GetBreaker methods that take a string parameter. All callers pass the empty string as the parameter. This parameter should be removed and the code should be simplified to handle only the empty string case. 3) nsDocument keeps a pointer to the line breaker and word breaker for the document. There's only one implementation of each interface so this doesn't need to be per-document. Also, no-one ever calls SetLineBreaker/SetWordBreaker. So remove all of Get/SetLine/WordBreaker and have the callers to GetLine/WordBreaker just get the breaker directly. 4) Having two factories to return two objects is a waste. Creating new objects for each GetBreaker call is a waste also --- the objects are stateless. It could all be reduced to at most two services, one implementing nsIWordBreaker and one implementing nsILineBreaker. nsIWordBreakerFactory and nsILineBreakerFactory can be removed. 5) They're mostly used from things that link into gklayout. nsContentUtils can cache these services the way it caches other services, and gklayout users (content and layout) can get the breakers that way. 6) While we're at it, the nsIWordBreaker and nsILineBreaker interfaces could be deCOMtaminated, primarily to return results as real results instead of via out pointers.
Whiteboard: [good first bug]
Attached patch Rough fix for 1-5 (obsolete) — Splinter Review
An EXTREMELY rough draftish fix that is in serious need of review. I'm still fairly new to the Mozilla codebase, so it undoubtedly contains errors (probably leaks some memory, too). I also wasn't able to use nsContentUtils in the editor because it wasn't compiling for some reason (please let me know how to fix this, if it's possible). And I haven't even begun trying to deCOMtaminate, though I'm curious as to how to do it (there are actually multiple 'return' values I would have to deal with, for one thing). So, once again, I urge someone to review it with all deliberate speed.
Attachment #192525 - Flags: review?(smontagu)
I also just noticed that I made at least one error that the compiler should have caught in TestLineBreak.cpp, but I don't build with tests enabled because I'm using the free MSVC compiler. So if there are any other errors in the test (or in extensions like Spellcheck), please alert me.
I also just noticed that I made at least one error that the compiler should have caught in TestLineBreak.cpp, but I don't build with tests enabled because I'm using the free MSVC compiler. So if there are any other errors in the test (or in extensions like Spellcheck), please alert me.
Attached patch Cleaner Fix for 1-5 (obsolete) — Splinter Review
Still not deCOMtaminated, but I removed some of the more glaring allocation errors. There are likely still other errors; please tell me if there are. (Sorry for the double-post).
Attachment #192525 - Attachment is obsolete: true
Attachment #192618 - Flags: review?(smontagu)
Attachment #192525 - Flags: review?(smontagu) → review?
Attachment #192525 - Flags: review?
Comment on attachment 192618 [details] [diff] [review] Cleaner Fix for 1-5 Thanks for taking this on. I'll be your reviewer.
Attachment #192618 - Flags: superreview?(roc)
Attachment #192618 - Flags: review?(smontagu)
Attachment #192618 - Flags: review?(roc)
> I don't build with tests enabled because I'm using the free MSVC compiler. Why can't the MSVC compiler build tests? intl/lwbrk/tests/TestLineBreak.cpp +#include "nsContentUtils.h" You can't do this. nsContentUtils can only be used by things that link into gklayout and intl doesn't. Do you actually need it though? + nsCOMPtr<nsILineBreaker> mLineBreaker = nsContentUtils::GetLineBreaker(); Our convention is that m... names refer to field names, so could you please rename the variable? + if (mLineBreaker == nsnull) return NS_ERROR_FAILURE; We usually just write (!mLineBreaker) editor/libeditor/text/nsInternetCiter.cpp + nsILineBreaker *lineBreaker; You want to just use nsCOMPtr and do_GetService here. Ditto in editor/libeditor/text/nsWrapUtils.cpp. extensions/spellcheck/src/nsSpellCheckUtils.cpp +#include "nsContentUtils.h" You can't do this here either, extensions don't link into gklayout. Ditto in extensions/spellcheck/src/nsSpellCheckController.cpp
It looks really great, by the way. lovely cleanup.
Attached patch Clean COMtaminated Fix (obsolete) — Splinter Review
I incorporated the changes you requested. Hopefully, this one should work. About building tests... I'm not using the normal MSVC compiler, but the free one, as per the instructions on Waldo's website. Last time I tried building with tests enabled, it failed. Maybe that's no longer the case... I'll try it some time. DeCOMtamination is still necessary, I know. I'm also fairly sure that most of the functions generally modify several values it is, and the nsresult is actually used in some cases. Should I just return the most obviously pertinent variable and keep the nsresult as a reference? Thanks for the help.
Attachment #192714 - Flags: superreview?(roc)
Attachment #192714 - Flags: review?(roc)
Attachment #192618 - Attachment is obsolete: true
Attachment #192618 - Flags: superreview?(roc)
Attachment #192618 - Flags: review?(roc)
(In reply to comment #8) > DeCOMtamination is still necessary, I know. I'm also fairly sure that most of > the functions generally modify several values it is, and the nsresult is > actually used in some cases. Should I just return the most obviously pertinent > variable and keep the nsresult as a reference? Thanks for the help. I don't anything that really needs to return a failure code. nsJISx4051LineBreaker::BreakInBetween() should return oCanBreak. nsJISx4051LineBreaker::Next() and nsJISx4051LineBreaker::Prev() can return oNext/oPrev as signed PRInt32s and return -1 when oNeedMoreText would be true. nsSampleWordBreaker::NextWord() and nsSampleWordBreaker::PrevWord() are similar. nsSampleWordBreaker::FindWord() could return a custom struct with two PRUint32s in it.
Before further work is done here, I have to point out that some points raised in comment #0 need to be reexamined. Currently, there's only one implementation of linebreaker, but afaict, a factory mechanism was put in place because it's planned (although no additional implementation is in sight at the moment) to add more (language-dependent) implementations down the road. See bug 203016.
I still think most of the points were valid, though. Why not just use a similar solution to the one that's already being used with charsets and have a type modifier for the services (e.g. @mozilla.org/intl/lbrk;1?type=JISx4051)? That way we could still eliminate the factories, because the objects created by the factories really are fairly useless without states. I could just be misinterpreting the way services work, but this does seem logical to me...
Jungshik, I think any new linebreaking functionality should not be added as a new service. Instead we should change the API so we can pass in a language atom to each call, and add the new functionality directly to the existing service. That way we can use the "lang" attribute to influence breaking. The only reason to use a service is if we want to be able to dynamically load new line/word breakers and/or have a frozen interface so that third parties can provide their own line/word breaking components. I don't think either of those are very important goals.
Attached patch DeCOMtaminated Fix (obsolete) — Splinter Review
It took awhile, but I eventually managed to get the word/line breaker services deCOMtaminated. If I made any mistakes or didn't optimize properly, I apologize and will try to rectify the situation. Otherwise, methinks this patch is complete (I ran the intl smoketests and verified that they were, in fact, working, to the best of my ability). There do seem to be some stability issues, but I'd put that down more to the current state of the branch than to this patch.
Attachment #192714 - Attachment is obsolete: true
Attachment #192857 - Flags: superreview?(roc)
Attachment #192857 - Flags: review?(roc)
Attachment #192714 - Flags: superreview?(roc)
Attachment #192714 - Flags: review?(roc)
Comment on attachment 192857 [details] [diff] [review] DeCOMtaminated Fix This is really good. Just a couple more things to fix. +} wb_pair; This name isn't consistent with our naming conventions. I suggest nsWordRange. Don't return -2 for NS_ERROR_NULL_POINTER, it's not worth it since that's only a programmer error. Just NS_ASSERTION() it and move on. Create #define NS_BREAKER_NEED_MORE_TEXT -1 and use it.
Attached patch Final DeCOMtaminated Fix (obsolete) — Splinter Review
Incorporating the suggestions you made. This should be the final version.
Attachment #192857 - Attachment is obsolete: true
Attachment #192873 - Flags: superreview?(roc)
Attachment #192873 - Flags: review?(roc)
Attachment #192857 - Flags: superreview?(roc)
Attachment #192857 - Flags: review?(roc)
Nearly there! Thanks for bearing with me. +#ifndef NS_BREAKER_NEED_MORE_TEXT +#define NS_BREAKER_NEED_MORE_TEXT -1 +#endif Don't do this, just #define them. We already have protection against including this file twice. +typedef struct { + PRUint32 begin; + PRUint32 end; +} nsWordRange; mBegin, mEnd + if(aPos > aLen) + return NS_BREAKER_ILLEGAL_VALUE; // NS_ERROR_ILLEGAL_VALUE I think this should just be an assertion, and we should eliminate NS_BREAKER_ILLEGAL_VALUE. It should be easy for callers to ensure they never call with aPos > aLen.
> Don't do this, just #define them. We already have protection against including > this file twice. But it's defined in both nsIWordBreaker.h and nsILineBreaker.h; otherwise I wouldn't have done it. Anyway, hopefully I'll be done with the (final?) changes soon.
Attachment #192873 - Attachment is obsolete: true
Attachment #192873 - Flags: superreview?(roc)
Attachment #192873 - Flags: review?(roc)
(In reply to comment #17) > > Don't do this, just #define them. We already have protection against including > > this file twice. > But it's defined in both nsIWordBreaker.h and nsILineBreaker.h; otherwise I > wouldn't have done it. Alright, then #define NS_LINE_BREAKER_NEED_MORE_TEXT *and* NS_WORD_BREAKER_NEED_MORE_TEXT, one in each file.
Wary though I am at this point of actually declaring this version 'final', I really hope it is.
Attachment #193404 - Flags: superreview?(roc)
Attachment #193404 - Flags: review?(roc)
Comment on attachment 193404 [details] [diff] [review] Final Fix (Really) Good as gold! I'll land this for you.
Attachment #193404 - Flags: superreview?(roc)
Attachment #193404 - Flags: superreview+
Attachment #193404 - Flags: review?(roc)
Attachment #193404 - Flags: review+
I had to fix some compilation errors in TestLineBreaker, and it doesn't pass the tests, but I checked that the existing code doesn't pass either so the tests are probably just rotten. We do get basically the same results after this patch as we did before. Checked in. Thanks very much!
You're welcome! Also, just to see if I was right about building tests with my compiler, I tried it. It failed in nsprpub, so I think the answer is 'no'. Nonetheless, thanks for correcting the errors, and thanks for checking it in!
This reduced Tp on btek and luna by about 5ms (a 0.5% reduction in average page load time). That's great work!
Attached patch Fix Bustage on Linux (obsolete) — Splinter Review
Yeah, thanks, Roc. Unfortunately, it's all gone to waste on Linux; I can verify that the line that it's busting on is a result of my patch. This patch should (hopefully) fix the damage (stupid me for not building a debug build, which also doesn't work with my compiler).
Attachment #193413 - Flags: superreview?(roc)
Attachment #193413 - Flags: review?(roc)
Comment on attachment 193413 [details] [diff] [review] Fix Bustage on Linux Never mind, I see it's already done.
Attachment #193413 - Attachment is obsolete: true
Attachment #193413 - Flags: superreview?(roc)
Attachment #193413 - Flags: review?(roc)
Shouldn't this be marked as resolved fixed?
Assignee: smontagu → pythonesque+bugzilla
And now it is so marked.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: