Closed Bug 300313 Opened 19 years ago Closed 19 years ago

Line/word breaking service should be deCOMtaminated and simplified

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: roc, Assigned: pythonesque+bugzilla)

Details

(Whiteboard: [good first bug])

Attachments

(1 file, 6 obsolete files)

nsILineBreakerFactory and nsIWordBreakerFactory are XPCOM services that let
callers obtain an nsILineBreaker/nsIWordBreaker. There are some problems with this.

1) nsIBreakState, nsILinearIterator and nsIBinarySearchIterator are public
interfaces that are never really used. They should be removed.
2) The factories offer GetBreaker methods that take a string parameter. All
callers pass the empty string as the parameter. This parameter should be removed
and the code should be simplified to handle only the empty string case.
3) nsDocument keeps a pointer to the line breaker and word breaker for the
document. There's only one implementation of each interface so this doesn't need
to be per-document. Also, no-one ever calls SetLineBreaker/SetWordBreaker. So
remove all of Get/SetLine/WordBreaker and have the callers to
GetLine/WordBreaker just get the breaker directly.
4) Having two factories to return two objects is a waste. Creating new objects
for each GetBreaker call is a waste also --- the objects are stateless. It could
all be reduced to at most two services, one implementing nsIWordBreaker and one
implementing nsILineBreaker. nsIWordBreakerFactory and nsILineBreakerFactory can
be removed.
5) They're mostly used from things that link into gklayout. nsContentUtils can
cache these services the way it caches other services, and gklayout users
(content and layout) can get the breakers that way.
6) While we're at it, the nsIWordBreaker and nsILineBreaker interfaces could be
deCOMtaminated, primarily to return results as real results instead of via out
pointers.
Whiteboard: [good first bug]
Attached patch Rough fix for 1-5 (obsolete) — Splinter Review
An EXTREMELY rough draftish fix that is in serious need of review.  I'm still
fairly new to the Mozilla codebase, so it undoubtedly contains errors (probably
leaks some memory, too).  I also wasn't able to use nsContentUtils in the
editor because it wasn't compiling for some reason (please let me know how to
fix this, if it's possible).  And I haven't even begun trying to deCOMtaminate,
though I'm curious as to how to do it (there are actually multiple 'return'
values I would have to deal with, for one thing).

So, once again, I urge someone to review it with all deliberate speed.
Attachment #192525 - Flags: review?(smontagu)
I also just noticed that I made at least one error that the compiler should have
caught in TestLineBreak.cpp, but I don't build with tests enabled because I'm
using the free MSVC compiler.  So if there are any other errors in the test (or
in extensions like Spellcheck), please alert me.
I also just noticed that I made at least one error that the compiler should have
caught in TestLineBreak.cpp, but I don't build with tests enabled because I'm
using the free MSVC compiler.  So if there are any other errors in the test (or
in extensions like Spellcheck), please alert me.
Attached patch Cleaner Fix for 1-5 (obsolete) — Splinter Review
Still not deCOMtaminated, but I removed some of the more glaring allocation
errors.  There are likely still other errors; please tell me if there are.

(Sorry for the double-post).
Attachment #192525 - Attachment is obsolete: true
Attachment #192618 - Flags: review?(smontagu)
Attachment #192525 - Flags: review?(smontagu) → review?
Attachment #192525 - Flags: review?
Comment on attachment 192618 [details] [diff] [review]
Cleaner Fix for 1-5

Thanks for taking this on. I'll be your reviewer.
Attachment #192618 - Flags: superreview?(roc)
Attachment #192618 - Flags: review?(smontagu)
Attachment #192618 - Flags: review?(roc)
> I don't build with tests enabled because I'm using the free MSVC compiler.

Why can't the MSVC compiler build tests?

intl/lwbrk/tests/TestLineBreak.cpp
+#include "nsContentUtils.h"

You can't do this. nsContentUtils can only be used by things that link into
gklayout and intl doesn't. Do you actually need it though?

+        nsCOMPtr<nsILineBreaker> mLineBreaker = nsContentUtils::GetLineBreaker();

Our convention is that m... names refer to field names, so could you please
rename the variable?

+    if (mLineBreaker == nsnull) return NS_ERROR_FAILURE;

We usually just write (!mLineBreaker)

editor/libeditor/text/nsInternetCiter.cpp
+  nsILineBreaker *lineBreaker;

You want to just use nsCOMPtr and do_GetService here. Ditto in
editor/libeditor/text/nsWrapUtils.cpp.

extensions/spellcheck/src/nsSpellCheckUtils.cpp
+#include "nsContentUtils.h"

You can't do this here either, extensions don't link into gklayout. Ditto in
extensions/spellcheck/src/nsSpellCheckController.cpp
It looks really great, by the way. lovely cleanup.
Attached patch Clean COMtaminated Fix (obsolete) — Splinter Review
I incorporated the changes you requested.  Hopefully, this one should work.

About building tests... I'm not using the normal MSVC compiler, but the free
one, as per the instructions on Waldo's website.  Last time I tried building
with tests enabled, it failed.	Maybe that's no longer the case... I'll try it
some time.

DeCOMtamination is still necessary, I know.  I'm also fairly sure that most of
the functions generally modify several values it is, and the nsresult is
actually used in some cases.  Should I just return the most obviously pertinent
variable and keep the nsresult as a reference?	Thanks for the help.
Attachment #192714 - Flags: superreview?(roc)
Attachment #192714 - Flags: review?(roc)
Attachment #192618 - Attachment is obsolete: true
Attachment #192618 - Flags: superreview?(roc)
Attachment #192618 - Flags: review?(roc)
(In reply to comment #8)
> DeCOMtamination is still necessary, I know.  I'm also fairly sure that most of
> the functions generally modify several values it is, and the nsresult is
> actually used in some cases.  Should I just return the most obviously pertinent
> variable and keep the nsresult as a reference?	Thanks for the help.

I don't anything that really needs to return a failure code.

nsJISx4051LineBreaker::BreakInBetween() should return oCanBreak.
nsJISx4051LineBreaker::Next() and nsJISx4051LineBreaker::Prev() can return
oNext/oPrev as signed PRInt32s and return -1 when oNeedMoreText would be true.
nsSampleWordBreaker::NextWord() and nsSampleWordBreaker::PrevWord() are similar.
nsSampleWordBreaker::FindWord() could return a custom struct with two PRUint32s
in it.
Before further work is done here, I have to point out that some points raised in
comment #0 need to be reexamined. Currently, there's only one implementation of
linebreaker, but afaict, a factory mechanism was put in place because it's
planned (although no additional implementation is in sight at the moment) to add
more (language-dependent) implementations down the road. See bug 203016.


I still think most of the points were valid, though.  Why not just use a similar
solution to the one that's already being used with charsets and have a type
modifier for the services (e.g. @mozilla.org/intl/lbrk;1?type=JISx4051)?  That
way we could still eliminate the factories, because the objects created by the
factories really are fairly useless without states.

I could just be misinterpreting the way services work, but this does seem
logical to me...
Jungshik, I think any new linebreaking functionality should not be added as a
new service. Instead we should change the API so we can pass in a language atom
to each call, and add the new functionality directly to the existing service.
That way we can use the "lang" attribute to influence breaking.

The only reason to use a service is if we want to be able to dynamically load
new line/word breakers and/or have a frozen interface so that third parties can
provide their own line/word breaking components. I don't think either of those
are very important goals.
Attached patch DeCOMtaminated Fix (obsolete) — Splinter Review
It took awhile, but I eventually managed to get the word/line breaker services
deCOMtaminated.  If I made any mistakes or didn't optimize properly, I
apologize and will try to rectify the situation.  Otherwise, methinks this
patch is complete (I ran the intl smoketests and verified that they were, in
fact, working, to the best of my ability).

There do seem to be some stability issues, but I'd put that down more to the
current state of the branch than to this patch.
Attachment #192714 - Attachment is obsolete: true
Attachment #192857 - Flags: superreview?(roc)
Attachment #192857 - Flags: review?(roc)
Attachment #192714 - Flags: superreview?(roc)
Attachment #192714 - Flags: review?(roc)
Comment on attachment 192857 [details] [diff] [review]
DeCOMtaminated Fix

This is really good. Just a couple more things to fix.

+} wb_pair;

This name isn't consistent with our naming conventions. I suggest nsWordRange.

Don't return -2 for NS_ERROR_NULL_POINTER, it's not worth it since that's only
a programmer error. Just NS_ASSERTION() it and move on.

Create
#define NS_BREAKER_NEED_MORE_TEXT -1
and use it.
Attached patch Final DeCOMtaminated Fix (obsolete) — Splinter Review
Incorporating the suggestions you made.  This should be the final version.
Attachment #192857 - Attachment is obsolete: true
Attachment #192873 - Flags: superreview?(roc)
Attachment #192873 - Flags: review?(roc)
Attachment #192857 - Flags: superreview?(roc)
Attachment #192857 - Flags: review?(roc)
Nearly there! Thanks for bearing with me.

+#ifndef NS_BREAKER_NEED_MORE_TEXT
+#define NS_BREAKER_NEED_MORE_TEXT -1
+#endif

Don't do this, just #define them. We already have protection against including
this file twice.

+typedef struct {
+  PRUint32 begin;
+  PRUint32 end;
+} nsWordRange;

mBegin, mEnd

+  if(aPos > aLen)
+    return NS_BREAKER_ILLEGAL_VALUE; // NS_ERROR_ILLEGAL_VALUE

I think this should just be an assertion, and we should eliminate
NS_BREAKER_ILLEGAL_VALUE. It should be easy for callers to ensure they never
call with aPos > aLen.
> Don't do this, just #define them. We already have protection against including
> this file twice.
But it's defined in both nsIWordBreaker.h and nsILineBreaker.h; otherwise I
wouldn't have done it.

Anyway, hopefully I'll be done with the (final?) changes soon.
Attachment #192873 - Attachment is obsolete: true
Attachment #192873 - Flags: superreview?(roc)
Attachment #192873 - Flags: review?(roc)
(In reply to comment #17)
> > Don't do this, just #define them. We already have protection against including
> > this file twice.
> But it's defined in both nsIWordBreaker.h and nsILineBreaker.h; otherwise I
> wouldn't have done it.

Alright, then #define NS_LINE_BREAKER_NEED_MORE_TEXT *and*
NS_WORD_BREAKER_NEED_MORE_TEXT, one in each file.
Wary though I am at this point of actually declaring this version 'final', I
really hope it is.
Attachment #193404 - Flags: superreview?(roc)
Attachment #193404 - Flags: review?(roc)
Comment on attachment 193404 [details] [diff] [review]
Final Fix (Really)

Good as gold! I'll land this for you.
Attachment #193404 - Flags: superreview?(roc)
Attachment #193404 - Flags: superreview+
Attachment #193404 - Flags: review?(roc)
Attachment #193404 - Flags: review+
I had to fix some compilation errors in TestLineBreaker, and it doesn't pass the
tests, but I checked that the existing code doesn't pass either so the tests are
probably just rotten. We do get basically the same results after this patch as
we did before.

Checked in. Thanks very much!
You're welcome!

Also, just to see if I was right about building tests with my compiler, I tried
it.  It failed in nsprpub, so I think the answer is 'no'.  Nonetheless, thanks
for correcting the errors, and thanks for checking it in!
This reduced Tp on btek and luna by about 5ms (a 0.5% reduction in average page
load time). That's great work!
Attached patch Fix Bustage on Linux (obsolete) — Splinter Review
Yeah, thanks, Roc.  Unfortunately, it's all gone to waste on Linux; I can
verify that the line that it's busting on is a result of my patch.  This patch
should (hopefully) fix the damage (stupid me for not building a debug build,
which also doesn't work with my compiler).
Attachment #193413 - Flags: superreview?(roc)
Attachment #193413 - Flags: review?(roc)
Comment on attachment 193413 [details] [diff] [review]
Fix Bustage on Linux

Never mind, I see it's already done.
Attachment #193413 - Attachment is obsolete: true
Attachment #193413 - Flags: superreview?(roc)
Attachment #193413 - Flags: review?(roc)
Shouldn't this be marked as resolved fixed?
Assignee: smontagu → pythonesque+bugzilla
And now it is so marked.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: