The default bug view has changed. See this FAQ.

Using machine learning to order autocomplete results

RESOLVED FIXED

Status

SeaMonkey
Autocomplete
RESOLVED FIXED
15 years ago
3 years ago

People

(Reporter: Nisheeth Ranjan, Assigned: Nisheeth Ranjan)

Tracking

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 9 obsolete attachments)

4.71 KB, text/plain
Details
5.42 KB, text/plain
Details
50.96 KB, patch
Details | Diff | Splinter Review
3.49 KB, text/plain
Details
(Assignee)

Description

15 years ago
The attached patch is a first cut at using machine learning techniques to better
order urls in the urlbar autocomplete dropdown window.  The current
implementation uses a simple linear classifier (perceptron) which is trained
implicitly each time the user selects a url from the autocomplete dropdown.  A
bunch of numeric features (X1, X2, ... Xn) are calculated for each url and input
to the perceptron along with the boolean value (B) of whether the url was
selected by the user.  The perceptron updates an array of numeric weights (W1,
W2, ... Wn), one for each input feature, as it sees the training data and
attempts to arrive at a set of weights such that:

sigmoid( sigma(i = 1 to n, Wi * Xi) ) = B

where sigmoid(x) = 1 / 1 + exp(-x)
and sigma represents the summation operator over variable i from 1 to n.

The goal is to train the perceptron with enough instances of training tuples,
<X1, X2, ... Xn, B> so that it converges on a set of weights such that it can
correctly output B from any new data instance, <X1, X2, ... Xn>.

Once such a set of weights is reached, the perceptron can be used to predict
whether a given url will be selected by the user or not.

This is a very simple implementation and there are many ideas that have been
left out in the above short description.

I'll add more information here as questions are raised...
(Assignee)

Comment 1

15 years ago
Created attachment 107650 [details] [diff] [review]
Patch v0.1

Version 0.1 of patch.
(Assignee)

Comment 2

15 years ago
Comment on attachment 107650 [details] [diff] [review]
Patch v0.1

Heikki, please review the attached patch.  Thanks!
Attachment #107650 - Flags: review?(heikki)
(Assignee)

Comment 3

15 years ago
Adding Heikki to the cc list.
(Assignee)

Comment 4

15 years ago
Joe hewitt is the owner of urlbar autocomplete.  CCing him as well.  Joe, I'll
get in touch with you to go over this patch interactively...
There is a patch in bug 180336 that should give you some pointers on how to
replace the standard C++ file handling with Mozilla versions.
(Assignee)

Comment 6

15 years ago
Heikki, I'm about to attach a revision of the patch that addresses the following
comments from your past review:

- In FillInputFeatures, use NS_NewURL more for parsing the url.
- Class names, Member variables, function arguments should follow Mozilla naming
conventions
- File saving and loading should use nsIFile
- Since output format is so close to XML, make it XML
- Fix tabbing
Status: NEW → ASSIGNED
(Assignee)

Comment 7

15 years ago
Created attachment 108111 [details] [diff] [review]
Patch v0.2

Fixes to first review from Heikki.
Attachment #107650 - Attachment is obsolete: true
(Assignee)

Comment 8

15 years ago
Created attachment 108254 [details] [diff] [review]
Patch v0.3

IDs are now output to the data file to identify a URL in case the user doesn't
want to output the url path.
(Assignee)

Updated

15 years ago
Attachment #108111 - Attachment is obsolete: true
(Assignee)

Updated

15 years ago
Attachment #108254 - Flags: review?(heikki)
Attachment #107650 - Flags: review?(heikki)
Comment on attachment 108254 [details] [diff] [review]
Patch v0.3

You should try to test build these changes on all the major platforms at least,
because you use more math etc. stuff than most of the code I've seen. Some math
stuff we have in NSPR actually, I'll try to see if I can spot any which could
be changed to NSPR.

>Index: autocomplete/resources/content/autocomplete.xml
>===================================================================
>Index: bookmarks/public/nsIBookmarksService.idl
>===================================================================
>+    void getAddDate(in string aBookmarkURL, out long long aAddDate);

Is there some Date data type that you could use instead, |long long| does not
look like a date to me (although I know how it works, but still:). Also I'd
like to see the javadoc style comments for members in IDL files.

>Index: bookmarks/src/nsBookmarksService.cpp
>===================================================================

Will continue...

Comment 10

15 years ago
Comment on attachment 108254 [details] [diff] [review]
Patch v0.3

sr=me
Attachment #108254 - Flags: superreview+
(Assignee)

Comment 11

15 years ago
> You should try to test build these changes on all the major platforms at least

I've started a checkout on Linux and I'm building on Windows.

>Is there some Date data type that you could use instead, |long long| does not
>look like a date to me

I agree it looks strange.  I looked for a nsDate in xpcom/ds and the closest
thing they have there is class nsTime that uses the same PRTime (#defined to
PRInt64) parameters that I am using.  I couldn't find any usage of nsTime as a
parameter in IDL files.  I could use it in nsBookMarksService.idl but then I'd
have to use the native qualifier and make the method unscriptable.

I did find usage of "long long" in IDL files to denote time (see LXR snippet
below).  One way to fix the strangeness might be to rename the method from
getAddDate() to getAddTime().  That's actually closer to what is really going on
anyway...

What do you think?

/xpfe/appshell/public/nsITimingService.idl, line 52 -- long long
getElapsedTime(in string timername);
/xpfe/appshell/public/nsITimingService.idl, line 60 -- void setTimer(in string
timername, in long long starttime);
/xpfe/components/download-manager/public/nsIDownloadManager.idl, line 80 -- in
long long startTime,
/mailnews/base/public/nsIMsgGroupRecord.idl, line 49 -- in long long time,
/mailnews/base/public/nsIMsgGroupRecord.idl, line 61 -- in long long time,
/mailnews/base/public/nsIMsgGroupRecord.idl, line 85 -- readonly attribute long
long addTime;

> Also I'd like to see the javadoc style comments for members in IDL files

Consider it done!  :-)
(Assignee)

Comment 12

15 years ago
Comment on attachment 108254 [details] [diff] [review]
Patch v0.3

Putting into approval queue as well...	I've sent email out about this
separately to drivers@mozilla.org.
Attachment #108254 - Flags: approval1.3a?
(Assignee)

Comment 13

15 years ago
OK, the patch builds fine on Linux...
(Assignee)

Comment 14

15 years ago
And it builds fine on the Mac OSX build as well...  So, build wise we are all
set for all the platforms built on the main tinderbox page...
Comment on attachment 108254 [details] [diff] [review]
Patch v0.3

>Index: history/src/nsGlobalHistory.cpp
>===================================================================
>+#define AC_NUM_URL_FEATURES 44

It would be nice to comment what these mean in practice:

>+const PRFloat64 LEARN_RATE = 0.5;
>+const PRFloat64 HISTORY_FAST_DECAY_CONSTANT = 0.2;
>+const PRFloat64 HISTORY_SLOW_DECAY_CONSTANT = 0.8;
>+const PRFloat64 BOOKMARK_FAST_DECAY_CONSTANT = 0.2;
>+const PRFloat64 BOOKMARK_SLOW_DECAY_CONSTANT = 0.8;


>+//----------------------------------------------------------------------
>+// Sigmoid perceptron definitions and implementation
>+// XXX The class definitions need to go in a separate header file
>+
>+class nsPerceptron 

The comment talks only about sigmoid perceptron but you have the plain one here
as well, and it's first.

Also you may want to change the name to tell which/what perceptron, but since
it is the only one for now it doesn't matter that much.

>+  nsPerceptron(int aNumFeatures);

You might want to put the default constructor (without args) into private
members without implementation, which would guarantee that this constructor
will always be called or it won't compile.

>+      if (mNumWeights > 0 && mWeights != NULL)

We tend to group things explicitly, so add some parens.

>+  virtual Train(PRFloat64* aInputs, int aNumInputs, PRFloat64 aTargetOutput);
>+  virtual Test(PRFloat64* aInputs, int aNumInputs, PRFloat64* aOutput);
>+  int mNumWeights;

|PRInt32| instead of |int|? Same in nsSigmoidPerceptron.

>+static nsSigmoidPerceptron sAutoCompleteLearner(AC_NUM_URL_FEATURES);

Static objects are bad. Make this into a service or normal member variable of
the history/autocomplete service.

>+nsPerceptron::nsPerceptron(int aNumFeatures)
>+{
>+  mNumWeights = aNumFeatures;
>+  mWeights = NULL;
>+  if (mNumWeights > 0)
>+  {
>+    mWeights = new PRFloat64[aNumFeatures];
>+  }

Maybe you should have Init method or something, because that |new| could fail
and you can't inform the caller if you do this in the constructor.

Also, set the mNumWeights after the |new| succeeded, otherwise those two member
variables are out of sync and you might cause crash. Do not use |NULL|, use
|nsnull| or |0|.

>+  double output = 0.0;

Strange, I don't seem to be able to find any PRDouble types. Maybe double is
good then.
Comment on attachment 108254 [details] [diff] [review]
Patch v0.3

>Index: history/src/nsGlobalHistory.cpp
>===================================================================
>+#define AC_NUM_URL_FEATURES 44

The way the features & this define are connected is quite loose, but couldn't
come up with a nice way at the moment to do this.

>+  for (int i = 0; i < mNumWeights; i++)

|int| to |PRInt32|, there seem to be more of these elsewhere as well.

Also, you should be careful with initializing variables like that in the for
loop; some compilers don't get the scope right so if you use |i| later in the
function it might have different values than you expect.

>+  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(file));

Is that the correct profile dir nowadays? Didn't it change to
".../Profiles/..."?

>+    nsCOMPtr<nsILocalFile> local_file(do_QueryInterface(file));

We tend to use interCaps -> localFile.

>+      fscanf(from, "%lf", mWeights[i]);

I'm really suspicuous of all file reads. Maybe this is good enough for now,
though...

>+    local_file->AppendNative(NS_LITERAL_CSTRING("ac-weights.txt"));

Could you make that string a define or const or something, since you are using
it in more than one location?

>+PRFloat64 nsSigmoidPerceptron::Sigmoid(PRFloat64 d)

We usually have 'a' prepend the interCap parameter name. There are also other
functions were you don't follow this convention.

>+    PRFloat64 m;  

Btw, it seems PRFloat64 is a typedef for double, just you know...

>+    nsAutoString url;
>+    CopyASCIItoUCS2(nsDependentCString(aURL), url);

URL is UTF8. Use NS_ConvertUTF8toUCS2? This pattern is repeated later as well.

>+nsGlobalHistory::SetRowValue(nsIMdbRow *aRow, mdb_column aCol, const PRFloat64 aValue)

The last param doesn't really need to be const.

>+  if (err != 0) return NS_ERROR_FAILURE;

The convention is:

if (err)
  return NS_ERROR_FAILURE;

>+    if (id == 0)

Either that or !id.

>+    mACFeatures = new PRFloat64[AC_NUM_URL_FEATURES];

Check if new failed and do something if it did.
>+  (*aResult)->AddRef();

Don't call addref directly, because that will confuse the refcount debugging
tools that rely on the macros. Use NS_ADDREF().

>+//  Feature 41: Is this a google search url?
>+//  Feature 42: Is this a netscape search url?
>+//  Feature 43: Is this a yahoo search url?

These URLs might obviously change in the future, so if these things were to be
turned on indefinitely they would need to come from prefs.

>+      do_GetService("@mozilla.org/browser/bookmarks-service;1", &rv);  

If there is a define for this progid you should use it.

>+  if (HasCell(mEnv, row, kToken_TypedColumn))  
>+    aFeatures[2] = 1;
>+  else
>+    aFeatures[2] = 0;

For boolean settings you could simply do:

  aFeatures[2] = HasCell(mEnv, row, kToken_TypedColumn);  

>+  nsCAutoString curl;
>+  NS_NewURI(getter_AddRefs(uri), aUrl);  
>+  uri->GetSpec(curl);

Check if you got the URI object before using it.

Also, now that you have the URI object, don't use the spec for everything!

>+  nsAutoString url(NS_ConvertASCIItoUCS2(curl).get());
>+  ToLowerCase(url);

URL is not ASCII, it is UTF8.

>+  if (FindInReadable(NS_LITERAL_STRING(".htm"), start, end))

uri->GetPath() and then look for that.

>+  if (FindInReadable(NS_LITERAL_STRING(".com"), start, end))
>+  if (FindInReadable(NS_LITERAL_STRING(".edu"), start, end))
>+  if (FindInReadable(NS_LITERAL_STRING(".org"), start, end))
>+  if (FindInReadable(NS_LITERAL_STRING(".net"), start, end))
>+  if (FindInReadable(NS_LITERAL_STRING(".gov"), start, end))

uri->GetHost() and then look for those.

>+  if (FindInReadable(NS_LITERAL_STRING("~"), start, end))

Look at path.

>+  // Feature 18: Does the URL end in a two letter country code?
>+  if (url.RFindChar('.') == ((url.Length() - 1) - 2))

Look at host.

>+  PRUint32 size, i;
>+  for ( ; start != end; start.advance(size)) 
>+  {
>+    const PRUnichar* buf = start.get();
>+    size = start.size_forward();

Uh, that looks needlessly complicated. Wouldn't you be able to do that in a
single loop, starting with:

for ( ; start != end; ++start) 

>+    fprintf(mURLDataFile, " path='%s'", NS_LossyConvertUCS2toASCII(aURL).get());

Use NS_ConvertUCS2toUTF8?

> nsGlobalHistory::OnAutoComplete(const PRUnichar *searchString,
>                                 nsIAutoCompleteResults *previousSearchResult,
>                                 nsIAutoCompleteListener *listener)
> {
>-  return NS_OK;
>+  nsCOMPtr<nsISupportsArray> results;
>+  nsCOMPtr<nsIAutoCompleteItem> item;  
>+  PRUint32 count;
>+  nsAutoString value;
>+  PRBool found = PR_FALSE;
>+  PRUint32 i;
>+  nsresult rv = NS_OK;
>+  
>+  if (mLearningMode == ACL_NONE && mDataCaptureMode == UDC_NONE)
>+    return rv;

Declare the variables as close to the place where you use them.

>+  previousSearchResult->GetItems(getter_AddRefs(results));
>+  results->Count(&count);

Shouldn't you check if results exists before using? There are other times with
this same function call.

NS_LossyConvertUCS2toASCII(searchString).get());

UTF8.

>+  PRUnichar *comment;
...
>+  aItem->GetComment(&comment);
...
>+  nsMemory::Free(comment);

You might want to use nsXPIDLString which would take care of the memory
management.

>Index: history/src/nsGlobalHistory.h
>===================================================================

Done.

Fix those and r=heikki. Please note that many of the things I commented on are
fairly generic, so check for that same pattern elsewhere.
Attachment #108254 - Flags: review?(heikki) → review+
(Assignee)

Comment 17

15 years ago
Created attachment 108415 [details] [diff] [review]
Patch v0.4 incorporates Heikki's comments

Here's a new patch with the following changes:

- Javadoc comments for new method in the bookmark service IDL file.
- Renamed the method to getAddTime and parameter to aAddTime
- sAutoCompleteLearner isn't a global static.  Its now mAutoCompleteLearner, a
member of nsGlobalHistory.
- Commented the constants in nsGlobalHistory.cpp
- Added private default constructor to nsPerceptron, nsSigmoidPerceptron
- Replaced int with PRInt32

>Maybe you should have Init method or something, because that |new| could fail
>and you can't inform the caller if you do this in the constructor.

This is something I'd like to do once and for all when I move nsPerceptron out
into a separate idl, .h, and .cpp file.  Then, I'll have return error values
for each function.

Right now, since the perceptron will be off by default anyway, please let me
check in without handling the edge case of running out of memory.

>Also, set the mNumWeights after the |new| succeeded, otherwise those two
member
>variables are out of sync and you might cause crash. Do not use |NULL|, use
>|nsnull| or |0|.

Done!
Attachment #108254 - Attachment is obsolete: true
(Assignee)

Updated

15 years ago
Attachment #108415 - Flags: superreview?(hewitt)
Attachment #108415 - Flags: review?(heikki)
Attachment #108415 - Flags: approval1.3a?
(Assignee)

Updated

15 years ago
Attachment #108415 - Flags: superreview?(hewitt)
Attachment #108415 - Flags: review?(heikki)
Attachment #108415 - Flags: approval1.3a?
>Right now, since the perceptron will be off by default anyway, please let me
>check in without handling the edge case of running out of memory.

Ok, but at least add a XXX comment saying need to check out of mem and move to
Init() method.
(Assignee)

Comment 19

15 years ago
I'm about to attach a patch that addresses everything in your comments above
except for the following points discussed below:

> Is that the correct profile dir nowadays? Didn't it change to
> ".../Profiles/..."?

I've seen the same usage in other places, and I've tested it and made sure that
the file is created in the right directory...

>>+  PRUint32 size, i;
>>+  for ( ; start != end; start.advance(size)) 
>>+  {
>>+    const PRUnichar* buf = start.get();
>>+    size = start.size_forward();
>
>
>Uh, that looks needlessly complicated. Wouldn't you be able to do that in a
>single loop, starting with:
>
>for ( ; start != end; ++start) 

Actually, the Mozilla string guide recommends the more complicated code because
it is more performant.  See
http://www.mozilla.org/projects/xpcom/string-guide.html#Looping_Iterators.
(Assignee)

Comment 20

15 years ago
Created attachment 108426 [details] [diff] [review]
Patch v.05 incorporates 2nd batch of comments from Heikki

This patch is what I will check into 1.3 alpha...
Attachment #108415 - Attachment is obsolete: true

Updated

15 years ago
Attachment #108254 - Flags: approval1.3a?

Comment 21

15 years ago
Comment on attachment 108426 [details] [diff] [review]
Patch v.05 incorporates 2nd batch of comments from Heikki

moving review and superreview forward and adding a=asa for checkin to 1.3a.
Attachment #108426 - Flags: superreview+
Attachment #108426 - Flags: review+
Attachment #108426 - Flags: approval1.3a+
(Assignee)

Comment 22

15 years ago
I just checked in Patch 108426 (v 0.5).  Keeping this bug open because more work
needs to happen here...

Updated

15 years ago
OS: Windows 2000 → All
Hardware: PC → All
So... is this actually enabled?  Based on the checkins that were required to get
this compiling (including rev 1.67 of nsGlobalHistory.cpp especially), I have to
wonder how much testing this received... (the code rev 1.67 changed could not
possibly have worked properly).
I'm sorry, but I backed out the checkin (after fruitless attempts to locate
nisheeth last night and earlier today).  This checkin was responsible for the
crash on shutdown on the tinderboxes running the bloat and leak tests.  In a
debug build on Linux, I saw the following on shutdown with this patch in my tree
(prior to the crash):

###!!! ASSERTION: can't get xpt search path!: 'Error', file
/home/bzbarsky/mozilla/profile/mozilla/xpcom/reflect/xptinfo/src/xptiInterfaceInfoManager.cpp,
line 67
###!!! ASSERTION: null monitor: 'mMonitor', file
../../dist/include/xpcom/nsAutoLock.h, line 240
###!!! ASSERTION: null monitor: 'mMonitor', file
../../dist/include/xpcom/nsAutoLock.h, line 248
###!!! ASSERTION: Expected only one call!: '!gCallCount++', file
/home/bzbarsky/mozilla/profile/mozilla/xpcom/reflect/xptinfo/src/xptiInterfaceInfoManager.cpp,
line 232

These were repeated 5 or 6 times before the actual crash.

Oh, and there was a startup assertion:

###!!! ASSERTION: nsDependentCString must wrap only null-terminated strings:
'!*aEndPtr', file ../../../../dist/include/string/nsDependentString.h, line 160
Break: at file ../../../../dist/include/string/nsDependentString.h, line 160

Finally, please fix the perceptron constructor to _always_ initialize
mNumWeights... the current code will do hideous things if '0' is passed to the
constructor and then Train is called.

Comment 25

15 years ago
Comment on attachment 108426 [details] [diff] [review]
Patch v.05 incorporates 2nd batch of comments from Heikki

please bear with me, this is the first time i've paid any attention to this
bug, first comments on comments.


>> +  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(file));
> Is that the correct profile dir nowadays?
no. that class set is going away (for more info see bug 156733) the correct api
is to use to directoryservice. sample code can be found all over the place.
(there's currently a broken consumer in the xp filepicker, but there will be a
correct patch in bug 125324 which you could work from).

--

nit#1, please don't check in trailing whitespace


>+// Autocomplete learning modes
>+#define ACL_NONE                0
please rename the ACL things to anything but ACL. acl should mean Access
Control List.
AUTOCOMPLETE_NO_LEARNING, AUTOCOMPLETE_ENABLE_TRAINING,
AUTOCOMPLETE_AFFECT_URL_LIST seem perfectly reasonable.
>+#define ACL_ENABLE_TRAINING     1
>+#define ACL_AFFECT_URL_LIST     2

>+#define UDC_NONE                     0
>+#define UDC_WITHOUT_URL_INFO  1
>+#define UDC_WITH_URL_INFO     2
I can't remember what UDC stands for, please replace it with a longer string
that is meaningful.

why not use consts, or perhaps idl consts?

>+const PRInt32 AC_NUM_URL_FEATURES = 44;
yuck. const's are kInitialCaps. 44 appears to be a magic number, if it is then
please decompose it or comment.

>+// XXX The implementations need to be moved out to their own .cpp file
this stuff should have been checked in *very* early in the alpha cycle, not
during the freeze at the end.

>+// XXX This should move to an Init method so that error handling can happen.
This isn't acceptable.
>+nsPerceptron::nsPerceptron(PRInt32 aNumFeatures)
>+{  
why take a signed int?
>+  mWeights = nsnull;
>+  if (aNumFeatures > 0)
>+  {
>+    mWeights = new PRFloat64[aNumFeatures];
>+  }

>+nsPerceptron::Train(PRFloat64* aInputs, PRInt32 aNumInputs, PRFloat64 aTargetOutput)
>+{
>+  double output = 0.0;
>+  double delta = 0.0;
you just switched from nspr types to compiler types...

>+nsPerceptron::LoadWeights()
>+{
>+  nsCOMPtr<nsIFile> file;
>+  FILE* from = 0;
>+  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(file));
again, this should be replaced with calls to the directoryservice.

>+    localFile->OpenANSIFileDesc("r", &from);
I think we prefer for people to use OpenNSPRFileDesc (and i'll note that the
example patch heikki pointed to *did* use the NSPRFileDesc method) and the
prfile apis

>-    return PR_TRUE;
>+    return PR_TRUE;    
you added trailing whitespace here -- i won't repeat the nit but i do want to
point out that it means blame was transfered a few times for something that
never should have changed.

>+    fflush(mURLDataFile);
flushing on the main thread is really mean when it's possible the file might
live on an AFS volume (although i suppose a decent AFS would try to cache i'm
not certain NFS can since the idea of flush is to guarantee success).

>+  nsresult rv = NS_OK;    
>+  mURLID = PR_Now();
this should be a do {}while() loop.
>+  while (NS_SUCCEEDED(rv))
>+  {
>+    rv = FindRow(kToken_URLIDColumn, ++mURLID, getter_AddRefs(oldRow));
>+  }

>+    mACFeatures = NULL;
heikki's comments said not to use NULL.


>+  nsresult rv;
>+  rv = FindRow(aCol, aValue, aResult);
early return is preferred, i.e. if (NS_FAILED(rv)) \n return rv;
>+  if (NS_SUCCEEDED(rv))
>+  {
>+    rv = GetRowValue(*aResult, kToken_URLIDColumn, aRowID);
>+  }
>+  return rv;
>+}

this stuff would make a great javadoc comment
/**
 * javadoc comments
 * ...
 */
>+// The input features into the autocomplete perceptron are as follows:
>+//
>+//   Features 1  = Frequency and recency metric for page in history 
>+//                 (domain = positive real numbers)
>+//                 Value decays fast with age of page
>+//                 Uses HISTORY_FAST_DECAY_CONSTANT


>+  // Now, calculate a bunch of url related features
>+  nsCOMPtr<nsIURI> uri;
>+  nsCAutoString curl, chost, cpath;
>+  rv = NS_NewURI(getter_AddRefs(uri), aUrl);
>+  if (NS_SUCCEEDED(rv) && uri)
>+  {
>+    uri->GetSpec(curl);    
>+    uri->GetHost(chost);
>+    uri->GetPath(cpath);
>+  }
this is very very very wasteful and silly.
>+  nsAutoString url(NS_ConvertUTF8toUCS2(curl).get());
>+  nsAutoString path(NS_ConvertUTF8toUCS2(cpath).get());
>+  nsAutoString host(NS_ConvertUTF8toUCS2(chost).get());
you just converted things from char* to PRUnichar*
>+  ToLowerCase(url);  
>+  ToLowerCase(host);
>+  ToLowerCase(path);
and now you're lowercasing them
>+  // Feature 6: Whether url ends in .htm or .html    
>+  nsAString::const_iterator start, end;
>+
>+  path.BeginReading(start);
>+  path.EndReading(end);
and now you're comapinrg them with things which by all rights should be char*
>+  aFeatures[5] = FindInReadable(NS_LITERAL_STRING(".htm"), start, end);
please please please save on conversions. don't promote to ucs2 unless you have
a reason.

in fact, you shouldn't be using findinreadable. you should be comparing the
last n chars with the strings.

>+  // Feature 7: Is it a .com URL?
>+  host.BeginReading(start);
>+  host.EndReading(end);
>+  aFeatures[6] = FindInReadable(NS_LITERAL_STRING(".com"), start, end);
>+
>+  // Feature 8: Is it a .edu URL?
>+  host.BeginReading(start);
>+  host.EndReading(end);
>+  aFeatures[7] = FindInReadable(NS_LITERAL_STRING(".edu"), start, end);
>+
>+  // Feature 9: Is it a .org URL?
>+  host.BeginReading(start);
>+  host.EndReading(end);
>+  aFeatures[8] = FindInReadable(NS_LITERAL_STRING(".org"), start, end);
>+
>+  // Feature 10: Is it a .net URL?
>+  host.BeginReading(start);
>+  host.EndReading(end);
>+  aFeatures[9] = FindInReadable(NS_LITERAL_STRING(".net"), start, end);
>+    
>+  // Feature 11: Is it a .gov URL?
>+  host.BeginReading(start);
>+  host.EndReading(end);
>+  aFeatures[10] = FindInReadable(NS_LITERAL_STRING(".gov"), start, end);
>+  
this is the first place that actually cares about a non terminal:
>+  // Feature 12: Does the URL contain a ~ ?
>+  path.BeginReading(start);
>+  path.EndReading(end);
>+  aFeatures[11] = FindInReadable(NS_LITERAL_STRING("~"), start, end);
the next few things obviously care about beginning chars only. however i wonder
if there's some other way to do it, this doesn't scale well at all.
>+  // Feature 13: Does the URL start with http:// ?
>+  PRBool isScheme;
>+  aFeatures[12] = aFeatures[13] = aFeatures[14] = aFeatures[15] = aFeatures[16] = 0;
>+  if (NS_SUCCEEDED(uri->SchemeIs("http", &isScheme)))
>+  {
>+    aFeatures[12] = isScheme;
>+  }
...
>+
>+  // Feature 18: Does the URL end in a two letter country code?
your comment's wrong. it's not the URL it's the host.
>+  aFeatures[17] = (host.RFindChar('.') == ((host.Length() - 1) - 2));
instead of doing a search which could cost O(N) for something like:
http://myververyverydshjfhsdkjfhsdakjlfhsalkjfhlsdjkahflkjsadhflkjsdhfljaksdfhl
sdkjfhaslkdfhlonglocaldomain

also consider that the following url should be considered valid:
http://www.state.ca.us./state/portal/myca_homepage.jsp

Ignoring that fact (because it isn't my job to right your code) the pseudo code
to efficiently check for the case you cared about would be something like:
x=host.length();
(x>3 && host[x-3] == '.' && host[x-2] != '.' && host[x-1] != '.')
(there's a very simple way to handle the trailing .)

i'm not sure if it makes sense to do the next counting using PRUnichar ...
(said counting snipped)

>+  // Calculate a bunch of hostname related features.  

>+  // Feature 33: Number of .s in the hostname if we omit initial "www." or "ftp." (if any)  
>+  // Feature 35: Number of .s in the hostname if we omit ending initial "www." or "ftp"
this comment looks wrong

>+  if (chost.Find("www.") == 0 || chost.Find("ftp.") == 0)
again *please* don't use find when you want to know if a specific set of
characters match.

use substring+equals or strcmp

>+  if (chost.RFindChar('.') == ((chost.Length() - 1) - 2))
again, the same complaint about the tail... and isn't there a boolean lying
around that you could check instead of recalculating? if not then this stuff
just isn't local to similar stuff... you should group related comparisons and
share conditions.


>+  // Feature 41: Is this a google search url?
http://www.google.com/mac?q=is+a+google+search+url and I use it. XXX fix your
code :)
>+  aFeatures[40] = FindInReadable(NS_LITERAL_STRING("http://www.google.com/search"), start, end);
XXX Find is bad.

>+  // Feature 43: Is this a yahoo search url?
there have to be other yahoo search urls...
>+  aFeatures[42] = FindInReadable(NS_LITERAL_STRING("http://search.yahoo.com/bin/search"), start, end);

>+  // If searchString found in the previous search results, assume 
>+  // that the user selected that url (searchString) from the previous 
>+  // list of autocomplete results.  
i wonder if you could ask the search service instead of relying on the evil
hard coding you have ... 

>+        fprintf(mURLDataFile, " url='%s'", NS_ConvertUCS2toUTF8(searchString));
you just converted to char*, eek. no you just converted to char* once
>+      if (NS_SUCCEEDED(FindRowAndID(kToken_URLColumn, 
>+                         NS_ConvertUCS2toUTF8(searchString).get(), 
>+                         getter_AddRefs(row), &rowID)))
and you just did it again. 1. i don't know if you ever needed PRUnichar* but if
you did, please save a comparison by not doing it again.

--SX MARK i'll be back
>+      {
>+        if (rowID == 0)
>+        {
>+          AssignUniqueURLID(row);
>+          rowID = mURLID;
Attachment #108426 - Flags: review-
Attachment #108426 - Flags: review+
Attachment #108426 - Flags: approval1.3a+

Updated

15 years ago
Blocks: 183901
(Assignee)

Comment 26

14 years ago
timeless, thanks for your comments.  I'm about to attach a patch that address
most of your concerns.

First some general replies related to your comments:

Your points about optimizing FillInputFeatures() make sense.  But, all the code
in FillInputFeatures is only executed when a pref is turned on by the users that
will participate in data collection.  Also, all the code in FillInputFeatures
will be discarded  once we've collected the data.  So, I don't want to spend
more time optimizing that code.  Even though a lot of extra work is being done,
there is no perceptible delay on my 750 Mhz laptop.  I'm hoping that users who
participate in the data collection will have equal or better machines and will
also not feel a lag in performance.

I promise that if we decide to keep the code around or enable it by default, I
will performance optimize it to everyone's satisfaction.

I know that the perceptron code also needs more work.  As with the data capture
code, a perceptron is only created and used if a pref is set by the user. 
Please ignore the XXX comments as they will be removed when the perceptron
becomes its own component.

More specific replies to your comments follow:

>>> +  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR,
>getter_AddRefs(file));
>> Is that the correct profile dir nowadays?
>no. that class set is going away (for more info see bug 156733) the correct api
>is to use to directoryservice. sample code can be found all over the place.
>(there's currently a broken consumer in the xp filepicker, but there will be a
>correct patch in bug 125324 which you could work from).

I double checked with Conrad Carlen about the above usage and he was fine with
both NS_GetSpecialDirectory and NS_APP_USER_PROFILE_50_DIR.  This code is
temporary anyway, so it doesn't really matter.

>nit#1, please don't check in trailing whitespace

Agreed!  I will exorcise these before checking in.

>>+// Autocomplete learning modes
>>+#define ACL_NONE                0
>please rename the ACL things to anything but ACL. acl should mean Access
>Control List.
>AUTOCOMPLETE_NO_LEARNING, AUTOCOMPLETE_ENABLE_TRAINING,
>AUTOCOMPLETE_AFFECT_URL_LIST seem perfectly reasonable.
>>+#define ACL_ENABLE_TRAINING     1
>>+#define ACL_AFFECT_URL_LIST     2
>
>>+#define UDC_NONE                     0
>>+#define UDC_WITHOUT_URL_INFO  1
>>+#define UDC_WITH_URL_INFO     2
>I can't remember what UDC stands for, please replace it with a longer string
>that is meaningful.

Done! I've replaced them with more descriptive names.

>why not use consts, or perhaps idl consts?

I prefer #defines mostly because I want to use all caps names without prefixing
a "k".  I don't want to use IDL constants because these aren't public constants
intended for consumers other than nsGlobalHistory.

>>+  nsresult rv = NS_OK;    
>>+  mURLID = PR_Now();
>>+  while (NS_SUCCEEDED(rv))
>>+  {
>>+    rv = FindRow(kToken_URLIDColumn, ++mURLID, getter_AddRefs(oldRow));
>>+  }
>
>this should be a do {}while() loop.

Not necessarily, because rv is initialized to NS_OK in the beginning, so
NS_SUCCEEDED(rv) will always return TRUE and the loop will get executed at least
once.  do {} while() has the same semantics.

I've changed this to a do {} while nevertheless because we do save a condition
test with the do-while...

>>+nsPerceptron::Train(PRFloat64* aInputs, PRInt32 aNumInputs, PRFloat64
>aTargetOutput)
>>+{
>>+  double output = 0.0;
>>+  double delta = 0.0;
>you just switched from nspr types to compiler types...

I tried to use PRFloat64 but ran into weird behavior (don't remember exactly,
calculations weren't happening properly or something) in my debugger.  Switching
to native types fixed the problem and I didn't explore it further.  I'll fix
this when I make the perceptron a proper component.

>>+    localFile->OpenANSIFileDesc("r", &from);
>I think we prefer for people to use OpenNSPRFileDesc (and i'll note that the
>example patch heikki pointed to *did* use the NSPRFileDesc method) and the
>prfile apis

The OpenANSIFileDesc method fit better with the code that I'd already written. 
All the data capture code is going to get thrown away once we've gotten data
back from the users.  Please ignore this nit for now.

>>+    fflush(mURLDataFile);
>flushing on the main thread is really mean when it's possible the file might
>live on an AFS volume (although i suppose a decent AFS would try to cache i'm
>not certain NFS can since the idea of flush is to guarantee success).

This flushing only happens when the user enables a pref for data capture.  Users
who participate in data collection can turn off the pref if they experience a
significant slowdown.  This code will go away after we've gotten data back from
some users.

>>+    mACFeatures = NULL;
>heikki's comments said not to use NULL.

Fixed.  I missed this one in my last patch.
(Assignee)

Comment 27

14 years ago
Boris, thanks a lot for your comments.

I've fixed the shutdown crash.  It wasn't a problem with my code but was
unmasked by it.  I've also verified that my code doesn't cause new debug
assertions to fire.

>Finally, please fix the perceptron constructor to _always_ initialize
>mNumWeights... the current code will do hideous things if '0' is passed to the
>constructor and then Train is called.

Done!

Comment 28

14 years ago
I've been biting my tongue on this comment, but I can't hold myself back any longer.

I have to ask: is this amazing over-engineering worth the trouble?  Can it
really offer appreciably better results than the existing frequency-weighted
order?  (I find the existing autocomplete dropdown ordering to be thoroughly
excellent.)

There are probably plenty of places in Mozilla where machine-learning can be
applied fruitfully.  If the goal of this bug is to prove the mechanisms before
applying them to bigger matters then that sounds quite fair to me.  Otherwise
this seems like a large effort targetting a nonexistant problem.

(Assignee)

Comment 29

14 years ago
Adam, you are exactly right.  The long term goal is to expand the use of machine
learning to other parts of Mozilla.  Autocomplete is a small baby step.  I don't
know whether we'll end up with better results than the existing mechanism.  We
might not.  But, I'm treating this as a research project.  Once we have
collected the data, we'll see if we can come up with a learning algorithm that
performs better than the current algorithm.  

Another goal is to get folks in the machine learning community to take notice of
Mozilla.  Hopefully, as we spread the word about small machine learning projects
like this one, we will spark more ambitious ideas in the heads of machine
learning gurus in academia.  If even a few university professors start working
on implementing their ideas on top of Mozilla, we will all benefit.

If you have ideas about where we can apply machine learning within Mozilla, lets
start discussing them in the Mozilla newsgroups.  Some ideas we've had till now are:

- Currently, Mozilla does LINK attribute based prefetching of web pages.  Can we
use a user's past surfing behavior to predict what to prefetch?
- Can we algorithmically identify ad images on a web page and block them out?
- How do the activities of email, instant messaging, newsgroup reading, and web
browsing intersect and overlap with each other?  Can we apply machine learning
techniques to learn a user’s habits and use that knowledge to filter out
unnecessary information and reduce information overload as they conduct these
activities?  The Bayesian Spam filter recently added to Mozilla is a big step in
this direction.
- We run security hole scanners on the Mozilla codebase today and get a lot of
false positives.  Can we apply machine learning to automatically flag these
mistaken errors and save developer time?
- Is it possible to identify security holes in the Mozilla codebase more
intelligently than grep like scanning by using artificial intelligence techniques?

Some of these ideas are hard research problems that are unlikely to have clear
solutions.  Exploring these ideas might not lead to shippable code.  But, we
should still take a shot at it for the sake of learning more about difficult
problems.  If we come up with an interesting insight, we will be able to apply
it not only to Mozilla but to other applications as well.
(Assignee)

Comment 30

14 years ago
Some more replies to comments from timeless:

>>+  if (chost.RFindChar('.') == ((chost.Length() - 1) - 2))
>again, the same complaint about the tail... and isn't there a boolean lying
>around that you could check instead of recalculating? if not then this stuff
>just isn't local to similar stuff... you should group related comparisons and
>share conditions.

Good point.  I'm now re-using the earlier boolean value.

>also consider that the following url should be considered valid:
>http://www.state.ca.us./state/portal/myca_homepage.jsp

I've added a check for this case in FillInputFeatures although I doubt if there
are many urls on the web with a trailing period in their hostname.
(Assignee)

Comment 31

14 years ago
Created attachment 110621 [details] [diff] [review]
Patch v0.6:  Incorporates timeless and boris' comments

This patch contains the stuff I said I'd do in comments 26, 27 and 30.	It also
includes the following:

- We now get one instead of three autocomplete calls for each autocomplete
selection.
- Fixed debug asserts that fired on shutdown in nsXPTIInterfaceManager
- We now output the referrer url for each visited url to the data file
- We now output Mozilla startup and exit times to the data file.
- Removed mURLID from nsGlobalHistory.	It was unnecessary.
- nsGlobalHistory::HidePage outputs to the data file.  We need this to weed out
(i)frame loads within a page.
- Made all the changes needed to build patch on all platforms when it was first
checked in.
- Made the comment before FillInputFeatures a javadoc comment.
- With rjc@netscape.com's help, reworked code so that we no longer need to add
the GetAddTime method to nsIBookmarksService.  Now, pretty much all the code in
this patch is in nsGlobalHistory.h/.cpp and a few lines in nsDocShell.h/.cpp
Attachment #108426 - Attachment is obsolete: true
(Assignee)

Updated

14 years ago
Attachment #110621 - Flags: review?(heikki)
Comment on attachment 110621 [details] [diff] [review]
Patch v0.6:  Incorporates timeless and boris' comments

Sorry to be anal here, but in nsGlobalHistory.cpp, could you use this style:

  if (foo) {
    ...
  }
  else {
    ...
  }

rather than

  if (foo)
  {
     ...
  }
  else
  {
     ...
  }

I know the latter is prevalent in docshell (I really wish that mozilla.org
would adopt a standard style that every one should follow.  Having source with
multiple styles is worse IMO than reading a file that is not your preferred
style) but nsGlobalHistory.cpp uses the former exclusively (except for
one-liners which have no braces) and "when-in-rome" should apply here.
(Assignee)

Comment 33

14 years ago
Created attachment 110803 [details] [diff] [review]
Patch v0.7: Created after compile and test cycle on Linux.

- Got rid of gcc warnings.
- Numeric url features are now output with 2 decimal places rather than 6.
Attachment #110621 - Attachment is obsolete: true
(Assignee)

Updated

14 years ago
Attachment #110803 - Flags: review?(heikki)
(Assignee)

Comment 34

14 years ago
Chris, I will fix the if-else syntax to match the rest of the file when I update
the patch after heikki's review.  I agree that the when-in-rome rule applies here.

I've also compiled and run with my patch on Mac Classic without any problems.
Comment on attachment 110803 [details] [diff] [review]
Patch v0.7: Created after compile and test cycle on Linux.

>Index: history/public/nsIBrowserHistory.idl
>===================================================================
>+    /**
>+     * setReferrerURL
>+     * Sets the referrer url for a url in global history

I am concerned about this, even though our implementation does not do what I am
really concerned about (namely, sending the referrer on URLs that the user goes
to from global history, because that would be a privacy/security issue). It
might be sufficient to add notes/warnings to the IDL and implementation that
the referrer will/must not be sent when traversing to the URL. 

(But in that light it begs the question: why is there a method for referrer
when it must not used for its "intended" purpose? Maybe you need to add some
verbiage to the IDL comment saying something about used for internal purposes
or something like that. Am I being too picky?)

>Index: history/src/nsGlobalHistory.cpp
>===================================================================
>+    if (id == 0)

Why not:

if (!id)

This is a general pattern. If you find yourself comparing something to 0/null,
you should change it to |if(foo)| or |if(!foo)|.

>+    // Sanity check the URL
>+    PRInt32 len = PL_strlen(aURL);
>+    NS_ASSERTION(len != 0, "no URL");
>+    if (! len)
>+      return NS_ERROR_INVALID_ARG;

That could simply be written as:

if (!*aURL)
  return NS_ERROR_INVALID_ARG;

>+    // XXX Handle out of memory condition
>+    mAutoCompleteLearner = new nsSigmoidPerceptron(AC_NUM_URL_FEATURES);
>+    mACFeatures = new PRFloat64[AC_NUM_URL_FEATURES];
>+
>+    if (!mAutoCompleteLearner || !mACFeatures)
>+    {
>+      // Disable learning
>+      mLearningMode = AUTOCOMPLETE_NO_LEARNING;
>+      mAutoCompleteLearner = nsnull;
>+      mACFeatures = nsnull;
>+    }

This is sorta handling OOM conditions already, but doing it in a buggy way. If
mACFeatures failed, then we will leak mAutoCompleteLearner, and vice versa.
Please fix this.

>-  (*aResult)->AddRef();
>+  NS_ADDREF(*aResult);

If you see anyone calling Release() directly, fix it with NS_RELEASE at the
same time. Otherwise this change will show up as new "leak".

>+nsGlobalHistory::FindRow(mdb_column aCol,
>+  err = mStore->FindRow(mEnv, kToken_HistoryRowScope,
>+                        aCol, &yarn,
>+                        &rowId, getter_AddRefs(row));

You are not doing anything with |err| after this. Please fix.

>+  nsCAutoString curl, chost, cpath;
>+  rv = NS_NewURI(getter_AddRefs(uri), aUrl);
>+  if (NS_SUCCEEDED(rv) && uri)
>+  {
>+    uri->GetSpec(curl);
>+    uri->GetHost(chost);
>+    uri->GetPath(cpath);
>+  }
>+  nsAutoString url(NS_ConvertUTF8toUCS2(curl).get());
>+  nsAutoString path(NS_ConvertUTF8toUCS2(cpath).get());
>+  nsAutoString host(NS_ConvertUTF8toUCS2(chost).get());
>+  // Feature 6: Whether url ends in .htm or .html
>+  nsAString::const_iterator start, end;
>+
>+  path.BeginReading(start);
>+  path.EndReading(end);
>+  aFeatures[5] = FindInReadable(NS_LITERAL_STRING(".htm"), start, end);

Rather than go through all this trouble, don't we have nsACString equivalents?
If not, I am fine with it.

>+  // Feature 12: Does the URL contain a ~ ?
>+  path.BeginReading(start);
>+  path.EndReading(end);
>+  aFeatures[11] = FindInReadable(NS_LITERAL_STRING("~"), start, end);

I just realized that a path would start with tilde, wouldn't it? Tilde anywhere
else in the path shouldn't probably be anything special. But a minor point,
fine either way (I don't expect to see tildes anywhere else in paths).

>Index: history/src/nsGlobalHistory.h
>===================================================================
>+      if ((mNumWeights > 0) && (mWeights != NULL))
>+      {
>+        delete [] mWeights;
>+      }

NULL again. Besides, mNumWeights shouldn't matter. Suppose we allocated
mWeights but mNumWeights is, or becomes 0; we'd leak. Write the if like this:

>+      if (mWeights)

Done.

After those changes, r=heikki
Attachment #110803 - Flags: review?(heikki) → review+
(Assignee)

Comment 36

14 years ago
About to attach a patch that addresses Heikki's review.

Heikky, here are my responses to your comments:

>Index: history/public/nsIBrowserHistory.idl
>===================================================================
>+    /**
>+     * setReferrerURL
>+     * Sets the referrer url for a url in global history
>
>I am concerned about this, even though our implementation does not do what I am
>really concerned about (namely, sending the referrer on URLs that the user goes
>to from global history, because that would be a privacy/security issue). It
>might be sufficient to add notes/warnings to the IDL and implementation that
>the referrer will/must not be sent when traversing to the URL. 
>
>(But in that light it begs the question: why is there a method for referrer
>when it must not used for its "intended" purpose? Maybe you need to add some
>verbiage to the IDL comment saying something about used for internal purposes
>or something like that. Am I being too picky?)

No, you aren't being too picky.  This method is a big hack.  I should really use
a standalone service for logging data as needed and called that service from
places where I need to output data.  But, this method will go away after the
data collection phase, so I'm taking a short cut.  I've added a comment to this
effect in the IDL file and renamed the method to OutputReferrerURL.

>Index: history/src/nsGlobalHistory.cpp
>===================================================================
>+    if (id == 0)
>
>Why not:
>
>if (!id)
>
>This is a general pattern. If you find yourself comparing something to 0/null,
>you should change it to |if(foo)| or |if(!foo)|.

Done.

>+    // Sanity check the URL
>+    PRInt32 len = PL_strlen(aURL);
>+    NS_ASSERTION(len != 0, "no URL");
>+    if (! len)
>+	return NS_ERROR_INVALID_ARG;
>
>That could simply be written as:
>
>if (!*aURL)
>  return NS_ERROR_INVALID_ARG;

Agreed.  I had cut and pasted this from existing code in AddPageToDatabase(). 
I'm changing my function but leaving AddPageToDatabase() alone.

>+    // XXX Handle out of memory condition
>+    mAutoCompleteLearner = new nsSigmoidPerceptron(AC_NUM_URL_FEATURES);
>+    mACFeatures = new PRFloat64[AC_NUM_URL_FEATURES];
>+
>+    if (!mAutoCompleteLearner || !mACFeatures)
>+    {
>+	// Disable learning
>+	mLearningMode = AUTOCOMPLETE_NO_LEARNING;
>+	mAutoCompleteLearner = nsnull;
>+	mACFeatures = nsnull;
>+    }
>
>This is sorta handling OOM conditions already, but doing it in a buggy way. If
>mACFeatures failed, then we will leak mAutoCompleteLearner, and vice versa.
>Please fix this.

Good catch.  Fixed.

>-  (*aResult)->AddRef();
>+  NS_ADDREF(*aResult);
>
>If you see anyone calling Release() directly, fix it with NS_RELEASE at the
>same time. Otherwise this change will show up as new "leak".

I didn't find any place in nsGlobalHistory.cpp.  I'll check other consumers of
FindRow() and fix if necessary.

>+nsGlobalHistory::FindRow(mdb_column aCol,
>+  err = mStore->FindRow(mEnv, kToken_HistoryRowScope,
>+			  aCol, &yarn,
>+			  &rowId, getter_AddRefs(row));
>
>You are not doing anything with |err| after this. Please fix.

This code is cut and pasted from the other FindRow methods in
nsGlobalHistory.cpp.  I don't know enough about the error values returned from
the mork database store to make this change.  If the other FindRows are ok with
ignoring err (which is of type mdb_err, not nsresult, so I can't use the
NS_SUCCEEDED macro on it), then I am too.  What do you think?

>+  path.BeginReading(start);
>+  path.EndReading(end);
>+  aFeatures[5] = FindInReadable(NS_LITERAL_STRING(".htm"), start, end);
>
>Rather than go through all this trouble, don't we have nsACString equivalents?
>If not, I am fine with it.

I remember trying to look for them but couldn't find them easily, so I resorted
to the nsAString methods.  I'll leave this as is for now but fix it up if we
decide to enable this code path by default.

>Index: history/src/nsGlobalHistory.h
>===================================================================
>+	if ((mNumWeights > 0) && (mWeights != NULL))
>+	{
>+	  delete [] mWeights;
>+	}


NULL again. Besides, mNumWeights shouldn't matter. Suppose we allocated
mWeights but mNumWeights is, or becomes 0; we'd leak. Write the if like this:

+	if (mWeights)


Oops, sorry for the NULL. I found that another one had creeped in as well and
changed it too.  This habit is gonna die hard!  I've also changed the if
condition like you suggested.
(Assignee)

Comment 37

14 years ago
>I didn't find any place in nsGlobalHistory.cpp.  I'll check other consumers of
>FindRow() and fix if necessary.

Duh, the only possible consumers of nsGlobalHistory::FindRow() are in
nsGlobalHistory.cpp because FindRow() is a protected method.  So, we are ok.
(Assignee)

Comment 38

14 years ago
Created attachment 110901 [details] [diff] [review]
Patch v0.8: Patch that addresses Heikki and Chris Aillon's comments

Fixed the if-else syntax and made the changes I mentioned in comment 36.
Attachment #110803 - Attachment is obsolete: true
(Assignee)

Updated

14 years ago
Attachment #110901 - Flags: superreview?(hewitt)
Attachment #110901 - Flags: review?(heikki)
Attachment #110901 - Flags: approval1.0.x?

Comment 39

14 years ago
While it's great that you're experimenting with how learning approaches could
improve the user experience, mozilla needs to keep the core stable.  Perhaps
this work would be better suited to a branch, extensions/ (not in default build),
or mozdev?  This would also allow you to avoid running the r/sr gauntlet while
doing the fine-tuning.  Once you've got a technique working nicely and have
enough testing to have confidence in it we'd of course welcome it in the mozilla
tree.

This need for testing/confidence especially applies to the 1.0.x branch, which
I notice you've flagged this patch for.
(Assignee)

Comment 40

14 years ago
Oops, sorry, tor.  I am new to the Bugzilla review/approval process.  I thought
that I needed to get approval for getting this checked into the trunk.  Heikki
tells me that I don't.  I'll go remove the approval request.
(Assignee)

Updated

14 years ago
Attachment #110901 - Flags: approval1.0.x?
(Assignee)

Comment 41

14 years ago
tor, I just realized that I didn't address the second part of your comment.

The reason I'm not doing this work on a branch is that I want this to be a part
of the 1.3 beta release which will get into the hands of lots of Mozilla
community members.  I then hope to convince some fraction of the downloaders to
turn on the data capture pref, run with it on for a couple of weeks, and then
send me back the data file.

The data collection phase of this project needs wide distribution and an easy
turn on/off mechanism for data capture.  I won't be able to meet these
requirements if I don't ship this in the default mozilla builds.

I remember that I sent out an email to drivers about this.  I'm going to paste
that email on this bug because it provides relevant info for people cc'd here.
(Assignee)

Comment 42

14 years ago
<snippet>
The long term work plan is to use machine learning techniques to better order
the autocomplete results in the url bar.  Right now, the goal is to capture data
(when the user chooses to turn on a pref) about urls visited, urls seen in the
autocomplete dropdown, and urls selected in the autocomplete session by the
user.  When we get the data back, Prof. Andrew Ng (the CS professor I'm working
with at Stanford) and I will try to make sense of it and identify the set of url
features and the learning algorithm that best predict which url the user will
visit next.

There are obvious privacy concerns about this.  To deal with them, there are two
data capture modes.  In one, url data is captured.  In the other, only numeric
features about the url are captured so that it is impossible to determine which
url the person visited.  The url data files are human readable so people who
participate in the data gathering phase will be able to see exactly what they
are sending back to us. 
</snippet>

I hope to weed out bugs over the next couple of weeks and then include a blurb
in the 1.3 beta release notes requesting Mozilla volunteers to help in data
collection.

Any suggestions on this plan are very welcome!
(Assignee)

Comment 43

14 years ago
Created attachment 110916 [details] [diff] [review]
Patch v0.9: Does better out of memory checking.

Heikki, the only thing that I want you to take a look at in this new patch is
the following code snippet that handles the out of memory condition for the
heap allocated variables:

+  if (mLearningMode > AUTOCOMPLETE_NO_LEARNING || 
+    mDataCaptureMode > URLDATACAPTURE_NONE) {
+    // Create perceptron and feature array.
+    mAutoCompleteLearner = new nsSigmoidPerceptron(AC_NUM_URL_FEATURES);
+	  if (mAutoCompleteLearner) {
+      mACFeatures = new PRFloat64[AC_NUM_URL_FEATURES];
+      if (!mACFeatures) {	  
+	 delete mAutoCompleteLearner;
+	 mAutoCompleteLearner = nsnull;
+	 mLearningMode = AUTOCOMPLETE_NO_LEARNING;
+      }
+    }
+    else {
+      mLearningMode = AUTOCOMPLETE_NO_LEARNING;
+    }
+  }
+  else {
+    mAutoCompleteLearner = nsnull;
+    mACFeatures = nsnull; 
+  }

Apart from this, your review+ of the previous patch carries forward to this
one.

Thanks!
Attachment #110901 - Attachment is obsolete: true
(Assignee)

Updated

14 years ago
Attachment #110916 - Flags: superreview?(hewitt)
(Assignee)

Comment 44

14 years ago
The above code snippet is making the assumption that mAutoCompleteLearner and
mACFeatures are initialized to nsnull in the constructor.
Comment on attachment 110916 [details] [diff] [review]
Patch v0.9: Does better out of memory checking.

More drive-by comments:

operators |delete| and |delete[]| are both null-safe so there is no need to do:

if (foo)
  delete foo;

Rather, just do

delete foo;


Also, you missed switching some braces in 

nsresult nsGlobalHistory::GetRowValue(nsIMdbRow *aRow, mdb_column aCol,
PRFloat64 *aResult)



In the middle of +nsresult
+nsGlobalHistory::FillInputFeatures(nsAString &aUrl,
+				    PRFloat64 *aFeatures)

Around this code:

+	 nsCOMPtr<nsIRDFDataSource> bookmarkDS = do_QueryInterface(bs, &rv);
+		    if (NS_SUCCEEDED(rv) && bookmarkDS) {
   ^^^^ Evil tab characters, please remove in favor of spaces.

+	   nsCOMPtr<nsIRDFNode> nodeType;


- In  NS_IMETHODIMP
 nsGlobalHistory::OnAutoComplete(const PRUnichar *searchString,
				 nsIAutoCompleteResults *previousSearchResult,
				 nsIAutoCompleteListener *listener)

+  nsCOMPtr<nsISupportsArray> results;

nsISupportsArray is considered deprecated and efforts to remove it from the
tree have already been started.  Looking at your usage, you want to use
nsCOMArray, instead.

Still in the same method,

+  // If searchString found in the previous search results, assume 
+  // that the user selected that url (searchString) from the previous 
+  // list of autocomplete results.  
+  if (found)
+  {	    

Please fix the brace.

And towards the end of that method,

+    if (mLearningMode >= AUTOCOMPLETE_ENABLE_TRAINING)
+    {
+      mAutoCompleteLearner->SaveWeights();
+    }
+  }
+
+  return rv;

Again the brace.


Also on a separate note, you may wish to consider using ++foo rather than foo++
as it is more efficient, and you use those quite frequently.

Comment 46

14 years ago
Comment on attachment 110916 [details] [diff] [review]
Patch v0.9: Does better out of memory checking.

sr=me
Attachment #110916 - Flags: superreview?(hewitt) → superreview+
Nisheeth, that looks ok once you remove the tabs.

Comment 48

14 years ago
I would expect that the same group of people who would be willing to mail
back a data collection file would also be willing to download a special
testing build.
Attachment #110621 - Flags: review?(heikki)
Attachment #110901 - Flags: review?(heikki)
(Assignee)

Comment 49

14 years ago
Created attachment 111095 [details]
Sample data file without url information
(Assignee)

Comment 50

14 years ago
Created attachment 111096 [details]
Sample data file with url information
(Assignee)

Comment 51

14 years ago
Created attachment 111259 [details] [diff] [review]
Patch v1.0: Final patch

This patch contains the following changes:
- Fixed tab characters and if-else braces pointed out by Chris and Heikki.
- Did not replace nsISupportsArray because it is a parameter in a method in
nsIAutoCompleteResults, an interface that existed before my patch.
- Replaced post-increments with pre-increments in FillInputFeatures().
(Assignee)

Updated

14 years ago
Attachment #110916 - Attachment is obsolete: true
(Assignee)

Comment 52

14 years ago
Patch v1.0 checked in.

Comment 53

14 years ago
ugh. this is way cool but it bloated us by a whopping 22k of static footprint. I
really like the concept but I'm really disappointed in the bloat.

I think this one of those features where we need to decide if its worth the
feature for the footprint hit? how about runtime footprint, do we have any idea
what the impact was there?
(Assignee)

Comment 54

14 years ago
Alec, runtime footprint isn't affected for the user unless some prefs are
enabled that switch on data capture or learning.

The 21k static footprint increase is temporary.  Once we've received data back
from enough mozilla users who participate in the data collection phase, we'll
remove the data capture code.  My hope is to take the data capture code out
after the 1.3 beta release and before the 1.3 final release.

The learning code might also be taken out if data analysis indicates that the
same set of feature weights can be used for all users.  Then, we will simply use
the hard coded weights rather than dynamically adjust them for each user.

So, the question before us is if we can accept a temporary 21 K increase in
static footprint for this research experiment.

What do you think?

Comment 55

14 years ago
Thanks for the clarification - I'm fine with the temporary code inclusion (and
in fact, I'm now interested in participating!) but there are only a few days
left to mozilla1.3 and I'd hate to see experimental/temporary code bloating a
release - can we make sure this gets backed by the 1.3 beta release? Is a week
of data gathering enough? (hopefully!)

Comment 56

14 years ago
If this needs community participation in a hurry, better write up a web page on
mozilla.org that describes how to turn this on in its various modes (I can't
figure it out) and push it to mozillazine.org etc.
People are waiting for Your sign Nisheeth :) Just tell us what to do...

Comment 58

14 years ago
Alex Flett:
> can we make sure this gets backed by the 1.3 beta release? Is a week
> of data gathering enough? (hopefully!)

I was under the impression that the plan was to include this in 1.3 Beta (see
comment 42).

Adam D. Moss:
> If this needs community participation in a hurry, better write up a web page
> on mozilla.org that describes how to turn this on in its various modes (I
> can't figure it out) and push it to mozillazine.org etc.

A MozillaZine article is an excellent idea and should get a lot of people
participating.

Comment 59

14 years ago
I stumbled across some of this code in Purify. Specifically mDataCaptureMode and
mLearningMode are not initialized in nsGlobalHistory::Init (I don't have these
pref's) and they are used later on. Not sure what the state of this bug is, but
I thought I'd post here since it's still open. Let me know if I need to file a
separate bug.
(Assignee)

Comment 60

14 years ago
David, thanks for the heads up.  I'm looking into the bug.  Bug 184167 is
probably a result of this...
(Assignee)

Comment 61

14 years ago
David, I've checked in a fix.  Thanks for finding the bug!

I hope to have a document describing this stuff written up for mozillazine and
mozilla.org in time for the 1.3 beta release.  Till then, if you can run the
nightly builds with the "browser.history.url.datacapture.mode" pref set to 2,
you will start seeing a file called "url-data.txt" in your profile directory. 
If some of you can try this out and report problems back to me, that would be
great!  I'd really like to ferret out lurking bugs before the 1.3 beta release.

Thanks!

Comment 62

14 years ago
Alright, I've been running CVS-HEAD (with mode set to 1, not 2) for a few days
now.  Haven't seen any problems.  I'm turning it off for the moment because
url-data.txt is starting to uncomfortably fill up my /home partition. :)

Is there a place where I can usefully send my url-data.txt or should I start
over when the data-collection 'officially' beings?
(Assignee)

Comment 63

14 years ago
Adam, I'd love to take a peek at the data.  Would you zip it up and email it to
me, please?  Thanks a lot!

Not sure if I'll be able to get to this, but, ideally there should be a web page
so that the data file can be submitted anonymously.  Anyone up for creating and
hosting such a thing?  It would probably be a simple CGI that accepts the data
file and emails it to me.  If it can double-check that the data file is zipped
and, if not, zip it up before emailing it, even better!

Comment 64

14 years ago
> Not sure if I'll be able to get to this, but, ideally there should be a web
> page so that the data file can be submitted anonymously.  Anyone up for
> creating and hosting such a thing?  It would probably be a simple CGI that
> accepts the data file and emails it to me.  If it can double-check that the
> data file is zipped and, if not, zip it up before emailing it, even better!

You mean like this?

http://www.mozillazine.org/misc/bug-182366-submit-anon-url-data.html

It'll need more testing before going into production and I'd have to check with
Kerz that it's okay for MozillaZine to host it. The wording, of course, can be
updated (and probably should be).

The script (it's PHP, not CGI) just grabs the zip file and mails it to
nisheeth@netscape.com as an attachment named url-data.zip.

The script checks that the data is zipped (i.e. that it's receiving a file of
the type application/x-zip-compressed) but it doesn't zip up the data if it
isn't. It also doesn't check that the zip file contains a url-data.txt file.

But essentially, it should work (though, as I said, it will need testing).

How big are these (zipped) url-data files going to be? I think the form's limit
is 2MB (though I believe this can be changed).

Comment 65

14 years ago
Created attachment 111782 [details]
The PHP script behind the anonymous form

Here's the PHP script so you can all suggest any necessary changes (or just
mock my coding skills).

Updated

14 years ago
Attachment #111782 - Attachment description: THe PHP script behind the anonymous form → The PHP script behind the anonymous form
(Assignee)

Comment 66

14 years ago
Alex, this is really cool!  Thanks a lot for whipping this up!

Adam, please try and submit your data file from Alex's page.  Let the testing begin!

Comment 67

14 years ago
I can't use the form to submit the data, because I deleted the data after I
emailed it to you. :)

Comment 68

14 years ago
patch v0.9 caused an embedding smoketest blocker today (see bug 189222).  not
sure why it took so long to see the regression.
Alex, there is an error in Your script. Probably You should change line

} elseif ($_FILES['data']['type'] != "application/x-zip-compressed") {

to accept "application/zip" type too.

I used to send my data today and recived error:
"You forgot to zip up your data before sending it! We told you that it wouldn't
work.

The type of file you tried to send was application/zip. A zip file is
application/x-zip-compressed."

Comment 70

14 years ago
> Alex, there is an error in Your script. Probably You should change line
>
> } elseif ($_FILES['data']['type'] != "application/x-zip-compressed") {
>
> to accept "application/zip" type too.

Whoops... should be fixed now. Try sending your data again.

I don't suppose there's any other MIME types that zip files can be?
well, if you're unlucky, they can be application/octet-stream, I suppose

Comment 72

14 years ago
Will this bug better sort the autocomplete results when starting to type in a
recipient's name? (I read most of the bug and did a "Find" for "mail" but found
nothing conclusive.)
Peter, you should consider looking at the patch.  This bug has nothing to do
with mail so far (though the results could perhaps be applied there sometime in
the future if the experiment is a success).
(Assignee)

Comment 74

14 years ago
As promised, here's the link to the project page titled "Applying machine
learning to autocomplete" on mozilla.org.  Any feedback on the page is very
welcome!  Please participate in the data gathering process!  Thanks!

http://www.mozilla.org/projects/ml/autocomplete/index.html

Comment 75

14 years ago
OK, I've updated the anon form to link to the page. I've checked with Kerz and
he's okay about MozillaZine hosting the form as long as it doesn't spike the
traffic too much so I'll be keeping an eye on that over the next few days.

As the form hasn't really been tested much, it would be great if anyone here who
gets it to work

a) successfully; or
b) not at all

could post a comment or email me, at least for the first few days. Sorry, should
have sorted this earlier.

Nisheeth, do you want me to modify the form so that it also emails the data to
ang@cs.stanford.edu? Also, if there's any other changes you want made, just yell.
(Assignee)

Comment 76

14 years ago
Thanks for updating the text on the page, Alex!  I think it looks great!

Yes, please add ang@cs.stanford.edu to the recipient list for submitted data files.

Comment 77

14 years ago
> Yes, please add ang@cs.stanford.edu to the recipient list for submitted data
> files.

OK, the script should now Cc all the data files to him. I've also updated all
the mailto: links to include ang@cs.stanford.edu.

> Any feedback on the page is very welcome!

A couple of things:

* You variously use "I" and "us" on the page. Probably best if you spell out who
you mean, as nobody reads the authors line at the top.

* In point 3 of the 'Data Collection' section, you could tell people to use
about:config rather than manually altering prefs.js (which is much easier to
screw up):

===

Type about:config in Mozilla's Location Bar to bring up a list of all your
preferences. Right-click on any of them (Cmd + Click if you're using a Mac) to
bring up the context menu and then select 'Integer' from the 'New' submenu.
Enter the following as the preference name:

<code>browser.history.url.datacapture.mode</code>

When asked for the value, enter <code>2</code> or <code>1</code>. Both settings
will dump data into a file called url-data.txt in your profile directory. The
different data capture modes are explained below.

===

* If you really wanted, you could include the anonymous submission form directly
on the page:

<form
action="http://www.mozillazine.org/misc/bug-182366-submit-anon-url-data.html"
enctype="multipart/form-data" method="POST">
<p><input type="file" name="data"> <input type="submit" name="send"
value="Send"></p>
</form>

That should work, assuming I'm not forgetting about any security measures
surrounding uploaded data from a form on one domain being processed by a script
on another.

* You might want to include a brief explanation of exactly what machine learning is.

Comment 78

14 years ago
so are we backing this out, now that 1.3beta has shipped?

Comment 79

14 years ago
Oh.  My.  God.

Before you try to make autocomplete be all DWIMmy AI, how about first making it
be  something (anything) less than completely incomprehensible?

See very simple and obvious proposal in bug
http://bugzilla.mozilla.org/show_bug.cgi?id=175725

Comment 80

14 years ago
Alex, I just got this message from your script:
> The type of file you tried to send was application/zip. 
> A zip file is application/x-zip-compressed or application/zip.

Nice, huh?
(Assignee)

Comment 81

14 years ago
alex, i updated http://www.mozilla.org/projects/ml/autocomplete/index.html with
your prose about "about:config".

alecf, yes the plan is still to back this out for 1.3 final.  Bug 193347 is a
1.3 final blocker and will take care of the code bloat.

Comment 82

14 years ago
> Alex, I just got this message from your script:
> > The type of file you tried to send was application/zip. 
> > A zip file is application/x-zip-compressed or application/zip.
>
> Nice, huh?

Sorry, looking into it.

Comment 83

14 years ago
Andreas, do you want to try sending your file again? I think the problem's fixed
now.

Comment 84

14 years ago
Alex: ok, same file worked now.

Comment 85

14 years ago
> Alex: ok, same file worked now.

Cool. It was a bracket thing. I didn't have enough brackets so the NOT
operator's scope was restricted to just the left-hand side of the... wait,
nobody cares!
What's the status of backing this out?  At this point it needs to come out of
both 1.3 branch and 1.4a trunk.....

Comment 87

14 years ago
> What's the status of backing this out?  At this point it needs to come out of
> both 1.3 branch and 1.4a trunk.....

See bug 193347. I believe the fix was landed on the trunk beofe 1.3 was branched.

Comment 88

14 years ago
I have enabled the ("browser.urlbar.autocomplete.learning.mode", 2) and the most
evident change is a delay when typing in the location bar. Especially the first
character of any url freezes the browser (ID:2003021008, linux) several seconds. 

Perhaps the search tree is much smaller, if the proposed url is calculated when
at least two or three characters are already known (apart from http(s)://, www,
etc.). For these cases, the classical algorithm should be applied.
(Assignee)

Comment 89

14 years ago
Gunnar, I've received the performance complaint from a couple of other people as
well who've had large history files.  The learning code wasn't streamlined for
performance at all and meant only as a fun prototype for people to play around
with.  Once we get the data back from Mozilla volunteers and if we find that a
learning algorithm works better than the current algorithm, we'll implement it
with performance in mind.  It would really help us to know what the processor
speed and memory of your Linux box.  It would also help us to get our hands on
your history file so that we can test our learning algo against it...

Comment 90

14 years ago
Is the autocomplete code still in the Mozilla Builds? I downloaded the RC1 and
tried to activate it, but the feature seems to be broken. Or is the code removed
due to the 1.3 release?

Comment 91

14 years ago
no, it was removed from all builds after the 1.3beta release.

Comment 92

14 years ago
Does the project still need additional learning data?  If so, is it possible to
get a post-1.3 version with this turned on?
(Assignee)

Comment 93

14 years ago
Gordon, the project has received enough data (around 400 data files).  So, there
isn't any need to put the code back in.  I guess more data is always good so in
case you want to participate in this for a couple of weeks, please use the 1.3
beta builds for that time to collect data.  We should still be able to use
whatever data you submit to us a couple of weeks from now.  Thanks!

Comment 94

14 years ago
This project is very important, in fact, it is first and solid step towards 
creation really intelligent user interface for Mozilla applications.

It is a strategic project, so, code shouldn't be removed, even if there were 
some bugs.

As to training data: I think those data should mostly come from user, as a 
result as user-specific experience: Mozilla application(s) can collect data from 
user and use them for autocompletion.

Thanks Nisheeth!

Updated

14 years ago
Blocks: 168902
Comment on attachment 110901 [details] [diff] [review]
Patch v0.8: Patch that addresses Heikki and Chris Aillon's comments

removing obsolete review request
Attachment #110901 - Flags: superreview?(hewitt)
What's going on with this? Is it planned for 1.4? 1.5?
(Assignee)

Comment 97

14 years ago
We're in the middle of data analysis on the data that we got back.  A quick 
update is that we have written up a perl script to convert the data file into a 
stream of url feature vectors.  We've also written up a Matlab script to run a 
softmax regression probabilistic model on it.  The short term plan is to 
compare how this model does against the current Mozilla algorithm on the data.  
If anyone is interested in more detail about what we are doing, please contact 
me.

I think results from this work will start showing up in the 1.5 timeframe.

Updated

14 years ago
Blocks: 205158

Comment 98

14 years ago
Came across this from bug 177360. I've heard a couple people ask about it in the
mozillazine forums, and 1.5a is out. Inquiring minds want to know: what's up
with this project?

Comment 99

14 years ago
Is this working now in moz 1.5?

I didn't get any url-data.txt file or any effect by enabling
user_pref("browser.urlbar.autocomplete.learning.mode", 2);
>Is this working now in moz 1.5?
no

Comment 101

13 years ago
How about setting a target milestone for this project?

Prog.

Comment 102

13 years ago
the author/driver behind this project is gone now. I suggest we mark this WONTFIX.  

Actually, he just needs to update his bugmail address I think.  I talked to
nisheeth recently and he still is working on this at stanford, though I'm not
sure at what pace or timeframe he is looking at, or if he is doing it for purely
academic reasons.  I'll see if I can get him to comment here.

Comment 104

13 years ago
I would still be interested to see his results, although I am quite happy with
how the address bar works at the moment. I also fear that it would make mozilla
a bit slower again. 
caillon:  Any progress in getting nisheeth's status?  Is this to still be
considered an 'assigned' bug?

(Assignee)

Comment 106

13 years ago
Sorry for the long silence.  I was back in school and had very little time away from course work.  I 
worked on this off and on last year and then continued this work as part of my class project for the 
machine learning class I took earlier this year.  Happily, now we are at a stage where we can make a call 
on whether to use the machine learning approach or not.

The final iteration of the machine learning approach saves the user an average of one keystroke per 
autocomplete event (on a data set of 237 users) as compared to the current Mozilla 
algorithm.  I am about to upload the project report and some slides on mozilla.org that describe the 
different ideas we tried and the results we achieved.

The final result is good but not great.  The question we need to discuss now is whether a one key per 
event advantage justifies changing the current Mozilla autocomplete code...  I'd love some input on this 
question from all of you who've been following the progress of this bug.

I will post links to the project report and the slides (which basically summarize the report) once I've 
uploaded them.  Please read them and post your comments to this bug.  Thanks a lot!

Comment 107

13 years ago
Thanks for the update.

In my experience, Mozilla already has a feature that provides about one key per 
event advantage compared to its default autocomplete, it's called "Autocomplete
best match as you type" (browser.urlbar.autoFill=true).

The only problem I find with this feature is that it's disabled by default, and
this brings me to where machine learning could really help Mozilla: Better
default prefs. I have a firm belive that gathering information from users (about
their browsing pattern) can help Mozilla choose better defaults. I won't tire
anyone with this off topic issue, as it probably requires a new bug.

Prog.
Thanks for the update.  I look forward to seeing what you have.

Comment 109

13 years ago
One key isn't much, but it may still feel to work better than the default
behaviour or the autoFill behaviour. I would still like to experiment with it.
Maybe it should be an installable addon?
(Assignee)

Comment 110

13 years ago
Just uploaded the report at http://www.mozilla.org/projects/ml/autocomplete/ac-report.html.  Also, a 
PDF version of the slides that summarize the report is at http://www.mozilla.org/projects/ml/
autocomplete/ac-preso.pdf.

Comment 111

13 years ago
i tried to configure mozilla web browser for the data capture mode '2' to
participate in this autocomplete project but the file "url-data.txt" was not
created.
Product: Core → Mozilla Application Suite
Nisheeth,
Are you still working on this ?
QA Contact: claudius → nobody

Comment 113

8 years ago
Now that we have the smart locationbar (T3h Awesomebar) with the places freency algorithm, I'm marking this bug FIXED and closing.
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED

Updated

8 years ago
No longer blocks: 205158
You need to log in before you can comment on or make changes to this bug.