Closed Bug 317472 Opened 19 years ago Closed 17 years ago

Save custom encoding (last charset) in annotation service

Categories

(Firefox :: Bookmarks & History, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Firefox 3

People

(Reporter: brettw, Assigned: mak)

References

(Blocks 1 open bug)

Details

(Keywords: addon-compat, intl, regression)

Attachments

(2 files, 10 obsolete files)

When the user manually specifies an encoding for a web page, it should be saved in the annotation service and re-used when the page is later loaded. Currently, this information appears to be stored in bookmarks, so at least those are consistent when you go back to them. This should happen for all web pages, and for the new bookmark service.
Priority: -- → P3
*** Bug 327328 has been marked as a duplicate of this bug. ***
Flags: blocking1.9a2?
Assignee: brettw → nobody
Also when I browse FTP server I wan't select charset manually for each directory- firefox must remember selected charset for all domain in current session at least.
Keywords: intl
notes:  there's a chunk of code in browser.js that refers to this not being implemented in places yet.

#ifndef MOZ_PLACES_BOOKMARKS
          // FIXME: Bug #317472, we don't have last charset in places yet.
          else if (/%s/.test(shortcutURL) ||
                   (aPostDataRef && /%s/.test(aPostDataRef.value))) {
            try {
              charset = BMSVC.getLastCharset(shortcutURL);
            } catch (ex) {
            }
          }
#endif

see http://lxr.mozilla.org/seamonkey/source/browser/base/content/browser.js#1994
Flags: blocking1.9a2?
Attached patch fix v1 (obsolete) — Splinter Review
save charset for bookmarked URIs, and use it to properly encode bookmark keyword terms (the only consumer, atm).
Assignee: nobody → dietrich
Status: NEW → ASSIGNED
Attachment #262054 - Flags: review?(mano)
Isn't it just the right time to start using places features?

I think we should silently annotate any url with the charset annotation whenever the charset is set manually (i.e. from the View->Character Encoding menu).
Attachment #262054 - Flags: review?(mano)
Attached patch fix v2 (obsolete) — Splinter Review
stores charset whenever it's manually selected.
Attachment #262054 - Attachment is obsolete: true
Attachment #262470 - Flags: review?(mano)
Comment on attachment 262470 [details] [diff] [review]
fix v2

>Index: browser/base/content/browser-places.js
>===================================================================

>+    var charSet;
>     try {
>       title = webNav.document.title;
>       description = PlacesUtils.getDescriptionFromDocument(webNav.document);
>+      charSet = webNav.document.characterSet;
>     }
>     catch (e) { }
>-    PlacesUtils.showMinimalAddBookmarkUI(url, title, description);
>+    PlacesUtils.showMinimalAddBookmarkUI(url, title, description, null, null, null, null, charSet);
>   },
> 
Please add a comment here explaining that this is done for the case of %s (smart keywords?) URI, otherwise I might optimize that later ;)

>Index: browser/base/content/browser.js
>===================================================================
> 
> function SetForcedCharset(charset)
> {
>     BrowserSetForcedCharacterSet(charset);
> }
> 
> function BrowserSetForcedCharacterSet(aCharset)
> {
>-  var docCharset = getBrowser().docShell.QueryInterface(
>-                            Components.interfaces.nsIDocCharset);
>+  var browser = getBrowser();
>+  var docCharset = browser.docShell.QueryInterface(
>+                           Components.interfaces.nsIDocCharset);

why is this change?

>   docCharset.charset = aCharset;
>+#ifdef MOZ_PLACES

please keep all UE-changes under MOZ_PLACES_BOOKMARKS.

>Index: browser/components/places/content/controller.js
>===================================================================

>+function PlacesEditItemCharSetTransaction(aURI, aCharSet) {

places-terminology-wise, "item" is kinda misleading; this transaction applies to URIs, not moz_bookmarks items.
Attachment #262470 - Flags: review?(mano)
Blocks: 334408
Thinking about this more, we don't need a transaction here at all once we:
  1) Make sure page-annotations are removed from the db when the page itself is removed from moz_places.
  2) Make sure that removing a bookmark removes its moz_places entry if the later has no visits.
No longer blocks: 334408
Target Milestone: --- → Firefox 3 beta1
Blocks: 333645
Target Milestone: Firefox 3 M7 → Firefox 3 M8
Target Milestone: Firefox 3 M8 → Firefox 3 M9
Target Milestone: Firefox 3 M9 → Firefox 3 M10
Target Milestone: Firefox 3 M10 → Firefox 3 M11
Bug 409945 changed the way charset annos are saved in import/export, that is because if we have them as a page annotation on a bookmark, then the page will never get deleted from moz_places (its an epire_never anno), and the anno is not removed elsewhere.
So it is now a itemAnno, that is removed with bookmark.
Target Milestone: Firefox 3 beta3 → ---
Wups, should have nominated this months ago: since it's apparently the one true bug for getLastCharset, without it, "Add a keyword for this search" is useless-UI for an awful lot of people whose searches aren't US-ASCII only.

And isn't there something that gavin had just finally fixed for 2.0, where docShell asks for a LAST_CHARSET so that you don't have to manually change it every time you go back to something where we don't sniff it correctly?
Severity: enhancement → normal
Flags: blocking-firefox3?
Keywords: regression
Definitely blocking for l10n functionality regressions.
Flags: blocking-firefox3? → blocking-firefox3+
Priority: P4 → P2
Assignee: dietrich → mak77
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Attached patch wip (obsolete) — Splinter Review
first pass at this.

Imho we should do this:

- if an url is bookmarked and the user chooses a charset from View / CharSet we set an itemAnno EXPIRE_NEVER, this anno will be deleted if the bookmark is deleted

- if an url is not bookmarked and the user chooses a charset from View / CharSet we set a pageAnno EXPIRE_WITH_HISTORY. We should not set an EXPIRE_NEVER here or we will end in a loop (page is not deleted by moz_places because it has an EXPIRE_NEVER anno, and we don't delete the anno because the page is still in moz_places, argh!)

- charset itemAnno is served before charSet pageAnno, when a user deletes a bookmark but the page is still in history we can still serve pageAnno

- if a page gets bookmarked? TBD, we should probably copy the pageAnno to an itemAnno so it's not get deleted with history

- if a page gets unbookmarked? TBD, we could probably copy the itemAnno to a pageAnno, but that appear to be a minor useful thing (we could most probably still have the old pageAnno setup before bookmarking)


IMHO, we don't need a specific interface in bookmark page dialogs, worst if that's a text field (should be a dropdown at least, to avoid entering bogus charsets). We already have an UI to choose the charset in the menus, and that could be enough: i set a charset there and is the backend that mantains it for page or bookmark, based on situation (and move it accordingly).
Attachment #262470 - Attachment is obsolete: true
Attached patch wip v2 (obsolete) — Splinter Review
Attachment #305514 - Attachment is obsolete: true
Attached patch wip v3 (obsolete) — Splinter Review
unbitrot of old dietrich's code merged-in
Attachment #305548 - Attachment is obsolete: true
Target Milestone: --- → Firefox 3
Whiteboard: [has wip patch]
Whiteboard: [has wip patch] → [has wip patch][swag: 1d]
Attached patch wip v4 (obsolete) — Splinter Review
- nsNavHistory implements nsICharsetResolver
- added new methods getCharsetForURI and setCharsetForURI with unit test

Thank you Gavin for pointing me to old FX2 implementation, that was really useful
Attachment #305725 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
first working and reviewable version
Attachment #310492 - Attachment is obsolete: true
Attachment #310543 - Flags: review?(dietrich)
Whiteboard: [has wip patch][swag: 1d] → [has patch] [needs review Dietrich]
Summary: Save custom encoding in annotation service → Save custom encoding (last charset) in annotation service
Comment on attachment 310543 [details] [diff] [review]
patch

>+
>+NS_IMETHODIMP
>+nsNavHistory::SetCharsetForURI(nsIURI* aURI,
>+                               const nsACString& aCharset)
>+{
>+  // Get services
>+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
>+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
>+  nsNavBookmarks* bmsvc = nsNavBookmarks::GetBookmarksService();
>+  NS_ENSURE_TRUE(bmsvc, NS_ERROR_OUT_OF_MEMORY);
>+
>+  // Detect if aURI is bookmarked
>+  PRBool isBookmarked = PR_FALSE;
>+  nsresult rv = bmsvc->IsBookmarked(aURI, &isBookmarked);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  if (isBookmarked) {
>+    // Try setting an item annotation
>+    nsTArray<PRInt64> bookmarks;
>+    rv = bmsvc->GetBookmarkIdsForURITArray(aURI, &bookmarks);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    // Charset annotations for the same bookmarked uri should be
>+    // maintained in sync, so we must set it for all bookmark ids
>+    if (aCharset.IsEmpty()) {
>+      // remove the current item charset annotation
>+      // don't bother on errors
>+      for (PRInt32 i = 0; i < bookmarks.Length(); i++)
>+        (void)annosvc->RemoveItemAnnotation(bookmarks[i], CHARSET_ANNO);
>+    }
>+    else {
>+      // Set item charset annotation (silently overwrite if already exists)
>+      nsAutoString annoValue;
>+      CopyASCIItoUTF16(aCharset, annoValue);
>+      for (PRInt32 i = 0; i < bookmarks.Length(); i++) {
>+        rv = annosvc->SetItemAnnotationString(bookmarks[i], CHARSET_ANNO,
>+                                              annoValue, 0,
>+                                              nsAnnotationService::EXPIRE_NEVER);
>+        NS_ENSURE_SUCCESS(rv, rv);
>+      }
>+    }

hrm, i don't like this. it's redundant data *and* it's not accurate. what if the user bookmarks the uri again after this? that bookmark would not have the anno set.

i think i'd prefer to only set the annotation on the URI in this api, not on the bookmark. when saving a bookmark w/ a forced charset, set that anno on that specific item from the front-end code, so that it retains affinity w/ the bookmark created when the action occurred.

in GetCharsetForURI(), you could iterate over the bookmarks, if no anno is found for the URI. (should always check URI first, since that'll be set already anyway, iiuc)

>+// nsNavHistory::GetCharsetForURI
>+//
>+// Get the last saved charset for an URI.
>+// If the URI is bookmarked read the charset from an item annotation
>+// else read it from a page annotation.
>+
>+NS_IMETHODIMP
>+nsNavHistory::GetCharsetForURI(nsIURI* aURI, 
>+                               nsACString& aCharset)
>+{
>+  // Get services
>+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
>+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
>+  nsNavBookmarks* bmsvc = nsNavBookmarks::GetBookmarksService();
>+  NS_ENSURE_TRUE(bmsvc, NS_ERROR_OUT_OF_MEMORY);
>+
>+  // Detect if aURI is bookmarked
>+  PRBool isBookmarked = PR_FALSE;
>+  nsresult rv = bmsvc->IsBookmarked(aURI, &isBookmarked);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  if (isBookmarked) {
>+    // Try getting an item annotation
>+    nsTArray<PRInt64> bookmarks;
>+    rv = bmsvc->GetBookmarkIdsForURITArray(aURI, &bookmarks);
>+    NS_ENSURE_SUCCESS(rv, rv);

in the bookmarked case, could save a db query by just calling this first and checking length, instead of calling IsBookmarked.

>+
>+    // charSet annotations for the same bookmarked uri are
>+    // maintained in sync, so we can use any bookmark id
>+    nsAutoString charset;
>+    rv = annosvc->GetItemAnnotationString(bookmarks[0], CHARSET_ANNO, charset);
>+    if (NS_SUCCEEDED(rv)) {
>+      LossyCopyUTF16toASCII(charset, aCharset);

use CopyUTF16toUTF8()

if a bookmark is added after the charset is set by the user (ie, it won't have the anno) and is the first in the array returned by getBookmarkIdsForURITArray(), then no charset will be found.

>+      return NS_OK;
>+    }
>+  }
>+  else {
>+    // aURI is not bookmarked, try getting a page annotation
>+    // getPageAnnotation throws if annotation is not found
>+    nsAutoString charset;
>+    rv = annosvc->GetPageAnnotationString(aURI, CHARSET_ANNO, charset);
>+    if (NS_SUCCEEDED(rv)) {
>+      LossyCopyUTF16toASCII(charset, aCharset);
>+      return NS_OK;
>+    }

in the case where a URI is bookmarked after the charset is chosen by the user, this will never get called and a charset will not be found.

should probably:

1. check for the URI anno first (since that's set when the charset is forced)
2. if doesn't exist, iterate over the bookmarks, checking each until found (or not)

>@@ -6285,8 +6412,40 @@ nsNavHistory::RecalculateFrecenciesInter
>     NS_ENSURE_SUCCESS(rv, rv);
> 
>     rv = mDBUpdateFrecencyAndHidden->Execute();
>     NS_ENSURE_SUCCESS(rv, rv);
>   }
> 
>   return NS_OK;
> }
>+
>+// nsICharsetResolver **********************************************************
>+
>+NS_IMETHODIMP
>+nsNavHistory::RequestCharset(nsIWebNavigation* aWebNavigation,

this should be testable from chrome mochitest

>+ *
>+ * The Original Code is mozilla.org code.

Places test code?

>+ *
>+ * The Initial Developer of the Original Code is Google Inc.
>+ * Portions created by the Initial Developer are Copyright (C) 2005
>+ * the Initial Developer. All Rights Reserved.

not google, not 2005?

>+    
>+    if (this._charSet) {
>+      PlacesUtils.history.setCharsetForURI(this._bookmarkURI, this._charSet);
>+    }

nit: no brackets for single lines
Attachment #310543 - Flags: review?(dietrich) → review-
Whiteboard: [has patch] [needs review Dietrich] → [needs new patch]
> hrm, i don't like this. it's redundant data *and* it's not accurate. what if
> the user bookmarks the uri again after this? that bookmark would not have the
> anno set.

in placesTransaction we copy the existing annotation to the new item.
While when you delete a bookmark, the annotation, if present, is copied to the page.
Having both could be reduntant but while bookmark charset is retained until bookmark is deleted, page charset is retained only until the page has some visit.

> i think i'd prefer to only set the annotation on the URI in this api, not on
> the bookmark. when saving a bookmark w/ a forced charset, set that anno on that
> specific item from the front-end code, so that it retains affinity w/ the
> bookmark created when the action occurred.

only when forcing charset? So if i have 2 bookmarks pointing to the same url (in menu and toolbar) i'll have to set the charset twice, one for each of them...

> in GetCharsetForURI(), you could iterate over the bookmarks, if no anno is
> found for the URI. (should always check URI first, since that'll be set already anyway, iiuc)

Not if the bookmark exists but there are no visits, we are sure that a created charset for a bookmark exist, not that a created charset for a page exist (could have been expired)

> >+  if (isBookmarked) {
> >+    // Try getting an item annotation
> >+    nsTArray<PRInt64> bookmarks;
> >+    rv = bmsvc->GetBookmarkIdsForURITArray(aURI, &bookmarks);
> >+    NS_ENSURE_SUCCESS(rv, rv);
> 
> in the bookmarked case, could save a db query by just calling this first and
> checking length, instead of calling IsBookmarked.

yes, still i have to check how much is GetBookmarkidsForURI expensive against isBookmarked... since this code is called every time a page without a channel charset is loaded.

> if a bookmark is added after the charset is set by the user (ie, it won't have
> the anno) and is the first in the array returned by
> getBookmarkIdsForURITArray(), then no charset will be found.

but when the user creates a new bookmark we copy the charset to it in placesTransactionManager.

> >+      return NS_OK;
> >+    }
> >+  }
> >+  else {
> >+    // aURI is not bookmarked, try getting a page annotation
> >+    // getPageAnnotation throws if annotation is not found
> >+    nsAutoString charset;
> >+    rv = annosvc->GetPageAnnotationString(aURI, CHARSET_ANNO, charset);
> >+    if (NS_SUCCEEDED(rv)) {
> >+      LossyCopyUTF16toASCII(charset, aCharset);
> >+      return NS_OK;
> >+    }
> 
> in the case where a URI is bookmarked after the charset is chosen by the user,
> this will never get called and a charset will not be found.

see above

> 
> should probably:
> 
> 1. check for the URI anno first (since that's set when the charset is forced)

if the visits have been expired will always fail

> 2. if doesn't exist, iterate over the bookmarks, checking each until found (or
> not)

imo they should be in sync
Blocks: 422548
Attached patch patch v2 (obsolete) — Splinter Review
what this patch does:

- fix expiration:
we actually have EXPIRE_WITH_HISTORY and EXPIRE_NEVER, on a page annotation they are both bogus, the first should expire when a page has no more visits (and it does not do that), the second should expire when the page is no more in the db. Actually if you set an EXPIRE_NEVER page anno you end up in a loop where you don't remove anno because you have the page, and you don't delete the page because you have an anno. This is particularly bad for us, if an extension starts setting expire_never annos on pages, we will have major privacy issues!

- implement nsICharserResolver:
while in Firefox 2 we only save charset for bookmarks, in places we can do that for both bookmarks and history, so until an uri is in the db we can save and retain the charset. Charset is saved in an EXPIRE_NEVER page annotation, will go 
away only if the page has no visits and is not a bookmark. 

- add charset methods:
added 2 new methods to nsINavHistoryService: setCharsetForURI and getCharsetForURI. They are used everywhere when dealing with the charset so we can change the underlying impl if needed. A big code simplification from the previous patch has come from using a page anno (and it is faster).

- save charset from user interaction
when a user force a charset through the browser UI, we save it and we use that in future and for keyword encoding

- fix charset import/export
fixed import/export of charset from html and json

- move old saved charset
we had moved charset from pageAnno to itemAnno, need to move them back again :(

- fix tests
expiration and import/export tests have been updated, plus added a new test specific for this bug (test for new charset methods)
Attachment #310543 - Attachment is obsolete: true
Attachment #312033 - Flags: review?(dietrich)
Whiteboard: [needs new patch]
Whiteboard: [has patch][needs review dietrich]
Comment on attachment 312033 [details] [diff] [review]
patch v2

yay, much much cleaner, and solves a few other problems. r=me w/ comments below fixed, and please get second-r from mano for the front-end bits.

>   /**
>+   * Get the stored charset for an uri.
>+   *
>+   * @param aURI             uri to retrieve charset for
>+   * @returns charset, empty string if not found
>+   */
>+  AUTF8String getCharsetForURI(in nsIURI aURI);
>+
>+  /**
>+   * Set a charset for an uri.

nit: s/a/the/ since you can only set one.

>+   *
>+   * @param aURI             uri to set charset to

nit: s/to/for/

>@@ -3995,16 +3999,76 @@ nsNavHistory::MarkPageAsTyped(nsIURI *aU
> 
>   if (mRecentTyped.Count() > RECENT_EVENT_QUEUE_MAX_LENGTH)
>     ExpireNonrecentEvents(&mRecentTyped);
> 
>   mRecentTyped.Put(uriString, GetNow());
>   return NS_OK;
> }
> 
>+
>+// nsNavHistory::SetCharsetForURI
>+//
>+// Set the charset for an URI.
>+// If aCharset is empty remove charset annotation for aURI.
>+
>+NS_IMETHODIMP
>+nsNavHistory::SetCharsetForURI(nsIURI* aURI,
>+                               const nsACString& aCharset)
>+{
>+  // Get annotation service
>+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
>+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
>+
>+  if (aCharset.IsEmpty()) {
>+    // remove the current page charset annotation
>+    // don't bother on errors
>+    (void)annosvc->RemovePageAnnotation(aURI, CHARSET_ANNO);
>+  }
>+  else {
>+  // Set page charset annotation (silently overwrite if already exists)

indent comment

>+    nsAutoString annoValue;
>+    CopyUTF8toUTF16(aCharset, annoValue);

you can instead use NS_ConvertUTF8toUTF16() inline.

>+    nsresult rv = annosvc->SetPageAnnotationString(aURI, CHARSET_ANNO,
>+                                                   annoValue, 0,
>+                                                   nsAnnotationService::EXPIRE_NEVER);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+  }
>+
>+  return NS_OK;
>+}
>+
>+
>+// nsNavHistory::GetCharsetForURI
>+//
>+// Get the last saved charset for an URI.
>+// If the URI is bookmarked read the charset from an item annotation

fix comment

>+// else read it from a page annotation.
>+
>+NS_IMETHODIMP
>+nsNavHistory::GetCharsetForURI(nsIURI* aURI, 
>+                               nsACString& aCharset)
>+{
>+  // Get services

nit: s/services/service/

or just remove this, since it's pretty clear what this is doing ;)

>@@ -630,22 +630,23 @@ nsNavHistoryExpire::EraseHistory(mozISto
> 
>   if (deletedPlaceIds.IsEmpty())
>     return NS_OK;
> 
>   return aConnection->ExecuteSimpleSQL(
>     NS_LITERAL_CSTRING("DELETE FROM moz_places WHERE id IN( "
>       "SELECT h.id "
>       "FROM moz_places h "
>-      "LEFT OUTER JOIN moz_historyvisits v ON v.place_id = h.id "
>-      "LEFT OUTER JOIN moz_annos a ON a.place_id = h.id "
>-      "WHERE h.id IN(") + deletedPlaceIds +
>-      NS_LITERAL_CSTRING(") AND v.place_id IS NULL AND (a.expiration <> ") +
>-      nsPrintfCString("%d", nsIAnnotationService::EXPIRE_NEVER) +
>-      NS_LITERAL_CSTRING(" OR a.expiration IS NULL))"));
>+      "WHERE h.id IN(") +
>+        deletedPlaceIds +
>+        NS_LITERAL_CSTRING(") AND NOT EXISTS "
>+          "(SELECT id FROM moz_historyvisits WHERE place_id = h.id) "
>+          "AND NOT EXISTS "
>+          "(SELECT id FROM moz_bookmarks WHERE fk = h.id) "
>+          "AND SUBSTR(h.url,0,6) <> 'place:')"));

LIMIT 1, for both sub-selects?

also, since the query approach changed, can you please profile this to confirm you're not regressing perf?


>+  nsCOMPtr<mozIStorageStatement> migrateCharsetStatement;
>   rv = aConnection->CreateStatement(NS_LITERAL_CSTRING(
>+        "INSERT OR REPLACE INTO moz_annos "
>+        "SELECT null, b.fk, t.anno_attribute_id, t.mime_type, t.content, "
>+          "t.flags, t.expiration, t.type, t.dateAdded, t.lastModified "
>+        "FROM moz_items_annos t "
>+          "JOIN moz_anno_attributes n ON t.anno_attribute_id = n.id "
>+          "JOIN moz_bookmarks b ON b.id = t.item_id "
>+        "WHERE n.name = ?1 AND t.expiration = ") +
>         nsPrintfCString("%d", nsIAnnotationService::EXPIRE_NEVER) +

is anno type necessary? seems like name should be enough.

>@@ -931,16 +934,20 @@ var BookmarkPropertiesPanel = {
>         PlacesUIUtils.ptm.editBookmarkMicrosummary(-1, microsummary));
>     }
> 
>     if (this._postData) {
>       childTransactions.push(
>         PlacesUIUtils.ptm.editBookmarkPostData(-1, this._postData));
>     }
> 
>+    if (this._charSet) {
>+      PlacesUtils.history.setCharsetForURI(this._bookmarkURI, this._charSet);
>+    }
>+

nit: no brackets
Attachment #312033 - Flags: review?(dietrich) → review+
Whiteboard: [has patch][needs review dietrich] → [has patch]
adding late-compat for the EXPIRE_NEVER behavior change.
Keywords: late-compat
Comment on attachment 312033 [details] [diff] [review]
patch v2

>Index: browser/base/content/browser.js

> function BrowserSetForcedCharacterSet(aCharset)
> {
>   var docCharset = getBrowser().docShell.QueryInterface(
>                             Components.interfaces.nsIDocCharset);
>   docCharset.charset = aCharset;
>+  // Save the forced charset in Places
>+  let uri = getWebNavigation().currentURI;
>+  PlacesUtils.history.setCharsetForURI(uri, aCharset);

Why do we need this here? This seems to do more than what the previous bookmarks service did, and I'm worried that it might interfere with other ways the charset is persisted. When was charset information stored in bookmarks in Firefox 2? Was it only when we added a bookmark?
(In reply to comment #26)
> (From update of attachment 312033 [details] [diff] [review])
> >Index: browser/base/content/browser.js
> 
> > function BrowserSetForcedCharacterSet(aCharset)
> > {
> >   var docCharset = getBrowser().docShell.QueryInterface(
> >                             Components.interfaces.nsIDocCharset);
> >   docCharset.charset = aCharset;
> >+  // Save the forced charset in Places
> >+  let uri = getWebNavigation().currentURI;
> >+  PlacesUtils.history.setCharsetForURI(uri, aCharset);
> 
> Why do we need this here? This seems to do more than what the previous
> bookmarks service did, and I'm worried that it might interfere with other ways
> the charset is persisted. When was charset information stored in bookmarks in
> Firefox 2? Was it only when we added a bookmark?

IIRC it was only saved for bookmarks on creation. This is saving the charset in places when the user forces it from the UI,  from my tests usually the charset is retained from the channel, if not preset is then asked to places. 
Yes it does more then before, but this is used only when expressively required from the user, i force a charset for an uri, i'd expect it to be reused the next time without having to reset it.

Blocks: 425611
Attached patch patch v2.1 (obsolete) — Splinter Review
- fixed dietrich's review comments
- added better comments in annotations idl for expire_never and expire_with_history (see bug 425611)
- tested query perf: it's about the same, this is an id-filtered query, so the speed does not change. On a not id-filtered query the timing would be about *2 (on my db with 130.000 visits and 5.000 bookmarks)

passing review request to Mano

Gavin, what are your final thoughts on this? What could be actual regressions from saving the forced charset?
Attachment #312033 - Attachment is obsolete: true
Attachment #312245 - Flags: review?(mano)
Comment on attachment 312245 [details] [diff] [review]
patch v2.1

I don't understand why is this added to the backend-services. I would rather use annotations.

Anyway, if you keep it this way, here are some drive-by-comments.

>Index: browser/base/content/browser.js
>===================================================================
>RCS file: /cvsroot/mozilla/browser/base/content/browser.js,v
>retrieving revision 1.1012
>diff -u -8 -p -r1.1012 browser.js
>--- browser/base/content/browser.js	24 Mar 2008 23:56:20 -0000	1.1012
>+++ browser/base/content/browser.js	28 Mar 2008 10:24:35 -0000
>@@ -1743,17 +1743,22 @@ function getShortcutOrURI(aURL, aPostDat
> 
>   if (/%s/i.test(shortcutURL) || /%s/i.test(postData)) {
>     var charset = "";
>     const re = /^(.*)\&mozcharset=([a-zA-Z][_\-a-zA-Z0-9]+)\s*$/;
>     var matches = shortcutURL.match(re);
>     if (matches)
>       [, shortcutURL, charset] = matches;
>     else {
>-      //XXX Bug 317472 will add lastCharset support to places.
>+      // Get the charset from Places
>+      try {
>+        // PlacesUtils._uri throws if uri is invalid
>+        let uri = PlacesUtils._uri(shortcutURL);

nit: use makeURI here.

>+        charset = PlacesUtils.history.getCharsetForURI(uri);

I would mention here that |charset| would be an empty string if no character-set is specified 

>@@ -5050,16 +5055,19 @@ function SetForcedCharset(charset)
>     BrowserSetForcedCharacterSet(charset);
> }
> 
> function BrowserSetForcedCharacterSet(aCharset)
> {
>   var docCharset = getBrowser().docShell.QueryInterface(
>                             Components.interfaces.nsIDocCharset);
>   docCharset.charset = aCharset;
>+  // Save the forced charset in Places
>+  let uri = getWebNavigation().currentURI;
>+  PlacesUtils.history.setCharsetForURI(uri, aCharset);

Just inline it, don't mix let/vars this way please.


>Index: toolkit/components/places/public/nsINavHistoryService.idl
>===================================================================
>RCS file: /cvsroot/mozilla/toolkit/components/places/public/nsINavHistoryService.idl,v
>retrieving revision 1.80
>diff -u -8 -p -r1.80 nsINavHistoryService.idl
>--- toolkit/components/places/public/nsINavHistoryService.idl	12 Mar 2008 23:36:24 -0000	1.80
>+++ toolkit/components/places/public/nsINavHistoryService.idl	28 Mar 2008 10:24:45 -0000
>@@ -1040,17 +1040,17 @@ interface nsINavHistoryQueryOptions : ns
>   attribute unsigned short queryType;
> 
>   /**
>    * Creates a new options item with the same parameters of this one.
>    */
>   nsINavHistoryQueryOptions clone();
> };
> 
>-[scriptable, uuid(1972ac40-d270-40ee-995a-c97d02849596)]
>+[scriptable, uuid(8b258ff6-b591-4da5-8b7c-0665b27d09cc)]
> interface nsINavHistoryService : nsISupports
> {
>   /**
>    * This transition type means the user followed a link and got a new toplevel
>    * window.
>    */
>   const unsigned long TRANSITION_LINK = 1;
> 
>@@ -1099,27 +1099,42 @@ interface nsINavHistoryService : nsISupp
>    */
>   readonly attribute boolean hasHistoryEntries;
> 
>   /**
>    * Gets the original title of the page.
>    */
>   AString getPageTitle(in nsIURI aURI);
> 
>-
>   /**
>    * This is just like markPageAsTyped (in nsIBrowserHistory, also implemented
>    * by the history service), but for bookmarks. It declares that the given URI
>    * is being opened as a result of following a bookmark. If this URI is loaded
>    * soon after this message has been received, that transition will be marked
>    * as following a bookmark.
>    */
>   void markPageAsFollowedBookmark(in nsIURI aURI);
> 
>   /**
>+   * Get the stored charset for an URI.

nits: Gets, character-set.

>+   *
>+   * @param aURI             URI to retrieve charset for
>+   * @returns charset, empty string if not found
>+   */
>+  AUTF8String getCharsetForURI(in nsIURI aURI);
>+

make it ACString for consistency with the rest of this api (in the setting method too).

>+  /**
>+   * Set the charset for an URI.
>+   *
>+   * @param aURI             URI to set charset for

Please use
@param foo
       Descrtipion

comments style.


> 
> static nsresult GetReversedHostname(nsIURI* aURI, nsAString& host);
>@@ -3639,27 +3643,27 @@ nsNavHistory::RemovePagesInternal(const 
>         aPlaceIdsQueryString +
>         NS_LITERAL_CSTRING(")"));
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   // now that visits have been removed, run annotation expiration.
>   // this will remove all expire-able annotations for these URIs.
>   (void)mExpire.OnDeleteURI();
> 
>-  // if there are no more annotations, and the entry is not bookmarked
>-  // then we can remove the moz_places entry.
>+  // if the entry is not bookmarked and is not a place: uri
>+  // then we can remove it from moz_places.
>   // Note that we do NOT delete favicons. Any unreferenced favicons will be
>   // deleted next time the browser is shut down.
>   rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>       "DELETE FROM moz_places WHERE id IN ("
>         "SELECT h.id FROM moz_places h WHERE h.id IN (") +
>         aPlaceIdsQueryString +
>         NS_LITERAL_CSTRING(") AND "
>-        "NOT EXISTS (SELECT b.id FROM moz_bookmarks b WHERE b.fk = h.id) AND "
>-        "NOT EXISTS (SELECT a.id FROM moz_annos a WHERE a.place_id = h.id))"));
>+        "NOT EXISTS (SELECT b.id FROM moz_bookmarks b WHERE b.fk = h.id LIMIT 1) "
>+        "AND SUBSTR(h.url,0,6) <> 'place:')"));
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   // placeId could have a livemark item, so setting the frecency to -1
>   // would cause it to show up in the url bar autocomplete
>   // call FixInvalidFrecenciesForExcludedPlaces() to handle that scenario
>   // XXX this might be dog slow, further degrading delete perf.
>   rv = FixInvalidFrecenciesForExcludedPlaces();
>   NS_ENSURE_SUCCESS(rv, rv);
>@@ -3995,16 +3999,70 @@ nsNavHistory::MarkPageAsTyped(nsIURI *aU
> 
>   if (mRecentTyped.Count() > RECENT_EVENT_QUEUE_MAX_LENGTH)
>     ExpireNonrecentEvents(&mRecentTyped);
> 
>   mRecentTyped.Put(uriString, GetNow());
>   return NS_OK;
> }
> 
>+
>+// nsNavHistory::SetCharsetForURI
>+//
>+// Set the charset for an URI.
>+// If aCharset is empty remove charset annotation for aURI.
>+
>+NS_IMETHODIMP
>+nsNavHistory::SetCharsetForURI(nsIURI* aURI,
>+                               const nsACString& aCharset)
>+{
>+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
>+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
>+
>+  if (aCharset.IsEmpty()) {
>+    // remove the current page charset annotation
>+    // don't bother on errors
>+    (void)annosvc->RemovePageAnnotation(aURI, CHARSET_ANNO);

NS_ESNURE_SUCCESS it and return NS_OK.


>+
>+
>+// nsNavHistory::GetCharsetForURI
>+//
>+// Get the last saved charset for an URI.
>+
>+NS_IMETHODIMP
>+nsNavHistory::GetCharsetForURI(nsIURI* aURI, 
>+                               nsACString& aCharset)
>+{
>+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
>+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
>+
>+  nsAutoString charset;
>+  nsresult rv = annosvc->GetPageAnnotationString(aURI, CHARSET_ANNO, charset);
>+  if (NS_SUCCEEDED(rv)) {
>+    CopyUTF16toUTF8(charset, aCharset);

Why is it ACString anyway, if it was an AString, you could do
return annosvc->GetPageAnnotationString(aURI, CHARSET_ANNO, aCharset);

> 
>+  { "Browser History Charset Resolver",
>+    NS_NAVHISTORYSERVICE_CID,
>+    "@mozilla.org/embeddor.implemented/bookmark-charset-resolver;1",
>+    nsNavHistoryConstructor },
>+

What's this for? where's this used?
Attachment #312245 - Flags: review?(mano) → review-
(In reply to comment #29)
> (From update of attachment 312245 [details] [diff] [review])
> I don't understand why is this added to the backend-services. I would rather
> use annotations.

Mainly to implement nsICharsetResolver, first version was frontend code only, but since we have a set up way of getting the charset from the service... this makes things work like in FX2 (apart that we are using annos)

> >+        charset = PlacesUtils.history.getCharsetForURI(uri);
> 
> I would mention here that |charset| would be an empty string if no
> character-set is specified 

and that should be, if it's empty we don't use it

    if (charset)
       encodedParam = escape(convertFromUnicode(charset, param));
     else // Default charset is UTF-8
       encodedParam = encodeURIComponent(param);

> >+
> >+// nsNavHistory::SetCharsetForURI
> >+//
> >+// Set the charset for an URI.
> >+// If aCharset is empty remove charset annotation for aURI.
> >+
> >+NS_IMETHODIMP
> >+nsNavHistory::SetCharsetForURI(nsIURI* aURI,
> >+                               const nsACString& aCharset)
> >+{
> >+  nsAnnotationService* annosvc = nsAnnotationService::GetAnnotationService();
> >+  NS_ENSURE_TRUE(annosvc, NS_ERROR_OUT_OF_MEMORY);
> >+
> >+  if (aCharset.IsEmpty()) {
> >+    // remove the current page charset annotation
> >+    // don't bother on errors
> >+    (void)annosvc->RemovePageAnnotation(aURI, CHARSET_ANNO);
> 
> NS_ESNURE_SUCCESS it and return NS_OK.

if i NS_ENSURE_SUCCESS i have to check if the anno exists before removing it, while removePageAnno silently fail if anno does not exists.

> > 
> >+  { "Browser History Charset Resolver",
> >+    NS_NAVHISTORYSERVICE_CID,
> >+    "@mozilla.org/embeddor.implemented/bookmark-charset-resolver;1",
> >+    nsNavHistoryConstructor },
> >+
> 
> What's this for? where's this used?

This is part of the nsICharsetResolver bits, we have an analogue in FX2 binded to the bookmarks service
> This is part of the nsICharsetResolver bits, we have an analogue in FX2 binded
> to the bookmarks service

exactly here:
http://lxr.mozilla.org/seamonkey/source/content/html/document/src/nsHTMLDocument.cpp#637

Why cannot you implement nsICharsetResolver in the front-end?
(In reply to comment #32)
> Why cannot you implement nsICharsetResolver in the front-end?

i could create the service in a separated js file like tagging service, still since it's a code that could be potentially called before loading a page, i feel it faster like it is, do you think it would be better that way?
Whiteboard: [has patch] → [has patch][needs new patch]
Attached patch patch v2.2 (obsolete) — Splinter Review
fixed comments. 
charset resolver still in backend (for perf reasons and because it's cleaner linked into history service imo)
Attachment #312245 - Attachment is obsolete: true
Attachment #313798 - Flags: review?
Whiteboard: [has patch][needs new patch] → [has patch][needs review mano]
Attachment #313798 - Flags: review? → review?(mano)
Comment on attachment 313798 [details] [diff] [review]
patch v2.2

At this point, I don't understand what would we get from the changes to the bookmarkProperties dialog (and the showAdd* methods), please remove those.

>Index: toolkit/components/places/public/nsINavHistoryService.idl
>===================================================================
>+   * Gets the stored character-set for an URI.
>+   *
>+   * @param aURI
>+   *        URI to retrieve character-set for
>+   * @returns character-set, empty string if not found
>+   */
>+  AString getCharsetForURI(in nsIURI aURI);
>+
>+  /**
>+   * Sets the character-set for an URI.
>+   *
>+   * @param aURI
>+   *        URI to set character-set for

mega nit: the character-set.

>+   * @param aCharset
>+   *        character-set to be saved

mega nit #2: s/saved/set/

>Index: toolkit/components/places/src/nsNavHistory.h
>===================================================================
>RCS file: /cvsroot/mozilla/toolkit/components/places/src/nsNavHistory.h,v
>retrieving revision 1.154
>diff -u -8 -p -r1.154 nsNavHistory.h
>--- toolkit/components/places/src/nsNavHistory.h	29 Mar 2008 06:48:28 -0000	1.154
>+++ toolkit/components/places/src/nsNavHistory.h	5 Apr 2008 11:42:43 -0000
>@@ -80,16 +80,18 @@
> #include "nsTArray.h"
> #include "nsINavBookmarksService.h"
> #include "nsMaybeWeakPtr.h"
> 
> #include "nsNavHistoryExpire.h"
> #include "nsNavHistoryResult.h"
> #include "nsNavHistoryQuery.h"
> 
>+#include "chardet/nsICharsetResolver.h"
>+

you should rather add chardet to the dependencies list in toolkit/components/places/src/Makefile.in, and then make that:
#include "nsICharsetResolver.h"

r=mano otherwise.
Attachment #313798 - Flags: review?(mano) → review+
Whiteboard: [has patch][needs review mano] → [has patch][has reviews]
(In reply to comment #35)
> (From update of attachment 313798 [details] [diff] [review])
> At this point, I don't understand what would we get from the changes to the
> bookmarkProperties dialog (and the showAdd* methods), please remove those.

shouldn't we save the charset for bookmarks to be reused for encoding keywords? the only other case where we save charset is when user explicitely choose a forced charset from menu...
I'd sure hope so: if we don't save charset to use for keyword searches, I'll have to reopen the things I duped, and request blocking again on one of them, since that was what I requested it for in the first place. Saving it on forced charset eliminates an annoyance from a broken page, and the workaround is just to keep forcing it; saving it for keywords makes the feature work, when the workaround is to find a totally obscure bug (it generally takes me four or five searches to find it) to determine how to manually hack the bookmarked URL.
OK.
Attached patch patch v2.3Splinter Review
i forgot a couple bits to save charset for bookmarks, so i've added them back, and reasking review for these changes, will post an interdiff to make it easier
Attachment #313798 - Attachment is obsolete: true
Attachment #314088 - Flags: review?(mano)
Comment on attachment 314088 [details] [diff] [review]
patch v2.3

r=mano, though I couldn't tell which concrete cases would the browser-places.js change handle...
Attachment #314088 - Flags: review?(mano) → review+
Keywords: checkin-needed
Checking in browser/base/content/browser-places.js;
/cvsroot/mozilla/browser/base/content/browser-places.js,v  <--  browser-places.js
new revision: 1.125; previous revision: 1.124
done
Checking in browser/base/content/browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v  <--  browser.js
new revision: 1.1023; previous revision: 1.1022
done
Checking in toolkit/components/places/public/nsIAnnotationService.idl;
/cvsroot/mozilla/toolkit/components/places/public/nsIAnnotationService.idl,v  <--  nsIAnnotationService.idl
new revision: 1.18; previous revision: 1.17
done
Checking in toolkit/components/places/public/nsINavHistoryService.idl;
/cvsroot/mozilla/toolkit/components/places/public/nsINavHistoryService.idl,v  <--  nsINavHistoryService.idl
new revision: 1.81; previous revision: 1.80
done
Checking in toolkit/components/places/src/Makefile.in;
/cvsroot/mozilla/toolkit/components/places/src/Makefile.in,v  <--  Makefile.in
new revision: 1.45; previous revision: 1.44
done
Checking in toolkit/components/places/src/nsNavHistory.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistory.cpp,v  <--  nsNavHistory.cpp
new revision: 1.288; previous revision: 1.287
done
Checking in toolkit/components/places/src/nsNavHistory.h;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistory.h,v  <--  nsNavHistory.h
new revision: 1.156; previous revision: 1.155
done
Checking in toolkit/components/places/src/nsNavHistoryExpire.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryExpire.cpp,v  <--  nsNavHistoryExpire.cpp
new revision: 1.45; previous revision: 1.44
done
Checking in toolkit/components/places/src/nsPlacesModule.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsPlacesModule.cpp,v  <--  nsPlacesModule.cpp
new revision: 1.7; previous revision: 1.6
done
Checking in toolkit/components/places/src/utils.js;
/cvsroot/mozilla/toolkit/components/places/src/utils.js,v  <--  utils.js
new revision: 1.14; previous revision: 1.13
done
RCS file: /cvsroot/mozilla/toolkit/components/places/tests/unit/test_317472.js,v
done
Checking in toolkit/components/places/tests/unit/test_317472.js;
/cvsroot/mozilla/toolkit/components/places/tests/unit/test_317472.js,v  <--  test_317472.js
initial revision: 1.1
done
Checking in toolkit/components/places/tests/unit/test_browserhistory.js;
/cvsroot/mozilla/toolkit/components/places/tests/unit/test_browserhistory.js,v  <--  test_browserhistory.js
new revision: 1.6; previous revision: 1.5
done
Checking in toolkit/components/places/tests/unit/test_expiration.js;
/cvsroot/mozilla/toolkit/components/places/tests/unit/test_expiration.js,v  <--  test_expiration.js
new revision: 1.19; previous revision: 1.18
done
Checking in browser/components/places/content/bookmarkProperties.js;
/cvsroot/mozilla/browser/components/places/content/bookmarkProperties.js,v  <--  bookmarkProperties.js
new revision: 1.83; previous revision: 1.82
done
Checking in browser/components/places/content/utils.js;
/cvsroot/mozilla/browser/components/places/content/utils.js,v  <--  utils.js
new revision: 1.133; previous revision: 1.132
done
Checking in browser/components/places/src/nsPlacesImportExportService.cpp;
/cvsroot/mozilla/browser/components/places/src/nsPlacesImportExportService.cpp,v  <--  nsPlacesImportExportService.cpp
new revision: 1.65; previous revision: 1.64
done
Checking in browser/components/places/tests/unit/test_384370.js;
/cvsroot/mozilla/browser/components/places/tests/unit/test_384370.js,v  <--  test_384370.js
new revision: 1.17; previous revision: 1.16
done
Checking in browser/components/places/tests/unit/test_bookmarks_html.js;
/cvsroot/mozilla/browser/components/places/tests/unit/test_bookmarks_html.js,v  <--  test_bookmarks_html.js
new revision: 1.36; previous revision: 1.35
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [has patch][has reviews]
Version: unspecified → Trunk
Flags: in-testsuite+
Blocks: 412556
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: