Exporting HTML bookmarks fails to amp-escape URIs



8 years ago
8 years ago


(Reporter: phiw2, Assigned: alqahira)




(2 attachments)



8 years ago
1. bookmark the 'generated source' bookmarklet
   (from https://www.squarefree.com/bookmarklets/webdevel.html)
2. export the bookmark file
3. import that bookmark file

AR: notice the bookmarklet stops functioning correctly
    (see bug 632774 for details)

A couple of characters got changed in the import process, that bookmarklet
has '&', '>' and '<' in the source, this is changed in the import process to '&', '%3E' and '%3C' respectively.

The same happens with URI that contain '&' in query strings - example.com/file.html?string&something. This is less destructive, though, as the uri continues to be resolved.

Both Camino 2.0.x and 2.1a1 are affected.

Comment 1

8 years ago
On IRC, Smokey ventured that bug 439024 eventually caused this. That is not the case, though. Builds older than 20080905 exhibit the same problem.
It looks like right now we're just doing this:

138   NSString* href = [[[linkElement attributeForName:@"href"] stringValue] cleanedStringFromHTMLBookmarkImport];

and |cleanedStringFromHTMLBookmarkImport| is just replacing control characters with spaces.

Which makes it seem like our current behavior is coming straight from NSXMLDocument (even if it wasn't before, where, once we got to the end of the rat's nest, we did |NSString* url = [tempItem stringByRemovingAmpEscapes];|: http://mxr.mozilla.org/mozilla1.8/source/camino/src/bookmarks/BookmarkManager.mm#1705).

I suspect the JS in the file in bug 261474 probably isn't getting imported correctly now, either (but that's better than it killing the entire import); see bug 261474 comment 13.
Created attachment 511544 [details]
URL comparisons

Well, the good news is that we're a lot closer to round-tripping that bookmark successfully than we were before the new importer/exporter :P  We're also a lot better than Firefox :P

Only Safari successfully round-trips this bookmark via HTML export/import.  Safari is also the only browser (of Camino, Safari, Fx3.6) that can import our export and reproduce what we had in Camino pre-export.

Note that all browsers transform what Jesse has encoded in his link; Camino and Safari do so identically, while Firefox randomly decodes a bunch of other things.

I sort-of think that to fix this, we'd have to re-encode some of the things we decoded when making the bookmarklet from the web page link when we export HTML, but I'm not sure the ramifications of that on other sorts of URLs we may store or on the interoperability of this un-spec'ed non-interoperable universal interoperable bookmarks format.

(I guess the other thing to check would be how this URI is stored in the Firefox 2 native HTML bookmarks file, and aim to replicate that on export, if we can still successfully import however that's stored.)

Note that we do successfully round-trip plist-export-plist-import (and Safari successfully imports our export there again); in that case, we &-escape any &s that are left in the string (we don't re-transform %escaped &s like the </&rt; that we %escaped when going from link->bookmark, just literal &s).  This might be a safe strategy for HTML, too, though, again, more testing would be needed and I'd be a bit afraid to change anything without a comprehensive test suite.

As it stands (due also to other things, not just this bug), the only 100% safe backup for Camino bookmarks is quitting Camino and making a copy of bookmarks.plist :P
[6:44pm] ardissone: ok, not going to touch that bug again with a 10 ft pole
[6:46pm] cl: haha
[6:46pm] cl: nice comment
[6:46pm] ardissone: if that's a sane fix, it's entirely trivial
[6:47pm] ardissone: [[bookmark url] stringByAddingAmpEscapes] instead of just [bookmark url]
[6:47pm] cl: see what stuart says, i guess

That's http://mxr.mozilla.org/camino/source/camino/src/bookmarks/HTMLBookmarkConverter.m#372; we do that already for title: http://mxr.mozilla.org/camino/source/camino/src/bookmarks/HTMLBookmarkConverter.m#378

Comment 5

8 years ago
Yes, that seems like the right fix.
Created attachment 512894 [details] [diff] [review]
amp-escape bookmark URLs in HTML export

Here's the patch that does that.

I won't have time for a while to do any extensive testing of round-tripping a bookmarks file in Camino, and of importing a Camino HTML export into Safari or Firefox, but if philippe has time before then, his testing should probably suffice.
Assignee: nobody → alqahira
Attachment #512894 - Flags: feedback?(phiw)
Summary: Importing bookmarks changes '&' in URI and bookmarklets to '&' → Exporting HTML bookmarks fails to amp-escape URIs

Comment 7

8 years ago
Comment on attachment 512894 [details] [diff] [review]
amp-escape bookmark URLs in HTML export

All in all, an improvement :-)
Round tripping in Camino works fine, Chrome and Fx can now import the Camino BM-file without problems. Only Safari seems to have regressed a little (bm file in html format; the plist format is OK).

1. export from Camino, import in Camino --> OK
2a. export from Camino (html), import in Safari --> fail
2b. export from Camino (html - old), import in Safari --> OK
2c. export from Camino (Safari plist), import in Safari --> OK
3a. import from Safari (plist) --> OK
3b. export from Safari (html), import --> fail
4a. export from Camino (html), import in Chrome --> OK
4b. export from Camino (html - old), import in Chrome --> fail
4c. export from Chrome, import in Camino --> fail
5a. export from Camino (html), import in Fx 3.6.x --> OK
5b. export from Camino (html - old), import in Fx 3.6.x --> fail
5c. export from Fx 3.6.x, import in Camino --> fail
6a. export from Camino (html), import in Fx 4 --> OK
6b. export from Fx 4b, import in Camino --> fail
7. import from Opera 11 --> OK

I didn't manage to have Opera co-operate nicely: no way to import a bm-file in html format that I could find, but it did not import the file when exporting from camino in Safari plist format
Attachment #512894 - Flags: feedback?(phiw) → feedback+
Comment on attachment 512894 [details] [diff] [review]
amp-escape bookmark URLs in HTML export

[9:58pm] sauron: i think Safari's html export is broken in the same manner ours was before the patch
[9:58pm] sauron: which is what "3b. export from Safari (html), import --> fail" is, right?
[9:59pm] sauron: oh, but "2a. export from Camino (html), import in Safari --> fail" is the new regression
[9:59pm] phiw: yes. 'fail' means an encoded ampersand is changed to '&'
[10:00pm] phiw: 2a is the only regression I found.
[10:00pm] sauron: because 2a worked before since Safari relies on that bug
[10:01pm] sauron: given we have a special "Safari" format that Safari does handle correctly, I'm not concerned
[10:01pm] sauron: (also because Safari is broken :P )
[10:01pm] sauron: as is Firefox :P
[10:03pm] sauron: did you test regular URLs that have & in them?
[10:04pm] phiw: my default Minefield bookmark file has like 6 entries maybe. all 'about' urls
[10:04pm] sauron: (e.g. bugzilla/crash-stats queries ?)
[10:05pm] sauron: i wanted to be sure we didn't regress regular http:// URLs (which maybe were broken before? dunno)
[10:05pm] phiw: my test url on my dev.server worked correctly
[10:05pm] sauron: ok
Attachment #512894 - Flags: superreview?(stuart.morgan+bugzilla)

Comment 9

8 years ago
Comment on attachment 512894 [details] [diff] [review]
amp-escape bookmark URLs in HTML export

Attachment #512894 - Flags: superreview?(stuart.morgan+bugzilla) → superreview+
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.