Closed Bug 464496 Opened 17 years ago Closed 2 years ago

Copy Link Location in page contextual menu needs to unescape/decode UTF-8 URIs

Categories

(Camino Graveyard :: Toolbars & Menus, defect, P3)

All
macOS
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: bugzilla-graveyard, Unassigned)

References

Details

Attachments

(2 files, 3 obsolete files)

Follow-up to bug 387312 comment 16: "The Copy Link Location et al functions talk to Gecko directly, so they will be harder to fix. They are beyond the scope of this bug anyway." Hendy, feel free to assign this back to nobody if you aren't going to fix this soon.
Attached patch Fix v1.0 (obsolete) — Splinter Review
Finds the image or link location for the element under the context menu and sends it to our NSPasteboard category. That way, we can modify the category to unescape the types and have that apply to all url copying. I intend to do that in Bug 464602.
Attachment #351617 - Flags: review?(stuart.morgan+bugzilla)
What all has this been tested against? We need to make sure it works on things like image maps, in addition to more typical links.
GeckoUtils::GetEnclosingLinkElementAndHref works specifically with a, link, and area elements, so image maps work.
Comment on attachment 351617 [details] [diff] [review] Fix v1.0 >+ Remove the whitespace from the blank lines. >+ NSString* urlString = [NSString stringWith_nsAString: url]; No space after : >+ NSString* hrefString = [NSString stringWith_nsAString: href]; Same here. r=me with those changes (although "Fix" is perhaps not the best name for this patch, since by itself it doesn't actually do anything ;)
Attachment #351617 - Flags: superreview?(mikepinkerton)
Attachment #351617 - Flags: review?(stuart.morgan+bugzilla)
Attachment #351617 - Flags: review+
Attachment #351617 - Flags: superreview?(mikepinkerton) → superreview+
Comment on attachment 351617 [details] [diff] [review] Fix v1.0 sr=pink though i don't really understand why we shouldn't fix gecko to do the right thing. won't it be broken in firefox as well?
Rogue spaces ousted.
Attachment #351617 - Attachment is obsolete: true
Attachment #358271 - Attachment description: Pasteboard redirection, for checkin → Pasteboard redirection, for checkin [checked in]
(In reply to comment #6) > Created an attachment (id=358271) [details] > Pasteboard redirection, for checkin Checked this in; hendy notes the actual unescaping will happen in bug 464602, so this now depends on that.
Depends on: 464602
We should fix the regression in bug 491448 before b4, either by finishing up the UTF-8 feature or carefully excising the patch here (note that bug 485126 or bug 480797 or both later touched that new code, so we need to be sure not to reintroduce the potential crashes when excising this code). Flagging b4 to keep better track of this as the cause of/potential fix for bug 491448.
Flags: camino2.0b4?
Since Bug 491448 has been fixed, and the unescaping bug will have to wait till 2.1, setting this - for 2.0b4.
Flags: camino2.0b4? → camino2.0b4-
These are P3 on the 2.1 list, but I think they're at the very bottom of hendy's to-do list.
Priority: -- → P3
Target Milestone: --- → Camino2.1
clean-setDataForURL-string-v1.patch consolidates title string cleaning into NSPasteboard setDataForURL: function from its callers. This patch is a prerequisite for my next patch.
Assignee: trendyhendy2000 → cpeterso
Status: NEW → ASSIGNED
Attachment #517364 - Flags: review?(stuart.morgan+bugzilla)
Attached patch unescape-copied-URLs-v1.patch (obsolete) — Splinter Review
unescape-copied-URLs-v1.patch: * Unescapes URL for "Copy Image Location". * Unescapes URL for "Copy Link Location". * Unescapes URL for "Copy Source URL". * Unescapes URL dragged from web page to Location Bar.
Attachment #517365 - Flags: review?(stuart.morgan+bugzilla)
My patch seems to break the copying or dragging of some Pinboard.in links (e.g. containing "c++" or "c#" tags): https://www.pinboard.in/t:c%252B%252B vs https://www.pinboard.in/t:c%2B%2B Pinboard.in already had unescaping problems with "c++" and "c#" tags, but I will continue investigating. These "c++" links on other sites still work correctly: http://www.delicious.com/tag/c%2B%2B https://secure.wikimedia.org/wikipedia/en/wiki/C%2B%2B
Does this patch still provide one flavor unescaped as a fallback for apps that can't understand/handle unescaped UTF-8 (bug 464602 comment 3)?
Comment on attachment 517364 [details] [diff] [review] clean-setDataForURL-string-v1.patch Unrequesting review until I can address the problems with Pinboard's "c++" tags.
Attachment #517364 - Flags: review?(stuart.morgan+bugzilla)
Comment on attachment 517365 [details] [diff] [review] unescape-copied-URLs-v1.patch Unrequesting review until I can address the problems with Pinboard's "c++" tags.
Attachment #517365 - Flags: review?(stuart.morgan+bugzilla)
(In reply to comment #14) > Does this patch still provide one flavor unescaped as a fallback for apps that > can't understand/handle unescaped UTF-8 (bug 464602 comment 3)? Do you know if there is a convention of _which_ pasteboard type should provide a safe, escaped URL string? I don't really see any recommendations in Apple's docs.
I seem to recall doing some dragging and/or pasting from various sources in Safari to PasteboardInspector[1] and seeing which ones weren't unescaped, but I don't recall the specifics right now. [1] http://wiki.caminobrowser.org/User:Julian/Pasteboard_Inspector
In comment 18, I think I was remembering bug 309226 comment 11, which was about punycoding IDN domains, not about unescaping/decoding URL fragments (although they're somewhat related). Safari puts the following URL flavors on the pasteboard, in order, for "Copy Link" on one of the URLs in bug 387312 comment 5: WebURLsWithTitlesPboardType Apple URL pasteboard type - encoded public.url CorePasteboardFlavorType 0x75726C20 'url ' public.url-name CorePasteboardFlavorType 0x75726C6E 'urln' public.utf8-plain-text NSStringPboardType NeXT plain ascii pasteboard type Of those, all are unescaped/decoded except for "Apple URL pasteboard type". I think it makes sense to provide %-encoded URLs for "NeXT plain ascii pasteboard type", too, since otherwise you get broken text there (? in place of non-ASCII characters).
Related pasteboard types are the "public.text" and "public.plain-text" UTIs. Camino does not set those pasteboard types today, but they sound like reasonable choices for escaped URLs too.
Patch v2 1. Unescape URLs (for select pasteboard types) copied via: * "Copy Image Location" * "Copy Link Location" * "Copy Source URL" * "Copy Location to Clipboard" (from Bookmark Bar context menu) * Dragged from Location Bar to another app (aka bug 464602) 2. Change "Copy Source URL" pasteboard type from NSStringPboardType to all URL types. 3. Some minor code cleanups. When unescaping pasteboard URLs, this patch leaves the following string representations of "real" URLs ESCAPED: * WebURLsWithTitlesPboardType (URL values are escaped, but title values are unescaped) * public.url * CorePasteboardFlavorType 'url ' * Apple URL pasteboard type This patch UNESCAPES the following string representations of "pretty" URLs (which, AFAIK, may not be valid URLs if they contain "unsafe", non-ISO-8859-1 characters): * public.url-name * CorePasteboardFlavorType 'urln' * NSStringPboardType (and public.utf8-plain-text)
Attachment #517364 - Attachment is obsolete: true
Attachment #517365 - Attachment is obsolete: true
Attachment #520150 - Flags: review?(stuart.morgan+bugzilla)
Caveats: 1. I have not yet deduced how to unescape URLs dragged from a web page link to the Location Bar (before the user presses RETURN). 2. Smokey mentioned "NeXT plain ascii pasteboard type" in comment 19, but I am not able to reproduce it on my Mac (Safari 5.0.4 on OS X 10.6.6).
Comment on attachment 520150 [details] [diff] [review] unescape-URL-v2.patch >+static NSArray* UnescapedStringsFrom(NSArray* escapedStrings) UnescapedURLStringsFrom? UnescapedStrings on its own is ambiguous. Also, needs a method-level comment. >+ unsigned int count = [escapedStrings count]; >+ NSMutableArray* unescapedStrings = [NSMutableArray arrayWithCapacity:count]; >+ for (unsigned int i = 0; i < count; ++i) { In general we prefer enumerators for case where you don't actually need the index, like this one. (I really wish we could use GTM, which has an NSArray utility method for doing exactly this kind of loop...) >+ NSString* url = [inUrls objectAtIndex:0]; >+ NSString* title = [inTitles objectAtIndex:0]; >+ >+ NSString* URLString; > if (urlCount == 1) { >- NSString* url = [inUrls objectAtIndex:0]; >- NSString* title = [inTitles objectAtIndex:0]; How do you know there's something at index 0 before you check the count? Also, URLString -> urlString >- if (inTitles) Whoops, I should have caught that useless check in a earlier patch that added the fallback title code earlier. Thanks. So beyond the small stuff above, my main question is: are we really, really confident that all the sources that run through this code have escaped URLs? Because if not, we'll corrupt them--unescaping a second time is *not* a no-op (see the extensive discussion in bugs around escaping/unescaping in other places, like bookmark UI).
Attachment #520150 - Flags: review?(stuart.morgan+bugzilla) → review-
Assignee: cpeterso → nobody
Pushing all the UTF-8 stuff out, since it's somewhat risky for this stage in the process.
Target Milestone: Camino2.1 → ---
No assignee, updating the status.
Status: ASSIGNED → NEW
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: