Last Comment Bug 17612 - view-source link-browsing
: view-source link-browsing
Status: VERIFIED FIXED
:
Product: Core
Classification: Components
Component: HTML: Parser (show other bugs)
: Trunk
: All All
: P3 enhancement with 2 votes (vote)
: mozilla1.9.1b2
Assigned To: Curtis Bartley [:cbartley]
:
Mentors:
: 11519 155077 188694 214112 224471 429618 453882 455945 (view as bug list)
Depends on: 469256 472846 500921 503136 520397 685110 799408 15372 309724 463176 464314 464339 464388 464727 464857 467477 467852 469302 470357 475542 475870 502644
Blocks: 79518 455888 469526 464352 464361 469535
  Show dependency treegraph
 
Reported: 1999-10-29 20:15 PDT by David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
Modified: 2012-10-09 06:45 PDT (History)
37 users (show)
chung808: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch, as I currently have it (work in progress!!) (9.68 KB, patch)
1999-10-29 20:16 PDT, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details | Diff | Review
The values of SRC and HREF attributes are turned into clickable links (29.26 KB, patch)
2008-10-28 11:20 PDT, Curtis Bartley [:cbartley]
benjamin: review-
Details | Diff | Review
The values of SRC and HREF attributes are turned into clickable links (20.68 KB, application/octet-stream)
2008-10-28 14:39 PDT, Curtis Bartley [:cbartley]
no flags Details
The values of SRC and HREF attributes are turned into clickable links (20.68 KB, patch)
2008-10-28 14:49 PDT, Curtis Bartley [:cbartley]
no flags Details | Diff | Review
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu. (8.33 KB, patch)
2008-10-28 15:01 PDT, Curtis Bartley [:cbartley]
enndeakin: review+
Details | Diff | Review
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu. (8.33 KB, patch)
2008-10-29 13:11 PDT, Curtis Bartley [:cbartley]
no flags Details | Diff | Review
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu. (8.36 KB, patch)
2008-10-30 20:27 PDT, Curtis Bartley [:cbartley]
gavin.sharp: review+
Details | Diff | Review
The values of SRC and HREF attributes are turned into clickable links (31.10 KB, patch)
2008-11-04 18:32 PST, Curtis Bartley [:cbartley]
mrbkap: review+
Details | Diff | Review
The values of SRC and HREF attributes are turned into clickable links (31.26 KB, patch)
2008-11-05 17:44 PST, Curtis Bartley [:cbartley]
mrbkap: superreview+
mbeltzner: approval1.9.1b2+
Details | Diff | Review
Two bug fixes (1.87 KB, patch)
2008-11-12 05:02 PST, neil@parkwaycc.co.uk
mrbkap: review+
mrbkap: superreview+
mbeltzner: approval1.9.1b2+
Details | Diff | Review

Description David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 1999-10-29 20:15:10 PDT
Being able to navigate through links in view-source would be a nice feature.  If
you clicked a link that could be handled by view source, then you would get the
source (stylesheet, frame, etc.).  If not, then you'd get the object (image,
etc.), if it could be viewed.

I think this would be easiest if "view-source:" URLs (see bug 15372) were
implemented in a way that:
 * if it's not text, redirects to non "view-source:" URL.
 * the view-source window were implemented to use internally the "view-source:"
URLs.
(Actually, if it did this, there's one distinction in my patch that wouldn't be
necessary.  This is probably a better way to go then distinguishing whether to
do a view-source or not based on the attribute that has the URL.  Not that my
code that does that actually has any effect...)

I have a patch that does a bit of this.  The main problem remaining is the link
resolution.  (Right now, if you click a link, you get the page, not the
source.)  However, there are certainly some rough spots in the patch!  (There
was some stuff with attribute tokens that was crashing, so I did it a bad way to
get it to work.  I'm not sure exactly what the right way is.)

There could also be some interesting enhancements to the chrome in the
view-source window to go along with this.
Comment 1 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 1999-10-29 20:16:59 PDT
Created attachment 2502 [details] [diff] [review]
patch, as I currently have it (work in progress!!)
Comment 2 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 1999-10-29 20:20:59 PDT
Also, it would be nice if this supported:
 * URLs in DOCTYPEs
 * URLs in PIs
 * URLs in xlinks (this would be easier if one doesn't have to decide whether
they are to be seen as source or content)

I may or may not have some time to do more work on this this weekend...
Comment 3 rickg 1999-11-01 14:03:59 PST
David -- as usual in your case, this is really good idea. I'll mark it as
remind, and try to do it post dog-food.
Comment 4 Blake Ross 2000-09-29 13:08:47 PDT
reopening and marking Future...
Comment 5 bsharma 2001-03-02 15:45:58 PST
updated qa contact.
Comment 6 sairuh (rarely reading bugmail) 2001-04-09 15:33:38 PDT
*** Bug 11519 has been marked as a duplicate of this bug. ***
Comment 7 rbs 2001-04-17 19:39:16 PDT
Related info: see erik's http://webtools.mozilla.org/web-sniffer/
It provides HTTP header info, clickable links (e.g., to linked CSS styles, etc)
Comment 8 Tim Powell 2001-04-24 07:04:54 PDT
That web sniffer is awesome! (Is the code available?) I'd love to see view-
source have those types of capabilities. (I'd make showing CR/LF optional and 
not default, though.) Showing the HTTP headers would also be very helpful. I 
guess some new RFE's are in order.
Comment 9 Erik van der Poel 2001-04-24 12:17:00 PDT
Yes, the Web Sniffer sources are available on mozilla.org. You can view the
sources directly here:

  http://lxr.mozilla.org/mozilla/source/webtools/web-sniffer/

You can also download the sources using CVS. They're under
mozilla/webtools/web-sniffer.

  http://www.mozilla.org/cvs.html
Comment 10 Christopher Hoess (gone) 2002-01-17 17:21:17 PST
->harishd
Comment 11 Alexey Chernyak 2002-06-30 07:37:47 PDT
*** Bug 155077 has been marked as a duplicate of this bug. ***
Comment 12 Weston Ruter 2003-02-16 16:17:33 PST
I think every link in a document should appear in the viewsource window as a
link, (i.e. underlined). Viewsource should be able to handle relative URLs of
all links, (e.g. src="", href="", etc.). This would greatly assist
developers/learners in browsing the source of a website. Back and Forward
buttons in the viewsource chrome must be added.
Comment 13 Boris Zbarsky [:bz] 2003-02-16 16:53:05 PST
There is nothing that "must" happen (except maybe removing internal view-source
completely if no one steps up to maintain it, which is what it's looking like
right now).
Comment 14 Boris Zbarsky [:bz] 2003-07-27 17:02:59 PDT
*** Bug 214112 has been marked as a duplicate of this bug. ***
Comment 15 Brian 'netdragon' Bober 2003-07-27 21:09:22 PDT
My bug 214112 was talking more about inline expansion of the links on demand. Is
that covered by this bug?
Comment 16 Bill Mason 2003-11-02 10:42:08 PST
*** Bug 224471 has been marked as a duplicate of this bug. ***
Comment 17 Alfonso Martinez 2003-11-14 12:50:41 PST
*** Bug 188694 has been marked as a duplicate of this bug. ***
Comment 18 neil@parkwaycc.co.uk 2004-01-04 11:57:12 PST
Currently view-source: URLs don't go in history, at time of writing by code in
nsDocShell.cpp but soon to be moved to disablehistory="true" in viewSource.xul
(see bug 229995).
Comment 19 Brian 'netdragon' Bober 2005-08-05 07:19:00 PDT
Also, <script src="foo.js"> sections should be viewable in view source. I had an
alternate idea for how this could work, which I wrote in bug 60426 (but should
have in this one instead):
that we be able to click (or double-click) on lines that
include other content such as ( <script src="foo.js" ) and <link> tags and
the external file would expand in place of that line. Clicking again would close
it. Obviously, we wouldn't want to do this for <a> tags.
Comment 20 Boris Zbarsky [:bz] 2008-09-05 15:04:11 PDT
*** Bug 453882 has been marked as a duplicate of this bug. ***
Comment 21 Matthias Versen [:Matti] 2008-09-18 18:55:09 PDT
*** Bug 455945 has been marked as a duplicate of this bug. ***
Comment 22 Curtis Bartley [:cbartley] 2008-10-28 11:20:21 PDT
Created attachment 345121 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

This patch only partially addresses bug 17612.  The values for SRC and HREF attributes are turned into links which link to their targets.  These links are themselves view-source URLs.

The patch also implements BACK and FORWARD support in the view-source window, since the user will want that functionality just as soon as they click on their first linkified URL.  This is keyboard and context menu support.  The context menu support required copying four ENTITY definitions from browser.dtd to viewSource.dtd, so there are localization implications.

One notable deficiency of the current implementation is that URLs in IMG SRC attributes attempt to view-source images, which renders the bytes that make up the image as text.  I think there is probably a simple solution to this problem, but I don't want to delay this patch to wait for it since we're pushing up against a string freeze deadline.
Comment 23 Benjamin Smedberg [:bsmedberg] 2008-10-28 12:15:11 PDT
Comment on attachment 345121 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

Because the view-source code lives in toolkit, it cannot use DTDs from Firefox (browser.dtd).
Comment 24 Brendan Eich [:brendan] 2008-10-28 12:40:42 PDT
Curtis, great to have a patch -- thanks, and keep 'em coming if you are game.

Benjamin, how about some guidance?

/be
Comment 25 Curtis Bartley [:cbartley] 2008-10-28 14:39:46 PDT
Created attachment 345161 [details]
The values of SRC and HREF attributes are turned into clickable links
Comment 26 Curtis Bartley [:cbartley] 2008-10-28 14:49:46 PDT
Created attachment 345164 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

[Note that this is the C++ portion of the previous, now superceded patch 345121.  The patch is stand-alone, the JS/XUL/DTD changes aren't needed for it to work.  I've pulled the C++ changes out to a separate patch so they won't be subject to the upcoming string freeze.]

This patch only partially addresses bug 17612.  The values for SRC and HREF
attributes are turned into links which link to their targets.  These links are
themselves view-source URLs.

One notable deficiency of the current implementation is that URLs in IMG SRC
attributes attempt to view-source images, which renders the bytes that make up
the image as text.  I think there is probably a simple solution to this
problem, but I haven't had time to research it yet.
Comment 27 Curtis Bartley [:cbartley] 2008-10-28 15:01:34 PDT
Created attachment 345170 [details] [diff] [review]
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

This patch implements BACK and FORWARD functionality in the view-source window.  It is not strictly dependent upon patch 345164 (The values of SRC and HREF attributes are turned into clickable links), although it can't be tested without that patch since otherwise links don't ever show up in the view source window.  This patch has some localizable string changes -- six entity definitions were copied from browser.dtd to viewSource.dtd.  (Note that patch includes the changes necessary to address bsmedberg's comment #23)

I split this patch out from the C++ changes since the entity additions make it subject to the upcoming string freeze, and I didn't want the core functionality changes in the other patch to be subjected to the tighter string freeze deadline.  I also need different reviewers for the two patches, so splitting them up this way seems to make sense.
Comment 28 Neil Deakin 2008-10-29 12:18:40 PDT
Comment on attachment 345170 [details] [diff] [review]
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

AY_AS_SOURCE);
>+
>+          // Record the page load in the session history so <back> will work.
>+          var ccShEntry = Cc["@mozilla.org/browser/session-history-entry;1"];
>+          var ccIoService = Cc["@mozilla.org/network/io-service;1"];
>+          var shEntry = ccShEntry.createInstance(Ci.nsISHEntry);
>+          var ioService = ccIoService.getService(Ci.nsIIOService);
>+          

This is a bit hard to read. Just do:

ccShEntry = Cc["@mozilla.org/browser/session-history-entry;1"].createInstance(Ci.nsISHEntry);

Or if you want mutliple lines, use the same variable name and put the two ishentry and ioservice service lines together rather than separated.

The patch looks OK, but I think a browser reviewer would be a better choice for this (gavin, mano or perhaps the other Neil)
Comment 29 Benjamin Smedberg [:bsmedberg] 2008-10-29 12:48:11 PDT
Curtis, can you make the change Neil suggested and upload a new patch with r?gavin?
Comment 30 Curtis Bartley [:cbartley] 2008-10-29 13:11:04 PDT
Created attachment 345338 [details] [diff] [review]
 The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

Addresses Neil's request in his review (comment #28) and added Gavin as requested reviewer.
Comment 31 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-10-30 14:34:01 PDT
Comment on attachment 345338 [details] [diff] [review]
 The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

>diff -r 5833ad22f1a7 toolkit/components/viewsource/content/viewSource.js

> function onLoadViewSource() 
> {

>+  // Attach the progress listener.
>+  try {
>+    getBrowser().addProgressListener(gViewSourceProgressListener, 
>+        Components.interfaces.nsIWebProgress.NOTIFY_ALL);
>+  } catch (ex) {
>+  }  

Why the try/catch? This shouldn't be failing, and if it does it's better off not being silent, I think.

>+  // Create a session history.
>+  var webNavigation = getBrowser().webNavigation;
>+  var ccSessionHistory = Cc["@mozilla.org/browser/shistory;1"];
>+  webNavigation.sessionHistory = ccSessionHistory.createInstance();

Just chain the Cc[].createInstance() and remove ccSessionHistory.

>+          // Record the page load in the session history so <back> will work.

I'm not sure I understand why we have to do this manually. Shouldn't we just revert the patch for bug 229995 and let docshell take care of it?

>+          var ccShEntry = Cc["@mozilla.org/browser/session-history-entry;1"];
>+          var shEntry = ccShEntry.createInstance(Ci.nsISHEntry);
>+          var ccIoService = Cc["@mozilla.org/network/io-service;1"];
>+          var ioService = ccIoService.getService(Ci.nsIIOService);

Again, remove the cc* variables and just change the Cc[].getService/createInstance calls.

>+          shEntry.setURI(ioService.newURI(viewSrcUrl, null, null));

You should be able to use makeURI from contentAreaUtils.js, rather than invoking the ioService directly.

>+          shEntry.setTitle(viewSrcUrl);
>+          shEntry.loadType = Ci.nsIDocShellLoadInfo.loadHistory;
>+          webNavigation.sessionHistory.QueryInterface(Ci.nsISHistoryInternal);

You can pass Ci.nsISHistoryInternal to the createInstance() call above to avoid having to explicitly call QI here.

I noticed some tabs slipped into the XUL changes, and some trailing whitespace on some of the lines you're adding - please remove those.

Otherwise this looks good. Want an answer to whether the addEntry call is needed before r+ing, though.
Comment 32 Curtis Bartley [:cbartley] 2008-10-30 20:27:48 PDT
Created attachment 345657 [details] [diff] [review]
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

* Removed exception handlers from addProgressListener() and removeProgressListener() calls.
* Removed the code which explicitly creates a session history object (including the ccSessionHistory temporary) -- I removed the "disablehistory" on the browser tag in the XUL, and as a result, a session history is created automatically.
* Removed other ccXxxx temporary variables
* Called makeURI() instead of ioService.newURI().
* Removed the call to QueryInterface() on the session history -- the one that is now automatically created seems to expose the desired interface by default.
* Converted tabs to spaces in viewSource.xul.  I couldn't find any cases of trailing spaces in my changes, but note that there were some random trailing spaces in the prior version of these files that got automatically trimmed out, maybe that's what you were seeing?

I investigated bug 229995 -- the only change in the patch for that bug related to my changes was simply the addition of the 'disablehistory="true"' attribute to the <browser> element.  I removed that attribute and discovered that I no longer needed to explicitly instantiate the session history.  On the other hand, the first file displayed in the view source window was still not recorded in the session history, although subsequent files (viewed by clicking links in the v-s window) were recorded.  It looks like the code to explicitly add that first file to the history is still needed.  There might be some way to get the first file automatically recorded, but bug 229995 at least sheds no light on what that way might be.

As an added note, it's probably desirable to add BACK and FORWARD handling to the viewPartialSource.xul as well.  (Or better yet, merge them into a single window if possible).
Comment 33 Curtis Bartley [:cbartley] 2008-11-03 09:46:16 PST
> >+          // Record the page load in the session history so <back> will work.
>
> I'm not sure I understand why we have to do this manually. Shouldn't we just
> revert the patch for bug 229995 and let docshell take care of it?

This turns out to be hard to answer.  The short answer, as best I understand it:

We must pass a session history entry object to LoadPage so that we can load from cache if at all possible (since the source might change, possibly substantially, if we request it from the server again).  However, it looks like the presence of a session history entry is also used internally to indicate what state the nsDocShell is in -- see the declarations of mOSHE and mLSHE in nsDocShell.h.  It may be hard to un-conflate the two different interpretations of what the presence of a session history entry means.

Some more background, possibly more detail than anyone really wants to know, follows.

Inside nsDocShell::OnNewURI:

>    if (updateHistory || && shAvailable) {
>        // Update session history if necessary...
>        if (!mLSHE && (mItemType == typeContent) && mURIResultedInDocument) {
>            /* This is  a fresh page getting loaded for the first time
>             *.Create a Entry for it and add it to SH, if this is the
>             * rootDocShell
>             */
>            (void) AddToSessionHistory(aURI, aChannel, getter_AddRefs(mLSHE));
>        }

On a normal URL load, nsDocShell::LoadURI is called.  When the code above is reached, AddToSessionHistory is called, and we have goodness.

On a view-source load, nsDocShell::LoadPage is called.  When the code above is reached, AddToSessionHistory is *not* called.  It is not called for two reasons.
* updateHistory is false
* mLSHE is not NULL, making "!mLSHE" false on the inner if-statement.

"updateHistory" is false because LoadPage specified a load type of LOAD_HISTORY which by definition includes the LOAD_CMD_HISTORY flag, as checked by this code which appears earlier in OnNewURI:

>    // Determine if this type of load should update history.
>    if (aLoadType == LOAD_BYPASS_HISTORY ||
>        aLoadType == LOAD_ERROR_PAGE ||
>        aLoadType & LOAD_CMD_HISTORY ||
>        aLoadType & LOAD_CMD_RELOAD)
>        updateHistory = PR_FALSE;

Now the second question.  Why is mLSHE not NULL?

LoadPage takes an nsISHEntry object, indeed that's it's whole point: the nsISHEntry object contains a cache-id allowing us to (if at all possible) load the page source from the cache rather than contacting the server and possibly getting source that's different than what the user was viewing.  LoadPage calls LoadHistoryEntry with the nsISHEntry, and LoadHistoryEntry in turn passes it to InternalLoad.

Inside InternalLoad we have the following code:

>    if (mLoadType != LOAD_ERROR_PAGE)
>        SetHistoryEntry(&mLSHE, aSHEntry);

If aSHEntry (the nsISHEntry argument) is not NULL, then mLSHE will be set to this value.  This causes the !mLSHE test above to fail in the LoadPage case.

On the other hand, on a normal load originating from LoadURI, the aSHEntry argument will be NULL, and mLSHE will also be NULL after this statement executes.

So we have a dilemma.  If we pass a non NULL nsISHEntry pointer, then the AddToSessionHistory call will be skipped, and we want it to be called.  On the other hand, if we have to pass an nsISHEntry because that's the only way to specify a cache-id allowing us to load the file for view-source from the cache, which we also want to do.

We have three alternatives as I see it:

1. Keep the current solution -- LoadPage does not add the view-source URL to the session history, and we do that manually inside viewSource.js.
2. Add some extra conditions to the if-statements around the call to AddToSessionHistory, something like:

    PRBool isViewSource = PR_FALSE;
    aURI->SchemeIs("view-source", &isViewSource);
    if ((updateHistory || isViewSource) && shAvailable) {
        // Update session history if necessary...
        printf("................. isViewSource: %d\n", isViewSource);
        if ((!mLSHE || isViewSource) && (mItemType == typeContent) && mURIResultedInDocument) {
            /* This is  a fresh page getting loaded for the first time
             *.Create a Entry for it and add it to SH, if this is the
             * rootDocShell
             */
            (void) AddToSessionHistory(aURI, aChannel, getter_AddRefs(mLSHE));
        }

This is ugly but should be low risk outside of the view-source code path.

3. Make view-source load-from cache-handling more explicit inside nsDocShell.cpp.  For example, it might make sense to introduce a LOAD_VIEW_SOURCE load type to replace the LOAD_HISTORY that LOAD_PAGE can use.  Or maybe an additional LoadType flag can be provided to specify indicate that can be added (ored) into the LOAD_HISTORY flag in LoadPage to indicate that it really does want the page load recorded in the session history.  The problem with any changes likes this is that the necessary code changes might have effects outside the view-source code path, and given the complexity of the code, might be hard to verify and test.
Comment 34 Boris Zbarsky [:bz] 2008-11-03 10:07:47 PST
Ideally we'd fix docshell to behave sanely on LoadPage calls, but I agree that a change like that would be risky, and shouldn't block this patch.  It would be nice to do as a followup, though.
Comment 35 Curtis Bartley [:cbartley] 2008-11-03 10:33:52 PST
> Ideally we'd fix docshell to behave sanely on LoadPage calls, but I agree that
a change like that would be risky, and shouldn't block this patch.  It would be
nice to do as a followup, though.

I'm just thinking out loud, but might it make sense to extend the view-source: scheme to include a cache-id, something like view-source:cache-id:xxxxx:http://xxxx?  The treatment would be: Load it from the cache if at all possible, otherwise, use the URL.  Then maybe we could just use LoadURI and handle view-source URLs through largely the same code-path as regular URLs.

This might also present a way to make hyperlinks displayed in the view-source window (URLs in SCRIPT and STYLE elements, SRCs in IFRAMES) to also specify a cache-id.  For example, if you view source of a page you are currently looking at, you want to see the original source, not whatever the server hands you back next time.  The same argument might apply if that source contains an IFRAME and you want to click on the link on the IFRAME's SRC attribute in the view-source window: You've already seen this content rendered as part of the original page, and you'd prefer to see the source to that, rather than contacting the server again and possibly getting back something different.  (How to get the cache-id for that IFRAME document is left as an exercise for the reader.)
Comment 36 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-11-03 11:02:11 PST
son of wyciwyg?
Comment 37 Curtis Bartley [:cbartley] 2008-11-03 11:08:01 PST
> son of wyciwyg?

I'd probably have said "bastard child", but yeah.

As a side note, it turns out there's a Wikipedia page for WYCIWYG: http://en.wikipedia.org/wiki/WYCIWYG...
Comment 38 Brendan Eich [:brendan] 2008-11-03 11:22:30 PST
As creator of wysiwyg: (Netscape 3, ol' dirty bastard ur-father of all these schemes) I endorse this cacheid idea. ;-)

But expedient fixes are good too, especially (if possible) for 3.1b2, whose deadline IIRC is tomorrow midnight. Any chance?

/be
Comment 39 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-11-03 17:40:04 PST
Comment on attachment 345657 [details] [diff] [review]
The view source window now supports BACK and FORWARD via keyboard shortcuts and the context menu.

Sounds like we should go with this for now, and file a followup to investigate the potential enhancements to docshell.

r=me
Comment 40 Curtis Bartley [:cbartley] 2008-11-03 17:53:18 PST
> son of wyciwyg?

I'd probably have said "bastard child", but yeah.

As a side note, it turns out there's a Wikipedia page for WYCIWYG: http://en.wikipedia.org/wiki/WYCIWYG...
Comment 41 Blake Kaplan (:mrbkap) (please use needinfo!) 2008-11-03 18:13:00 PST
Comment on attachment 345164 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

I don't mind the whitespace cleanup, though I did have to generate a diff -w to review. Splitting the whitespace cleanup into a different patch would have been nice.

Also, please use the options -pU8 when generating diffs for review!

>diff -r 5833ad22f1a7 parser/htmlparser/src/nsViewSourceHTML.cpp
>@@ -85,6 +86,8 @@
> #include "prio.h"
> #include "plstr.h"
> #include "prmem.h"
>+
>+

Nit: Don't add these empty lines.

>-nsresult CViewSourceHTML::WriteAttributes(PRInt32 attrCount, PRBool aOwnerInError) {
>+nsresult CViewSourceHTML::WriteAttributes(const nsAString& tagName, 
>+      nsTokenAllocator* theAllocator, PRInt32 attrCount, PRBool aOwnerInError) {

I'd rather see this written as:

nsresult CViewSourceHTML::WriteAttributes(const nsAString& tagName,
                                          nsTokenAllocator* theAllocator,
                                          PRInt32 attrCount,
                                          PRBool aOwnerInError)

Also, if you can avoid it, I'd prefer to move away from the "the*" naming convention. Calling that parameter just 'allocator' reads much better to me.

>@@ -699,18 +703,22 @@
> 
>           CAttributeToken* theAttrToken = (CAttributeToken*)theToken;
>           const nsSubstring& theKey = theAttrToken->GetKey();
>-          
>+
>           // The attribute is only in error if its owner is NOT in error.
>           const PRBool attributeInError =
>             !aOwnerInError && theAttrToken->IsInError();
> 
>           result = WriteTag(kAttributeName,theKey,0,attributeInError);
>           const nsSubstring& theValue = theAttrToken->GetValue();
>-
>           if(!theValue.IsEmpty() || theAttrToken->mHasEqualWithoutValue){
>-            result = WriteTag(kAttributeValue,theValue,0,attributeInError);
>+            if (IsUrlAttribute(tagName, theKey, theValue)) {
>+              WriteHrefAttribute(theAllocator, theValue);
>+              result = NS_OK;
>+            } else {
>+              result = WriteTag(kAttributeValue,theValue,0,attributeInError);
>+            }
>           }

A little analysis shows that WriteTag doesn't ever return an error code (and if it did, we'd throw it away). So let's get rid of the unsightly 'result = NS_OK' and pretend that WriteTag returns void (file a followup on cleaning up (the error reporting/handling in) this file, please).

>+PRBool CViewSourceHTML::IsUrlAttribute(const nsAString& tagName,
>+                        const nsAString& attrName, const nsAString& attrValue) {

Same nit about this function declaration. I won't point it out every time.

>+  nsAutoString trimmedAttrName(attrName);
>+  TrimTokenValue(trimmedAttrName, PR_TRUE, PR_TRUE);
>+
>+  PRBool isHref = trimmedAttrName.LowerCaseEqualsLiteral("href");
>+  PRBool isSrc = trimmedAttrName.LowerCaseEqualsLiteral("src");

Not sure if it matters, but you could avoid the second string compare for href attributes by making the second declaration: |PRBool isSrc = !isHref && ...|.

>+  // If this is the HREF attribute of a BASE element, then update mBaseHref.
>+  // This doesn't feel like the ideal place for this, but the alternatives don't
>+  // seem all that nice either.

Yeah, this sucks. Further, you could end up with the wrong base attribute if the given href here is a relative URL, malformed or not safe to load. We should add an API to nsIHTMLContentSink that tells you if the given attribute value is a valid base href target. I think it's OK for now, but we should definitely file a followup bug on it.

>+void CViewSourceHTML::WriteHrefAttribute(nsTokenAllocator* theAllocator,
>+                                                        const nsAString& href) {
>+  // Trim away any surrounding whitespace and the quotes to get the HREF proper.
>+  nsAutoString hrefProper(href);
>+  PRInt32 untrimmedLength = href.Length();
>+  TrimTokenValue(hrefProper, PR_TRUE, PR_FALSE);
>+  PRInt32 trimmedLeftLength = hrefProper.Length();
>+  TrimTokenValue(hrefProper, PR_FALSE, PR_TRUE);
>+  PRInt32 trimmedLength = hrefProper.Length();
>+
>+  // Pull out the text preceding and succeeding the HREF proper.
>+  PRInt32 leftTextLength = untrimmedLength - trimmedLeftLength;
>+  PRInt32 rightTextStart = leftTextLength + trimmedLength;
>+  PRInt32 rightTextLength = untrimmedLength - rightTextStart;
>+  const nsAString& precedingText = Substring(href, 0, leftTextLength);
>+  const nsAString& succeedingText = Substring(href, rightTextStart, rightTextLength);

Ugh. Could you define a version of TrimTokenValue that takes and returns iterators and operate on those instead? It seems like it'd get rid of a lot of these local variables and cut down on the complication of just getting the central piece separated from the quotes and whitespace.

>+  nsAutoString fullPrecedingText;
>+  fullPrecedingText.AssignWithConversion(kBeforeText[kAttributeValue]);

I'd prefer seeing this turn into:
  nsAutoString fullPrecedingText(kEquals);
since we know what that value is, though it doesn't really matter.

>+  // PNG rendered as text.  A smarter view-source handler might be able to deal
>+  // with the latter problem.

File a bug on this?

>+  // Construct the HTML that will represent the HREF.
>+  const nsAutoString EMPTY;

We have EmptyString() for this.

>+  nsAutoString HREF;
>+  HREF.AssignWithConversion("href");

Please use |NS_NAMED_LITERAL_STRING(HREF, "href")| here.

>+  viewSourceUrl = EmptyString();

This is better as |viewSourceUrl.Truncate()|.

>+  // Build a URI for the page's base URL.
>+  const nsAString& baseUrl = mBaseHref.IsEmpty() ? mFilename : mBaseHref;
>+  rv = NS_NewURI(getter_AddRefs(baseURI), baseUrl);

Is it worth keeping a URI around for the base href instead of creating it every time we need it? Also, do these two calls work correctly in the face of internationalized URIs with a charset? The mParser member has the charset that we could pass to these functions, which seems like a good idea to me.

>+void CViewSourceHTML::WriteTextInSpan(const nsAString &text, nsTokenAllocator*
>+          allocator, const nsAString& attrName, const nsAString &attrValue) {
>+  nsAutoString span;
>+  span.AssignWithConversion("SPAN");

NS_NAMED_LITERAL_STRING here (also, when you have a literal like that, you can use AssignLiteral).

>+void CViewSourceHTML::WriteTextInAnchor(const nsAString &text, nsTokenAllocator*
>+          allocator, const nsAString& attrName, const nsAString &attrValue) {
>+  nsAutoString anchor;
>+  anchor.AssignWithConversion("A");

Ditto here.

I'll stamp your next patch with the comments addressed.
Comment 42 Curtis Bartley [:cbartley] 2008-11-04 18:32:02 PST
Created attachment 346379 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

[Note: TODOs (generally filing bugs) have not been done yet, everything else is contained in the patch.]

1. Turfed bogus empty lines.
2. Changed style when linewrapping function arguments.
3. Ignored useless return value from WriteTag. [TODO: File a follow-up bug to clean up error handling/reporting in this file.]
4. Fixed function declaration style.
5. [TODO: File a follow-up bug on the handling of malformed URLs in BASE elements.]
6. Introduced iterator-based TrimTokenValue function.
7. Used literal equals when initializing fullPrecedingText.  Comment suggested using a constant, but that seems like overkill.
8. [TODO: File a bug about smarter view-source handling of images]
9. Used EmptyString().
10. Stored a URI object for the BASE, not just a spec, so we don't have to rebuild it every time.  Also now use the charset reported by the parser when constructing URIs.
11. Used NS_NAMED_LITERAL_STRING.
12. Ditto.
Comment 43 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-11-04 21:53:26 PST
Frontend patch landed:
http://hg.mozilla.org/mozilla-central/rev/f3e79a7ed0e7
Comment 44 Mike Beltzner [:beltzner, not reading bugmail] 2008-11-05 00:23:18 PST
Blake - once you've done the review can you add the patch to the metering list so that we can get this in during the slushy freeze?
Comment 45 Blake Kaplan (:mrbkap) (please use needinfo!) 2008-11-05 15:14:57 PST
Comment on attachment 346379 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

>-nsresult CViewSourceHTML::WriteAttributes(PRInt32 attrCount, PRBool aOwnerInError) {
>+nsresult CViewSourceHTML::WriteAttributes(const nsAString& tagName, 
>+      nsTokenAllocator* allocator, PRInt32 attrCount, PRBool aOwnerInError) {

Oops, missed this header.

>+  nsAutoString fullPrecedingText;
>+  fullPrecedingText.AssignWithConversion("=");

This could be AssignLiteral, but I wasn't joking about kEqual. It lives in nsIParser.h as a handy PRUnichar for handing to nsStrings for this very reason.

>+  // Get the BaseURI.
>+  rv = GetBaseURI(getter_AddRefs(baseURI));
>+  if (NS_FAILED(rv)) return rv;

Might as well use NS_ENSURE_SUCCESS here and for the NS_NewURI below.

>+  // Prepend "view-source:" onto the absolute URL and store it in the out param.
>+  viewSourceUrl.AssignWithConversion("view-source:");

This should be AssignLiteral.

>+  // Create a new base URI and store it in mBaseURI.
>+  return NS_NewURI(getter_AddRefs(mBaseURI), baseSpec, charset.get());

It seems like you should do this into a temporary URI so if it fails, we still have the old base URI around.

r=mrbkap with that.
Comment 46 Curtis Bartley [:cbartley] 2008-11-05 17:44:10 PST
Created attachment 346585 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

    (From update of attachment 346379 [details] [diff] [review])
    >-nsresult CViewSourceHTML::WriteAttributes(PRInt32 attrCount, PRBool aOwnerInError) {
    >+nsresult CViewSourceHTML::WriteAttributes(const nsAString& tagName, 
    >+      nsTokenAllocator* allocator, PRInt32 attrCount, PRBool aOwnerInError) {
    
    Oops, missed this header.

Done.
    
    >+  nsAutoString fullPrecedingText;
    >+  fullPrecedingText.AssignWithConversion("=");
    
    This could be AssignLiteral, but I wasn't joking about kEqual. It lives in
    nsIParser.h as a handy PRUnichar for handing to nsStrings for this very reason.

Done.
    
    >+  // Get the BaseURI.
    >+  rv = GetBaseURI(getter_AddRefs(baseURI));
    >+  if (NS_FAILED(rv)) return rv;
    
    Might as well use NS_ENSURE_SUCCESS here and for the NS_NewURI below.

Done.
    
    >+  // Prepend "view-source:" onto the absolute URL and store it in the out param.
    >+  viewSourceUrl.AssignWithConversion("view-source:");
    
    This should be AssignLiteral.

Done.
    
    >+  // Create a new base URI and store it in mBaseURI.
    >+  return NS_NewURI(getter_AddRefs(mBaseURI), baseSpec, charset.get());
    
    It seems like you should do this into a temporary URI so if it fails, we still
    have the old base URI around.

Done.
    
    r=mrbkap with that.
Comment 47 Blake Kaplan (:mrbkap) (please use needinfo!) 2008-11-05 18:02:22 PST
Comment on attachment 346585 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

Looks good to me. r+sr=mrbkap
Comment 48 Mike Beltzner [:beltzner, not reading bugmail] 2008-11-05 18:10:46 PST
Comment on attachment 346585 [details] [diff] [review]
The values of SRC and HREF attributes are turned into clickable links

a=beltzner for 1.9.1b2
Comment 49 Johnathan Nightingale [:johnath] 2008-11-10 13:26:19 PST
http://hg.mozilla.org/mozilla-central/rev/a33287a649fe
Comment 50 PikeUK 2008-11-11 03:23:02 PST
Filed bug 464222 about linkification in DOM source.
Comment 51 Boris Zbarsky [:bz] 2008-11-11 22:02:31 PST
So did we ever do any performance impact testing here?
Comment 52 Curtis Bartley [:cbartley] 2008-11-11 22:10:29 PST
We didn't.  In fact, we need automated tests in general.
Comment 53 Boris Zbarsky [:bz] 2008-11-11 22:45:18 PST
It doesn't even have to be automated as a first cut.  Just knowing how much of an impact (if any) this has on view-source performance would be good.  There are old view-source performance bugs that should be able to provide nice pathological testcases if those are desired.
Comment 54 neil@parkwaycc.co.uk 2008-11-12 05:02:12 PST
Created attachment 347755 [details] [diff] [review]
Two bug fixes

These two bugs were very easy to find. The compiler conveniently warned me about the IsTokenValueTrimmableCharacter bug while the other bug was detected by an assertion - you can't add attributes to a stack-allocated token.
Comment 55 Boris Zbarsky [:bz] 2008-11-12 05:28:34 PST
Can we add a test for the stack token thing?  For that matter, what about the other?
Comment 56 Boris Zbarsky [:bz] 2008-11-13 10:12:27 PST
Can we please get comment 54 spun into a separate, release-blocking, bug?
Comment 57 Boris Zbarsky [:bz] 2008-11-13 10:14:00 PST
Note that we linkify the wrong thing if the attribute is the last non-whitespace on the line.  See bug 464727.
Comment 58 Blake Kaplan (:mrbkap) (please use needinfo!) 2008-11-13 11:22:41 PST
Comment on attachment 347755 [details] [diff] [review]
Two bug fixes

Might as well mark this here... Looks good.
Comment 59 Boris Zbarsky [:bz] 2008-11-13 11:49:38 PST
Comment on attachment 347755 [details] [diff] [review]
Two bug fixes

The problem with that is there's no way to request blocking....
Comment 60 Biju 2008-11-13 20:39:00 PST
In some cases quote is appended to the generated URL
I think it we are seeing it if a new-line comes after quoted URL.
it does not matter whether there is space between quoted URL and the new-line
this also affects if we use single quote instead of double quote
 

See
view-source:https://bugzilla.mozilla.org/show_bug.cgi?id=17612


case 1.
  <link href="skins/standard/global.css"

url #1 
   view-source:https://bugzilla.mozilla.org/skins/standard/global.css%22


Case 2.

            <a name="a1" href="attachment.cgi?id=2502"
               title="View the content of the attachment">


url #2 
view-source:https://bugzilla.mozilla.org/attachment.cgi?id=2502%22


Case 3.

    <span class="last_comment_link">
      <a href="#c59" 
         accesskey="l"><b>L</b>ast Comment</a>
    </span>

url #3 
https://bugzilla.mozilla.org/show_bug.cgi?id=17612#c59%22">view-source:https://bugzilla.mozilla.org/show_bug.cgi?id=17612#c59%22






Case 4.

            <a href="#attach_2502"
               title="Go to the comment associated with the attachment">1999-10-29 20:16 PST</a>,

            <a href="mailto:dbaron&#64;mozilla.com"
               title="David Baron [:dbaron] &lt;dbaron&#64;mozilla.com&gt;">David Baron [:dbaron]
            </a>


urls #4 

https://bugzilla.mozilla.org/show_bug.cgi?id=17612#attach_2502%22">view-source:https://bugzilla.mozilla.org/show_bug.cgi?id=17612#attach_2502%22
view-source:mailto:dbaron&#64;mozilla.com"
Comment 61 Mike Beltzner [:beltzner, not reading bugmail] 2008-11-13 20:40:11 PST
This bug is FIXED. That means that the checkin is part of the code. That means that new problems are new bugs; please file them as such.
Comment 62 Mike Beltzner [:beltzner, not reading bugmail] 2008-11-13 20:41:15 PST
Comment on attachment 347755 [details] [diff] [review]
Two bug fixes

a=beltzner, grumble, grumble
Comment 63 Biju 2008-11-13 20:42:10 PST
bugzilla is eating up my URLs
see view-source: of https://bugzilla.mozilla.org/show_bug.cgi?id=17612
Comment 64 Blake Kaplan (:mrbkap) (please use needinfo!) 2008-11-13 20:45:02 PST
Biju, the bug you're describing is bug 464727.
Comment 65 neil@parkwaycc.co.uk 2008-11-14 03:21:01 PST
Pushed changeset 0ec6c4bc82c1 to mozilla-central.
Comment 66 Neil Cronin 2008-11-14 18:35:58 PST
(In reply to comment #22)
> Created an attachment (id=345121) [details]
> One notable deficiency of the current implementation is that URLs in IMG 
> SRC attributes attempt to view-source images

Is there a bug for this?  I can't find it, but I have some ideas for handling this, and I'd like to take a stab at implementing it.
Comment 67 Curtis Bartley [:cbartley] 2008-11-14 18:43:04 PST
(In reply to comment #66)
> (In reply to comment #22)
> > Created an attachment (id=345121) [details] [details]
> > One notable deficiency of the current implementation is that URLs in IMG 
> > SRC attributes attempt to view-source images
> 
> Is there a bug for this?  I can't find it, but I have some ideas for handling
> this, and I'd like to take a stab at implementing it.

There is, it's Bug 464339.  Note that there's a patch attached which has a working hack, but it's not anywhere near production quality.
Comment 68 Curtis Bartley [:cbartley] 2008-11-19 10:26:34 PST
There is now a test plan at https://wiki.mozilla.org/QA/Firefox3.1/ViewSource_Testplan.  Ostensibly it's about view source in general but at the moment most of the focus is actually on linkification.
Comment 69 neil@parkwaycc.co.uk 2008-11-29 15:54:19 PST
*** Bug 429618 has been marked as a duplicate of this bug. ***
Comment 70 baffclan 2008-12-05 13:53:16 PST
not work on frame page.
bug 309724

view-source:https://bugzilla.mozilla.org/attachment.cgi?id=197153
Comment 71 Brendan Eich [:brendan] 2008-12-05 14:05:46 PST
(In reply to comment #70)
> not work on frame page.
> bug 309724
> 
> view-source:https://bugzilla.mozilla.org/attachment.cgi?id=197153

File a new bug, please.

/be
Comment 72 Boris Zbarsky [:bz] 2008-12-05 17:32:41 PST
There is already a bug on it: the one cited in comment 70.  So I'm not sure what the point of commenting here about it is...
Comment 73 Biju 2008-12-10 19:59:47 PST
just noticed html form action is not linkyfied.
Is it a bug or is it intentionally not linkyfied for security reasons?
Comment 74 Phil Ringnalda (:philor) 2008-12-11 15:26:36 PST
*** Bug 455945 has been marked as a duplicate of this bug. ***
Comment 75 Boris Zbarsky [:bz] 2008-12-11 15:40:09 PST
Curtis, what's the state of the performance testing?  It needs to happen before we ship this....  Ideally it should have happened before we shipped this in a beta.
Comment 76 Curtis Bartley [:cbartley] 2008-12-11 20:36:42 PST
(In reply to comment #75)
> Curtis, what's the state of the performance testing?  It needs to happen before
> we ship this....  Ideally it should have happened before we shipped this in a
> beta.

Indeed it should have.  There are also a couple of bugs that ideally should have been fixed (by me) before beta.  Anyway, I've opened bug 469256 to track performance testing so it doesn't slip off my radar screen again.
Comment 77 Ori Avtalion (salty-horse) 2008-12-12 16:34:53 PST
I'm experiencing usability issues with this feature:

1) Right clicking on links should have a "Copy link location" item in the context menu, just like the "regular" browser window.

Since it's missing, copying links is now very hard. I have to be careful to click and drag from one edge of the link to the other, while being careful not to trigger the link.

2) Back/forward should be accessible via a menu item, and not just as keyboard controls. I can imagine less-savvy web developers closing the window and choosing "View source" again after accidentally following a link.

3) Ctrl+click should open a new "View source" window, to follow the convention.
Comment 78 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2008-12-12 16:58:01 PST
Ori, please file new bugs for these issues you mentioned, if they aren't filed already.
Comment 79 Ori Avtalion (salty-horse) 2008-12-12 17:44:19 PST
I filed bug 469434 and bug 469434.

The back/forward are already present in the context menu which I did not notice at first.
Comment 80 Boris Zbarsky [:bz] 2008-12-18 18:30:10 PST
The back/forward changes in this bug caused bug 470357.
Comment 81 kitchin 2008-12-19 19:57:56 PST
Correction to Comment #79, the new bugs are Bug 469434 and Bug 469435.

Note You need to log in before you can comment on or make changes to this bug.