Closed Bug 97677 Opened 23 years ago Closed 23 years ago

Need API to check whether a URI exists

Categories

(Core :: Networking, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla0.9.5

People

(Reporter: akkzilla, Assigned: akkzilla)

References

Details

(Whiteboard: needs sr)

Attachments

(3 files, 6 obsolete files)

For editor link checking, we need an API to check whether a URI is valid or not
(which usually means actually trying to open the file, do an http head request,
etc as appropriate for the scheme).  Darin and bbaetz have helped me with JS
code to open URIs asynchronously and check, but it's error prone and we all
agree that this logic really belongs in netlib instead of the editor, since
netlib will be a lot smarter about interpreting the various http status codes.
Blocks: 91388
Darin suggests nsILinkChecker in netwerk/base/public, implemented by
nsLinkChecker in netwerk/base/source.

I have JS code in bug 91388 which does a basic implementation of this (no proxy
handling or other advanced conditions), so I'll translate it into C++ and put it
there.
Assignee: neeti → akkana
Target Milestone: --- → mozilla0.9.5
Suggestions/hints from bbaetz via IRC:
*bbaetz*
well, we should differenciate between user cancelled events
*bbaetz*
and non-user-cancelled events
*bbaetz*
and have three states: OK, not OK, and unchecked
*bbaetz*
you'll need that anyway
*»bbaetz*
User cancelled means something like we had to go through a proxy interactively
and the user chose not to?
*bbaetz*
yes
*akk*
Should "timed out" be different from "definitely not there"?
*bbaetz*
and if their net connection is down, then just flag all the links as dead, I guess
*bbaetz*
um
*bbaetz*
hmm
*akk*
(As long as I have to return something other than boolean)
*bbaetz*
thats up to you, I guess
*bbaetz*
you should disable this when the user is offline, while you;re at it
*akk*
How do I find out if we're offline?
*akk*
(I was just going to figure that the user shouldn't call this if we're offline. :-)
*bbaetz*
its an attribute on the ioservice
*bbaetz*
but you can also register as an observer, and be told when this state changes
*bbaetz*
so you can enable/disable the menu item
Status: NEW → ASSIGNED
Oh, he also pointed out earlier in the conversation (and it's worth recording)
that it would be nice if knowledge of particular protocols were handled at the
protocol level, rather than this base class knowing all about the details of
http and whatever other protocols might have special cases.  I'm not going to
worry about that in the initial nsILinkChecker implementation (unless netlib
people think I should), but that does suggest that adding something to
nsIChannel might possibly be a better solution in the long run.
yes, and the link checker code could simply use that API in the future.
Attached file nsIURIChecker.idl (obsolete) —
Attached file nsURIChecker.h (obsolete) —
Attached file nsURIChecker.cpp (obsolete) —
akkana:

1- please use NS_GENERIC_FACTORY_CONSTRUCTOR instead of nsURIChecker::Create

2- please do not override the meaning of NS_BINDING_XXX... instead, you should
invent new status codes, declared in nsIURIChecker.
actually, i'm only concerned about NS_BINDING_REDIRECTED... it doesn't apply to
URI checking.  otherwise, using NS_BINDING_SUCCEEDED, NS_BINDING_FAILED, and
NS_BINDING_ABORTED is ok.
also, if you want to generate a single diff, you can 'cvs add' your new files to
your local tree, and then use 'cvs diff -uN' to generate a diff containing the
new files.
I've attached the new files and the diffs to build and create the new class.  I
tried to follow the guidelines Darin and Bradley offered (thanks for all the
help!) -- please let me know if I misunderstood or things need changing.  If the
API looks okay, then I'm seeking review/sr.

I've also attached the patch to editor/ui/composer/content/ComposerCommands.js
which I'm using to test this.  Apply, rebuild in editor, run mozilla -editor
[url] on a url which has a few links in it (ideally, some bad, some good), then
do a Debug->Check Links and watch stdout (there's no UI as yet).  I'll be
getting review from editor folks for that code, but if one of you has time to
review the composer code as well as the netwerk code to make sure I'm not doing
something silly, the double-check would be much appreciated.
Keywords: patch
Priority: -- → P2
Whiteboard: needs r, sr
(Working on the changes Darin suggested ...)
Attachment #51318 - Attachment is obsolete: true
Attachment #51319 - Attachment is obsolete: true
Attachment #51321 - Attachment is obsolete: true
Attachment #51327 - Attachment is obsolete: true
Unfortunately, there's a new pattern for "ToUnicode" that appeared in today's 
tree update. E.g.:
name.ToNewUnicode()
should be changed to:
ToNewUnicode(name);
and you include
#include "nsReadableUtils.h"                                                    
in files that need this.
See bug 100476.

Attachment #51350 - Attachment is obsolete: true
Attachment #51744 - Attachment is obsolete: true
Comment on attachment 51751 [details] [diff] [review]
Patch with new string APIs AND new license.

r=cmanske
Attachment #51751 - Flags: review+
Whiteboard: needs r, sr → needs sr
Comment on attachment 51744 [details] [diff] [review]
Patch incorporating new string APIs

>Index: base/public/nsIIOService.idl

>+interface nsILinkChecker;
>+interface nsIObserver;

these lines can be deleted.



>Index: base/public/nsIURIChecker.idl

>+   *   NS_BINDING_IN_PROGRESS: in progress, don't know yet

nsIRequest::isPending() should be used to convey this state.

>+%{C++
>+
>+#define NS_URICHECKER_CONTRACT_ID "@mozilla.org/network/urichecker;1"
>+#define NS_URICHECKER_CID                     \
>+{ /* cf3a0e06-1dd1-11b2-a904-ac1d6da77a02 */         \
>+    0xcf3a0e06,                                      \
>+    0x1dd1,                                          \
>+    0x11b2,                                          \
>+    {0xa9, 0x04, 0xac, 0x1d, 0x6d, 0xa7, 0x7a, 0x02} \
>+}

please add this to nsNetCID.h



>+#define NS_BINDING_IN_PROGRESS NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_NETWORK, 5)

remove.. see comments above.



>Index: base/src/nsURIChecker.cpp

>+static NS_DEFINE_CID(kIOServiceCID, NS_IOSERVICE_CID);

use nsNetUtil.h:do_GetIOService to eliminate yet another static kIOServiceCID


>+//Interfaces for addref and release and queryinterface

minor nit... how about:

//Implementations for addref, release, and queryinterface



>+nsURIChecker::nsURIChecker()
>+{
>+    NS_INIT_REFCNT();
>+    mStatus = NS_BINDING_REDIRECTED;
>+    mIsPending = PR_TRUE;
>+}

you probably want to just initialize mStatus to NS_OK and mIsPending to FALSE.
mIsPending is TRUE once AsyncCheckURI has been called.


>+nsURIChecker::~nsURIChecker()
>+{
>+    Cancel(NS_BINDING_ABORTED);
>+}

calling cancel is meaningless since nsURIChecker implements nsIStreamListener.
ie., if the destructor of nsURIChecker has been called, then mChannel must
not be holding a reference back to |this|, and if it is not holding a reference
to |this|, then the channel must already have finished.


>+nsresult
>+nsURIChecker::SetStatusAndCallBack(PRUint16 aStatus)
>+{
>+    mStatus = aStatus;
>+    mIsPending = PR_FALSE;
>+
>+    mObserver->OnStartRequest((nsIRequest*)this, 0);
>+    mObserver->OnStopRequest((nsIRequest*)this, 0, mStatus);
>+
>+    return NS_OK;
>+}

why is the argument to this function not a nsresult?
also, why not make the function return void?
and, it is better style to use NS_STATIC_CAST instead of C style casts.

this function has made me realize the following:
AsyncCheckURI should take a third argument... an nsISupports context parameter
similar to the context parameter of nsIChannel::AsyncOpen... this is an opaque
parameter that should be passed to the nsIRequestObserver::OnSt{art,op}Request
methods.  please add this parameter to AsyncCheckURI.



>+/////////////////////////////////////////////////////
>+// nsIURI methods
>+//

nsIURIChecker methods, right?


>+NS_IMETHODIMP
>+nsURIChecker::AsyncCheckURI(const char* aURI, nsIRequestObserver *aObserver,
>+                            nsIRequest** aRequestRet)
>+{
>+    nsresult rv;
>+
>+    mIsPending = PR_TRUE;
>+    mStatus = NS_BINDING_REDIRECTED;

initialize mStatus to NS_OK


>+    mObserver = aObserver;
>+    if (aRequestRet)
>+        *aRequestRet = this;

you need to AddRef *aRequestRet



>+    // Get the IO Service:
>+    if (!mIOS) {
>+        mIOS = do_GetService(kIOServiceCID, &rv);
>+        if (NS_FAILED(rv)) return rv;
>+    }
>+    if (!mIOS) return NS_ERROR_UNEXPECTED;

you don't need to cache the IO service as a member variable... there is no need...
you don't use it anywhere else.  just use a local nsCOMPtr, and use do_GetIOService
instead of do_GetService (see comments above).


>+    // Make the URI
>+    rv = mIOS->NewURI(aURI, 0, getter_AddRefs(mURI));
>+    if (NS_FAILED(rv)) return rv;
>+    else if (!mURI) return NS_ERROR_UNEXPECTED;

you do not need this extra level of error checking... just check for either
NS_FAILED(rv) or !mURI... but not both.

also, i don't think that you need to keep the URI as a member variable.


>+    // Make a new channel:
>+    rv = mIOS->NewChannelFromURI(mURI, getter_AddRefs(mChannel));
>+    if (NS_FAILED(rv)) return rv;
>+    if (!mChannel) return NS_ERROR_UNEXPECTED;

again, no need for the extra error checking... just use if (NS_FAILED(rv)) ...


>+NS_IMETHODIMP
>+nsURIChecker::GetName(PRUnichar** aName)
>+{
>+    char* nameStr;
>+    nsresult rv = mURI->GetSpec(&nameStr);
>+    if (NS_FAILED(rv)) return rv;
>+
>+    nsString name;
>+    name.AssignWithConversion(nameStr);
>+    Recycle(nameStr);
>+    *aName = ToNewUnicode(name);
>+    return *aName ? NS_OK : NS_ERROR_OUT_OF_MEMORY;
>+}

just call mChannel->GetName()




>+NS_IMETHODIMP nsURIChecker::Cancel(nsresult status)
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->Cancel(status);
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

this QueryInterface is not necessary... mChannel is a nsIRequest in the C++ sense.


>+NS_IMETHODIMP nsURIChecker::Suspend()
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->Suspend();
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP nsURIChecker::Resume()
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->Resume();
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP nsURIChecker::GetLoadGroup(nsILoadGroup **aLoadGroup)
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->GetLoadGroup(aLoadGroup);
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP nsURIChecker::SetLoadGroup(nsILoadGroup *aLoadGroup)
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->SetLoadGroup(aLoadGroup);
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP nsURIChecker::GetLoadFlags(nsLoadFlags *aLoadFlags)
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->GetLoadFlags(aLoadFlags);
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP nsURIChecker::SetLoadFlags(nsLoadFlags aLoadFlags)
>+{
>+    nsCOMPtr<nsIRequest> request = do_QueryInterface(mChannel);
>+    if (request)
>+        return request->SetLoadFlags(aLoadFlags);
>+    else return NS_ERROR_NOT_IMPLEMENTED;
>+}

again, no QI needed.


>+NS_IMETHODIMP
>+nsURIChecker::OnStartRequest(nsIRequest *aRequest, nsISupports *aCtxt)
>+{
>+    //printf("OnStartRequest\n");

please either use PR_LOG, wrap this line w/ #ifdef DEBUG_akkana, or delete it.


>+    // DNS errors and other obvious problems will return failure status
>+    nsresult status;
>+    nsresult rv = aRequest->GetStatus(&status);
>+    if (NS_FAILED(rv) || status != 0) {

do not test status against 0... you want NS_FAILED(status)



>+    // We don't want to read the actual data, so cancel now:
>+    aRequest->Cancel(NS_BINDING_ABORTED);

returning any error code from OnStartRequest will also cancel the request.



>+    // If status is zero, it might still be an error if it's http:
>+    // http has data even when there's an error like a 404.
>+    nsCOMPtr<nsIHttpChannel> httpChannel = do_QueryInterface(aRequest);
>+    if (!httpChannel) {
>+        return SetStatusAndCallBack(NS_BINDING_SUCCEEDED);
>+    }
>+    PRUint32 responseStatus;
>+    rv = httpChannel->GetResponseStatus(&responseStatus);
>+    if (NS_FAILED(rv)) {
>+        return SetStatusAndCallBack(NS_BINDING_FAILED);
>+    }
>+    // If it's below 100 or between 200-299, it's valid:
>+    if (responseStatus < 100 ||
>+        (responseStatus >= 200 && responseStatus < 300)) {
>+        return SetStatusAndCallBack(NS_BINDING_SUCCEEDED);
>+    }

there are no http response codes below 100... this might be a better check:

      if (responseStatus / 100 == 2) {
          // link is valid
      }


>+    // If we got a 404 (not found), we need some extra checking:
>+    // toplevel urls from Netscape Enterprise Server 3.6,
>+    // like http://www.mozilla.org, generate a 404
>+    // and will have to be retried without the head.
>+    if (responseStatus == 404)
>+    {
>+        char* server = 0;
>+        rv = httpChannel->GetResponseHeader("Server", &server);
>+        if (NS_SUCCEEDED(rv)) {
>+            if (!PL_strcmp(server, "Netscape-Enterprise/3.6")) {

would PL_strcasecmp be better?  just in case the value somehow gets lowercased.


>+                mStatus = NS_BINDING_REDIRECTED;

mStatus = NS_OK ???


>+                // Open a new channel for a real (not head) request:
>+                nsCOMPtr<nsIChannel> channel;

remember: this is an edge case.. there is no problem calling do_GetIOService
again for this edge case.  also, you can get the URI from mChannel->GetOriginalURI().


>+                rv = mIOS->NewChannelFromURI(mURI, getter_AddRefs(channel));
>+                if (NS_FAILED(rv)) return rv;
>+                if (!channel) return NS_ERROR_UNEXPECTED;

again, no need for the extra error checking.. if (NS_FAILED(rv)) should do.

>+                return channel->AsyncOpen(this, nsnull);
>+            }
>+        }
>+
>+        // Else it was a normal 404, so return as expected
>+        return SetStatusAndCallBack(NS_BINDING_FAILED);
>+    }
>+
>+    // If we get here, then it's an http channel, not a 100, 200 or 404.
>+    // Treat it as an error.
>+    mStatus = NS_BINDING_FAILED;

don't you want call SetStatusAndCallback in this case?  401 and 407 may require
special treatment since checking the link would require the user to enter their
username and password.



>+NS_IMETHODIMP
>+nsURIChecker::OnStopRequest(nsIRequest *request, nsISupports *ctxt,
>+                             nsresult statusCode)
>+{
>+    //printf("OnStopRequest\n");

please clean up this line as above.




>+
>+// OnDataAvailable shouldn't generally be called,
>+// since we use head requests whenever possible,
>+// but in practice we're seeing it every time.
>+NS_IMETHODIMP
>+nsURIChecker::OnDataAvailable(nsIRequest *aRequest, nsISupports *aCtxt,
>+                               nsIInputStream *aInput, PRUint32 aOffset,
>+                               PRUint32 aCount)
>+{
>+#ifdef DEBUG_akkana
>+    PRUnichar* uri;
>+    GetName(&uri);
>+    nsString uristr(uri);
>+    char* cstr = ToNewCString(uristr);
>+    printf("OnDataAvailable: %s\n", cstr);
>+    Recycle(cstr);
>+#endif
>+    // If we've gotten here, something went wrong with the previous cancel,
>+    // so try again:
>+    aRequest->Cancel(NS_BINDING_ABORTED);
>+    return NS_OK;
>+}

returning a failure code also cancels the request.



>Index: base/src/nsURIChecker.h

>+ * Contributor(s): 
       akkana ...


>Index: build/nsNetModule.cpp

looks good
Attachment #51744 - Attachment is obsolete: false
Attachment #51751 - Attachment is obsolete: true
Comment on attachment 51789 [details] [diff] [review]
Patch revised per Darin's suggestions

sr=darin w/ the 2 changes i mentioned over IRC

1- fix comment to remove less than 100

2- remove changes to nsIIOservice.idl
Attachment #51789 - Flags: superreview+
Fixed.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: