Closed Bug 73845 Opened 24 years ago Closed 23 years ago

stdURL parsing is broken.

Categories

(Core :: Networking, defect, P1)

x86
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla0.9.2

People

(Reporter: jud, Assigned: dougt)

References

()

Details

(Keywords: crash, Whiteboard: PDT+, critical for 0.9.2; ready for checkin)

Attachments

(3 files)

There are various crashers cropping up because parsing a url sometimes produces a null mScheme. The said URL points to the place where we can wind up w/ a null mScheme. The crashers manifest themselves in ::IsScheme() when we go to dereference mScheme.
Blocks: 64833
Keywords: mozilla0.9
Perhaps we should do the following for now? Index: nsStdURL.cpp =================================================================== RCS file: /cvsroot/mozilla/netwerk/base/src/nsStdURL.cpp,v retrieving revision 1.69 diff -u -r1.69 nsStdURL.cpp --- nsStdURL.cpp 2001/03/13 02:00:54 1.69 +++ nsStdURL.cpp 2001/03/28 22:24:11 @@ -347,6 +347,11 @@ nsresult rv = mURLParser->ParseAtScheme(i_Spec, &mScheme, &mUsername, &mPassword, &mHost, &mPort, &ePath); + NS_ASSERTION(mScheme, "we should have a scheme"); + if (!mScheme) { + CRTFREEIF(ePath); + return NS_ERROR_FAILURE; + } if (NS_SUCCEEDED(rv)) { // Now parse the path rv = mURLParser->ParseAtDirectory(ePath, &mDirectory, &mFileBaseName,
Am going to investigate this tonight. We could avoid the crash as jud suggests but I am keen to find out how we got here. If anyone has a reproducible case to lead us to this state I'd appreciate it.
rick: jud tells me that you may have a reproducible case for this?
I've forwarded gagan a mail msg that causes this crash.
Keywords: crash
Here's what Rick found during investigation. There's code that creates *raw* uri/url objects (nsSimpleURI, or nsStdURL) using the component manager. That code then turns around and does a uri->SetSpec("some uri string"); . Then it turns around and asks for the scheme, host, username, pwd, whatever. Point being that people are creating urls just to get string parsing functionality out of them. This is bad for several reasons: 1. if the string they pass in isn't recognized by our parser, we fail. 2. building up a url object for parsing is overkill. 3. it allows a URL to be created w/ out a scheme 4. the model circumvents our entirely flexible necko protocol handler model because it doesn't go through necko for uri creation. Suggested fixes: 1. hang general uri string parsing off of nsIIOService so all these parsing usages can use that. nsIOService's impl would *just* do *simple* specification based heuristic parsing on the string. and hand out strings to the caller (much like the old NET_ParseURL() did. 2. For those really needing a URL object, they would need to be fixed up to go through necko as originally intended.
Target Milestone: --- → mozilla1.0
due to high interest from the very important embedding customer I'm setting this to 0.9.1 (for renewed consideration)
Target Milestone: mozilla1.0 → mozilla0.9.1
Blocks: 75664
Whiteboard: needed by 05/08/01
the overall cleanup is going to be bigger than the scope of this bug. But I would go ahead and add this check proposed by jud to prevent crashing (for now) someone give me an sr on jud's patch?
we should sooo avoid my patch, it will sweep the problem under the carpet and we'll forget about the real solution and battle incarnations of this for years to come.
no you misunderstood my comments. Your patch is a temporary fix to get over the problem while me and rpotts are working on the good-clean fix which is possibly going to take longer than 0.9.1 deadline. I am totally in favor for fixing this the right way. Just that we may not be able to get that "right" way in time.
gagan, could we do the following? - keep this bug open - create a new bug just for the patch in the current bug - check-in the patch in both 0.9 branch and 0.9.1 trunk by 11/May/01 (after the proper procedures are observed) - close the newly created bug for the patch and keep the current bug open
dougt taking over URL parsing stuff...
Assignee: gagan → dougt
#73323. another bug where the caller had to be fixed. when this land, we need to make sure to remove all these temporary fixes.
Blocks: 73323
REPEAT OF QUESTIONS TO GAGAN: dougt, could we do the following? - keep this bug open - create a new bug just for the patch in the current bug - check-in the patch in both 0.9 branch and 0.9.1 trunk by 11/May/01 (after the proper procedures are observed) - close the newly created bug for the patch and keep the current bug open
No longer blocks: 64833, 75664
well: - keep this bug open until it is fixed, I will :-) - create a new bug just for the patch in the current bug I don't see the point. - check-in the patch in both 0.9 branch and 0.9.1 trunk by 11/May/01 (after the proper procedures are observed) we are past this date. - close the newly created bug for the patch and keep the current bug open I don't see the point. I am with Jud here. The patch is not something that we should do. people will be implementing against 0.9 and they may do the wrong thing here. If your customer want to patch their tree, have them do that, but lets not apply this patch to mozilla.
First, "hang[ing] general uri string parsing off of nsIIOService", is not really that easy. Parsing URI's is nontrival and I don't want to see it in two places. The way to do URL parsing is to create a URL Parser via the component manager. A problem is that there are 3 parsers to choose from: std, auth, and noauth. This is what creating a standard uri buys you - not having to deal with the complexities of parsing. If users are going to call SetSpec on a standard url that they created via the component manager, they must expect null's on the standard getters. Garbage in - Garbage out. Any simplified parser (as suggested) would not be expected to do more. I don't think that "building up a url object for parsing is overkill.". It is the only way to avoid "circumvent[ing] our entirely flexible necko protocol handler model". My suggestion is that clients call NS_NewURI() to ensure that they will support pluggable protocols. repeat contrapositive - if you don't want to support pluggable protocols, don't call NS_NewURI(). If clients have a hell-bent need for creating a standard uri outside of necko, remember to pass in qualified url string and program defensively. I am not sure that this should be officially supported in any embedding context. Just some thoughts.... comments?
Status: NEW → ASSIGNED
the problem is that we have several clients in the current tree who don't understand the plugable protocol architecture, and subsequently they don't expect nulls back. as a result, when you put our code in a plugged protocol environment (someone adding their own protocol handler) we fail in several places because people are trying to parse urls using the object from the component manager. the idea behind the url parsing off of nsIIOService was to just allow basic string parsing, nothing fancy, like the old Net_ParseURL() function. creating a full blown URI object is indeed overkill if all you're doing, over and over and over again is looking for the scheme. the point is that we have many places in the system where all people want is string parsing, and it's been suggested that we provide a centralized place for that, nsIIOService, so we get *some* degree of consistency. another option would be go to around to all those places and have them do the string parsing themselves (lots of room for error there). and another option, which you point out is that all callers could be "fixed" to use NS_NewURI() and deal w/ failures gracefully.
I think that I am arguing that all callers should either (a) be "fixed" to use NS_NewURI(), or (b) if their parsing is simple enough (like getting the scheme) they should do it themselves. I think that exposing some "simple url parser" on the IOService will just be plagued with bugs: "It does handle a, b, c", "I breaks this load", ect. I much rather have correct parsing even it if cost a bit more.
Blocks: 75664
Call sites (as of May 21) that must change: content\xbl\src\nsBindingManager.cpp(877) content\xbl\src\nsXBLService.cpp(1151) editor\base\nsHTMLDataTransfer.cpp(662) embedding\browser\powerplant\source\CBrowserApp.cp(416) embedding\components\ui\helperAppDlg\nsHelperAppDlg.js(422) extensions\inspector\resources\content\prefs\pref-sidebar.js(17) extensions\pipella\ui\content\protoCommon.js(435) extensions\python\xpcom\file.py(78) extensions\python\xpcom\file.py(155) extensions\wallet\signonviewer\SignonViewer.js(166) extensions\wallet\src\singsign.cpp(2044) extensions\wallet\src\singsign.cpp(2168) extensions\xml-rpc\src\nsXmlRpcClient.js(81) js\src\xpconnect\loader\mozJSComponentLoader.cpp(88) mailnews\base\src\nsMsgServiceProvider.cpp(107) mailnews\base\src\nsStatusBarBiffManager.cpp(115) mailnews\compose\resources\content\MsgComposeCommands.js(1937) mailnews\compose\src\nsMsgCompFields.cpp(130) modules\libimg\src\if.cpp(1749) netwerk\build\nsNetCID.h(71) netwerk\build\nsNetModule.cpp(635) rdf\chrome\src\nsChromeRegistry.cpp(213) rdf\chrome\src\nsChromeRegistry.cpp(1321) rdf\chrome\src\nsChromeRegistry.cpp(2514) rdf\chrome\src\nsChromeRegistry.cpp(2541) rdf\chrome\src\nsChromeRegistry.cpp(2837) rdf\chrome\src\nsChromeRegistry.cpp(3134) uriloader\exthandler\nsExternalHelperAppService.cpp(172) webshell\tests\viewer\samples\soundtest.html(8) widget\src\beos\nsFilePicker.cpp(236) widget\src\mac\nsFilePicker.cpp(627) widget\src\os2\nsFilePicker.cpp(229) widget\src\windows\nsClipboard.cpp(659) widget\src\windows\nsFilePicker.cpp(269) xpfe\browser\resources\content\metadata.js(414) xpfe\browser\resources\content\sessionHistoryUI.js(132) xpfe\browser\resources\content\sessionHistoryUI.js(135) xpfe\communicator\resources\content\contentAreaDD.js(275) xpfe\communicator\resources\content\nsContextMenu.js(676) xpfe\components\filepicker\res\content\filepicker.js(32) xpfe\components\filepicker\src\nsFilePicker.js(90) xpfe\components\prefwindow\resources\content\pref-applications-edit.xul(156) xpfe\components\prefwindow\resources\content\pref-applications-new.js(75) xpfe\components\prefwindow\resources\content\pref-applications.js(72) xpfe\components\search\src\nsInternetSearchService.cpp(1219) xpfe\components\sidebar\resources\customize.js(259) xpfe\components\sidebar\resources\sidebarOverlay.js(630) xpfe\components\sidebar\src\nsSidebar.js(44)
I am updating all callers sites that use SetSpec. I posted a question on the newsgroup regarding callers to SetFile. If you are interested, please comment.
Open Networking bugs, qa=tever -> qa to me.
QA Contact: tever → benc
has dougt ever given up on widesweeping changes that could hork the tree? ;-) lets take jud's short term crash protection scheme for 0.9.1/beta
LoL. The current patch is only 87 K this time and touches a mere 46 files. I am still working out the kinks, but it should be ready in a day or two.
Patch to convert callers is more risky then valeski's patch per PDT. Jud, can you check in your "hack". Once it is in, I can move this bug to 0.9.2.
once we cut a branch for 0.9.1, I'll put my patch on the branch, my patch is the wrong thing and I don't want it on the trunk.
Whiteboard: needed by 05/08/01 → workaround ready and will check into branch
workaround bug number = http://bugzilla.mozilla.org/show_bug.cgi?id=82385 Moving this bug out a milestone.
Keywords: mozilla0.9
Whiteboard: workaround ready and will check into branch
Target Milestone: mozilla0.9.1 → mozilla0.9.2
I don't think you need a full blown uri/url-object to get url-parsing, you only need the urlparser-object itself. And, if I remember correctly, the three available parsers are kind of static (global instances) with necko for performance reasons. To get urlparsing you can just use one of these "preloaded" parsers.
Andreas, that is what the nsStdURL does for you. It determines which preloaded parser you should get. The problem with going directly to one of these parsers outside of NS_NewURI is that addon protocol schemes will be ignored. There needs to be one bottlenecko for all URI creation
Doug, Currently, nsStdURL does alot more than simple URL parsing :-( It is MUCH more expensive to create a URI object than it is to simply parse the string. Maybe the "right" thing to do is to make nsStdURL *much less expensive*... But unless we do, I don't think that we can wash our hands of the problem and say "use NS_NewURI(...) or write your own parser..." I think that the WORST situation we can end up in is if consumers ACTUALLY START DOING THEIR OWN URL PARSING!! This would be a HUGE disaster and would ensure that pluggable protocols never worked :-( I think that we need to evaluate whether we can make URI object creation cheap enough that we can create millions of them for the sole purpose of string parsing... If we cannot, then I think that we must expose a consistant means of parsing URIs at the Necko level that is effecient...
I don't think that you can make url parsing less expensive while preserving plugablity.
Doug, why not use nsStdURLParser as default parser? I think it is currently done this way. If you don't wish for a special parser (bound to the scheme) you get the StDURLParser. It is designed to work for all protocols that are not simple. There are certain corner cases which make the other parsers necessary, but StdURLParser usually does the job if the urlsyntax fits the usual scheme. I think we can offer an api to access the StdURLParser without creating an URI-object.
Well, under-the-hood this is what we are going to basically do. There is some additional work to make this work pluggable protocols/uri parsers (a mapping of scheme to urlparser). The jist is to add a function on the IO service that would take a scheme (or string containing a scheme), and return a nsIURLParser. This *could* be QI'ed for a protocol specific uri parser if needed. Another function would be added which would take a uri string and part key, and would return you the request string part of of the URI similar to what netlib did. I will attach some IDL's early next week for comment.
Priority: -- → P3
sorry it took so long to get back to this bug, but now that 0.9.1 is done, I can focus on this for a while. The last attachment adds the two new interface that rick and I are proposing.
just out of curiosity: why isn't NS_IURLPARSER_KEY NS_URLPARSER_CONTRACTID? and shouldn't it be defined as "@mozilla.org/network/urlparser;1" instead of "@mozilla.org/urlparser;1"?
Attached patch patch v.1Splinter Review
The last patch changes all callers to use either the uri parsing api on the ioservice or to create a new URI via the NS_NewURI method. I know that I am leaking a URI in nsGlobalHistory::RemovePagesFromHost. Alec, can you help me out here? Is this what you meant by caching a URI? Can we just make this a nsCOMPtr and forget about it? Alec, darin, rick, please review my patch... Rick, that bug we were chasing existed on the trunk! D'oh.
Comments on the patch and API: for the sake of those of us who write javascript, can we please change this: + void extractUrlPart(in string urlString, + in short mask, + out unsigned long startPos, + out unsigned long endPos, + out string urlPart); to + string extractUrlPart(in string urlString, + in short mask, + out unsigned long startPos, + out unsigned long endPos); ---- GetParserForScheme hits the registry EVERY time - seems like we should be caching this in memory. Also, the entire function is enclosed in an "if (scheme) {..." why not just start the function with if (!scheme) return nsServiceManager::GetService(kStdURLParserCID, NS_GET_IID(nsIURLParser), (nsISupports **)_retval); I also notice that literally every single call to extractUrlPart() passes in null (and in fact, in SignonViewer, you're passing in 0, not null, which is wrong) - let's not create extra API parameters if nobody needs them.. (yes, I agree it would save allocations, etc, but I don't see any of the callers you list begging for it) Finally, I notice that in the API you refer to the first parameter as a "mask" which would imply that you could pass in a composition of values (i.e. url_Host | url_Username).. when in fact the API allows you to specify exactly one of these values. You need a new name, and you need to specify that in nsIIOService.idl...
D'Oh! Wrong patch! I think that I did mention to you yesterday that I was cacheing the stuff in GetParserForScheme. I will post a new patch over the weekend/monday. Also, I will change the api and JS callers to use the return value as the outstring. I will also change the parameter name from "mask" to "flag" to avoid confusion. However, I will not change the out vars for the start and end position. These all pass null now since under the hood we do a terrible job "guessing" what these values should be. Once the URI Parser is rewritten to use scc's string classes, these parameter we be more useful. I guess what you may be argueing for, or what I think I hear, is that you want an api that avoids extra parms. If this is the case, we could split this into two methods if you think that would be cleaner.
well, the way I look at it is if we're really bent on saving the allocations, we should be using ACString rather than "string" - that allows us to pass in a nsAWritableCString, which will avoid the allocations (not the copying, but I argue that the allocation is the bigger issue) - I'd like to here scc's advice on that..
I don't think nsACString is currently exposed to IDL, even though nsAString is. I'd like to see it exposed, because we would use it in the LDAP XPCOM SDK as well, if we could. scc?
It is all kind of a mute point since the uri parsers are not updated to scott strings....
Alec, this next patch will address most of your concerns. However, the one that I do not address is the return value being the string. I did not change this for two reasons: the first is that |extractScheme| already uses this syntax. Secondly, we are needing to land this in the next two days or a ton of stuff will fall off my plate. I know, that the best reasons/excuses, but I will file another bug to switch both extractScheme and extractUrlPart to the more use friendly forms sometime after 0.9.2. This next patch is only for mozilla/netwerk... the rest of the stuff stays the same. Alec, Darin can I please have a review?
looks good (r=darin) comments: - make sure you don't transfer your branch client.mk: MOZ_CO_TAG = DOUGT_URI_PARSER_06122001_BRANCH - from JS, you use 1<<1 instead of the actual urlpart flag... why not use ioService.url_Username? extensions/wallet/signonviewer/SignonViewer.js - the indentation seems off in: mailnews/base/src/nsMessengerMigrator.cpp netwerk/base/public/nsIIOService.idl - it seems unfortunate that in order to implement ExtractUrlPart, we have to parse the entire urlString.
thanks for reviewing this patch. It is long and boring.... - make sure you don't transfer your branch client.mk: MOZ_CO_TAG = DOUGT_URI_PARSER_06122001_BRANCH Doesn't everyone want to be on my branch?! :-) - from JS, you use 1<<1 instead of the actual urlpart flag... why not use ioService.url_Username? extensions/wallet/signonviewer/SignonViewer.js fixed. - the indentation seems off in: mailnews/base/src/nsMessengerMigrator.cpp netwerk/base/public/nsIIOService.idl ^M's will be pounded out... - it seems unfortunate that in order to implement ExtractUrlPart, we have to parse the entire urlString. Yeah, we need to optimized this later...
Priority: P3 → P1
I'll do a fuller review of this in a bit (just got back from vacation) but I really have to object over the IDL syntax of this change! When you IDLized nsIAtom, I asked for a similar change (re: ToString() and friends), and you said "I'll get to it later" - and here we are years later and no change. I write a lot of javascript for this product, and I /know/ I'll be using this down the road. It is a massive pain to have to use "out" values from JavaScript - and all told, there are only 3 changes you have to make to support this change, all JavaScript, because the C++ signature will not change in the slightest. Just search your patch for ".extractUrlPart" and you'll see the 3 places I'm talking about.
okay. I will change it. How is the rest of the patch?
You can't pass NULL for an XPCOM out param. End of story. Please pass pointers to dummy PRInt32s instead (they could both be the same one if you want to save the four bytes of stack). Similarly, you can't avoid creating the objects in JS. ioService.extractScheme(something, {}, {}, fourth); The {} is a clear way of showing that you're going to discard it, though you could make it a hair cheaper by doing: var unused = { }; ioService.extractScheme(something, unused, unused, fourth); The JS bit is a (large) style nit, but passing NULL as an out parameter is strictly verboten. Please do not check in this patch until you fix that.
is it verboten per XPConnect or JS? It seems silly to make that verboten - there are plenty of places where we have optional out parameters... the rest of the patch looks great though
The last patch should address the js out parameters. shaver, alec, can you review this please?
ack, unfortunately, your conversion from the out param to the return value isn't quite right ..example: try { - url = ioService.extractUrlPart(host, ioService.url_Username, 0, 0, username); // TODO verify! + username = ioService.extractUrlPart(host, ioService.url_Username, unused, unused); // TODO verify! } catch(e) {} if (username.value) { user = username.value; here, "username" is the actual value you're using, and "username.value" will have no value...the other two cases suffer the same ills..
yeah. I see. something like this would work right: + try { + username = ioService.extractUrlPart(host, ioService.url_Username, unused, unused); // TODO verify! + } catch(e) { + username = ""; + } + + if (username != "") { + user = username; } else { user = "<>"; } Also, I don't see the problem with the calls in sessionHistoryUI.js.
ah, I was reading that part wrong.. ok, with all of that, sr=alecf
great! alec, thank you for help.
Marking PDT+
Whiteboard: PDT+
a=blizzard on behalf of drivers for 0.9.2
Whiteboard: PDT+ → PDT+, critical for 0.9.2
I have a crasher bug 86800, crash because nsStdURL::SetSpec() does not do a null check on the parameter. I noticed a lot of stuff in that class does not have null checks. The patch here does not seem to add those null checks (but it might still fix my crasher, haven't checked yet). Anyway, do we really want to crash every time somebody calls these methods wrong? Wouldn't it be better to put some NS_ENSURE_ARG_POINTER magic in the file, so we would assert in debug builds and not crash and burn in optimized?
Heikki, the crashes were caused by a misuse of the nsStdURL implementation. I have fixed all the call sites. Read my post in the netlib newsgroup for more information.
Whiteboard: PDT+, critical for 0.9.2 → PDT+, critical for 0.9.2; ready for checkin
*** Bug 87131 has been marked as a duplicate of this bug. ***
fixed checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
+verifyme Is there a place I can verify these fixes, or are the qa people for the components that were sending null schemes going to need to check this?
Keywords: verifyme
The PDT list is shrinking, and I'd like to close this. Whats the steps to verify this?
verify that the patches were checked in?
VERIFYING, per bbaetz.
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: