Closed Bug 161352 Opened 22 years ago Closed 22 years ago

URL: can't load local file with semicolon (;) in name

Categories

(Core :: Networking: File, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: mni, Assigned: neeti)

References

(Depends on 1 open bug)

Details

(Keywords: testcase)

Attachments

(4 files, 1 obsolete file)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; pl-PL; rv:1.1b) Gecko/20020721 BuildID: 2002072104 Any filename on local disk, which contains semicolon ";" can't be loaded. The error message says that this file cannot be found. Example of filename: Title;dcopt.html I'm using Windows XP Prof English, Mozilla 1.1b. This error happens since I remember - mozilla version 0.9.8, english or localized. Reproducible: Always Steps to Reproduce: 1. Create file on local disk with name: Title;dcopt.html 2. Load it in Mozilla thru Open File (ctrl-O) Actual Results: Error message about file not to be found. There are similar reported, but concerning high-ascii codes or national characters (japanese), but no one reported problems with just semicolon.
Confirmed on linux too (today's trunk).
Assignee: law → dougt
Status: UNCONFIRMED → NEW
Component: File Handling → Networking: File
Ever confirmed: true
OS: Windows XP → All
QA Contact: sairuh → benc
Hardware: PC → All
andreas, ideas?
A quick look at RFC 1630's BNF shows this is "reserved": reserved = | ; | / | # | ? | : | space
Summary: Mozilla can't load local file with semicolon (;) in name → URL: can't load local file with semicolon (;) in name
Andreas: so in this case, shouldn't we escape it when making the URL? I'm assuming this should work in the right situations.
Yes, it yhould be escaped when creating the url from the local filename. I find it strange that it does not already happen. We escape a ; everywhere in the url except in directory, param query and ref. Escaping on this case should have happen with the filename mask.
Yes, and that is okay this way in my opinion. URLs have to be parsed this way. The trick is to escaped the ; in the filename while creating the url from the local filepath.
+nsbeta1 - these files are rare, but this seems to be a standards problem. UPDATE: mozilla 1.1b - Mac OS X I found a chart: http://i-technica.com/whitestuff/urlencodechart.html It said that the encode value was %3b. I tried to use that encoding in a file URL: file:///Macintosh%20HD/fileURLs/testof%3B.txt it works. And if you open the file via the XUL dir view of the directory: file:///Macintosh%20HD/fileURLs/ it does properly escape.
Keywords: nsbeta1
me
Assignee: dougt → neeti
Attached patch proposed patchSplinter Review
andreas: the above patch loads the file. Is this the correct way to do it?
I can't say I like the hardcoded replace. Instead why not change the Escape-Mask from esc_Directory to esc_FileBaseName as asked for in comment #5 ? That should do the trick.
andreas: I have tried changing the Escape-Mask from esc_Directory to esc_FileBaseName Index: nsURLHelperWin.cpp =================================================================== RCS file: /cvsroot/mozilla/netwerk/base/src/nsURLHelperWin.cpp,v retrieving revision 1.1 diff -u -r1.1 nsURLHelperWin.cpp --- nsURLHelperWin.cpp 13 Sep 2002 19:32:30 -0000 1.1 +++ nsURLHelperWin.cpp 3 Oct 2002 17:16:25 -0000 @@ -69,7 +69,7 @@ NS_NAMED_LITERAL_CSTRING(prefix, "file:///"); // Escape the path with the directory mask - if (NS_EscapeURL(ePath.get(), ePath.Length(), esc_Directory+esc_Forced, escPath)) + if (NS_EscapeURL(ePath.get(), ePath.Length(), esc_FileBaseName|esc_AlwaysCopy, escPath)) escPath.Insert(prefix, 0); else escPath.Assign(prefix + ePath); I get a lot of assertions and then it hangs with this change. First, we fail to get the base url in nsChromeRegistry::ConvertChromeURL(nsChromeRegistry *....) in mozilla/rdf/chrome/src/nsChromeRegistry.cpp, rv = GetBaseURL(package, provider, finalURL); when package is "communicator", and provider is "content" These are the assertions and warnings I get before it hangs Type Manifest File: D:\build\mozilla\dist\bin\components\xpti.dat +++ JavaScript debugging hooks installed. nsNativeComponentLoader: autoregistering begins. *** Registering necko_core_and_primary_protocols components (all right -- a gene ric module!) nsNativeComponentLoader: autoregistering succeeded nNCL: registering deferred (0) WARNING: dependent window created without a parent, file d:/build/mozilla/xpfe/b ootstrap/nsWindowCreator.cpp, line 148 WEBSHELL+ = 1 For application/x-java-vm found plugin C:\Program Files\JavaSoft\JRE\1.3.0_01\bi n\NPOJI600.dll WARNING: chrome: failed to get base url for chrome://communicator/content/profil e/profileSelection.xul -- using wacky default, file d:/build/mozilla/rdf/chrome/ src/nsChromeRegistry.cpp, line 562 WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file d:/build/mozilla/docshell /base/nsDocShell.cpp, line 2521 WARNING: NS_ENSURE_TRUE(window) failed, file d:/build/mozilla/xpfe/appshell/src/ nsContentTreeOwner.cpp, line 635 WARNING: NS_ENSURE_TRUE(docShellElement) failed, file d:/build/mozilla/xpfe/apps hell/src/nsXULWindow.cpp, line 1120 WARNING: NS_ENSURE_TRUE(windowElement) failed, file d:/build/mozilla/xpfe/appshe ll/src/nsXULWindow.cpp, line 1174 ###!!! ASSERTION: no xul:window: 'windowElement', file d:/build/mozilla/xpfe/app shell/src/nsXULWindow.cpp, line 935 ###!!! ASSERTION: no xul:window: 'windowElement', file d:/build/mozilla/xpfe/app shell/src/nsXULWindow.cpp, line 867 ###!!! ASSERTION: no xul:window: 'windowElement', file d:/build/mozilla/xpfe/app shell/src/nsXULWindow.cpp, line 989 WARNING: no preshell for window, file d:/build/mozilla/dom/src/base/nsGlobalWind ow.cpp, line 3915 WARNING: NS_ENSURE_TRUE(presShell) failed, file d:/build/mozilla/dom/src/base/ns GlobalWindow.cpp, line 3952 WEBSHELL- = 0 WARNING: CSSLoaderImpl::LoadAgentSheet: Load of URL 'chrome://global/skin/scroll bars.css' failed. Error code: 18, file d:/build/mozilla/content/html/style/src/ nsCSSLoader.cpp, line 1931 WARNING: CSSLoaderImpl::LoadAgentSheet: Load of URL 'resource:///res/forms.css' failed. Error code: 18, file d:/build/mozilla/content/html/style/src/nsCSSLoade r.cpp, line 1931 WEBSHELL+ = 1 WEBSHELL+ = 2 D:\build\mozilla\dist\bin\chrome\inspector.jar mtime changed, invalidating FastL oad file WARNING: chrome: failed to get base url for chrome://navigator/content/navigator .xul -- using wacky default, file d:/build/mozilla/rdf/chrome/src/nsChromeRegist ry.cpp, line 562 WARNING: CSSLoaderImpl::LoadAgentSheet: Load of URL 'resource:///res/ua.css' fai led. Error code: 18, file d:/build/mozilla/content/html/style/src/nsCSSLoader.c pp, line 1931 *** open of resource:/res/ua.css failed: error=80520012 ###!!! ASSERTION: unexpected null pointer: 'aUAStyleSheet', file d:/build/mozill a/content/base/src/nsDocumentViewer.cpp, line 1272 WARNING: unable to load UA style sheet, file d:/build/mozilla/content/base/src/n sDocumentViewer.cpp, line 1663 WARNING: CSSLoaderImpl::LoadAgentSheet: Load of URL 'chrome://global/skin/scroll bars.css' failed. Error code: 18, file d:/build/mozilla/content/html/style/src/ nsCSSLoader.cpp, line 1931 WARNING: CSSLoaderImpl::LoadAgentSheet: Load of URL 'resource:///res/forms.css' failed. Error code: 18, file d:/build/mozilla/content/html/style/src/nsCSSLoade r.cpp, line 1931 ###!!! ASSERTION: no quirk stylesheet: 'mQuirkStyleSheet', file d:/build/mozilla /content/base/src/nsStyleSet.cpp, line 867 Note: verifyreflow is disabled ###!!! ASSERTION: no quirk stylesheet: 'mQuirkStyleSheet', file d:/build/mozilla /content/base/src/nsStyleSet.cpp, line 867 Note: styleverifytree is disabled Note: frameverifytree is disabled
We are returning a NO_RDF_VALUE for rv = ds->GetTarget(aSource, aProperty, aTruthValue, aResult) in CompositeDataSourceImpl::GetTarget(...) CompositeDataSourceImpl::GetTarget(CompositeDataSourceImpl * const 0x01415d48, nsIRDFResource * 0x0193f8f8, nsIRDFResource * 0x014177b8, int 1, nsIRDFNode * * 0x0012e9a8) line 832 nsChromeRegistry::FollowArc(nsIRDFDataSource * 0x01415d48, nsACString & {...}, nsIRDFResource * 0x0193f8f8, nsIRDFResource * 0x014177b8) line 1220 + 46 bytes nsChromeRegistry::GetBaseURL(const nsACString & {...}, const nsACString & {...}, nsACString & {...}) line 670 + 48 bytes nsChromeRegistry::ConvertChromeURL(nsChromeRegistry * const 0x0144f5c8, nsIURI * 0x01437fa8, nsACString & {...}) line 551 + 26 bytes nsChromeProtocolHandler::NewChannel(nsChromeProtocolHandler * const 0x01437eb0, nsIURI * 0x01437fa8, nsIChannel * * 0x0012f0ac) line 640 + 40 bytes nsIOService::NewChannelFromURI(nsIOService * const 0x01440838, nsIURI * 0x01437fa8, nsIChannel * * 0x0012f0ac) line 511 + 40 bytes NS_NewChannel(nsIChannel * * 0x0012f1a4, nsIURI * 0x01437fa8, nsIIOService * 0x01440838, nsILoadGroup * 0x0145cb78, nsIInterfaceRequestor * 0x014395b0, unsigned int 524288) line 164 + 20 bytes nsDocShell::DoURILoad(nsIURI * 0x01437fa8, nsIURI * 0x00000000, nsISupports * 0x00000000, nsIInputStream * 0x00000000, nsIInputStream * 0x00000000, int 1, nsIDocShell * * 0x00000000, nsIRequest * * 0x00000000) line 5102 + 91 bytes nsDocShell::InternalLoad(nsDocShell * const 0x01439588, nsIURI * 0x01437fa8, nsIURI * 0x00000000, nsISupports * 0x00000000, int 1, const unsigned short * 0x01930bd0, nsIInputStream * 0x00000000, nsIInputStream * 0x00000000, unsigned int 1, nsISHEntry * 0x00000000, int 1, nsIDocShell * * 0x00000000, nsIRequest * * 0x00000000) line 5020 + 51 bytes nsDocShell::LoadURI(nsDocShell * const 0x01439588, nsIURI * 0x01437fa8, nsIDocShellLoadInfo * 0x0193f580, unsigned int 0, int 1) line 720 + 73 bytes nsWindowWatcher::OpenWindowJS(nsWindowWatcher * const 0x0191de24, nsIDOMWindow * 0x00000000, const char * 0x014404c0, const char * 0x0218da2c, const char * 0x0218d52c, int 1, unsigned int 1, long * 0x0190f2d0, nsIDOMWindow * * 0x0012fa44) line 771 nsWindowWatcher::OpenWindow(nsWindowWatcher * const 0x0191de20, nsIDOMWindow * 0x00000000, const char * 0x014404c0, const char * 0x0218da2c, const char * 0x0218d52c, nsISupports * 0x01440590, nsIDOMWindow * * 0x0012fa44) line 459 + 48 bytes nsProfile::LoadDefaultProfileDir(nsCString & {...}, int 1) line 580 + 94 bytes nsProfile::StartupWithArgs(nsProfile * const 0x01425cc8, nsICmdLineService * 0x01445c10, int 1) line 412 + 16 bytes nsAppShellService::DoProfileStartup(nsAppShellService * const 0x0145bdd8, nsICmdLineService * 0x01445c10, int 1) line 266 + 31 bytes InitializeProfileService(nsICmdLineService * 0x01445c10) line 1171 + 31 bytes main1(int 1, char * * 0x00284870, nsISupports * 0x00276f08) line 1450 + 14 bytes main(int 1, char * * 0x00284870) line 1883 + 37 bytes mainCRTStartup() line 338 + 17 bytes KERNEL32! 77e97d08()
My failure. I now think the best approach is to mask *each* segment of the path individually with the esc_FileBaseName mask. That should do it. Simply replacing the mask will get all the / escaped with the results seen above.
andreas: when you say we should mask *each* segment of the path individually with the esc_FileBaseName mask, is *each* segment, is the portion of the path delimited by the "/" ? thanks, Neeti
Yes, but I would do that with the native path, before changing \ to / for example or : to / on mac. In fact that change can happen while going through the path and escaping each segment.
Keywords: testcase
andreas:could you take a look at this patch
win, mac and os2 need to be tested by build buddies ...
Attachment #101732 - Attachment is obsolete: true
- 1023,1023,1023,1023,1023,1023,1023,1023,1023,1023, 912, 912, 0,1008, 0, 768, /* 3x 0123456789:;<=>? */ + 1023,1023,1023,1023,1023,1023,1023,1023,1023,1023,1008, 912, 0,1008, 0, 768, /* 3x 0123456789:;<=>? */ Here is andreas's explanation of how : will not be escaped when being part of the filename. This is a bit code, bit set says character is allowed in that part of the url without escaping. bite meaning: 1: scheme 2: user 4: password 8: host 16: directory 32: filebasename 64: fileextension 128: param 256: query 512: ref So 1023 says the character is allowed in every part of the url (no need for escape). In case of the : which had previously 912 and now 1008 this means: 912 = 512 + 256 + 128 + 16 = ref + query + param + directory 1008 = 512 + 256 + 128 + 64 + 32 + 16 = ref + query + param + fileextension + filebase + directory. Now the : will not be escaped when being part of the filename = filebasename + fileextension.
Comment on attachment 101804 [details] [diff] [review] patch to allow filenames with ; and in the process do not escape : in filenames this looks terribly expensive. why don't we come up with a new mask or modify the meaning of esc_Directory instead? we shouldn't need to modify the code in nsURLHelperXXX.cpp.
Attachment #101804 - Flags: needs-work+
We can't modify esc_Directory mask because we have to allow ; in the directory component as part of the URL spezification (param) ... those parts after the ; are not part of the directory name. Because of this while constructing a file url from a native path any ; has to be escaped. The only solution I see would be a new mask but those do not fit well with the current schema because it is oriented on parts of the url and not on what kind of url (file for example) is constructed.
but we don't support ';' inside the directory path. we only support it following the filename. i know RFC 2396 allows params within each path segment, but mozilla does not support that part of the RFC. just look at nsIURL::param. it only provides access to a param found between the filename and the query/ref. i just don't see the point of writing slow code in order to support something that is non-existent in practice. our frozen embedding APIs do not expose param between path segments, so should the implementation of those interfaces support params between path segments? it makes no sense at this point in time.
Okay, but we can't escape ; in esc_Directory because if we do not support it in mozilla we should at least not change the semantics of the url by escaping it and for example leave it to a http server to do the right thing and serve the correct page. That is what we do. Not escaping ; in the directory part is there for a good reason. That should not change. The problem here is that escaping the *whole* path with esc_directory does not cut it because that path contains the filename and when the filename contains a ; it will not get escaped and therefore part of the filename will be parsed as param and cut from the filename ... file not found ... Maybe we can instead just cut off the filename from the native path, escape it with the esc_FileBaseName mask and mask the directory part with esc_Directory and then put both together. That should be faster then going through the path step by step.
yeah, you make a good point. if some server is using params, we shouldn't munge them. so what if we just do a ReplaceSubstring(";", "%3b") on the result returned from NS_EscapeURL(esc_Directory)?
I can't say I like that programming style. It was also Neetis first idea. But what if masks esc_directory and esc_FileBaseName change in the future? Who will remember that have to do here more replaces?
but we just got done enumerating reasons why the escaping rules for esc_Directory and friends cannot change. anyways, ';' is special... i wouldn't advocate repeating this pattern if another character needs to be escaped. if that ever happens, we should write a single pass loop. bottom line: this is very performance critical code. the conversion routines between nsIFile and file:// are called a lot, especially at startup. neeti's original patch has the advantage of being very performant. we don't have to touch the heap allocator at all unless there is a ';' in the path. seems like the ideal situation IMO. i'd just like to see some documentation in the original patch.
They can not change regarding ; but otherwise ... it has happened in the past. But I agree with you, this point is very performance critical.
tested this on linux and windows
this patch compiles fine on OS/2 and does fix the problem.
Comment on attachment 102594 [details] [diff] [review] revised original patch with updated comments i think the comment could be simplified down to: // esc_Directory does not escape the semicolons, so if a filename // contains semicolons we need to manually escape them. should be apparent from the ReplaceSubstring call that ';' maps to %3b :) sr=darin with or without this change
Attachment #102594 - Flags: superreview+
Attachment #102801 - Flags: review+
Comment on attachment 102801 [details] [diff] [review] revised comments on the patch r=dougt
Comment on attachment 102801 [details] [diff] [review] revised comments on the patch sr=darin
Attachment #102801 - Flags: superreview+
Comment on attachment 102801 [details] [diff] [review] revised comments on the patch a=asa for checkin to 1.2b (on behalf of drivers)
Attachment #102801 - Flags: approval+
checked in fix
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
VERIFIED: Mozilla 1.3a, allplats. Mariusz, I'd like to get your confirmation this works for you now. I was able to get these files to open in the file picker and via HTML view as well. Since this only happened via "open file", can someone explain why it worked in HTML view and/or XUL tree? Does file picker have it's own entry point w/ different escaping?
Status: RESOLVED → VERIFIED
Depends on: 473280
Comment on attachment 102594 [details] [diff] [review] revised original patch with updated comments >? diff1.txt >? neckobase_s.lib >? neckobase_s.pdb >Index: nsURLHelperMac.cpp >=================================================================== >RCS file: /cvsroot/mozilla/netwerk/base/src/nsURLHelperMac.cpp,v >retrieving revision 1.1 >diff -u -r1.1 nsURLHelperMac.cpp >--- nsURLHelperMac.cpp 13 Sep 2002 19:32:30 -0000 1.1 >+++ nsURLHelperMac.cpp 11 Oct 2002 17:04:15 -0000 >@@ -89,6 +89,11 @@ > // need encoding; use %2F which is a forward slash > escPath.ReplaceSubstring(":", "%2F"); > >+ // esc_Directory does not escape the semicolons but if a filename >+ // contains semicolons we need to encode it; use %3b which is a >+ // semicolon >+ escPath.ReplaceSubstring(";", "%3b"); >+ > escPath.Insert(prefix, 0); > > // XXX this should be unnecessary >Index: nsURLHelperOS2.cpp >=================================================================== >RCS file: /cvsroot/mozilla/netwerk/base/src/nsURLHelperOS2.cpp,v >retrieving revision 1.2 >diff -u -r1.2 nsURLHel
Flags: needinfo?(MAWAVANSMOOKEY)
Comment on attachment 102594 [details] [diff] [review] revised original patch with updated comments >? diff1.txt >? neckobase_s.lib >? neckobase_s.pdb >Index: nsURLHelperMac.cpp >=================================================================== >RCS file: /cvsroot/mozilla/netwerk/base/src/nsURLHelperMac.cpp,v >retrieving revision 1.1 >diff -u -r1.1 nsURLHelperMac.cpp >--- nsURLHelperMac.cpp 13 Sep 2002 19:32:30 -0000 1.1 >+++ nsURLHelperMac.cpp 11 Oct 2002 17:04:15 -0000 >@@ -89,6 +89,11 @@ > // need encoding; use %2F which is a forward slash > escPath.ReplaceSubstring(":", "%2F"); > >+ // esc_Directory does not escape the semicolons but if a filename >+ // contains semicolons we need to encode it; use %3b which is a >+ // semicolon >+ escPath.ReplaceSubstring(";", "%3b"); >+ > escPath.Insert(prefix, 0); > > // XXX this should be unnecessary >Index: nsURLHelperOS2.cpp >=================================================================== >RCS file: /cvsroot/mozilla/netwerk/base/src/nsURLHelperOS2.cpp,v >retrieving revision 1.2 >diff -u -r1.2 nsURLHel
Flags: needinfo?(MAWAVANSMOOKEY)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: