Closed Bug 954542 Opened 10 years ago Closed 10 years ago

Names of folder and files are not checked against forbidden names.

Categories

(Instantbird Graveyard :: Other, defect)

defect
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: FeuerFliege, Assigned: FeuerFliege)

Details

Attachments

(1 file, 9 obsolete files)

*** Original post on bio 1108 at 2011-10-21 18:35:00 UTC ***

Names of folder and files are not checked against forbidden names.

I created a twitter account for testing purpose and named it LPT8. For this account no history is written, because Windows doesn't allow this name.

For more infos:
https://secure.wikimedia.org/wikipedia/en/wiki/Filename#Reserved_characters_and_words
*** Original post on bio 1108 at 2012-05-04 12:41:42 UTC ***

I also ran into this bug when sending myself messages from an IRC nick called "Mic|web", which contains the invalid pipe character.
Attached patch first approach (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1488 at 2012-05-19 19:41:00 UTC ***

My first patch attempt ever.

It should take care off reserved characters, reserved device names and strings ending with period.

I would appreciate any feedback.
Attachment #8353241 - Flags: feedback?
Assignee: nobody → bug
Status: NEW → ASSIGNED
*** Original post on bio 1108 at 2012-05-20 03:23:07 UTC ***

(In reply to comment #2)
> Created attachment 8353241 [details] [diff] [review] (bio-attmnt 1488) [details]
> first approach
> 
> My first patch attempt ever.
> 
> It should take care off reserved characters, reserved device names and strings
> ending with period.
> 
> I would appreciate any feedback.

So my first thought is: do we want to do this for all OSes or just for Windows? (All OSes would let you transfer logs seamlessly, which is desirable. But I figur it's worth asking the question.)

Anyway, it seems like there's a lot of duplicated code. I think you could have an array and then do a <some array>.indexOf(aName.toUpperCase) != -1 instead.

This also definitely needs some comments about what each part is checking for.

Also, I know it's really silly, but if you're in AUX, it'll become AUX_. Which is fine, but what happens if you're /actually/ in AUX_? It seems these two chats would get logged together.
Comment on attachment 8353241 [details] [diff] [review]
first approach

*** Original change on bio 1108 attmnt 1488 at 2012-05-20 03:26:12 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353241 - Flags: feedback? → feedback+
Attached patch better version of first patch (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1501 at 2012-05-21 15:19:00 UTC ***

added comments and array
Comment on attachment 8353241 [details] [diff] [review]
first approach

*** Original change on bio 1108 attmnt 1488 at 2012-05-21 15:19:42 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353241 - Attachment is obsolete: true
Attached patch alternative approach (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1502 at 2012-05-21 15:21:00 UTC ***

(In reply to comment #3)
> So my first thought is: do we want to do this for all OSes or just for Windows?

Windows has the most restrictive rules. Linux/UNIX allows everything but "/" although it is recommended to avoid /><|:&.

> Anyway, it seems like there's a lot of duplicated code. I think you could have
> an array and then do a <some array>.indexOf(aName.toUpperCase) != -1 instead.

> Also, I know it's really silly, but if you're in AUX, it'll become AUX_. Which
> is fine, but what happens if you're /actually/ in AUX_? It seems these two
> chats would get logged together.

True, but it could happen with every rename pattern. To avoid this issue we could rename every name.

[IB profile]
  ->logs
    ->[aAccount.protocol]
      ->[aAccount] + ".account"
        ->[_conv] + ".buddy"/".chat"/".twttr"

This would deal with window's reserved device names as well as forbidden file and directory name endings. .chat is already used for group conversations.
Comment on attachment 8353254 [details] [diff] [review]
alternative approach

*** Original change on bio 1108 attmnt 1502 at 2012-05-21 15:32:42 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353254 - Flags: feedback?
*** Original post on bio 1108 at 2012-05-21 20:03:26 UTC ***

(In reply to comment #5)
> Created attachment 8353254 [details] [diff] [review] (bio-attmnt 1502) [details]
> alternative approach
> 
> (In reply to comment #3)
> > So my first thought is: do we want to do this for all OSes or just for Windows?
> 
> Windows has the most restrictive rules. Linux/UNIX allows everything but "/"
> although it is recommended to avoid /><|:&.
> 
> > Anyway, it seems like there's a lot of duplicated code. I think you could have
> > an array and then do a <some array>.indexOf(aName.toUpperCase) != -1 instead.
> 
> > Also, I know it's really silly, but if you're in AUX, it'll become AUX_. Which
> > is fine, but what happens if you're /actually/ in AUX_? It seems these two
> > chats would get logged together.
> 
> True, but it could happen with every rename pattern. To avoid this issue we
> could rename every name.

This doesn't happen if you escape the character itself. That means if there's an underscore: _ in the name, replace it with two: __

Results:

AUX   -> AUX_
AUX_  -> AUX__
AUX__ -> AUX____

Not exactly nice (and confusing too! Who'd expect to have to look in to AUX__ when the account was named AUX_?) but it's reversible atleast.

Should the invalid characters like <>|.. also reversibly be encoded? The percentage encoding as in URLs should work if I'm not mistaken. We'd need to escape % then instead of _ of course. 


AUX   ->   %AUX
COM1  ->   %COM1 
<     ->   %3C

This would work because none of the reserved names start with [0-9A-F] that are 
needed to encode the other characters. 

> [IB profile]
>   ->logs
>     ->[aAccount.protocol]
>       ->[aAccount] + ".account"
>         ->[_conv] + ".buddy"/".chat"/".twttr"
> 
> This would deal with window's reserved device names as well as forbidden file
> and directory name endings. .chat is already used for group conversations.

The idea is not bad in my opinion, except that there are lots of logs (for me atleast) that would need to be converted and we'd have migration code sitting around for a while.
*** Original post on bio 1108 at 2012-05-22 00:49:59 UTC ***

(In reply to comment #6)
> (In reply to comment #5)
> >
> > True, but it could happen with every rename pattern. To avoid this issue we
> > could rename every name.
> 
> This doesn't happen if you escape the character itself. That means if there's
> an underscore: _ in the name, replace it with two: __

I must have been blind as this is kind of what I did in both of the latest patches for reserved characters.


> Should the invalid characters like <>|.. also reversibly be encoded? 

I did in the recent patches.


> > [IB profile]
> >   ->logs
> >     ->[aAccount.protocol]
> >       ->[aAccount] + ".account"
> >         ->[_conv] + ".buddy"/".chat"/".twttr"
> > 
> > This would deal with window's reserved device names as well as forbidden file
> > and directory name endings. .chat is already used for group conversations.
> 
> The idea is not bad in my opinion, except that there are lots of logs (for me
> atleast) that would need to be converted and we'd have migration code sitting
> around for a while.

I didn't consider that. But that reminded me of a bug I noticed today. The name of (at least) the twitter log directory is localized.
Attached patch safe filenames patch v3 (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1534 at 2012-05-29 09:15:00 UTC ***

I have tested the patch a few days, seems to work as expected.
Attachment #8353288 - Flags: review?(clokep)
Comment on attachment 8353253 [details] [diff] [review]
better version of first patch

*** Original change on bio 1108 attmnt 1501 at 2012-05-29 09:15:06 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353253 - Attachment is obsolete: true
Comment on attachment 8353288 [details] [diff] [review]
safe filenames patch v3

*** Original change on bio 1108 attmnt 1534 at 2012-05-29 21:59:25 UTC ***

>diff -r 8b2bae87ca15 chat/components/src/logger.js
>+function cleanName (aName)
No space before the (.

>+{
No empty line break before an opening bracket please.

>+/* name mustn't end with '.' or ' ', added '_' to use it as appendix */
Please  use a full sentence (capital starting letter, period at the end). We usually use // comments if it's only one line.

>+  let noEndChars = ["."," ","_"];
Spaces after each , please.

>+  if (noEndChars.indexOf(aName,aName.length-1) != -1)
>+    aName += "_";

>+/* The following array contains all reserved device names (windows)*/
Same comment about comments from above. Is there a source that lists these as reserved?

>+  let reservedNames = ["CON","PRN","AUX","NUL","COM1","COM2","COM3","COM4","COM5","COM6","COM7","COM8","COM9","LPT1","LPT2","LPT3","LPT4","LPT5","LPT6","LPT7","LPT8","LPT9",".",".."];
Please include a space after each comma and wrap your lines at 72 characters.

>+  if (reservedNames.indexOf(aName.toUpperCase()) != -1)
>+    return aName + "_";
What happens here if I actually am talking to CON_? It seems that both CON and CON_ would end up as CON_...that's bad.

>+/* 1) reserved characters (windows): <>:"/\|?* 2) avoid in linux/UNIX: /><|:& 3) added % to prevent possible collisions */
This comment needs to be expanded and formatted better.

>+  let pttrn = /[<>:"\/\\|?*&%]/g;
I think you also want to escape |, * and ?, they're all regexp special characters.

>+  while (pttrn.test(aName)) {
>+    let n = aName.search(pttrn);
>+    aName= aName.replace(aName.charAt(n), "%" + aName.charCodeAt(n).toString(16));
This is...replacing a character with it's hex code? I wonder if we should just encodeURI everything and not worry too specifically about what systems don't like what characters.

>+  }
>+  return aName;
Attachment #8353288 - Flags: review?(clokep) → review-
*** Original post on bio 1108 at 2012-05-30 06:45:13 UTC ***

(In reply to comment #9)
> Comment on attachment 8353288 [details] [diff] [review] (bio-attmnt 1534) [details]
> safe filenames patch v3

Thanks for your comments, I'll try to sharpen my coding style up :-/

 
> Is there a source that lists these as
> reserved?

<http://msdn.microsoft.com/en-us/library/aa365247%28v=vs.85%29.aspx#naming_conventions>


> >+  if (reservedNames.indexOf(aName.toUpperCase()) != -1)
> >+    return aName + "_";
> What happens here if I actually am talking to CON_? It seems that both CON and
> CON_ would end up as CON_...that's bad.

No, CON_ would be saved to CON__ because the previous check of the last characters includes '_' to make it suitable for the use as appendix:

+  let noEndChars = ["."," ","_"];
+  if (noEndChars.indexOf(aName,aName.length-1) != -1)
+    aName += "_";

> >+  let pttrn = /[<>:"\/\\|?*&%]/g;
> I think you also want to escape |, * and ?, they're all regexp special
> characters.

No need to escape them, inside of a character class the only regexp meta characters are ]\^- and / as it is the delimiter.


> >+  while (pttrn.test(aName)) {
> >+    let n = aName.search(pttrn);
> >+    aName= aName.replace(aName.charAt(n), "%" + aName.charCodeAt(n).toString(16));
> This is...replacing a character with it's hex code? I wonder if we should just
> encodeURI everything and not worry too specifically about what systems don't
> like what characters.

It is replacing the character with it's hex code and a leading '%' like encodeURI, but the characters do not match: /?:& wouldn't get encoded while many allowed characters would. But we could use encodeURIComponent() instead.
Attached patch safe filenames patch v4 (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1537 at 2012-05-31 18:43:00 UTC ***

I've picked up Mic's proposal for renaming reserved device names.
This patch includes a decode function as clokep suggested in #instantbird.
Attachment #8353292 - Flags: review?(clokep)
Comment on attachment 8353288 [details] [diff] [review]
safe filenames patch v3

*** Original change on bio 1108 attmnt 1534 at 2012-05-31 18:43:07 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353288 - Attachment is obsolete: true
Comment on attachment 8353254 [details] [diff] [review]
alternative approach

*** Original change on bio 1108 attmnt 1502 at 2012-05-31 18:45:50 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353254 - Flags: feedback?
Comment on attachment 8353292 [details] [diff] [review]
safe filenames patch v4

*** Original change on bio 1108 attmnt 1537 at 2012-06-12 16:44:48 UTC ***

>diff -r 3f9722c19f85 chat/components/src/logger.js
>--- a/chat/components/src/logger.js	Thu May 31 13:14:34 2012 +0200
>+++ b/chat/components/src/logger.js	Thu May 31 20:25:35 2012 +0200
>@@ -27,6 +27,37 @@
> 
> const kLineBreak = "@mozilla.org/windows-registry-key;1" in Cc ? "\r\n" : "\n";
> 
>+function encodeName(aName)
>+{
Put this on the line above.

>+// Reserved device names by windows 
Wrong indentation. Remove the trailing space, use a period at the end and capitalize Windows.

>+  let reservedNames = ["CON", "PRN", "AUX", "NUL", "COM1", "COM2", "COM3",
>+    "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT1", "LPT2", "LPT3",
>+    "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"];
>+  if (reservedNames.indexOf(aName.toUpperCase()) != -1)
>+    return "%" + aName;
Florian and I had discussed maybe using a regular expression: /^(CON|PRN|AUX|NUL|COM[1-9]|LPT[1-9])$/? I won't insist on one way or the other. This seems a bit excessive to rename every conversation that could potentially contain one of these though, shouldn't we only care if it is one of these or a % followed by these?

>+// "." and " " mustn't be at the end of a file or folder name, appending "_"
Fix indentation (as above) and add a period at the end.

>+  let noEndChars = [".", " ", "_"];
>+  if (noEndChars.indexOf(aName,aName.length-1) != -1)
>+    aName += "_";
I feel like there should be a simpler way to do this...but maybe not.

>+/* Reserved characters are replaced by %[hex value]. encodeURIComponent() is
>+ * not sufficient, nevertheless decodeURIComponent() can be used to decode.*/
Don't include text on the first line of the comment, put the closing comment on a separate line:
/*
 * Comment here.
 */
It could probably just use // though.

>+  function encodeReservedChars (match) {
>+    return "%" + match.charCodeAt(0).toString(16);
>+  }
A simple function like this can probably doesn't need the { return ... }, and no space after the name.

>+  return aName.replace(/[<>:"\/\\|?*&%]/g, encodeReservedChars);
>+}

>+function decodeName(aName)
>+{
>+  let reservedNames = ["%CON", "%PRN", "%AUX", "%NUL", "%COM1", "%COM2",
>+    "%COM3", "%COM4", "%COM5", "%COM6", "%COM7", "%COM8", "%COM9", "%LPT1",
>+    "%LPT2", "%LPT3", "%LPT4", "%LPT5", "%LPT6", "%LPT7", "%LPT8", "%LPT9"];
>+  if (reservedNames.indexOf(aName.toUpperCase()) != -1)
>+    return aName.substr(1);
>+  if (aName.charAt(aName.length-1) == "_")
>+    aName = aName.substring(0, aName.length-1);
>+  return decodeURIComponent(aName);
>+}
See the same comments as above.

Seems to me that decodeName needs to be used somewhere. :)
Attachment #8353292 - Flags: review?(clokep) → review-
Attached patch safe filenames patch v5 (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1711 at 2012-06-27 19:43:00 UTC ***

(In reply to comment #12)
> >+  let noEndChars = [".", " ", "_"];
> >+  if (noEndChars.indexOf(aName,aName.length-1) != -1)
> >+    aName += "_";
> I feel like there should be a simpler way to do this...but maybe not.

I changed it to the following, it seems simpler:
+  if (/[\. _]/.test(aName.slice(-1)))
+    aName += "_";

I hope I captured all flaws this time. I kept the indexOf() method for the reserved device names and dropped the decode function.
Attachment #8353469 - Flags: review?(clokep)
Comment on attachment 8353254 [details] [diff] [review]
alternative approach

*** Original change on bio 1108 attmnt 1502 at 2012-06-27 19:43:01 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353254 - Attachment is obsolete: true
Comment on attachment 8353292 [details] [diff] [review]
safe filenames patch v4

*** Original change on bio 1108 attmnt 1537 at 2012-06-27 19:43:01 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353292 - Attachment is obsolete: true
*** Original post on bio 1108 at 2012-07-14 14:58:37 UTC ***

(In reply to comment #13)
> Created attachment 8353469 [details] [diff] [review] (bio-attmnt 1711) [details]
> safe filenames patch v5

So I tested this and it seems that the folder names are properly made, but when I right click on a conversation/body, the "Show Logs" context menu is still disabled as if it can't find the logs. Maybe we're missing a place where the name needs to be encoded? Does this work for you?
OS: Windows XP → All
Hardware: x86 → All
*** Original post on bio 1108 at 2012-07-15 21:26:24 UTC ***

I do not have a built with this patch, but it works for me if I replace logger.js in omni.ja and start IB with -purgecaches.

What are you test cases?
Comment on attachment 8353469 [details] [diff] [review]
safe filenames patch v5

*** Original change on bio 1108 attmnt 1711 at 2012-07-16 05:59:02 UTC ***

:( Works for accounts with problematic names, but not for chats with buddies. Thank you for noticing this case.
Attachment #8353469 - Flags: review?(clokep)
Attached patch safe filenames patch v6 (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1748 at 2012-07-19 10:09:00 UTC ***

I forgot, that logs are not only written ...
Attachment #8353508 - Flags: review?(clokep)
Comment on attachment 8353469 [details] [diff] [review]
safe filenames patch v5

*** Original change on bio 1108 attmnt 1711 at 2012-07-19 10:09:31 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353469 - Attachment is obsolete: true
Comment on attachment 8353508 [details] [diff] [review]
safe filenames patch v6

*** Original change on bio 1108 attmnt 1748 at 2012-08-02 01:37:29 UTC ***

This looks good! This patch didn't fully apply though, I'm going to upload an updated version of it (and ask wnayes to look at it, as he's recently looked at much of the logger code).
Attachment #8353508 - Flags: review?(clokep) → review+
Attached patch Patch v6 unbitrotted (obsolete) — Splinter Review
*** Original post on bio 1108 as attmnt 1772 at 2012-08-02 01:39:00 UTC ***

Updated (unbitrotted) patch. Will can you take a look at this? You've been in the logger recently. Thanks!
Attachment #8353533 - Flags: review?(wnayes)
Comment on attachment 8353533 [details] [diff] [review]
Patch v6 unbitrotted

*** Original change on bio 1108 attmnt 1772 at 2012-08-02 17:00:03 UTC ***

While testing on my machine, COM0 and LPT0 were also reserved names. The Wikipedia link cited in the initial bug post also has these, so I think it would be good to include them in the check. CLOCK$ is also listed but it is not reserved on my machine, and maybe normalization takes care of the $ sign?

I also agree that the regular expression mentioned in Comment #12 should be considered. Checking for zeros would shorten it up even more (something like /^(CON|PRN|AUX|NUL|COM\d|LPT\d)$/).

The logger changes ran successfully with writing out several new folders and files, and I think this is very close to being finished.
Attachment #8353533 - Flags: review?(wnayes) → review-
*** Original post on bio 1108 as attmnt 1989 at 2012-10-19 08:46:00 UTC ***

unbitrotted, added COM0 and LPT0, now using RegExp as suggested by wnayes.

I have tested it with buddy, chat, and twitter conversations and everything seems to work fine.
Attachment #8353748 - Flags: review?(clokep)
Comment on attachment 8353508 [details] [diff] [review]
safe filenames patch v6

*** Original change on bio 1108 attmnt 1748 at 2012-10-19 08:46:45 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353508 - Attachment is obsolete: true
Comment on attachment 8353533 [details] [diff] [review]
Patch v6 unbitrotted

*** Original change on bio 1108 attmnt 1772 at 2012-10-19 08:46:45 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353533 - Attachment is obsolete: true
Comment on attachment 8353748 [details] [diff] [review]
patch v7 - unbitrotted, added COM0, LPT0, using RegExp

*** Original change on bio 1108 attmnt 1989 at 2012-10-24 23:16:12 UTC ***

I think we're just down to a few nits and this can land! :)

># HG changeset patch
># Parent 2685a879ee5c470f8dd5e4bbc184e503ce47eb34
># User Florian Janssen <bugi@media.fjmail.de>
>Patch for 1108, unbitrotted, added COM0 and LPT0, now using RegExp as suggested by wnayes
Please use the bug title as the message here: "Bug 954542 (bio 1108) - Names of folder and files are not checked against forbidden names, r=clokep." in this case.

>diff --git a/chat/components/src/logger.js b/chat/components/src/logger.js

>+function encodeName(aName){
Can you add a brief comment above this function describing what it is doing and why it is necessary (if you want to check wording or anything before asking for a full review again, just catch me on IRC).

>+  // "." and " " mustn't be at the end of a file or folder name, appending "_".
Nit: "must not" instead of "musnt"

Also, I tested this and it seemed to work as advertised!
Attachment #8353748 - Flags: review?(clokep) → review-
*** Original post on bio 1108 as attmnt 2015 at 2012-10-27 14:07:00 UTC ***

Patch with requested changes
Attachment #8353775 - Flags: review?(clokep)
Comment on attachment 8353748 [details] [diff] [review]
patch v7 - unbitrotted, added COM0, LPT0, using RegExp

*** Original change on bio 1108 attmnt 1989 at 2012-10-27 14:07:34 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8353748 - Attachment is obsolete: true
Comment on attachment 8353775 [details] [diff] [review]
Patch for bug 1108, r=clokep

*** Original change on bio 1108 attmnt 2015 at 2012-10-29 16:36:21 UTC ***

I think this one looks fine!
Attachment #8353775 - Flags: review?(clokep) → review+
Whiteboard: [checkin-needed]
*** Original post on bio 1108 at 2012-11-03 04:18:27 UTC ***

Thanks for fixing this!

Checked in as http://hg.instantbird.org/instantbird/rev/e910ddfccb50
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [checkin-needed]
Target Milestone: --- → 1.3
You need to log in before you can comment on or make changes to this bug.