Closed Bug 1155154 Opened 10 years ago Closed 9 years ago

input type file: filters incorrectly removed if accept attribute is in the form: ".prefix,.prefixPlusSomething"

Categories

(Core Graveyard :: File Handling, defect)

37 Branch
defect
Not set
normal

Tracking

(firefox42 fixed)

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: l.rozmus, Assigned: arnaud.bienner)

References

()

Details

(Keywords: testcase)

Attachments

(3 files, 4 obsolete files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0
Build ID: 20150403141925

Steps to reproduce:

make simple input fields with simple filter file extensions or mime types
as listed here
http://jsfiddle.net/sw2tgd36/23/

1.
<input type="file" accept=".xlsx,.xls">
<input type="file" accept=".xlsx, .xls">
<input type="file" accept=".xls,.xlsx">
<input type="file" accept=".xls, .xlsx">
2.
<input type="file" accept=".xls,.html,.xlsx">
3.
<input id="fileSelect" type="file" accept="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.ms-excel" /> 
4.
<input id="fileSelect" type="file" accept="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" /> 
5.
<input id="fileSelect" type="file" accept="application/vnd.ms-excel" /> 
    




Actual results:

Ubuntu 14.04LTS 64bit Firefox 37.0.1
- Filters on 1) doesn't work! shows "*.xlsx;"
rest works fine

Windows 7 64bit Firefox 37.0.1
- only Filter as 2)
- rest FAIL!



Expected results:

Filters on 1) allow files only with extensions .xls and .xlsx
Filters on 2) allow files only with extensions .xls and .xlsx and .html
Filters on 3) allow files only with extensions: xls, xlsx
Filters on 4) allow files only with extensions: xlsx
Filters on 5) allow files only with extensions: xls
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Component: Untriaged → DOM: Core & HTML
Product: Firefox → Core
Resolution: --- → DUPLICATE
Status: RESOLVED → UNCONFIRMED
Keywords: testcase
OS: Linux → All
Hardware: x86_64 → All
Resolution: DUPLICATE → ---
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) Gecko/20100101 Firefox/40.0
Build ID: 20150420030204

Confirmed.
Status: UNCONFIRMED → NEW
Ever confirmed: true
See Also: → 1156273
Component: DOM: Core & HTML → File Handling
Do we have a tracking bug for @accept issues?
Flags: needinfo?(jonas)
I don't know.
Flags: needinfo?(jonas)
Everything works fine for me on Mac, but indeed I can reproduce the problem on Linux.
I will have a look.
Assignee: nobody → arnaud.bienner
Status: NEW → ASSIGNED
I think I've found what is the problem.
Since bug 826176 we remove duplicate filters (here in the code [1]).
For example, <input type=file accept="image/jpeg,.jpeg"> will lead to redundant filters, as image/jpeg will already be mapped to a list of extensions containing ".jpeg". So this second filter is removed.

The "is it a duplicate?" check is done only by looking at extension list string: this was the easiest way to do it when we implemented it since we were only manipulating extension lists as string in this part of the code.
If an extension list is a substring of another, then we consider it as a duplicate:
image/jpeg maps to "*.jpeg; *.jpg" extension list string, and our second filter ".jpg" maps to the "*.jpg" extension list is a substring of it, so we ignore it.
But with accept=".xls,.xlsx" (or any other in the form of ".prefix,.prefixPlusSomething") we incorrectly remove the first filter while it is different from the other one.

1) The very quick solution might be to have ".jpg" filter to map to "*.jpg; " instead of "*.jpg"
IIRC I just didn't want to have useless a ";" separator for extension list with only one element. Maybe because on Windows we display the extension list next to filter's title and that looks ugly. I will check.

2) Alternative, clearer solution, would be to rewrite the code so nsFilePickerFilter stores filters as a real list/set instead of a string [2], to make comparison easier and less error prone. But image/*,audio/* and video/* extensions lists are stored as string in [3]. Not sure it's worth to change everything just for this bug.

I will have a deeper look tomorrow.


[1]: http://lxr.mozilla.org/mozilla-central/source/dom/html/HTMLInputElement.cpp#7166
[2]: http://lxr.mozilla.org/mozilla-central/source/dom/html/HTMLInputElement.h#1395
[3]: http://lxr.mozilla.org/mozilla-central/source/toolkit/content/filepicker.properties#11
Summary: filter extension input type="file" accept=".xls,.xlsx" doesn't work → input type file: filters incorrectly removed if accept attribute is in the form: ".prefix,.prefixPlusSomething"
Attached patch bug1155154.patch (obsolete) — Splinter Review
I'm asking you Olli for review since you reviewed my patches in bug 826176.

To summarize my previous comment, as you might remember, we remove duplicated filters by checking if a filter extension string is a substring of the other one:
e.g. "*.jpeg" and ".jpeg; *.jpg": first one will be removed.

To avoid having useless ";" separator, we add them only when necessary.
But then the "substring check" will not work for filters like "*.xls" and "*.xlsx" while it works for "*.xls; " and "*.xlsx; "

This patch adds the missing ";" when comparing two filters.
Another solution would be to always add it. As I said previously, I guess I did this because on Windows, we display the list of filters, and I'm not sure the file picker is smart enough to remove extra ";" separator, and having something named "MP3 (*.mp3; )" instead of "MP3 (*.mp3)" doesn't look like IMO. But unfortunately I don't have a Windows dev env anymore so I can't test this.

I don't like the way we play with strings like this, but using more abstract types for filters would need lot of code rewriting. But maybe worth it, in a second step.

One last note: the first change in the patch isn't related to this, but I don't like having a filter named ".ext ;" while its extension list is ".ext". I'm not sure why I did that, but having it named ".ext" is looks definitely better.
Attachment #8634381 - Flags: review?(bugs)
Please don't do unrelated changes.

How do we deal with accept attribute containing spaces, for example
".xls, .xlsx" or ".xls,  .xlsx", or ".xlsx, .xls", or ".xlsx,  .xls" ?
What if there is ; in the accept attribute?

Might be good to add some more tests
(In reply to Olli Pettay [:smaug] from comment #9)
> Please don't do unrelated changes.

I thought this was really too small to be worth another commit, but fair enough, I will remove this.
 
> How do we deal with accept attribute containing spaces, for example
> ".xls, .xlsx" or ".xls,  .xlsx", or ".xlsx, .xls", or ".xlsx,  .xls" ?

Fine: it seems it's already handled by the tokenizer.

> What if there is ; in the accept attribute?

This one doesn't work well however :( Good catch.
The accept being ".xlsx;  *.*" for example will lead to the filter extension string being "*.xlsx; *.*"
While I'm not sure of the impact this has, this sounds definitely buggy.
So we should definitely do something about the characters that will be interpreted by the file picker (';' and '*' I don't have others in mind) or discard the filter if anything after the '.' isn't alphanumeric.

> Might be good to add some more tests

Do you have other tests in mind? I will try to think about more tests on my side too.
So I was thinking just those
".xls, .xlsx", ".xls,  .xlsx", ".xlsx, .xls", ".xlsx,  .xls" and something including ';'
Comment on attachment 8634381 [details] [diff] [review]
bug1155154.patch

so, if you could add some more tests to that patch.
Attachment #8634381 - Flags: review?(bugs)
As I said in comment 10, multiple filters with extra spaces works now. I will update the patch to add those tests.
But not ';' but that sounds like a different issue than the initial problem so I should fix this in a separate patch (even if we land them together).
About the ';' issue, what do you think we should do? Ignore the extension if it contains characters interpretable by the file picker? (';' '*' maybe others but I can think only about those two) Just remove them? Or ignore the extension if it is not only alphanumeric characters?
The latter sounds better, but maybe it should be "alphanumeric" + few other characters: looking at [1] I can see some extensions use '(' '#' etc. but those seem to be quite uncommon so maybe we should go with "alphanumeric" only.
What do you think?

[1]: https://en.wikipedia.org/wiki/Alphabetical_list_of_filename_extensions_%28A%E2%80%93E%29
Flags: needinfo?(bugs)
Could you still add a test about ;, so that we know what our current behavior is, and then we can
change the results when ; handling is possibly changed later.

What do other browsers do with ; ?
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #14)
> What do other browsers do with ; ?

Chrome:
  - Mac/Linux: ';' and '*' are considered as normal characters (not ignored, not interpreted by the file picker)
  - Windows:
    '*' is interpreted by the file picker (i.e. .pd* will also show .pdf files)
    everything after ';' is ignored

Internet explorer: extension filters that contain ';' or '*' (and also some other characters like '&') are ignored.

I believe we should simply just ignore filters that contain problematic characters (';' and '*')
Attached patch bug1155154-2.patch (obsolete) — Splinter Review
Patch with updated tests, and first minor change removed, as it is unrelated.
Attachment #8634381 - Attachment is obsolete: true
Attachment #8634978 - Flags: review?(bugs)
And here is the small patch to change the filter's name.
As I said, there is no reason to have an extra ";" in the name.
Attachment #8634981 - Flags: review?(bugs)
Last part: ignore extension filter if it contains ';' or '*' + updated and new test cases.
Attachment #8634999 - Flags: review?(bugs)
Attachment #8634999 - Flags: review?(bugs) → review+
Attachment #8634981 - Flags: review?(bugs) → review+
Comment on attachment 8634978 [details] [diff] [review]
bug1155154-2.patch

Shouldn't just appending ';' be enough. Why '; '?
Attachment #8634978 - Flags: review?(bugs) → review+
Attachment #8634978 - Attachment is obsolete: true
Attachment #8636742 - Flags: review+
(In reply to Olli Pettay [:smaug] (away-ish until July 26, expect slower than usual reviews) from comment #20)
> Shouldn't just appending ';' be enough. Why '; '?

Yes, you're right. I just copy it from the other cases where we add an extra space when constructing the list of extensions, but here we don't need it.
I've updated the patch with this change + give it a correct description.
Update patch description, carrying r+
Attachment #8634981 - Attachment is obsolete: true
Attachment #8636745 - Flags: review+
ditto
Attachment #8634999 - Attachment is obsolete: true
Attachment #8636747 - Flags: review+
Pushed to try, everything looks fine: https://treeherder.mozilla.org/#/jobs?repo=try&revision=32548304020b
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/f03577f3c73b
https://hg.mozilla.org/mozilla-central/rev/21b8a570d156
https://hg.mozilla.org/mozilla-central/rev/f17fa0479c1a
Status: ASSIGNED → RESOLVED
Closed: 10 years ago9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: