HTML4: Accept property for input type="file" should filter file types[form sub]

RESOLVED FIXED in mozilla16

Status

()

Core
Layout: Form Controls
--
enhancement
RESOLVED FIXED
16 years ago
2 years ago

People

(Reporter: Skewer, Assigned: mounir)

Tracking

(Blocks: 1 bug, {html4, testcase})

Trunk
mozilla16
html4, testcase
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.2 -
wanted1.9.2 -
blocking1.9.1 -
wanted1.9.1 -
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [HTML4-17.3][HTML4-17.4] parity-Opera)

Attachments

(2 attachments)

(Reporter)

Description

16 years ago
Procedure: Try to upload a file other than image/gif or image/jpeg in the testcase.

Expected: Since the accept property is set to those two file types, it shouldn't
be possible to submit a form with these kind of files. I don't have a way to
test the way the form is submitted, but I would expect a browser to enforce the
accept property after returning from a browse window (and it should probably
filter the browse window too, even though that method isn't fool-proof).

Actual: The browser allows any content-type to be selected in the browse window,
and, though untested, I have a feeling they are submitted this way too.

Build: 2001060104 Win98
(Reporter)

Comment 1

16 years ago
Created attachment 36883 [details]
INPUT TYPE="file" testcase (with accept and style)
(Reporter)

Comment 2

16 years ago
Keywording...
Keywords: correctness, html4, testcase, ui
over to form submission, setting status to new.
Status: UNCONFIRMED → NEW
Component: HTML Form Controls → Form Submission
Ever confirmed: true
OS: Windows 98 → All
Hardware: PC → All

Comment 4

16 years ago
reassigning
Assignee: rods → pollmann

Updated

16 years ago
Component: Form Submission → HTML Form Controls
Target Milestone: --- → Future

Comment 5

16 years ago
Not going to happen until crasher, dataloss, and correctness bugs are fixed.

Updated

16 years ago
Status: NEW → ASSIGNED
Bulk reassigning Eric Pollmann's remaining form submission bugs to Alex.
Assignee: pollmann → alexsavulov
Status: ASSIGNED → NEW

Updated

16 years ago
Summary: HTML4: Accept property for input type="file" should filter file types → HTML4: Accept property for input type="file" should filter file types[form sub]

Updated

16 years ago
Blocks: 46135
Created attachment 77374 [details]
Testcase for accept attribute on FORM

The entire form element can also have an accept attribute.
Whiteboard: [HTML4-17.3][HTML4-17.4]

Comment 8

14 years ago
ping?
this one was quiet for more than a year.
maybe just because it is of "minor" severity?
IMO browser has improved so much in this time that even such minor problems 
should be tackled.
Go for it.
Assignee: alexsavulov → stefan-moz01

Comment 10

14 years ago
So far i managed to pinpoint a file
http://lxr.mozilla.org/seamonkey/source/widget/public/nsIFilePicker.idl
...but have no clue if i am on the right track.
Keywords: helpwanted
Yep.  nsIFilePicker is what's used to control the filepicker that comes up.

So you could filter in there, or you could filter on return from the filepicker,
either way.

Not sure what the right thing is for filenames the user just types, of course...

Comment 12

14 years ago
Stefan, please keep this in mind:

> From: Tim Berners-Lee <timbl@w3.org>
> Date: Fri, 31 Mar 2000 16:37:02 -0500
>
>... you can write:
> 
> <INPUT name="audiofile1" type="file" accept="audio/*">
> 
> and be prompted for various means of audio input (a recorder,
> a mixing desk, a file icon drag and drop receptor, etc).  
> Here "file" does not mean "from a disk" but "large body of
> data with a MIME type".
> 
> As someone who used the NeXT machine's "lip service" many 
> years ago I see no reason why browsers should not implement 
> both audio and video and still capture in this way.   There
> are many occasions that voice input is valuable. We have speech 
> recognition systems in the lab, for example, and of course this 
> is very much needed....  So you don't need to convince me of
> the usefulness.
> 
> However, browser writers have not implemented this!
> 
> One needs to encourage this feature to be implemented, and 
> implemented well.
> 
> I hope this helps.
> 
> Tim Berners-Lee

please see also bug 46135

Comment 13

14 years ago
Re comment 12: Yes, James, i was aware of bug 46135, which is a broader issue, 
only to be done after basic file filtering is done.

Re comment 11: tnx for the confirmation of being on the right thack with 
nsFilePicker. However, i think that filtering should be instructed to the 
dialog _before_ choosing the file, not after user has chosen it, as the goal is 
to help user with locating the appropriate file, not to eliminate submission of 
wrong MIME type.

Reading the spec again:
http://www.w3.org/TR/html401/interact/forms.html#adef-accept
"This attribute specifies a comma-separated list of content types that a 
_server_ processing this form will handle correctly. User agents _may_ use this 
information to filter out non-conforming _files_..."
(_emphasis_ added by me)
Which doesn't really say that the agent should not upload any other MIME type 
(eg if user typed the filename or switched the type combo from "Audio files" 
to "All files" (which should still be available IMO) and has chosen some other 
file.

I am somehow new to the mozilla source and lacking of time at the moment, so I 
am not actually solving it. Do we have a better candidate out here? Please? :)

Comment 14

14 years ago
"...User agents _may_ use this information to filter out non-conforming 
_files_..."

Good catch:  the word "may" means that this is actually an optional 
feature request in the file selection boxes, which are very different
across platforms.  I'm attempting to change this from "minor" to 
"enhancement" but don't know if I have the bugzilla privs.  I'm also 
redirecting the dependency accordingly.
No longer blocks: 46135
Severity: minor → enhancement
Depends on: 46135
"may" in said specification means "you better have a REALLY GOOD REASON for not
doing this" (standard RFC-speak).

Comment 16

14 years ago
Bradner                  Best Current Practice                  [Page 2]
RFC 2119                     RFC Key Words                    March 1997

5. MAY   This word, or the adjective "OPTIONAL", mean that an item is
   truly optional.  One vendor may choose to include the item because a
   particular marketplace requires it or because the vendor feels that
   it enhances the product while another vendor may omit the same item.
   An implementation which does not include a particular option MUST be
   prepared to interoperate with another implementation which does
   include the option, though perhaps with reduced functionality. In the
   same vein an implementation which does include a particular option
   MUST be prepared to interoperate with another implementation which
   does not include the option (except, of course, for the feature the
   option provides.)

Comment 17

14 years ago
Probably fighting whole May about "may" won't do any good :)

RFCs and common-sense clasify "may" as optional, not required to comply with 
spec. However this does not mean that demanding Mozilla users don't need this 
useful feature.

Donno why James set to be dependent of bug 46135. The only reason i see for 
doing so is if he envisioned the filtering file picker as a "helper 
application". Enlighten us, please :)

Comment 18

14 years ago
I was mistaken about the dependency swap; putting it back.
Blocks: 46135
No longer depends on: 46135

Updated

14 years ago
Blocks: 97806

Comment 19

14 years ago
Memorializing some pertinent documentation, the result of closed bug 61408:
http://bugzilla.mozilla.org/attachment.cgi?id=112432&action=view

Comment 20

14 years ago
http://dev.horde.org/api/horde/dev-doxygen/html/classMIME__Magic.html

I think it's based on Apache's mod_mime_magic which is probably
more up to date.  I think IANA has a database for this function.

Comment 21

14 years ago
Even better:
  http://freshmeat.net/projects/file/

Version numbers above 4 have been divided into the main file(1) program and 
libmagic(3), a library that other programs can use directly to get file information without 
needing to fork and exec file(1). 

Comment 22

14 years ago
Stefan, if you are uncomfortable with milestone 1.6a, then please either change to 1.7a 
or assign this to me.

Here are the pertinent Apache module docs:

  http://httpd.apache.org/docs-2.0/mod/mod_mime_magic.html
  http://httpd.apache.org/docs/mod/mod_mime_magic.html

I'm guessing that might be better but not as up to date as libmagic(3) and probably better 
than the MIME_Magic::filenameToMIME function from the Horde Application Framework.
Target Milestone: Future → mozilla1.6alpha

Comment 23

14 years ago
Are we all in agreement that if the filename is foo.gif but file(1) says it's a jpeg, then it's a 
jpeg?

More importantly:  do you really want to filter this in the file selection dialog box, or would 
it be better to implement a test just prior to form submission, causing any mismatches to 
result in a dialog such as, "The file (name here) selected for upload was supposed to 
match MIME type(s) (accept property list here), but it is MIME type (mime_magic output 
here). Continue with form submission? Yes or No"

If so, should the accept property (list) be shown as part of the file upload widget?
It can be implemented differently on different platforms, but my suggestion is
to have a list of file extensions that are associated with the MIME type and
only show those.  The dialog on Windows and Linux can be like that of many other
implementations--it shows only those files which have those extensions.  It
would also be possible (but time-consuming, especially for remote drives) to get
the magic file type from each file to deal with the .gif case you speak of.

Even if you filter the file extensions in the dialog box, you could still check
the mime type of the file at submission because the user might have typed the
filename into the Browse box instead of selecting a particular file from the GUI.
> but my suggestion is to have a list of file extensions that are associated with
> the MIME type and only show those

So if I don't name my jpegs foo.jpg I get screwed?  Let's not, ok?  Especially
since some OSes (BeOS comes to mind) have MIME types for every file in the
filesystem metadata.
First, there is OS support on some OS's (notably Windows) for the *.jpg *.jpeg idea.
Second, for network drives determining the mime type can be very expensive, so
it may not be desirable to narrow down the file types for the user in that way
in all cases (though I'd be happy to see if there was a way to do that).
Third, for OS's that have quick support for determining that mime type, they can
sure as heck do that for the dialog.  No one says the file dialog has to be XP,
and it is not.
Fourth, on many users generally save JPEGs as *.jpg and *.jpeg and are used to
being presented with dialogs that filter these files down.  If you are on such
an OS and you wish to name your JPEGs *.txt, you can type the name in directly
or (if we decide to allow it) change the file type filter to All Files (*.*).

Comment 27

14 years ago
File extensions can't be trusted, especially on Micros~1 where 
foo_JPG.bak could easily be a gif.  So you have to allow file 
selection filter overrides.  Having said that, not all operating 
systems have filepicker dialogs with type filtering to begin with.

Plus, according to the HTML 3.2 spec, filename filtering was 
supposed to be different than type filtering.  The idea was that 
you could do something like accept="*.jpg;*.jpeg" when you wanted
to filter by filename patterns, xor put a (list of) mime type(s)
in when you wanted to filter by type.  That parsing abomination 
was mercifully yanked from HTML 4.

I'm pretty happy with iterating mime_magic() over the uploaded 
files, using the dialog from comment 23 when there's a conflict,
and showing the accept property (list), when there is one, in the 
widget under both the filename text box and the browse button.

Or, should the accept property (list) be rendered above the other 
two sub-widgets instead of below?

I just looked at a bunch of upload-enabled forms from a couple 
different mass-market websites (Yahoo and Meetup) and it seems 
safe to assume that form authors allow for a wide variety of 
shapes and sizes for the upload widget.  I remember that many 
of the popular browsers have rendered them as line-breakable 
in low margin width situations, so adding height to the widget 
should not be a problem.

The more I think about use-cases, the less I think filtering 
by filename patterns is likely to save much time, since all 
the major OS file pickers allow for multiple ways to sort 
files, and most allow sorting by filename extension.

Comment 28

14 years ago
Very much agree with John in both comment 24 and comment 26. James, don't make 
it too strict, please. File picker is suppose to HELP user. Server still has to 
check what was uploaded (you know... there are some other browsers that might 
not respect this _recomendation_ )
- The file selection would be much easier and time saving if there are fewer 
  files to pick from (only *.jpg,*.jpeg,*.pjpeg for accept=image/jpeg)
- Browser should not totally prevent the upload of any other file, either 
  determined by filename of file(1) or some other MIME magic.
- If user knowingly saved his jpeg to some.gif let him pick it by disabling the 
  filter (switch to "Show all files")
- If some OS (like BeOS) discourages use of *.jpg (and variants) for image/jpeg
  then it might be possible to literally filter files by mime type (if it is a
  part of file metadata, as the filename is on all other OSes) A BeOS guru can
  jump in at anytime (and will if filtering by name will be inconvinient for 
  him).
Assignee: stefan-moz01 → jps

Comment 29

14 years ago
I am not opposed to file extension filtering AS AN OPTION, 
on the platform(s) that supports it without much extra work.  
Here is how Win32 does file extension filtering:

CFileDialog::CFileDialog( 

  BOOL,    // true  = File Open dialog box
           // false = File Save As dialog box

  LPCTSTR, // If an extension is not included in 
           // the Filename edit box, this extension
           // is automatically appended to the 
           // filename.   NULL = none
    // PLEASE NOTE THE USE OF THE SINGULAR EXTENSION.

  LPCTSTR, // The initial filename that appears in 
           // the filename edit box.  NULL = none

  OFN_ALLOWMULTISELECT,           // flag bitmask
    // allow multi-select iff bug 44464 resolved

  _T("Text files (.txt)|*.txt|")  // filter string
  _T("HTML files (.html; .htm)|*.html;*.htm|") 
    // another pair of "text|pattern(s)|"
  _T("All files|*.*|")            // IMPORTANT TO INCLUDE THIS
    // another pair of "text|pattern(s)|"
  _T("|"), // terminate concat of pairs with a bar

  CWnd*    // final argument is a pointer to the 
           // file dialog box object's parent or 
           // owner window, for notifications.
)

Now, I still think it would be polite -- perhaps too polite 
-- to warn and offer to defer form submission when the MIME 
type as reported by mime_magic() doesn't match something in 
the accept property list, as in comment 23.  However, that 
is optional, too.

Whether you do that or not, the first step is to find a 
MIME type matcher suitable for matching lists of types 
with arbitrary MIME type patterns (i.e., 
"image,text/* audio/x-specific;type=with;parameter=values")
against the key types in mimeTypes.rdf.  The match must 
not be case sensitive, (except for the "values" maybe, if 
you even need to match those.)  There is probably one of 
those in the browser code already.  I think HTTP uses MIME 
list matching.

Then mimeTypes.rdf has the mapping from type to extension(s).

Comment 30

14 years ago
Asking Boris whether/how to use the http mime type list match.
Priority: -- → P2
Target Milestone: mozilla1.6alpha → mozilla1.5alpha
what "http mime type list"? I think you want this:
http://lxr.mozilla.org/seamonkey/source/netwerk/mime/public/nsIMIMEService.idl

Comment 32

14 years ago
Thanks, Christian, but nsIMIMEService.idl doesn't (yet) have anything like
 GetAllExtensionsFromMimeTypes(in string listWithWildcards)

Here's what I asked Boris:

Would you please tell stefan-moz01@baebler.net and I where the MIME type 
match between a list of types with optional patterns -- the one that HTTP 
uses for processing "Accept*:" header(s) -- is located in the Mozilla code?

Do you think it can be called from forms or will it have to be copied?

I welcome anyone who would like to chime in on these questions, please.

Comment 33

14 years ago
Boris is correct, as usual:

> It's not located anywhere.  Mozilla sends Accept* headers, but it 
> never receives them (being an HTTP client, not server), hence never 
> has to process them.

Stefan, would you like this back?  Do you think you can write the
case-insensitive MIME list-with-wildcards pattern match by 1.5alpha?  
I really hate to see this one slip.  Is there anything else you 
would need for it?
Priority: P2 → P1
Target Milestone: mozilla1.5alpha → mozilla1.6beta

Comment 34

14 years ago
To Stefan.  I've got to clear my calendar for at least the next several weeks.

Apache does have an Accept HTTP header parser, of course, which you may use for this:
  http://httpd.apache.org/docs/content-negotiation.html
  http://cvs.apache.org/viewcvs.cgi/apache-1.3/src/modules/standard/mod_negotiation.c

If you don't think you can do this by 1.7a, please swap the dependency with bug 46135.

Do you have all three platforms handy?  If not, I'd ask Boris; he knows this stuff like the 
back of his hand.  Although I can't promise that he has even looked once at the Mac file 
picker routines.
Assignee: jps → stefan-moz01
Priority: P1 → --
Target Milestone: mozilla1.6beta → ---

Comment 35

14 years ago
Unfortunately I won't be able to deal with it in the next months either. 
If soemeone can, please do so.
Assignee: stefan-moz01 → form
QA Contact: vladimire → ian

Comment 36

13 years ago
so what is needed to be done here?
Everything.  None of this is implemented.

Comment 38

11 years ago
The Web Forms 2 specification also allows wild cards, e.g. accept="image/*":
http://whatwg.org/specs/web-forms/current-work/#upload

Comment 39

9 years ago
FWIW, Opera supports this including the wildcards of Web Forms 2.0

Would be nice for use cases like Flickr (image/* only shows pictures that can be uploaded).
Flags: wanted1.9.1?
Whiteboard: [HTML4-17.3][HTML4-17.4] → [HTML4-17.3][HTML4-17.4] parity-Opera
Flags: wanted1.9.1?
Flags: wanted1.9.1-
Flags: blocking1.9.1-

Updated

9 years ago
Keywords: polish
Whiteboard: [HTML4-17.3][HTML4-17.4] parity-Opera → [HTML4-17.3][HTML4-17.4][polish-hard][polish-interactive] parity-Opera

Updated

9 years ago
Flags: wanted1.9.2?

Updated

8 years ago
Assignee: layout.form-controls → nobody
Keywords: polish
QA Contact: ian → layout.form-controls
Whiteboard: [HTML4-17.3][HTML4-17.4][polish-hard][polish-interactive] parity-Opera → [HTML4-17.3][HTML4-17.4] parity-Opera
Flags: wanted1.9.2?
Flags: wanted1.9.2-
Flags: blocking1.9.2-

Updated

7 years ago
Duplicate of this bug: 542271

Updated

7 years ago
Depends on: 377624, 565272, 565274
(Assignee)

Updated

7 years ago
Assignee: nobody → mounir.lamouri
Status: NEW → ASSIGNED
(Assignee)

Updated

7 years ago
Keywords: helpwanted

Comment 41

6 years ago
This does not work with many MIME TYPES. Including:
application/*
text/html
text/plain
text/xml
text/*
image/jpg
...
image/[insertAnyCharacterHereThatIsNotA *]

At least for:
image/jpg
image/png
image/tiff
should treat as:
image/*
instead of ignoring.
Same goes to audio, video, etc... 

Could that be fixed?
(Assignee)

Comment 42

6 years ago
Only "image/*", "audio/*" and "video/*" are currently supported. We do not yet support MIME TYPES, see bug 565274. In addition, note that "<foo>/*" isn't allowed, except for image, audio and video.

Comment 43

6 years ago
(In reply to comment #42)
> Only "image/*", "audio/*" and "video/*" are currently supported.
Could you please make it so that image/<foo> is treated as image/* until you support all variants of image (same for audio and video)

> We do not yet support MIME TYPES, see bug 565274. In addition, note that 
> "<foo>/*"
> isn't allowed, except for image, audio and video.
Could you please point to the RFC and indicates that? Maybe there's no strong enough reason for not allowing support of it.

Comment 44

6 years ago
(In reply to comment #42)
> Only "image/*", "audio/*" and "video/*" are currently supported. We do not
> yet support MIME TYPES, see bug 565274. In addition, note that "<foo>/*"
> isn't allowed, except for image, audio and video.

Wouldn't it be reasonable to add one more -- "text/*" -- to that list? I suspect it would be useful for certain web applications, such as bug trackers.

Comment 45

5 years ago
parity-Chrome?

<input type="file" accept="application/vnd.ms-excel"/>
presents a dialog only allowing excel files in Chrome 18 on windows XP.
(Assignee)

Comment 46

5 years ago
Please, see activity in bug 565274.
(Assignee)

Comment 47

5 years ago
This is now done.
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla16

Comment 48

2 years ago
In case anybody is still following this after so long...isn't this suppose to also provide the ability to filter by file extension? For example: ".xls, .xlsx, .csv". Also, this currently only adds filters that the user can select. If they don't select the filter then the file browser will still accept other file types but specification says that's not the desired behavior when supporting the "accept" attribute. The "accept" attribute should define the current/default/only filter. Sorry if I'm misunderstanding something about this feature. It's kind of annoying, though, when IE and Chrome both support this and Firefox doesn't...

Comment 49

2 years ago
(In reply to Michael DeVos from comment #48)

This is was bug 826176 is about, and as you can see, it has been implemented and will be part of Firefox 37.

Comment 50

2 years ago
*This is what
You need to log in before you can comment on or make changes to this bug.