Last Comment Bug 83749 - HTML4: Accept property for input type="file" should filter file types[form sub]
: HTML4: Accept property for input type="file" should filter file types[form sub]
Status: RESOLVED FIXED
[HTML4-17.3][HTML4-17.4] parity-Opera
: html4, testcase
Product: Core
Classification: Components
Component: Layout: Form Controls (show other bugs)
: Trunk
: All All
: -- enhancement with 35 votes (vote)
: mozilla16
Assigned To: Mounir Lamouri (:mounir)
:
: Jet Villegas (:jet)
Mentors:
: 542271 (view as bug list)
Depends on: 377624 565272 565274
Blocks: input-helper-apps xforms
  Show dependency treegraph
 
Reported: 2001-06-01 18:38 PDT by Skewer
Modified: 2015-01-15 01:22 PST (History)
30 users (show)
roc: blocking1.9.2-
roc: wanted1.9.2-
roc: blocking1.9.1-
roc: wanted1.9.1-
mounir: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
INPUT TYPE="file" testcase (with accept and style) (1.18 KB, text/html)
2001-06-01 18:40 PDT, Skewer
no flags Details
Testcase for accept attribute on FORM (254 bytes, text/html)
2002-04-02 18:56 PST, Christopher Hoess (gone)
no flags Details

Description Skewer 2001-06-01 18:38:49 PDT
Procedure: Try to upload a file other than image/gif or image/jpeg in the testcase.

Expected: Since the accept property is set to those two file types, it shouldn't
be possible to submit a form with these kind of files. I don't have a way to
test the way the form is submitted, but I would expect a browser to enforce the
accept property after returning from a browse window (and it should probably
filter the browse window too, even though that method isn't fool-proof).

Actual: The browser allows any content-type to be selected in the browse window,
and, though untested, I have a feeling they are submitted this way too.

Build: 2001060104 Win98
Comment 1 Skewer 2001-06-01 18:40:26 PDT
Created attachment 36883 [details]
INPUT TYPE="file" testcase (with accept and style)
Comment 2 Skewer 2001-06-01 18:51:38 PDT
Keywording...
Comment 3 Boris Zbarsky [:bz] (still a bit busy) 2001-06-01 20:07:28 PDT
over to form submission, setting status to new.
Comment 4 rods (gone) 2001-06-04 04:57:13 PDT
reassigning
Comment 5 Eric Pollmann 2001-06-07 20:18:03 PDT
Not going to happen until crasher, dataloss, and correctness bugs are fixed.
Comment 6 Kevin McCluskey (gone) 2001-10-29 11:58:26 PST
Bulk reassigning Eric Pollmann's remaining form submission bugs to Alex.
Comment 7 Christopher Hoess (gone) 2002-04-02 18:56:12 PST
Created attachment 77374 [details]
Testcase for accept attribute on FORM

The entire form element can also have an accept attribute.
Comment 8 Stefan Baebler 2003-04-16 23:16:00 PDT
ping?
this one was quiet for more than a year.
maybe just because it is of "minor" severity?
IMO browser has improved so much in this time that even such minor problems 
should be tackled.
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2003-04-17 13:04:13 PDT
Go for it.
Comment 10 Stefan Baebler 2003-04-22 08:49:36 PDT
So far i managed to pinpoint a file
http://lxr.mozilla.org/seamonkey/source/widget/public/nsIFilePicker.idl
...but have no clue if i am on the right track.
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2003-04-22 09:44:37 PDT
Yep.  nsIFilePicker is what's used to control the filepicker that comes up.

So you could filter in there, or you could filter on return from the filepicker,
either way.

Not sure what the right thing is for filenames the user just types, of course...
Comment 12 James Salsman 2003-05-08 02:12:35 PDT
Stefan, please keep this in mind:

> From: Tim Berners-Lee <timbl@w3.org>
> Date: Fri, 31 Mar 2000 16:37:02 -0500
>
>... you can write:
> 
> <INPUT name="audiofile1" type="file" accept="audio/*">
> 
> and be prompted for various means of audio input (a recorder,
> a mixing desk, a file icon drag and drop receptor, etc).  
> Here "file" does not mean "from a disk" but "large body of
> data with a MIME type".
> 
> As someone who used the NeXT machine's "lip service" many 
> years ago I see no reason why browsers should not implement 
> both audio and video and still capture in this way.   There
> are many occasions that voice input is valuable. We have speech 
> recognition systems in the lab, for example, and of course this 
> is very much needed....  So you don't need to convince me of
> the usefulness.
> 
> However, browser writers have not implemented this!
> 
> One needs to encourage this feature to be implemented, and 
> implemented well.
> 
> I hope this helps.
> 
> Tim Berners-Lee

please see also bug 46135
Comment 13 Stefan Baebler 2003-05-08 03:09:34 PDT
Re comment 12: Yes, James, i was aware of bug 46135, which is a broader issue, 
only to be done after basic file filtering is done.

Re comment 11: tnx for the confirmation of being on the right thack with 
nsFilePicker. However, i think that filtering should be instructed to the 
dialog _before_ choosing the file, not after user has chosen it, as the goal is 
to help user with locating the appropriate file, not to eliminate submission of 
wrong MIME type.

Reading the spec again:
http://www.w3.org/TR/html401/interact/forms.html#adef-accept
"This attribute specifies a comma-separated list of content types that a 
_server_ processing this form will handle correctly. User agents _may_ use this 
information to filter out non-conforming _files_..."
(_emphasis_ added by me)
Which doesn't really say that the agent should not upload any other MIME type 
(eg if user typed the filename or switched the type combo from "Audio files" 
to "All files" (which should still be available IMO) and has chosen some other 
file.

I am somehow new to the mozilla source and lacking of time at the moment, so I 
am not actually solving it. Do we have a better candidate out here? Please? :)
Comment 14 James Salsman 2003-05-08 05:57:47 PDT
"...User agents _may_ use this information to filter out non-conforming 
_files_..."

Good catch:  the word "may" means that this is actually an optional 
feature request in the file selection boxes, which are very different
across platforms.  I'm attempting to change this from "minor" to 
"enhancement" but don't know if I have the bugzilla privs.  I'm also 
redirecting the dependency accordingly.
Comment 15 Boris Zbarsky [:bz] (still a bit busy) 2003-05-08 06:42:47 PDT
"may" in said specification means "you better have a REALLY GOOD REASON for not
doing this" (standard RFC-speak).
Comment 16 James Salsman 2003-05-08 06:51:51 PDT
Bradner                  Best Current Practice                  [Page 2]
RFC 2119                     RFC Key Words                    March 1997

5. MAY   This word, or the adjective "OPTIONAL", mean that an item is
   truly optional.  One vendor may choose to include the item because a
   particular marketplace requires it or because the vendor feels that
   it enhances the product while another vendor may omit the same item.
   An implementation which does not include a particular option MUST be
   prepared to interoperate with another implementation which does
   include the option, though perhaps with reduced functionality. In the
   same vein an implementation which does include a particular option
   MUST be prepared to interoperate with another implementation which
   does not include the option (except, of course, for the feature the
   option provides.)
Comment 17 Stefan Baebler 2003-05-08 07:16:44 PDT
Probably fighting whole May about "may" won't do any good :)

RFCs and common-sense clasify "may" as optional, not required to comply with 
spec. However this does not mean that demanding Mozilla users don't need this 
useful feature.

Donno why James set to be dependent of bug 46135. The only reason i see for 
doing so is if he envisioned the filtering file picker as a "helper 
application". Enlighten us, please :)
Comment 18 James Salsman 2003-05-08 09:43:48 PDT
I was mistaken about the dependency swap; putting it back.
Comment 19 James Salsman 2003-05-14 17:46:12 PDT
Memorializing some pertinent documentation, the result of closed bug 61408:
http://bugzilla.mozilla.org/attachment.cgi?id=112432&action=view
Comment 20 James Salsman 2003-05-15 10:21:35 PDT
http://dev.horde.org/api/horde/dev-doxygen/html/classMIME__Magic.html

I think it's based on Apache's mod_mime_magic which is probably
more up to date.  I think IANA has a database for this function.
Comment 21 James Salsman 2003-05-19 06:38:48 PDT
Even better:
  http://freshmeat.net/projects/file/

Version numbers above 4 have been divided into the main file(1) program and 
libmagic(3), a library that other programs can use directly to get file information without 
needing to fork and exec file(1). 
Comment 22 James Salsman 2003-05-19 16:16:29 PDT
Stefan, if you are uncomfortable with milestone 1.6a, then please either change to 1.7a 
or assign this to me.

Here are the pertinent Apache module docs:

  http://httpd.apache.org/docs-2.0/mod/mod_mime_magic.html
  http://httpd.apache.org/docs/mod/mod_mime_magic.html

I'm guessing that might be better but not as up to date as libmagic(3) and probably better 
than the MIME_Magic::filenameToMIME function from the Horde Application Framework.
Comment 23 James Salsman 2003-05-19 16:28:52 PDT
Are we all in agreement that if the filename is foo.gif but file(1) says it's a jpeg, then it's a 
jpeg?

More importantly:  do you really want to filter this in the file selection dialog box, or would 
it be better to implement a test just prior to form submission, causing any mismatches to 
result in a dialog such as, "The file (name here) selected for upload was supposed to 
match MIME type(s) (accept property list here), but it is MIME type (mime_magic output 
here). Continue with form submission? Yes or No"

If so, should the accept property (list) be shown as part of the file upload widget?
Comment 24 John Keiser (jkeiser) 2003-05-19 16:46:17 PDT
It can be implemented differently on different platforms, but my suggestion is
to have a list of file extensions that are associated with the MIME type and
only show those.  The dialog on Windows and Linux can be like that of many other
implementations--it shows only those files which have those extensions.  It
would also be possible (but time-consuming, especially for remote drives) to get
the magic file type from each file to deal with the .gif case you speak of.

Even if you filter the file extensions in the dialog box, you could still check
the mime type of the file at submission because the user might have typed the
filename into the Browse box instead of selecting a particular file from the GUI.
Comment 25 Boris Zbarsky [:bz] (still a bit busy) 2003-05-19 18:36:17 PDT
> but my suggestion is to have a list of file extensions that are associated with
> the MIME type and only show those

So if I don't name my jpegs foo.jpg I get screwed?  Let's not, ok?  Especially
since some OSes (BeOS comes to mind) have MIME types for every file in the
filesystem metadata.
Comment 26 John Keiser (jkeiser) 2003-05-19 19:17:05 PDT
First, there is OS support on some OS's (notably Windows) for the *.jpg *.jpeg idea.
Second, for network drives determining the mime type can be very expensive, so
it may not be desirable to narrow down the file types for the user in that way
in all cases (though I'd be happy to see if there was a way to do that).
Third, for OS's that have quick support for determining that mime type, they can
sure as heck do that for the dialog.  No one says the file dialog has to be XP,
and it is not.
Fourth, on many users generally save JPEGs as *.jpg and *.jpeg and are used to
being presented with dialogs that filter these files down.  If you are on such
an OS and you wish to name your JPEGs *.txt, you can type the name in directly
or (if we decide to allow it) change the file type filter to All Files (*.*).
Comment 27 James Salsman 2003-05-19 19:56:51 PDT
File extensions can't be trusted, especially on Micros~1 where 
foo_JPG.bak could easily be a gif.  So you have to allow file 
selection filter overrides.  Having said that, not all operating 
systems have filepicker dialogs with type filtering to begin with.

Plus, according to the HTML 3.2 spec, filename filtering was 
supposed to be different than type filtering.  The idea was that 
you could do something like accept="*.jpg;*.jpeg" when you wanted
to filter by filename patterns, xor put a (list of) mime type(s)
in when you wanted to filter by type.  That parsing abomination 
was mercifully yanked from HTML 4.

I'm pretty happy with iterating mime_magic() over the uploaded 
files, using the dialog from comment 23 when there's a conflict,
and showing the accept property (list), when there is one, in the 
widget under both the filename text box and the browse button.

Or, should the accept property (list) be rendered above the other 
two sub-widgets instead of below?

I just looked at a bunch of upload-enabled forms from a couple 
different mass-market websites (Yahoo and Meetup) and it seems 
safe to assume that form authors allow for a wide variety of 
shapes and sizes for the upload widget.  I remember that many 
of the popular browsers have rendered them as line-breakable 
in low margin width situations, so adding height to the widget 
should not be a problem.

The more I think about use-cases, the less I think filtering 
by filename patterns is likely to save much time, since all 
the major OS file pickers allow for multiple ways to sort 
files, and most allow sorting by filename extension.
Comment 28 Stefan Baebler 2003-05-19 22:04:49 PDT
Very much agree with John in both comment 24 and comment 26. James, don't make 
it too strict, please. File picker is suppose to HELP user. Server still has to 
check what was uploaded (you know... there are some other browsers that might 
not respect this _recomendation_ )
- The file selection would be much easier and time saving if there are fewer 
  files to pick from (only *.jpg,*.jpeg,*.pjpeg for accept=image/jpeg)
- Browser should not totally prevent the upload of any other file, either 
  determined by filename of file(1) or some other MIME magic.
- If user knowingly saved his jpeg to some.gif let him pick it by disabling the 
  filter (switch to "Show all files")
- If some OS (like BeOS) discourages use of *.jpg (and variants) for image/jpeg
  then it might be possible to literally filter files by mime type (if it is a
  part of file metadata, as the filename is on all other OSes) A BeOS guru can
  jump in at anytime (and will if filtering by name will be inconvinient for 
  him).
Comment 29 James Salsman 2003-05-26 06:30:19 PDT
I am not opposed to file extension filtering AS AN OPTION, 
on the platform(s) that supports it without much extra work.  
Here is how Win32 does file extension filtering:

CFileDialog::CFileDialog( 

  BOOL,    // true  = File Open dialog box
           // false = File Save As dialog box

  LPCTSTR, // If an extension is not included in 
           // the Filename edit box, this extension
           // is automatically appended to the 
           // filename.   NULL = none
    // PLEASE NOTE THE USE OF THE SINGULAR EXTENSION.

  LPCTSTR, // The initial filename that appears in 
           // the filename edit box.  NULL = none

  OFN_ALLOWMULTISELECT,           // flag bitmask
    // allow multi-select iff bug 44464 resolved

  _T("Text files (.txt)|*.txt|")  // filter string
  _T("HTML files (.html; .htm)|*.html;*.htm|") 
    // another pair of "text|pattern(s)|"
  _T("All files|*.*|")            // IMPORTANT TO INCLUDE THIS
    // another pair of "text|pattern(s)|"
  _T("|"), // terminate concat of pairs with a bar

  CWnd*    // final argument is a pointer to the 
           // file dialog box object's parent or 
           // owner window, for notifications.
)

Now, I still think it would be polite -- perhaps too polite 
-- to warn and offer to defer form submission when the MIME 
type as reported by mime_magic() doesn't match something in 
the accept property list, as in comment 23.  However, that 
is optional, too.

Whether you do that or not, the first step is to find a 
MIME type matcher suitable for matching lists of types 
with arbitrary MIME type patterns (i.e., 
"image,text/* audio/x-specific;type=with;parameter=values")
against the key types in mimeTypes.rdf.  The match must 
not be case sensitive, (except for the "values" maybe, if 
you even need to match those.)  There is probably one of 
those in the browser code already.  I think HTTP uses MIME 
list matching.

Then mimeTypes.rdf has the mapping from type to extension(s).
Comment 30 James Salsman 2003-05-27 13:00:06 PDT
Asking Boris whether/how to use the http mime type list match.
Comment 31 Christian :Biesinger (don't email me, ping me on IRC) 2003-05-27 13:58:06 PDT
what "http mime type list"? I think you want this:
http://lxr.mozilla.org/seamonkey/source/netwerk/mime/public/nsIMIMEService.idl
Comment 32 James Salsman 2003-05-27 15:34:32 PDT
Thanks, Christian, but nsIMIMEService.idl doesn't (yet) have anything like
 GetAllExtensionsFromMimeTypes(in string listWithWildcards)

Here's what I asked Boris:

Would you please tell stefan-moz01@baebler.net and I where the MIME type 
match between a list of types with optional patterns -- the one that HTTP 
uses for processing "Accept*:" header(s) -- is located in the Mozilla code?

Do you think it can be called from forms or will it have to be copied?

I welcome anyone who would like to chime in on these questions, please.
Comment 33 James Salsman 2003-05-27 20:35:45 PDT
Boris is correct, as usual:

> It's not located anywhere.  Mozilla sends Accept* headers, but it 
> never receives them (being an HTTP client, not server), hence never 
> has to process them.

Stefan, would you like this back?  Do you think you can write the
case-insensitive MIME list-with-wildcards pattern match by 1.5alpha?  
I really hate to see this one slip.  Is there anything else you 
would need for it?
Comment 34 James Salsman 2003-05-31 01:02:39 PDT
To Stefan.  I've got to clear my calendar for at least the next several weeks.

Apache does have an Accept HTTP header parser, of course, which you may use for this:
  http://httpd.apache.org/docs/content-negotiation.html
  http://cvs.apache.org/viewcvs.cgi/apache-1.3/src/modules/standard/mod_negotiation.c

If you don't think you can do this by 1.7a, please swap the dependency with bug 46135.

Do you have all three platforms handy?  If not, I'd ask Boris; he knows this stuff like the 
back of his hand.  Although I can't promise that he has even looked once at the Mac file 
picker routines.
Comment 35 Stefan Baebler 2003-08-05 01:52:06 PDT
Unfortunately I won't be able to deal with it in the next months either. 
If soemeone can, please do so.
Comment 36 basic 2004-07-15 03:28:27 PDT
so what is needed to be done here?
Comment 37 Boris Zbarsky [:bz] (still a bit busy) 2004-07-15 11:15:17 PDT
Everything.  None of this is implemented.
Comment 38 Christian Schmidt 2006-08-17 01:58:32 PDT
The Web Forms 2 specification also allows wild cards, e.g. accept="image/*":
http://whatwg.org/specs/web-forms/current-work/#upload
Comment 39 Daniel.S 2008-08-21 13:55:08 PDT
FWIW, Opera supports this including the wildcards of Web Forms 2.0

Would be nice for use cases like Flickr (image/* only shows pictures that can be uploaded).
Comment 40 Daniel.S 2010-05-20 13:47:12 PDT
*** Bug 542271 has been marked as a duplicate of this bug. ***
Comment 41 brunoais 2011-07-02 03:24:20 PDT
This does not work with many MIME TYPES. Including:
application/*
text/html
text/plain
text/xml
text/*
image/jpg
...
image/[insertAnyCharacterHereThatIsNotA *]

At least for:
image/jpg
image/png
image/tiff
should treat as:
image/*
instead of ignoring.
Same goes to audio, video, etc... 

Could that be fixed?
Comment 42 Mounir Lamouri (:mounir) 2011-07-04 10:23:46 PDT
Only "image/*", "audio/*" and "video/*" are currently supported. We do not yet support MIME TYPES, see bug 565274. In addition, note that "<foo>/*" isn't allowed, except for image, audio and video.
Comment 43 brunoais 2011-07-04 11:05:59 PDT
(In reply to comment #42)
> Only "image/*", "audio/*" and "video/*" are currently supported.
Could you please make it so that image/<foo> is treated as image/* until you support all variants of image (same for audio and video)

> We do not yet support MIME TYPES, see bug 565274. In addition, note that 
> "<foo>/*"
> isn't allowed, except for image, audio and video.
Could you please point to the RFC and indicates that? Maybe there's no strong enough reason for not allowing support of it.
Comment 44 Dave Webber 2011-07-24 03:26:45 PDT
(In reply to comment #42)
> Only "image/*", "audio/*" and "video/*" are currently supported. We do not
> yet support MIME TYPES, see bug 565274. In addition, note that "<foo>/*"
> isn't allowed, except for image, audio and video.

Wouldn't it be reasonable to add one more -- "text/*" -- to that list? I suspect it would be useful for certain web applications, such as bug trackers.
Comment 45 David Balažic 2012-04-12 06:05:30 PDT
parity-Chrome?

<input type="file" accept="application/vnd.ms-excel"/>
presents a dialog only allowing excel files in Chrome 18 on windows XP.
Comment 46 Mounir Lamouri (:mounir) 2012-04-12 07:18:23 PDT
Please, see activity in bug 565274.
Comment 47 Mounir Lamouri (:mounir) 2012-06-07 06:07:55 PDT
This is now done.
Comment 48 Michael DeVos 2015-01-14 14:08:06 PST
In case anybody is still following this after so long...isn't this suppose to also provide the ability to filter by file extension? For example: ".xls, .xlsx, .csv". Also, this currently only adds filters that the user can select. If they don't select the filter then the file browser will still accept other file types but specification says that's not the desired behavior when supporting the "accept" attribute. The "accept" attribute should define the current/default/only filter. Sorry if I'm misunderstanding something about this feature. It's kind of annoying, though, when IE and Chrome both support this and Firefox doesn't...
Comment 49 Arnaud Bienner 2015-01-15 01:22:13 PST
(In reply to Michael DeVos from comment #48)

This is was bug 826176 is about, and as you can see, it has been implemented and will be part of Firefox 37.
Comment 50 Arnaud Bienner 2015-01-15 01:22:41 PST
*This is what

Note You need to log in before you can comment on or make changes to this bug.