Le 30 mai 2011 à 05:03, Chris Young a écrit :
On Sat, 14 May 2011 20:49:10 +0200, François Revol wrote:
> [const?] char *fetch_mime_by_ext(const char *filename);
Personally I hate detecting filetypes by filename/extension - it's
prone to errors/hacking. Look at all the problems on Windows,
renaming a file makes it think the filetype has changed, you can trick
things like Outlook so easily just by renaming files.
+1, that's why we use MIME types on files in xattrs and have a MIME sniffer in Haiku,
just like BeOS did for 15 years. :p
I also realise that sometimes the filename is the only way to
determine a filetype. I think an "integrated" approach would be
better, one function which passes the filename and the data. I see no
reason why the data (or at least the first few bytes) couldn't be
available before you need to work out what type it is.
Sometimes the file is empty :p
Well, there are several cases:
- existing files (file:) that can have a mime xattr, the BeOS port already checks them for
fetch_filetype,
- virtual files (not yet downloaded urls or in-memory cached data not downloaded) where
the OS didn't yet sniff the mime, for which we have to sniff ourselves.
Ideally we would:
- check if the file is real, and it has a MIME info,
- try to sniff it,
- fallback to extension matching.
The API doesn't have to provide separate functions indeed, it can be a global one
which can have the data buffer be optional, if NULL then we just don't sniff.
Of course there are some corner cases like directories (BeOS also has a mime type for
those but different than NS'), and usual types like text/html which the OS might
differentiate a bit too much sometimes or identify differently (text/xml+html for
xhtml?).
For ex, Haiku uses text/x-source-code and usually doesn't differentiate between
python, perl, bash, C++ or whatever source language.
François.