我正在寻找一种简单的方法来获得mime类型,其中文件扩展名是不正确的或没有给出,类似于这个问题只有在. net。


当前回答

在Urlmon.dll中,有一个名为FindMimeFromData的函数。

来自文档

MIME类型检测或“数据嗅探”是指从二进制数据中确定适当的MIME类型的过程。最终结果取决于服务器提供的MIME类型头、文件扩展名和/或数据本身的组合。通常,只有前256字节的数据是重要的。

因此,从文件中读取第一个(最多)256字节,并将其传递给FindMimeFromData。

其他回答

如果你正在使用。net Framework 4.5或更高版本,现在有一个MimeMapping.GetMimeMapping(filename)方法,它将返回一个字符串,其中包含传递文件名的正确Mime映射。注意,这里使用的是文件扩展名,而不是文件本身中的数据。

文档请见http://msdn.microsoft.com/en-us/library/system.web.mimemapping.getmimemapping

I think the right answer is a combination of Steve Morgan's and Serguei's answers. That's how Internet Explorer does it. The pinvoke call to FindMimeFromData works for only 26 hard-coded mime types. Also, it will give ambigous mime types (such as text/plain or application/octet-stream) even though there may exist a more specific, more appropriate mime type. If it fails to give a good mime type, you can go to the registry for a more specific mime type. The server registry could have more up-to-date mime types.

参考网址:http://msdn.microsoft.com/en-us/library/ms775147(VS.85).aspx

我发现运行这段代码有几个问题:

UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);

如果你尝试用x64/Win10运行它,你会得到

AccessViolationException "Attempted to read or write protected memory.
This is often an indication that other memory is corrupt"

多亏了这篇文章,PtrToStringUni在windows 10和@xanatos中无法工作

我修改了我的解决方案,在x64和。net Core 2.1下运行:

   [DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true, 
    SetLastError = false)]
    static extern int FindMimeFromData(IntPtr pBC,
        [MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
        [MarshalAs(UnmanagedType.LPArray, ArraySubType=UnmanagedType.I1, 
        SizeParamIndex=3)]
        byte[] pBuffer,
        int cbSize,
        [MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
        int dwMimeFlags,
        out IntPtr ppwzMimeOut,
        int dwReserved);

   string getMimeFromFile(byte[] fileSource)
   {
            byte[] buffer = new byte[256];
            using (Stream stream = new MemoryStream(fileSource))
            {
                if (stream.Length >= 256)
                    stream.Read(buffer, 0, 256);
                else
                    stream.Read(buffer, 0, (int)stream.Length);
            }

            try
            {
                IntPtr mimeTypePtr;
                FindMimeFromData(IntPtr.Zero, null, buffer, buffer.Length,
                    null, 0, out mimeTypePtr, 0);

                string mime = Marshal.PtrToStringUni(mimeTypePtr);
                Marshal.FreeCoTaskMem(mimeTypePtr);
                return mime;
            }
            catch (Exception ex)
            {
                return "unknown/unknown";
            }
   }

谢谢

编辑:只需使用Mime侦探

我使用字节数组序列来确定给定文件的正确MIME类型。与只查看文件名的文件扩展名相比,这样做的好处是,如果用户要重命名一个文件以绕过某些文件类型的上传限制,则文件扩展名将无法捕捉到这一点。另一方面,通过字节数组获取文件签名将阻止这种有害行为的发生。

下面是c#中的一个例子:

public class MimeType
{
    private static readonly byte[] BMP = { 66, 77 };
    private static readonly byte[] DOC = { 208, 207, 17, 224, 161, 177, 26, 225 };
    private static readonly byte[] EXE_DLL = { 77, 90 };
    private static readonly byte[] GIF = { 71, 73, 70, 56 };
    private static readonly byte[] ICO = { 0, 0, 1, 0 };
    private static readonly byte[] JPG = { 255, 216, 255 };
    private static readonly byte[] MP3 = { 255, 251, 48 };
    private static readonly byte[] OGG = { 79, 103, 103, 83, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0 };
    private static readonly byte[] PDF = { 37, 80, 68, 70, 45, 49, 46 };
    private static readonly byte[] PNG = { 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82 };
    private static readonly byte[] RAR = { 82, 97, 114, 33, 26, 7, 0 };
    private static readonly byte[] SWF = { 70, 87, 83 };
    private static readonly byte[] TIFF = { 73, 73, 42, 0 };
    private static readonly byte[] TORRENT = { 100, 56, 58, 97, 110, 110, 111, 117, 110, 99, 101 };
    private static readonly byte[] TTF = { 0, 1, 0, 0, 0 };
    private static readonly byte[] WAV_AVI = { 82, 73, 70, 70 };
    private static readonly byte[] WMV_WMA = { 48, 38, 178, 117, 142, 102, 207, 17, 166, 217, 0, 170, 0, 98, 206, 108 };
    private static readonly byte[] ZIP_DOCX = { 80, 75, 3, 4 };

    public static string GetMimeType(byte[] file, string fileName)
    {

        string mime = "application/octet-stream"; //DEFAULT UNKNOWN MIME TYPE

        //Ensure that the filename isn't empty or null
        if (string.IsNullOrWhiteSpace(fileName))
        {
            return mime;
        }

        //Get the file extension
        string extension = Path.GetExtension(fileName) == null
                               ? string.Empty
                               : Path.GetExtension(fileName).ToUpper();

        //Get the MIME Type
        if (file.Take(2).SequenceEqual(BMP))
        {
            mime = "image/bmp";
        }
        else if (file.Take(8).SequenceEqual(DOC))
        {
            mime = "application/msword";
        }
        else if (file.Take(2).SequenceEqual(EXE_DLL))
        {
            mime = "application/x-msdownload"; //both use same mime type
        }
        else if (file.Take(4).SequenceEqual(GIF))
        {
            mime = "image/gif";
        }
        else if (file.Take(4).SequenceEqual(ICO))
        {
            mime = "image/x-icon";
        }
        else if (file.Take(3).SequenceEqual(JPG))
        {
            mime = "image/jpeg";
        }
        else if (file.Take(3).SequenceEqual(MP3))
        {
            mime = "audio/mpeg";
        }
        else if (file.Take(14).SequenceEqual(OGG))
        {
            if (extension == ".OGX")
            {
                mime = "application/ogg";
            }
            else if (extension == ".OGA")
            {
                mime = "audio/ogg";
            }
            else
            {
                mime = "video/ogg";
            }
        }
        else if (file.Take(7).SequenceEqual(PDF))
        {
            mime = "application/pdf";
        }
        else if (file.Take(16).SequenceEqual(PNG))
        {
            mime = "image/png";
        }
        else if (file.Take(7).SequenceEqual(RAR))
        {
            mime = "application/x-rar-compressed";
        }
        else if (file.Take(3).SequenceEqual(SWF))
        {
            mime = "application/x-shockwave-flash";
        }
        else if (file.Take(4).SequenceEqual(TIFF))
        {
            mime = "image/tiff";
        }
        else if (file.Take(11).SequenceEqual(TORRENT))
        {
            mime = "application/x-bittorrent";
        }
        else if (file.Take(5).SequenceEqual(TTF))
        {
            mime = "application/x-font-ttf";
        }
        else if (file.Take(4).SequenceEqual(WAV_AVI))
        {
            mime = extension == ".AVI" ? "video/x-msvideo" : "audio/x-wav";
        }
        else if (file.Take(16).SequenceEqual(WMV_WMA))
        {
            mime = extension == ".WMA" ? "audio/x-ms-wma" : "video/x-ms-wmv";
        }
        else if (file.Take(4).SequenceEqual(ZIP_DOCX))
        {
            mime = extension == ".DOCX" ? "application/vnd.openxmlformats-officedocument.wordprocessingml.document" : "application/x-zip-compressed";
        }

        return mime;
    }


}

注意,我对DOCX文件类型的处理是不同的,因为DOCX实际上只是一个ZIP文件。在这个场景中,只要验证了文件扩展名的顺序,我就会检查文件扩展名。对于某些人来说,这个示例还远远不够完整,但是您可以轻松地添加自己的示例。

如果您想添加更多MIME类型,您可以从这里获得许多不同文件类型的字节数组序列。另外,这里还有一个关于文件签名的很好的资源。

如果其他方法都失败了,我经常做的是逐步查看我正在寻找的特定类型的几个文件,并在文件的字节序列中寻找模式。最后,这仍然是基本的验证,不能用于确定文件类型的100%证明。

我发现这个很有用。 VB。NET开发人员:

    Public Shared Function GetFromFileName(ByVal fileName As String) As String
        Return GetFromExtension(Path.GetExtension(fileName).Remove(0, 1))
    End Function

    Public Shared Function GetFromExtension(ByVal extension As String) As String
        If extension.StartsWith("."c) Then
            extension = extension.Remove(0, 1)
        End If

        If MIMETypesDictionary.ContainsKey(extension) Then
            Return MIMETypesDictionary(extension)
        End If

        Return "unknown/unknown"
    End Function

    Private Shared ReadOnly MIMETypesDictionary As New Dictionary(Of String, String)() From { _
         {"ai", "application/postscript"}, _
         {"aif", "audio/x-aiff"}, _
         {"aifc", "audio/x-aiff"}, _
         {"aiff", "audio/x-aiff"}, _
         {"asc", "text/plain"}, _
         {"atom", "application/atom+xml"}, _
         {"au", "audio/basic"}, _
         {"avi", "video/x-msvideo"}, _
         {"bcpio", "application/x-bcpio"}, _
         {"bin", "application/octet-stream"}, _
         {"bmp", "image/bmp"}, _
         {"cdf", "application/x-netcdf"}, _
         {"cgm", "image/cgm"}, _
         {"class", "application/octet-stream"}, _
         {"cpio", "application/x-cpio"}, _
         {"cpt", "application/mac-compactpro"}, _
         {"csh", "application/x-csh"}, _
         {"css", "text/css"}, _
         {"dcr", "application/x-director"}, _
         {"dif", "video/x-dv"}, _
         {"dir", "application/x-director"}, _
         {"djv", "image/vnd.djvu"}, _
         {"djvu", "image/vnd.djvu"}, _
         {"dll", "application/octet-stream"}, _
         {"dmg", "application/octet-stream"}, _
         {"dms", "application/octet-stream"}, _
         {"doc", "application/msword"}, _
         {"dtd", "application/xml-dtd"}, _
         {"dv", "video/x-dv"}, _
         {"dvi", "application/x-dvi"}, _
         {"dxr", "application/x-director"}, _
         {"eps", "application/postscript"}, _
         {"etx", "text/x-setext"}, _
         {"exe", "application/octet-stream"}, _
         {"ez", "application/andrew-inset"}, _
         {"gif", "image/gif"}, _
         {"gram", "application/srgs"}, _
         {"grxml", "application/srgs+xml"}, _
         {"gtar", "application/x-gtar"}, _
         {"hdf", "application/x-hdf"}, _
         {"hqx", "application/mac-binhex40"}, _
         {"htm", "text/html"}, _
         {"html", "text/html"}, _
         {"ice", "x-conference/x-cooltalk"}, _
         {"ico", "image/x-icon"}, _
         {"ics", "text/calendar"}, _
         {"ief", "image/ief"}, _
         {"ifb", "text/calendar"}, _
         {"iges", "model/iges"}, _
         {"igs", "model/iges"}, _
         {"jnlp", "application/x-java-jnlp-file"}, _
         {"jp2", "image/jp2"}, _
         {"jpe", "image/jpeg"}, _
         {"jpeg", "image/jpeg"}, _
         {"jpg", "image/jpeg"}, _
         {"js", "application/x-javascript"}, _
         {"kar", "audio/midi"}, _
         {"latex", "application/x-latex"}, _
         {"lha", "application/octet-stream"}, _
         {"lzh", "application/octet-stream"}, _
         {"m3u", "audio/x-mpegurl"}, _
         {"m4a", "audio/mp4a-latm"}, _
         {"m4b", "audio/mp4a-latm"}, _
         {"m4p", "audio/mp4a-latm"}, _
         {"m4u", "video/vnd.mpegurl"}, _
         {"m4v", "video/x-m4v"}, _
         {"mac", "image/x-macpaint"}, _
         {"man", "application/x-troff-man"}, _
         {"mathml", "application/mathml+xml"}, _
         {"me", "application/x-troff-me"}, _
         {"mesh", "model/mesh"}, _
         {"mid", "audio/midi"}, _
         {"midi", "audio/midi"}, _
         {"mif", "application/vnd.mif"}, _
         {"mov", "video/quicktime"}, _
         {"movie", "video/x-sgi-movie"}, _
         {"mp2", "audio/mpeg"}, _
         {"mp3", "audio/mpeg"}, _
         {"mp4", "video/mp4"}, _
         {"mpe", "video/mpeg"}, _
         {"mpeg", "video/mpeg"}, _
         {"mpg", "video/mpeg"}, _
         {"mpga", "audio/mpeg"}, _
         {"ms", "application/x-troff-ms"}, _
         {"msh", "model/mesh"}, _
         {"mxu", "video/vnd.mpegurl"}, _
         {"nc", "application/x-netcdf"}, _
         {"oda", "application/oda"}, _
         {"ogg", "application/ogg"}, _
         {"pbm", "image/x-portable-bitmap"}, _
         {"pct", "image/pict"}, _
         {"pdb", "chemical/x-pdb"}, _
         {"pdf", "application/pdf"}, _
         {"pgm", "image/x-portable-graymap"}, _
         {"pgn", "application/x-chess-pgn"}, _
         {"pic", "image/pict"}, _
         {"pict", "image/pict"}, _
         {"png", "image/png"}, _
         {"pnm", "image/x-portable-anymap"}, _
         {"pnt", "image/x-macpaint"}, _
         {"pntg", "image/x-macpaint"}, _
         {"ppm", "image/x-portable-pixmap"}, _
         {"ppt", "application/vnd.ms-powerpoint"}, _
         {"ps", "application/postscript"}, _
         {"qt", "video/quicktime"}, _
         {"qti", "image/x-quicktime"}, _
         {"qtif", "image/x-quicktime"}, _
         {"ra", "audio/x-pn-realaudio"}, _
         {"ram", "audio/x-pn-realaudio"}, _
         {"ras", "image/x-cmu-raster"}, _
         {"rdf", "application/rdf+xml"}, _
         {"rgb", "image/x-rgb"}, _
         {"rm", "application/vnd.rn-realmedia"}, _
         {"roff", "application/x-troff"}, _
         {"rtf", "text/rtf"}, _
         {"rtx", "text/richtext"}, _
         {"sgm", "text/sgml"}, _
         {"sgml", "text/sgml"}, _
         {"sh", "application/x-sh"}, _
         {"shar", "application/x-shar"}, _
         {"silo", "model/mesh"}, _
         {"sit", "application/x-stuffit"}, _
         {"skd", "application/x-koan"}, _
         {"skm", "application/x-koan"}, _
         {"skp", "application/x-koan"}, _
         {"skt", "application/x-koan"}, _
         {"smi", "application/smil"}, _
         {"smil", "application/smil"}, _
         {"snd", "audio/basic"}, _
         {"so", "application/octet-stream"}, _
         {"spl", "application/x-futuresplash"}, _
         {"src", "application/x-wais-source"}, _
         {"sv4cpio", "application/x-sv4cpio"}, _
         {"sv4crc", "application/x-sv4crc"}, _
         {"svg", "image/svg+xml"}, _
         {"swf", "application/x-shockwave-flash"}, _
         {"t", "application/x-troff"}, _
         {"tar", "application/x-tar"}, _
         {"tcl", "application/x-tcl"}, _
         {"tex", "application/x-tex"}, _
         {"texi", "application/x-texinfo"}, _
         {"texinfo", "application/x-texinfo"}, _
         {"tif", "image/tiff"}, _
         {"tiff", "image/tiff"}, _
         {"tr", "application/x-troff"}, _
         {"tsv", "text/tab-separated-values"}, _
         {"txt", "text/plain"}, _
         {"ustar", "application/x-ustar"}, _
         {"vcd", "application/x-cdlink"}, _
         {"vrml", "model/vrml"}, _
         {"vxml", "application/voicexml+xml"}, _
         {"wav", "audio/x-wav"}, _
         {"wbmp", "image/vnd.wap.wbmp"}, _
         {"wbmxl", "application/vnd.wap.wbxml"}, _
         {"wml", "text/vnd.wap.wml"}, _
         {"wmlc", "application/vnd.wap.wmlc"}, _
         {"wmls", "text/vnd.wap.wmlscript"}, _
         {"wmlsc", "application/vnd.wap.wmlscriptc"}, _
         {"wrl", "model/vrml"}, _
         {"xbm", "image/x-xbitmap"}, _
         {"xht", "application/xhtml+xml"}, _
         {"xhtml", "application/xhtml+xml"}, _
         {"xls", "application/vnd.ms-excel"}, _
         {"xml", "application/xml"}, _
         {"xpm", "image/x-xpixmap"}, _
         {"xsl", "application/xml"}, _
         {"xslt", "application/xslt+xml"}, _
         {"xul", "application/vnd.mozilla.xul+xml"}, _
         {"xwd", "image/x-xwindowdump"}, _
         {"xyz", "chemical/x-xyz"}, _
         {"zip", "application/zip"} _
        }