使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型

Posted

技术标签:

【中文标题】使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型【英文标题】:Using .NET, how can you find the mime type of a file based on the file signature not the extension 【发布时间】:2017-11-19 09:01:28 【问题描述】:

我正在寻找一种简单的方法来获取文件扩展名不正确或未给出的 mime 类型,类似于 this question 仅在 .Net 中。

【问题讨论】:

这听起来类似于this question。 当要求明确指出不要使用扩展名时,我希望我可以删除所有仍在使用文件扩展名的“假答案”! 这可能是一个老问题,但问题仍然存在。我会在这里对每个答案投反对票,因为他们只通过内容检查 Windows 可执行文件; Linux 或 ios 可执行文件或危险文件呢? @PhillipH 为这些写一个答案。 【参考方案1】:

最后我确实使用了 urlmon.dll。我认为会有一种更简单的方法,但这很有效。我包含代码以帮助其他人,并允许我在需要时再次找到它。

using System.Runtime.InteropServices;

...

    [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
    private extern static System.UInt32 FindMimeFromData(
        System.UInt32 pBC,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
        [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
        System.UInt32 cbSize,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
        System.UInt32 dwMimeFlags,
        out System.UInt32 ppwzMimeOut,
        System.UInt32 dwReserverd
    );

    public static string getMimeFromFile(string filename)
    
        if (!File.Exists(filename))
            throw new FileNotFoundException(filename + " not found");

        byte[] buffer = new byte[256];
        using (FileStream fs = new FileStream(filename, FileMode.Open))
        
            if (fs.Length >= 256)
                fs.Read(buffer, 0, 256);
            else
                fs.Read(buffer, 0, (int)fs.Length);
        
        try
        
            System.UInt32 mimetype;
            FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
            System.IntPtr mimeTypePtr = new IntPtr(mimetype);
            string mime = Marshal.PtrToStringUni(mimeTypePtr);
            Marshal.FreeCoTaskMem(mimeTypePtr);
            return mime;
        
        catch (Exception e)
        
            return "unknown/unknown";
        
    

【讨论】:

可能是注册表中映射的任何内容。 @flq, @mkmurray msdn.microsoft.com/en-us/library/… 我在 Windows 8 上的 IIS 中托管此代码时遇到了问题。使用 pinvoke.net 上的实现(有细微差别)解决了这个问题:pinvoke.net/default.aspx/urlmon.findmimefromdata 我一直在用 IIS 7 测试这段代码,但它并没有为我工作。我有一个正在测试的 CSV 文件。我一直在更改 CSV 的扩展名(更改为 .png、.jpeg 等),并且 mimetype 随扩展名(image/png、image/jpeg)而变化。我可能是错的,但我的理解是 Urlmon.dll 使用文件的元数据确定 mimetype,而不仅仅是扩展名 不适用于 64 位应用程序,请查看此处:***.com/questions/18358548/…【参考方案2】:

我找到了一个硬编码的解决方案,我希望我能帮助别人:

public static class MIMEAssistant

  private static readonly Dictionary<string, string> MIMETypesDictionary = new Dictionary<string, string>
  
    "ai", "application/postscript",
    "aif", "audio/x-aiff",
    "aifc", "audio/x-aiff",
    "aiff", "audio/x-aiff",
    "asc", "text/plain",
    "atom", "application/atom+xml",
    "au", "audio/basic",
    "avi", "video/x-msvideo",
    "bcpio", "application/x-bcpio",
    "bin", "application/octet-stream",
    "bmp", "image/bmp",
    "cdf", "application/x-netcdf",
    "cgm", "image/cgm",
    "class", "application/octet-stream",
    "cpio", "application/x-cpio",
    "cpt", "application/mac-compactpro",
    "csh", "application/x-csh",
    "css", "text/css",
    "dcr", "application/x-director",
    "dif", "video/x-dv",
    "dir", "application/x-director",
    "djv", "image/vnd.djvu",
    "djvu", "image/vnd.djvu",
    "dll", "application/octet-stream",
    "dmg", "application/octet-stream",
    "dms", "application/octet-stream",
    "doc", "application/msword",
    "docx","application/vnd.openxmlformats-officedocument.wordprocessingml.document",
    "dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
    "docm","application/vnd.ms-word.document.macroEnabled.12",
    "dotm","application/vnd.ms-word.template.macroEnabled.12",
    "dtd", "application/xml-dtd",
    "dv", "video/x-dv",
    "dvi", "application/x-dvi",
    "dxr", "application/x-director",
    "eps", "application/postscript",
    "etx", "text/x-setext",
    "exe", "application/octet-stream",
    "ez", "application/andrew-inset",
    "gif", "image/gif",
    "gram", "application/srgs",
    "grxml", "application/srgs+xml",
    "gtar", "application/x-gtar",
    "hdf", "application/x-hdf",
    "hqx", "application/mac-binhex40",
    "htm", "text/html",
    "html", "text/html",
    "ice", "x-conference/x-cooltalk",
    "ico", "image/x-icon",
    "ics", "text/calendar",
    "ief", "image/ief",
    "ifb", "text/calendar",
    "iges", "model/iges",
    "igs", "model/iges",
    "jnlp", "application/x-java-jnlp-file",
    "jp2", "image/jp2",
    "jpe", "image/jpeg",
    "jpeg", "image/jpeg",
    "jpg", "image/jpeg",
    "js", "application/x-javascript",
    "kar", "audio/midi",
    "latex", "application/x-latex",
    "lha", "application/octet-stream",
    "lzh", "application/octet-stream",
    "m3u", "audio/x-mpegurl",
    "m4a", "audio/mp4a-latm",
    "m4b", "audio/mp4a-latm",
    "m4p", "audio/mp4a-latm",
    "m4u", "video/vnd.mpegurl",
    "m4v", "video/x-m4v",
    "mac", "image/x-macpaint",
    "man", "application/x-troff-man",
    "mathml", "application/mathml+xml",
    "me", "application/x-troff-me",
    "mesh", "model/mesh",
    "mid", "audio/midi",
    "midi", "audio/midi",
    "mif", "application/vnd.mif",
    "mov", "video/quicktime",
    "movie", "video/x-sgi-movie",
    "mp2", "audio/mpeg",
    "mp3", "audio/mpeg",
    "mp4", "video/mp4",
    "mpe", "video/mpeg",
    "mpeg", "video/mpeg",
    "mpg", "video/mpeg",
    "mpga", "audio/mpeg",
    "ms", "application/x-troff-ms",
    "msh", "model/mesh",
    "mxu", "video/vnd.mpegurl",
    "nc", "application/x-netcdf",
    "oda", "application/oda",
    "ogg", "application/ogg",
    "pbm", "image/x-portable-bitmap",
    "pct", "image/pict",
    "pdb", "chemical/x-pdb",
    "pdf", "application/pdf",
    "pgm", "image/x-portable-graymap",
    "pgn", "application/x-chess-pgn",
    "pic", "image/pict",
    "pict", "image/pict",
    "png", "image/png", 
    "pnm", "image/x-portable-anymap",
    "pnt", "image/x-macpaint",
    "pntg", "image/x-macpaint",
    "ppm", "image/x-portable-pixmap",
    "ppt", "application/vnd.ms-powerpoint",
    "pptx","application/vnd.openxmlformats-officedocument.presentationml.presentation",
    "potx","application/vnd.openxmlformats-officedocument.presentationml.template",
    "ppsx","application/vnd.openxmlformats-officedocument.presentationml.slideshow",
    "ppam","application/vnd.ms-powerpoint.addin.macroEnabled.12",
    "pptm","application/vnd.ms-powerpoint.presentation.macroEnabled.12",
    "potm","application/vnd.ms-powerpoint.template.macroEnabled.12",
    "ppsm","application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
    "ps", "application/postscript",
    "qt", "video/quicktime",
    "qti", "image/x-quicktime",
    "qtif", "image/x-quicktime",
    "ra", "audio/x-pn-realaudio",
    "ram", "audio/x-pn-realaudio",
    "ras", "image/x-cmu-raster",
    "rdf", "application/rdf+xml",
    "rgb", "image/x-rgb",
    "rm", "application/vnd.rn-realmedia",
    "roff", "application/x-troff",
    "rtf", "text/rtf",
    "rtx", "text/richtext",
    "sgm", "text/sgml",
    "sgml", "text/sgml",
    "sh", "application/x-sh",
    "shar", "application/x-shar",
    "silo", "model/mesh",
    "sit", "application/x-stuffit",
    "skd", "application/x-koan",
    "skm", "application/x-koan",
    "skp", "application/x-koan",
    "skt", "application/x-koan",
    "smi", "application/smil",
    "smil", "application/smil",
    "snd", "audio/basic",
    "so", "application/octet-stream",
    "spl", "application/x-futuresplash",
    "src", "application/x-wais-source",
    "sv4cpio", "application/x-sv4cpio",
    "sv4crc", "application/x-sv4crc",
    "svg", "image/svg+xml",
    "swf", "application/x-shockwave-flash",
    "t", "application/x-troff",
    "tar", "application/x-tar",
    "tcl", "application/x-tcl",
    "tex", "application/x-tex",
    "texi", "application/x-texinfo",
    "texinfo", "application/x-texinfo",
    "tif", "image/tiff",
    "tiff", "image/tiff",
    "tr", "application/x-troff",
    "tsv", "text/tab-separated-values",
    "txt", "text/plain",
    "ustar", "application/x-ustar",
    "vcd", "application/x-cdlink",
    "vrml", "model/vrml",
    "vxml", "application/voicexml+xml",
    "wav", "audio/x-wav",
    "wbmp", "image/vnd.wap.wbmp",
    "wbmxl", "application/vnd.wap.wbxml",
    "wml", "text/vnd.wap.wml",
    "wmlc", "application/vnd.wap.wmlc",
    "wmls", "text/vnd.wap.wmlscript",
    "wmlsc", "application/vnd.wap.wmlscriptc",
    "wrl", "model/vrml",
    "xbm", "image/x-xbitmap",
    "xht", "application/xhtml+xml",
    "xhtml", "application/xhtml+xml",
    "xls", "application/vnd.ms-excel",                        
    "xml", "application/xml",
    "xpm", "image/x-xpixmap",
    "xsl", "application/xml",
    "xlsx","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
    "xltx","application/vnd.openxmlformats-officedocument.spreadsheetml.template",
    "xlsm","application/vnd.ms-excel.sheet.macroEnabled.12",
    "xltm","application/vnd.ms-excel.template.macroEnabled.12",
    "xlam","application/vnd.ms-excel.addin.macroEnabled.12",
    "xlsb","application/vnd.ms-excel.sheet.binary.macroEnabled.12",
    "xslt", "application/xslt+xml",
    "xul", "application/vnd.mozilla.xul+xml",
    "xwd", "image/x-xwindowdump",
    "xyz", "chemical/x-xyz",
    "zip", "application/zip"
  ;

  public static string GetMIMEType(string fileName)
  
    //get file extension
    string extension = Path.GetExtension(fileName).ToLowerInvariant();

    if (extension.Length > 0 && 
        MIMETypesDictionary.ContainsKey(extension.Remove(0, 1)))
    
      return MIMETypesDictionary[extension.Remove(0, 1)];
    
    return "unknown/unknown";
  

【讨论】:

这是基于文件名的。它可能对希望通过文件内容完成的人(而不是 OP)有用。 这个列表的一个子集也便于将 WebImage.ImageFormat 映射回 mime 类型。谢谢! 根据您的目标,您可能希望返回“application/octet-stream”而不是“unknown/unknown”。 由于我的编辑被拒绝了,我将在这里发布:扩展名必须全部小写,否则将无法在字典中找到。 @JalalAldeenSaa'd - 恕我直言,更好的解决方法是将StringComparer.OrdinalIgnoreCase 用于字典构造函数。序数比较比不变量快,你会摆脱.ToLower() 及其变体。【参考方案3】:

编辑:只需使用Mime Detective

我使用字节数组序列来确定给定文件的正确 MIME 类型。与仅查看文件名的文件扩展名相比,这样做的好处是,如果用户要重命名文件以绕过某些文件类型上传限制,则文件扩展名将无法捕捉到这一点。另一方面,通过字节数组获取文件签名将阻止这种恶作剧行为的发生。

这是一个 C# 示例:

public class MimeType

    private static readonly byte[] BMP =  66, 77 ;
    private static readonly byte[] DOC =  208, 207, 17, 224, 161, 177, 26, 225 ;
    private static readonly byte[] EXE_DLL =  77, 90 ;
    private static readonly byte[] GIF =  71, 73, 70, 56 ;
    private static readonly byte[] ICO =  0, 0, 1, 0 ;
    private static readonly byte[] JPG =  255, 216, 255 ;
    private static readonly byte[] MP3 =  255, 251, 48 ;
    private static readonly byte[] OGG =  79, 103, 103, 83, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0 ;
    private static readonly byte[] PDF =  37, 80, 68, 70, 45, 49, 46 ;
    private static readonly byte[] PNG =  137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82 ;
    private static readonly byte[] RAR =  82, 97, 114, 33, 26, 7, 0 ;
    private static readonly byte[] SWF =  70, 87, 83 ;
    private static readonly byte[] TIFF =  73, 73, 42, 0 ;
    private static readonly byte[] TORRENT =  100, 56, 58, 97, 110, 110, 111, 117, 110, 99, 101 ;
    private static readonly byte[] TTF =  0, 1, 0, 0, 0 ;
    private static readonly byte[] WAV_AVI =  82, 73, 70, 70 ;
    private static readonly byte[] WMV_WMA =  48, 38, 178, 117, 142, 102, 207, 17, 166, 217, 0, 170, 0, 98, 206, 108 ;
    private static readonly byte[] ZIP_DOCX =  80, 75, 3, 4 ;

    public static string GetMimeType(byte[] file, string fileName)
    

        string mime = "application/octet-stream"; //DEFAULT UNKNOWN MIME TYPE

        //Ensure that the filename isn't empty or null
        if (string.IsNullOrWhiteSpace(fileName))
        
            return mime;
        

        //Get the file extension
        string extension = Path.GetExtension(fileName) == null
                               ? string.Empty
                               : Path.GetExtension(fileName).ToUpper();

        //Get the MIME Type
        if (file.Take(2).SequenceEqual(BMP))
        
            mime = "image/bmp";
        
        else if (file.Take(8).SequenceEqual(DOC))
        
            mime = "application/msword";
        
        else if (file.Take(2).SequenceEqual(EXE_DLL))
        
            mime = "application/x-msdownload"; //both use same mime type
        
        else if (file.Take(4).SequenceEqual(GIF))
        
            mime = "image/gif";
        
        else if (file.Take(4).SequenceEqual(ICO))
        
            mime = "image/x-icon";
        
        else if (file.Take(3).SequenceEqual(JPG))
        
            mime = "image/jpeg";
        
        else if (file.Take(3).SequenceEqual(MP3))
        
            mime = "audio/mpeg";
        
        else if (file.Take(14).SequenceEqual(OGG))
        
            if (extension == ".OGX")
            
                mime = "application/ogg";
            
            else if (extension == ".OGA")
            
                mime = "audio/ogg";
            
            else
            
                mime = "video/ogg";
            
        
        else if (file.Take(7).SequenceEqual(PDF))
        
            mime = "application/pdf";
        
        else if (file.Take(16).SequenceEqual(PNG))
        
            mime = "image/png";
        
        else if (file.Take(7).SequenceEqual(RAR))
        
            mime = "application/x-rar-compressed";
        
        else if (file.Take(3).SequenceEqual(SWF))
        
            mime = "application/x-shockwave-flash";
        
        else if (file.Take(4).SequenceEqual(TIFF))
        
            mime = "image/tiff";
        
        else if (file.Take(11).SequenceEqual(TORRENT))
        
            mime = "application/x-bittorrent";
        
        else if (file.Take(5).SequenceEqual(TTF))
        
            mime = "application/x-font-ttf";
        
        else if (file.Take(4).SequenceEqual(WAV_AVI))
        
            mime = extension == ".AVI" ? "video/x-msvideo" : "audio/x-wav";
        
        else if (file.Take(16).SequenceEqual(WMV_WMA))
        
            mime = extension == ".WMA" ? "audio/x-ms-wma" : "video/x-ms-wmv";
        
        else if (file.Take(4).SequenceEqual(ZIP_DOCX))
        
            mime = extension == ".DOCX" ? "application/vnd.openxmlformats-officedocument.wordprocessingml.document" : "application/x-zip-compressed";
        

        return mime;
    



请注意,我处理 DOCX 文件类型的方式不同,因为 DOCX 实际上只是一个 ZIP 文件。在这种情况下,一旦我确认它具有该序列,我只需检查文件扩展名。对于某些人来说,这个示例还远未完成,但您可以轻松添加自己的示例。

如果你想添加更多的MIME类型,你可以得到很多不同文件类型的字节数组序列from here。另外,here is another good resource 涉及文件签名。

如果所有其他方法都失败了,我经常做的是逐步浏览我正在寻找的特定类型的多个文件,并在文件的字节序列中寻找一个模式。归根结底,这仍然是基本的验证,不能用于 100% 证明确定文件类型。

【讨论】:

感谢@ROFLwTIME - 当我们有一个字节数组但没有文件名/扩展名时,我已经对此进行了一些改进。 (当然,对于某些 mime 类型,它需要默认设置,或者需要进一步增强才能正确识别)。但如果有人想让我发布代码,请告诉我。 +1 用于使用字节。现在,甚至可以获取特定 mime 类型的预期字节数来测试它(不是默认值 256)。但是,我会在这里选择一个结构,将扩展名、字节和 mime/type 作为属性,并可能保留一个预定义结构的字典。这将为我省去无休止的 if-else 检查 这种方法的问题在于,例如,以“MZ”开头的文本文件将被解释为 .EXE 文件。换句话说,您至少应该考虑在所有情况下进行扩展,加上更长的签名或每个格式的启发式方法以避免误报。 @Nutshell 我相信 XLS 末尾有一个 0 字节,而 DOC 没有,所以先检查 XLS,然后检查 DOC。至于 XLSX/DOCX,它们确实共享相同的签名 (ZIP),因此要区分它们,您需要比阅读标题更深入。例如,XLSX 文件在标题附近有字符串“xl/_rels/workbook.xml.rels”,而 DOCX 文件在标题附近有字符串“word/_rels/document.xml.rels”。这只是尝试区分这些特定类型的众多方法之一,它肯定不会涵盖 100% 的场景。 (例如,包含 DOCX/XLSX 文件的 Zip 文件) 大家好。我是如何在 github 上将原始 FileTypeDetective 分叉到 MimeDetective 的人。如果有用,我很高兴。我已经与开发人员 trailmax 进行了交谈。我们已将许可证更改为 MIT!【参考方案4】:

在 Urlmon.dll 中,有一个名为 FindMimeFromData 的函数。

来自文档

MIME 类型检测或“数据嗅探”是指从二进制数据中确定适当的 MIME 类型的过程。最终结果取决于服务器提供的 MIME 类型标头、文件扩展名和/或数据本身的组合。通常,只有前 256 个字节的数据是重要的。

因此,从文件中读取前(最多)256 个字节并将其传递给FindMimeFromData

【讨论】:

这种方法的可靠性如何? 根据***.com/questions/4833113/…,该函数只能确定26种类型,所以我认为它不可靠。例如。 '*.docx' 文件被确定为'application/x-zip-compressed'。 我想这是因为 docx 表面上是一个 zip 文件。 Docx 一个 zip 文件,.docx 的 mimetype 是“application/vnd.openxmlformats-officedocument.wordprocessingml.document”。虽然这可以通过仅二进制检查来确定,但这可能不是最有效的方法,而且在大多数情况下,您必须读取前 256 个字节以上的内容。 我认为这个问题在 20 年代仍然很重要。查看this answer。 FileSignatures Project 似乎更可靠一些,并且可以让您准确控制要匹配的文件类型。【参考方案5】:

如果您使用的是 .NET Framework 4.5 或更高版本,现在有一个 MimeMapping.GetMimeMapping(filename) 方法,该方法将返回一个字符串,其中包含传递的文件名的正确 Mime 映射。请注意,这使用文件扩展名,而不是文件本身中的数据。

文档位于http://msdn.microsoft.com/en-us/library/system.web.mimemapping.getmimemapping

【讨论】:

这对我有用,只需要一行代码。 var mimetype = System.Web.MimeMapping.GetMimeMapping(&lt;pathToFile&gt;); 这没有回答原始问题“如果文件扩展名不正确或丢失”。 GetMimeMapping 仅使用扩展名和 mime 条目的静态字典。 如果这门课很有用的话我发现了:) 我建议编辑您的评论,注意这在内部使用文件扩展名,很容易伪造。 通常情况下,我不会对答案投反对票,但作为这个误导性的答案,我这样做了。问题是关于不信任文件扩展名【参考方案6】:

您也可以查看注册表。

    using System.IO;
    using Microsoft.Win32;

    string GetMimeType(FileInfo fileInfo)
    
        string mimeType = "application/unknown";

        RegistryKey regKey = Registry.ClassesRoot.OpenSubKey(
            fileInfo.Extension.ToLower()
            );

        if(regKey != null)
        
            object contentType = regKey.GetValue("Content Type");

            if(contentType != null)
                mimeType = contentType.ToString();
        

        return mimeType;
    

您将不得不以一种或另一种方式进入 MIME 数据库 - 无论它们是从扩展名还是从幻数映射的有点微不足道 - Windows 注册表就是这样一个地方。 对于独立于平台的解决方案,尽管必须将这个数据库与代码一起提供(或作为独立库)。

【讨论】:

@Rabbi 尽管这个问题是针对文件内容而不是扩展名的,但这个答案可能对其他路过的人(比如我自己)仍然有用。即使这样的答案不太可能被接受,拥有这些信息仍然很好。 这不是简单地根据文件名的扩展名获取mime吗?如果文件是 .docx 并且某个小丑决定将其重命名为 .doc 怎么办?你肯定弄错了 MIME 类型。 @kolin,您说的完全正确,但俗话说“做傻事,总有人会做一个更好的傻瓜”。 :) 当使用 Windows Azure Web 角色或任何其他以有限信任运行您的应用程序的主机时 - 不要忘记您将不被允许访问注册表。 try-catch-for-registry 和内存字典(如 Anykey 的答案)的组合看起来是一个很好的解决方案,两者兼而有之。【参考方案7】:

如果您想在非 Windows 环境中托管 ASP.NET 解决方案,来自 Nuget 的HeyRed.Mime.MimeGuesser.GuessMimeType 将是终极解决方案。

文件扩展名映射非常不安全。如果攻击者上传无效扩展名,映射字典将例如允许在 .jpg 文件中分发可执行文件。 因此,始终使用内容嗅探库来了解真正的内容类型。

 public  static string MimeTypeFrom(byte[] dataBytes, string fileName)
 
        var contentType = HeyRed.Mime.MimeGuesser.GuessMimeType(dataBytes);
        if (string.IsNullOrEmpty(contentType))
        
            return HeyRed.Mime.MimeTypesMap.GetMimeType(fileName);
        
  return contentType;

【讨论】:

到目前为止,我尝试过的最好的库。找到我放在文件夹中的每个文件的内容类型。 + .net 核心支持! 简直太棒了。我还尝试了许多库(Nuget 包、自定义类...)。这个是 UNIX 系统中最接近 File -bi [filename] 的。【参考方案8】:

我使用混合解决方案:

    using System.Runtime.InteropServices;

    [DllImport (@"urlmon.dll", CharSet = CharSet.Auto)]
    private extern static System.UInt32 FindMimeFromData(
        System.UInt32 pBC, 
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
        [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
        System.UInt32 cbSize,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
        System.UInt32 dwMimeFlags,
        out System.UInt32 ppwzMimeOut,
        System.UInt32 dwReserverd
    );

    private string GetMimeFromRegistry (string Filename)
    
        string mime = "application/octetstream";
        string ext = System.IO.Path.GetExtension(Filename).ToLower();
        Microsoft.Win32.RegistryKey rk = Microsoft.Win32.Registry.ClassesRoot.OpenSubKey(ext);
        if (rk != null && rk.GetValue("Content Type") != null)
            mime = rk.GetValue("Content Type").ToString();
        return mime;
    

    public string GetMimeTypeFromFileAndRegistry (string filename)
    
        if (!File.Exists(filename))
        
           return GetMimeFromRegistry (filename);
        

        byte[] buffer = new byte[256];

        using (FileStream fs = new FileStream(filename, FileMode.Open))
        
            if (fs.Length >= 256)
                fs.Read(buffer, 0, 256);
            else
                fs.Read(buffer, 0, (int)fs.Length);
        

        try
                    
            System.UInt32 mimetype;

            FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);

            System.IntPtr mimeTypePtr = new IntPtr(mimetype);

            string mime = Marshal.PtrToStringUni(mimeTypePtr);

            Marshal.FreeCoTaskMem(mimeTypePtr);

            if (string.IsNullOrWhiteSpace (mime) || 
                mime =="text/plain" || mime == "application/octet-stream")                    
            
                return GetMimeFromRegistry (filename);
            

            return mime;
        
        catch (Exception e)
        
            return GetMimeFromRegistry (filename);
        
    

【讨论】:

感谢您的代码。它部分工作。对于“doc”和“tif”文件,它返回“application/octet-stream”。还有其他选择吗? 如果能看到上述扩展字典和 urlmon 的混合解决方案,那就太好了。 @PranavShah,请注意,服务器对 mime 类型(注册表查找返回的类型)的了解取决于服务器上安装的软件。基本的 Windows 安装或专用 Web 服务器不应该可靠地知道不需要安装的第 3 方软件的 mime 类型。不过,它应该知道.doc 文件是什么。【参考方案9】:

我写了一个 mime 类型的验证器。欢迎分享给大家。

private readonly Dictionary<string, byte[]> _mimeTypes = new Dictionary<string, byte[]>
    
        "image/jpeg", new byte[] 255, 216, 255,
        "image/jpg", new byte[] 255, 216, 255,
        "image/pjpeg", new byte[] 255, 216, 255,
        "image/apng", new byte[] 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82,
        "image/png", new byte[] 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82,
        "image/bmp", new byte[] 66, 77,
        "image/gif", new byte[] 71, 73, 70, 56,
    ;

private bool ValidateMimeType(byte[] file, string contentType)
    
        var imageType = _mimeTypes.SingleOrDefault(x => x.Key.Equals(contentType));

        return file.Take(imageType.Value.Length).SequenceEqual(imageType.Value);
    

【讨论】:

【参考方案10】:

我认为正确的答案是 Steve Morgan 和 Serguei 的答案的结合。这就是 Internet Explorer 的工作方式。对FindMimeFromData 的 pinvoke 调用仅适用于 26 种硬编码的 mime 类型。此外,即使可能存在更具体、更合适的 mime 类型,它也会给出模棱两可的 mime 类型(例如 text/plainapplication/octet-stream)。如果它没有给出好的 mime 类型,您可以去注册表获取更具体的 mime 类型。服务器注册表可以有更多最新的 mime 类型。

参考:http://msdn.microsoft.com/en-us/library/ms775147(VS.85).aspx

【讨论】:

【参考方案11】:

此类使用以前的答案尝试 3 种不同的方式:基于扩展的硬编码、FindMimeFromData API 和使用注册表。

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;

using Microsoft.Win32;

namespace YourNamespace

    public static class MimeTypeParser
    
        [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
        private extern static System.UInt32 FindMimeFromData(
                System.UInt32 pBC,
                [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
                [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
                System.UInt32 cbSize,
                [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
                System.UInt32 dwMimeFlags,
                out System.UInt32 ppwzMimeOut,
                System.UInt32 dwReserverd
        );

        public static string GetMimeType(string sFilePath)
        
            string sMimeType = GetMimeTypeFromList(sFilePath);

            if (String.IsNullOrEmpty(sMimeType))
            
                sMimeType = GetMimeTypeFromFile(sFilePath);

                if (String.IsNullOrEmpty(sMimeType))
                
                    sMimeType = GetMimeTypeFromRegistry(sFilePath);
                
            

            return sMimeType;
        

        public static string GetMimeTypeFromList(string sFileNameOrPath)
        
            string sMimeType = null;
            string sExtensionWithoutDot = Path.GetExtension(sFileNameOrPath).Substring(1).ToLower();

            if (!String.IsNullOrEmpty(sExtensionWithoutDot) && spDicMIMETypes.ContainsKey(sExtensionWithoutDot))
            
                sMimeType = spDicMIMETypes[sExtensionWithoutDot];
            

            return sMimeType;
        

        public static string GetMimeTypeFromRegistry(string sFileNameOrPath)
        
            string sMimeType = null;
            string sExtension = Path.GetExtension(sFileNameOrPath).ToLower();
            RegistryKey pKey = Registry.ClassesRoot.OpenSubKey(sExtension);

            if (pKey != null && pKey.GetValue("Content Type") != null)
            
                sMimeType = pKey.GetValue("Content Type").ToString();
            

            return sMimeType;
        

        public static string GetMimeTypeFromFile(string sFilePath)
        
            string sMimeType = null;

            if (File.Exists(sFilePath))
            
                byte[] abytBuffer = new byte[256];

                using (FileStream pFileStream = new FileStream(sFilePath, FileMode.Open))
                
                    if (pFileStream.Length >= 256)
                    
                        pFileStream.Read(abytBuffer, 0, 256);
                    
                    else
                    
                        pFileStream.Read(abytBuffer, 0, (int)pFileStream.Length);
                    
                

                try
                
                    UInt32 unMimeType;

                    FindMimeFromData(0, null, abytBuffer, 256, null, 0, out unMimeType, 0);

                    IntPtr pMimeType = new IntPtr(unMimeType);
                    string sMimeTypeFromFile = Marshal.PtrToStringUni(pMimeType);

                    Marshal.FreeCoTaskMem(pMimeType);

                    if (!String.IsNullOrEmpty(sMimeTypeFromFile) && sMimeTypeFromFile != "text/plain" && sMimeTypeFromFile != "application/octet-stream")
                    
                        sMimeType = sMimeTypeFromFile;
                    
                
                catch 
            

            return sMimeType;
        

        private static readonly Dictionary<string, string> spDicMIMETypes = new Dictionary<string, string>
        
            "ai", "application/postscript",
            "aif", "audio/x-aiff",
            "aifc", "audio/x-aiff",
            "aiff", "audio/x-aiff",
            "asc", "text/plain",
            "atom", "application/atom+xml",
            "au", "audio/basic",
            "avi", "video/x-msvideo",
            "bcpio", "application/x-bcpio",
            "bin", "application/octet-stream",
            "bmp", "image/bmp",
            "cdf", "application/x-netcdf",
            "cgm", "image/cgm",
            "class", "application/octet-stream",
            "cpio", "application/x-cpio",
            "cpt", "application/mac-compactpro",
            "csh", "application/x-csh",
            "css", "text/css",
            "dcr", "application/x-director",
            "dif", "video/x-dv",
            "dir", "application/x-director",
            "djv", "image/vnd.djvu",
            "djvu", "image/vnd.djvu",
            "dll", "application/octet-stream",
            "dmg", "application/octet-stream",
            "dms", "application/octet-stream",
            "doc", "application/msword",
            "docx","application/vnd.openxmlformats-officedocument.wordprocessingml.document",
            "dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
            "docm","application/vnd.ms-word.document.macroEnabled.12",
            "dotm","application/vnd.ms-word.template.macroEnabled.12",
            "dtd", "application/xml-dtd",
            "dv", "video/x-dv",
            "dvi", "application/x-dvi",
            "dxr", "application/x-director",
            "eps", "application/postscript",
            "etx", "text/x-setext",
            "exe", "application/octet-stream",
            "ez", "application/andrew-inset",
            "gif", "image/gif",
            "gram", "application/srgs",
            "grxml", "application/srgs+xml",
            "gtar", "application/x-gtar",
            "hdf", "application/x-hdf",
            "hqx", "application/mac-binhex40",
            "htc", "text/x-component",
            "htm", "text/html",
            "html", "text/html",
            "ice", "x-conference/x-cooltalk",
            "ico", "image/x-icon",
            "ics", "text/calendar",
            "ief", "image/ief",
            "ifb", "text/calendar",
            "iges", "model/iges",
            "igs", "model/iges",
            "jnlp", "application/x-java-jnlp-file",
            "jp2", "image/jp2",
            "jpe", "image/jpeg",
            "jpeg", "image/jpeg",
            "jpg", "image/jpeg",
            "js", "application/x-javascript",
            "kar", "audio/midi",
            "latex", "application/x-latex",
            "lha", "application/octet-stream",
            "lzh", "application/octet-stream",
            "m3u", "audio/x-mpegurl",
            "m4a", "audio/mp4a-latm",
            "m4b", "audio/mp4a-latm",
            "m4p", "audio/mp4a-latm",
            "m4u", "video/vnd.mpegurl",
            "m4v", "video/x-m4v",
            "mac", "image/x-macpaint",
            "man", "application/x-troff-man",
            "mathml", "application/mathml+xml",
            "me", "application/x-troff-me",
            "mesh", "model/mesh",
            "mid", "audio/midi",
            "midi", "audio/midi",
            "mif", "application/vnd.mif",
            "mov", "video/quicktime",
            "movie", "video/x-sgi-movie",
            "mp2", "audio/mpeg",
            "mp3", "audio/mpeg",
            "mp4", "video/mp4",
            "mpe", "video/mpeg",
            "mpeg", "video/mpeg",
            "mpg", "video/mpeg",
            "mpga", "audio/mpeg",
            "ms", "application/x-troff-ms",
            "msh", "model/mesh",
            "mxu", "video/vnd.mpegurl",
            "nc", "application/x-netcdf",
            "oda", "application/oda",
            "ogg", "application/ogg",
            "pbm", "image/x-portable-bitmap",
            "pct", "image/pict",
            "pdb", "chemical/x-pdb",
            "pdf", "application/pdf",
            "pgm", "image/x-portable-graymap",
            "pgn", "application/x-chess-pgn",
            "pic", "image/pict",
            "pict", "image/pict",
            "png", "image/png", 
            "pnm", "image/x-portable-anymap",
            "pnt", "image/x-macpaint",
            "pntg", "image/x-macpaint",
            "ppm", "image/x-portable-pixmap",
            "ppt", "application/vnd.ms-powerpoint",
            "pptx","application/vnd.openxmlformats-officedocument.presentationml.presentation",
            "potx","application/vnd.openxmlformats-officedocument.presentationml.template",
            "ppsx","application/vnd.openxmlformats-officedocument.presentationml.slideshow",
            "ppam","application/vnd.ms-powerpoint.addin.macroEnabled.12",
            "pptm","application/vnd.ms-powerpoint.presentation.macroEnabled.12",
            "potm","application/vnd.ms-powerpoint.template.macroEnabled.12",
            "ppsm","application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
            "ps", "application/postscript",
            "qt", "video/quicktime",
            "qti", "image/x-quicktime",
            "qtif", "image/x-quicktime",
            "ra", "audio/x-pn-realaudio",
            "ram", "audio/x-pn-realaudio",
            "ras", "image/x-cmu-raster",
            "rdf", "application/rdf+xml",
            "rgb", "image/x-rgb",
            "rm", "application/vnd.rn-realmedia",
            "roff", "application/x-troff",
            "rtf", "text/rtf",
            "rtx", "text/richtext",
            "sgm", "text/sgml",
            "sgml", "text/sgml",
            "sh", "application/x-sh",
            "shar", "application/x-shar",
            "silo", "model/mesh",
            "sit", "application/x-stuffit",
            "skd", "application/x-koan",
            "skm", "application/x-koan",
            "skp", "application/x-koan",
            "skt", "application/x-koan",
            "smi", "application/smil",
            "smil", "application/smil",
            "snd", "audio/basic",
            "so", "application/octet-stream",
            "spl", "application/x-futuresplash",
            "src", "application/x-wais-source",
            "sv4cpio", "application/x-sv4cpio",
            "sv4crc", "application/x-sv4crc",
            "svg", "image/svg+xml",
            "swf", "application/x-shockwave-flash",
            "t", "application/x-troff",
            "tar", "application/x-tar",
            "tcl", "application/x-tcl",
            "tex", "application/x-tex",
            "texi", "application/x-texinfo",
            "texinfo", "application/x-texinfo",
            "tif", "image/tiff",
            "tiff", "image/tiff",
            "tr", "application/x-troff",
            "tsv", "text/tab-separated-values",
            "txt", "text/plain",
            "ustar", "application/x-ustar",
            "vcd", "application/x-cdlink",
            "vrml", "model/vrml",
            "vxml", "application/voicexml+xml",
            "wav", "audio/x-wav",
            "wbmp", "image/vnd.wap.wbmp",
            "wbmxl", "application/vnd.wap.wbxml",
            "wml", "text/vnd.wap.wml",
            "wmlc", "application/vnd.wap.wmlc",
            "wmls", "text/vnd.wap.wmlscript",
            "wmlsc", "application/vnd.wap.wmlscriptc",
            "wrl", "model/vrml",
            "xbm", "image/x-xbitmap",
            "xht", "application/xhtml+xml",
            "xhtml", "application/xhtml+xml",
            "xls", "application/vnd.ms-excel",                                                
            "xml", "application/xml",
            "xpm", "image/x-xpixmap",
            "xsl", "application/xml",
            "xlsx","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            "xltx","application/vnd.openxmlformats-officedocument.spreadsheetml.template",
            "xlsm","application/vnd.ms-excel.sheet.macroEnabled.12",
            "xltm","application/vnd.ms-excel.template.macroEnabled.12",
            "xlam","application/vnd.ms-excel.addin.macroEnabled.12",
            "xlsb","application/vnd.ms-excel.sheet.binary.macroEnabled.12",
            "xslt", "application/xslt+xml",
            "xul", "application/vnd.mozilla.xul+xml",
            "xwd", "image/x-xwindowdump",
            "xyz", "chemical/x-xyz",
            "zip", "application/zip"
        ;
    

【讨论】:

不要忘记在注册表周围进行 Try-Catch - 您将不被允许在受限模式下访问它,在受限信任中的 Azure Web 角色或具有受限信任的所有其他主机就是这种情况。【参考方案12】:

我发现这个很有用。 对于 VB.NET 开发人员:

    Public Shared Function GetFromFileName(ByVal fileName As String) As String
        Return GetFromExtension(Path.GetExtension(fileName).Remove(0, 1))
    End Function

    Public Shared Function GetFromExtension(ByVal extension As String) As String
        If extension.StartsWith("."c) Then
            extension = extension.Remove(0, 1)
        End If

        If MIMETypesDictionary.ContainsKey(extension) Then
            Return MIMETypesDictionary(extension)
        End If

        Return "unknown/unknown"
    End Function

    Private Shared ReadOnly MIMETypesDictionary As New Dictionary(Of String, String)() From  _
         "ai", "application/postscript", _
         "aif", "audio/x-aiff", _
         "aifc", "audio/x-aiff", _
         "aiff", "audio/x-aiff", _
         "asc", "text/plain", _
         "atom", "application/atom+xml", _
         "au", "audio/basic", _
         "avi", "video/x-msvideo", _
         "bcpio", "application/x-bcpio", _
         "bin", "application/octet-stream", _
         "bmp", "image/bmp", _
         "cdf", "application/x-netcdf", _
         "cgm", "image/cgm", _
         "class", "application/octet-stream", _
         "cpio", "application/x-cpio", _
         "cpt", "application/mac-compactpro", _
         "csh", "application/x-csh", _
         "css", "text/css", _
         "dcr", "application/x-director", _
         "dif", "video/x-dv", _
         "dir", "application/x-director", _
         "djv", "image/vnd.djvu", _
         "djvu", "image/vnd.djvu", _
         "dll", "application/octet-stream", _
         "dmg", "application/octet-stream", _
         "dms", "application/octet-stream", _
         "doc", "application/msword", _
         "dtd", "application/xml-dtd", _
         "dv", "video/x-dv", _
         "dvi", "application/x-dvi", _
         "dxr", "application/x-director", _
         "eps", "application/postscript", _
         "etx", "text/x-setext", _
         "exe", "application/octet-stream", _
         "ez", "application/andrew-inset", _
         "gif", "image/gif", _
         "gram", "application/srgs", _
         "grxml", "application/srgs+xml", _
         "gtar", "application/x-gtar", _
         "hdf", "application/x-hdf", _
         "hqx", "application/mac-binhex40", _
         "htm", "text/html", _
         "html", "text/html", _
         "ice", "x-conference/x-cooltalk", _
         "ico", "image/x-icon", _
         "ics", "text/calendar", _
         "ief", "image/ief", _
         "ifb", "text/calendar", _
         "iges", "model/iges", _
         "igs", "model/iges", _
         "jnlp", "application/x-java-jnlp-file", _
         "jp2", "image/jp2", _
         "jpe", "image/jpeg", _
         "jpeg", "image/jpeg", _
         "jpg", "image/jpeg", _
         "js", "application/x-javascript", _
         "kar", "audio/midi", _
         "latex", "application/x-latex", _
         "lha", "application/octet-stream", _
         "lzh", "application/octet-stream", _
         "m3u", "audio/x-mpegurl", _
         "m4a", "audio/mp4a-latm", _
         "m4b", "audio/mp4a-latm", _
         "m4p", "audio/mp4a-latm", _
         "m4u", "video/vnd.mpegurl", _
         "m4v", "video/x-m4v", _
         "mac", "image/x-macpaint", _
         "man", "application/x-troff-man", _
         "mathml", "application/mathml+xml", _
         "me", "application/x-troff-me", _
         "mesh", "model/mesh", _
         "mid", "audio/midi", _
         "midi", "audio/midi", _
         "mif", "application/vnd.mif", _
         "mov", "video/quicktime", _
         "movie", "video/x-sgi-movie", _
         "mp2", "audio/mpeg", _
         "mp3", "audio/mpeg", _
         "mp4", "video/mp4", _
         "mpe", "video/mpeg", _
         "mpeg", "video/mpeg", _
         "mpg", "video/mpeg", _
         "mpga", "audio/mpeg", _
         "ms", "application/x-troff-ms", _
         "msh", "model/mesh", _
         "mxu", "video/vnd.mpegurl", _
         "nc", "application/x-netcdf", _
         "oda", "application/oda", _
         "ogg", "application/ogg", _
         "pbm", "image/x-portable-bitmap", _
         "pct", "image/pict", _
         "pdb", "chemical/x-pdb", _
         "pdf", "application/pdf", _
         "pgm", "image/x-portable-graymap", _
         "pgn", "application/x-chess-pgn", _
         "pic", "image/pict", _
         "pict", "image/pict", _
         "png", "image/png", _
         "pnm", "image/x-portable-anymap", _
         "pnt", "image/x-macpaint", _
         "pntg", "image/x-macpaint", _
         "ppm", "image/x-portable-pixmap", _
         "ppt", "application/vnd.ms-powerpoint", _
         "ps", "application/postscript", _
         "qt", "video/quicktime", _
         "qti", "image/x-quicktime", _
         "qtif", "image/x-quicktime", _
         "ra", "audio/x-pn-realaudio", _
         "ram", "audio/x-pn-realaudio", _
         "ras", "image/x-cmu-raster", _
         "rdf", "application/rdf+xml", _
         "rgb", "image/x-rgb", _
         "rm", "application/vnd.rn-realmedia", _
         "roff", "application/x-troff", _
         "rtf", "text/rtf", _
         "rtx", "text/richtext", _
         "sgm", "text/sgml", _
         "sgml", "text/sgml", _
         "sh", "application/x-sh", _
         "shar", "application/x-shar", _
         "silo", "model/mesh", _
         "sit", "application/x-stuffit", _
         "skd", "application/x-koan", _
         "skm", "application/x-koan", _
         "skp", "application/x-koan", _
         "skt", "application/x-koan", _
         "smi", "application/smil", _
         "smil", "application/smil", _
         "snd", "audio/basic", _
         "so", "application/octet-stream", _
         "spl", "application/x-futuresplash", _
         "src", "application/x-wais-source", _
         "sv4cpio", "application/x-sv4cpio", _
         "sv4crc", "application/x-sv4crc", _
         "svg", "image/svg+xml", _
         "swf", "application/x-shockwave-flash", _
         "t", "application/x-troff", _
         "tar", "application/x-tar", _
         "tcl", "application/x-tcl", _
         "tex", "application/x-tex", _
         "texi", "application/x-texinfo", _
         "texinfo", "application/x-texinfo", _
         "tif", "image/tiff", _
         "tiff", "image/tiff", _
         "tr", "application/x-troff", _
         "tsv", "text/tab-separated-values", _
         "txt", "text/plain", _
         "ustar", "application/x-ustar", _
         "vcd", "application/x-cdlink", _
         "vrml", "model/vrml", _
         "vxml", "application/voicexml+xml", _
         "wav", "audio/x-wav", _
         "wbmp", "image/vnd.wap.wbmp", _
         "wbmxl", "application/vnd.wap.wbxml", _
         "wml", "text/vnd.wap.wml", _
         "wmlc", "application/vnd.wap.wmlc", _
         "wmls", "text/vnd.wap.wmlscript", _
         "wmlsc", "application/vnd.wap.wmlscriptc", _
         "wrl", "model/vrml", _
         "xbm", "image/x-xbitmap", _
         "xht", "application/xhtml+xml", _
         "xhtml", "application/xhtml+xml", _
         "xls", "application/vnd.ms-excel", _
         "xml", "application/xml", _
         "xpm", "image/x-xpixmap", _
         "xsl", "application/xml", _
         "xslt", "application/xslt+xml", _
         "xul", "application/vnd.mozilla.xul+xml", _
         "xwd", "image/x-xwindowdump", _
         "xyz", "chemical/x-xyz", _
         "zip", "application/zip" _
        

【讨论】:

看起来可能是旧列表...没有 .docx、.xlsx 等。 某处有在线列表吗?上面的列表看起来更完整一些,在这里找到了一些缺失的列表:***.com/questions/4212861/…——但似乎应该有一个 Web 服务,你可以发送一个文件名和一些字节也可以做到最好猜猜剩下的…… 我会为此使用配置,所以我可以选择我需要的 mime 类型并相应地修改它们,而无需更改任何一行代码【参考方案13】:

如果有人愿意,他们可以将出色的 perl 模块 File::Type 移植到 .NET。在代码中是一组文件头幻数查找每个文件类型或正则表达式匹配。

这是一个 .NET 文件类型检测库 http://filetypedetective.codeplex.com/,但它目前只检测到少量文件。

【讨论】:

【参考方案14】:

此答案是作者答案 (Richard Gourlay) 的副本,但根据 Rohland 指向http://www.pinvoke.net/default.aspx/urlmon.findmimefromdata的评论进行了改进以解决 IIS 8 / win2012 上的问题(其中函数会导致应用程序池崩溃)

using System.Runtime.InteropServices;

...

public static string GetMimeFromFile(string filename)


    if (!File.Exists(filename))
        throw new FileNotFoundException(filename + " not found");

    const int maxContent = 256;

    var buffer = new byte[maxContent];
    using (var fs = new FileStream(filename, FileMode.Open))
    
        if (fs.Length >= maxContent)
            fs.Read(buffer, 0, maxContent);
        else
            fs.Read(buffer, 0, (int) fs.Length);
    

    var mimeTypePtr = IntPtr.Zero;
    try
    
        var result = FindMimeFromData(IntPtr.Zero, null, buffer, maxContent, null, 0, out mimeTypePtr, 0);
        if (result != 0)
        
            Marshal.FreeCoTaskMem(mimeTypePtr);
            throw Marshal.GetExceptionForHR(result);
        

        var mime = Marshal.PtrToStringUni(mimeTypePtr);
        Marshal.FreeCoTaskMem(mimeTypePtr);
        return mime;
    
    catch (Exception e)
    
        if (mimeTypePtr != IntPtr.Zero)
        
            Marshal.FreeCoTaskMem(mimeTypePtr);
        
        return "unknown/unknown";
    


[DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = false)]
private static extern int FindMimeFromData(IntPtr pBC,
    [MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
    [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1, SizeParamIndex = 3)] byte[] pBuffer,
    int cbSize,
    [MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
    int dwMimeFlags,
    out IntPtr ppwzMimeOut,
    int dwReserved);

【讨论】:

不错的 C+P 答案。克里斯【参考方案15】:

@Steve Morgan 和@Richard Gourlay 这是一个很好的解决方案,谢谢。一个小缺点是,当文件中的字节数为 255 或更低时,mime 类型有时会产生“application/octet-stream”,这对于预期会产生“text/plain”的文件来说有点不准确。我已更新您的原始方法以解决这种情况,如下所示:

如果文件中的字节数小于或等于 255,并且推断出的 mime 类型为“application/octet-stream”,则创建一个新的字节数组,该数组由重复 n 次的原始文件字节组成,直到总字节数 >= 256。然后重新检查该新字节数组的 mime 类型。

修改方法:

Imports System.Runtime.InteropServices

<DllImport("urlmon.dll", CharSet:=CharSet.Auto)> _
Private Shared Function FindMimeFromData(pBC As System.UInt32, <MarshalAs(UnmanagedType.LPStr)> pwzUrl As System.String, <MarshalAs(UnmanagedType.LPArray)> pBuffer As Byte(), cbSize As System.UInt32, <MarshalAs(UnmanagedType.LPStr)> pwzMimeProposed As System.String, dwMimeFlags As System.UInt32, _
ByRef ppwzMimeOut As System.UInt32, dwReserverd As System.UInt32) As System.UInt32
End Function
Private Function GetMimeType(ByVal f As FileInfo) As String
    'See http://***.com/questions/58510/using-net-how-can-you-find-the-mime-type-of-a-file-based-on-the-file-signature
    Dim returnValue As String = ""
    Dim fileStream As FileStream = Nothing
    Dim fileStreamLength As Long = 0
    Dim fileStreamIsLessThanBByteSize As Boolean = False

    Const byteSize As Integer = 255
    Const bbyteSize As Integer = byteSize + 1

    Const ambiguousMimeType As String = "application/octet-stream"
    Const unknownMimeType As String = "unknown/unknown"

    Dim buffer As Byte() = New Byte(byteSize) 
    Dim fnGetMimeTypeValue As New Func(Of Byte(), Integer, String)(
        Function(_buffer As Byte(), _bbyteSize As Integer) As String
            Dim _returnValue As String = ""
            Dim mimeType As UInt32 = 0
            FindMimeFromData(0, Nothing, _buffer, _bbyteSize, Nothing, 0, mimeType, 0)
            Dim mimeTypePtr As IntPtr = New IntPtr(mimeType)
            _returnValue = Marshal.PtrToStringUni(mimeTypePtr)
            Marshal.FreeCoTaskMem(mimeTypePtr)
            Return _returnValue
        End Function)

    If (f.Exists()) Then
        Try
            fileStream = New FileStream(f.FullName(), FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
            fileStreamLength = fileStream.Length()

            If (fileStreamLength >= bbyteSize) Then
                fileStream.Read(buffer, 0, bbyteSize)
            Else
                fileStreamIsLessThanBByteSize = True
                fileStream.Read(buffer, 0, CInt(fileStreamLength))
            End If

            returnValue = fnGetMimeTypeValue(buffer, bbyteSize)

            If (returnValue.Equals(ambiguousMimeType, StringComparison.OrdinalIgnoreCase) AndAlso fileStreamIsLessThanBByteSize AndAlso fileStreamLength > 0) Then
                'Duplicate the stream content until the stream length is >= bbyteSize to get a more deterministic mime type analysis.
                Dim currentBuffer As Byte() = buffer.Take(fileStreamLength).ToArray()
                Dim repeatCount As Integer = Math.Floor((bbyteSize / fileStreamLength) + 1)
                Dim bBufferList As List(Of Byte) = New List(Of Byte)
                While (repeatCount > 0)
                    bBufferList.AddRange(currentBuffer)
                    repeatCount -= 1
                End While
                Dim bbuffer As Byte() = bBufferList.Take(bbyteSize).ToArray()
                returnValue = fnGetMimeTypeValue(bbuffer, bbyteSize)
            End If
        Catch ex As Exception
            returnValue = unknownMimeType
        Finally
            If (fileStream IsNot Nothing) Then fileStream.Close()
        End Try
    End If
    Return returnValue
End Function

【讨论】:

这是我遇到的问题,您的想法很棒,可以复制字节。我必须在 c# 中实现它,但是使用文件的长度和具有文件第一个字节的缓冲区,我能够遍历所有丢失的字节并复制数组中的字节以重复文件(我只是从 idx 中复制了数组中较早文件长度的字节)。【参考方案16】:

IIS 7 或更高版本

使用此代码,但您需要成为服务器上的管理员

public bool CheckMimeMapExtension(string fileExtension)
        
            try
            

                using (
                ServerManager serverManager = new ServerManager())
                   
                    // connects to default app.config
                    var config = serverManager.GetApplicationHostConfiguration();
                    var staticContent = config.GetSection("system.webServer/staticContent");
                    var mimeMap = staticContent.GetCollection();

                    foreach (var mimeType in mimeMap)
                    

                        if (((String)mimeType["fileExtension"]).Equals(fileExtension, StringComparison.OrdinalIgnoreCase))
                            return true;

                    

                
                return false;
            
            catch (Exception ex)
             
                Console.WriteLine("An exception has occurred: \n0", ex.Message);
                Console.Read();
            

            return false;

        

【讨论】:

欺骗呢?【参考方案17】:

当使用 Windows Azure Web 角色或任何其他以有限信任运行您的应用程序的主机时,请不要忘记您将无权访问注册表或非托管代码。混合方法 - try-catch-for-registry 和内存字典的组合看起来像是一个很好的解决方案,它包含了所有的东西。

我用这段代码来做:

public class DefaultMimeResolver : IMimeResolver

    private readonly IFileRepository _fileRepository;

    public DefaultMimeResolver(IFileRepository fileRepository)
    
        _fileRepository = fileRepository;
    

    [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
    private static extern System.UInt32 FindMimeFromData(
        System.UInt32 pBC, [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
         [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
         System.UInt32 cbSize,
         [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
         System.UInt32 dwMimeFlags,
         out System.UInt32 ppwzMimeOut,
         System.UInt32 dwReserverd);


    public string GetMimeTypeFromFileExtension(string fileExtension)
    
        if (string.IsNullOrEmpty(fileExtension))
        
            throw new ArgumentNullException("fileExtension");
        

        string mimeType = GetMimeTypeFromList(fileExtension);

        if (String.IsNullOrEmpty(mimeType))
        
            mimeType = GetMimeTypeFromRegistry(fileExtension);
        

        return mimeType;
    

    public string GetMimeTypeFromFile(string filePath)
    
        if (string.IsNullOrEmpty(filePath))
        
            throw new ArgumentNullException("filePath");
        

        if (!File.Exists(filePath))
        
            throw new FileNotFoundException("File not found : ", filePath);
        

        string mimeType = GetMimeTypeFromList(Path.GetExtension(filePath).ToLower());

        if (String.IsNullOrEmpty(mimeType))
        
            mimeType = GetMimeTypeFromRegistry(Path.GetExtension(filePath).ToLower());

            if (String.IsNullOrEmpty(mimeType))
            
                mimeType = GetMimeTypeFromFileInternal(filePath);
            
        

        return mimeType;
    

    private string GetMimeTypeFromList(string fileExtension)
    
        string mimeType = null;

        if (fileExtension.StartsWith("."))
        
            fileExtension = fileExtension.TrimStart('.');
        

        if (!String.IsNullOrEmpty(fileExtension) && _mimeTypes.ContainsKey(fileExtension))
        
            mimeType = _mimeTypes[fileExtension];
        

        return mimeType;
    

    private string GetMimeTypeFromRegistry(string fileExtension)
    
        string mimeType = null;
        try
        
            RegistryKey key = Registry.ClassesRoot.OpenSubKey(fileExtension);

            if (key != null && key.GetValue("Content Type") != null)
            
                mimeType = key.GetValue("Content Type").ToString();
            
        
        catch (Exception)
        
            // Empty. When this code is running in limited mode accessing registry is not allowed.
        

        return mimeType;
    

    private string GetMimeTypeFromFileInternal(string filePath)
    
        string mimeType = null;

        if (!File.Exists(filePath))
        
            return null;
        

        byte[] byteBuffer = new byte[256];

        using (FileStream fileStream = _fileRepository.Get(filePath))
        
            if (fileStream.Length >= 256)
            
                fileStream.Read(byteBuffer, 0, 256);
            
            else
            
                fileStream.Read(byteBuffer, 0, (int)fileStream.Length);
            
        

        try
        
            UInt32 MimeTypeNum;

            FindMimeFromData(0, null, byteBuffer, 256, null, 0, out MimeTypeNum, 0);

            IntPtr mimeTypePtr = new IntPtr(MimeTypeNum);
            string mimeTypeFromFile = Marshal.PtrToStringUni(mimeTypePtr);

            Marshal.FreeCoTaskMem(mimeTypePtr);

            if (!String.IsNullOrEmpty(mimeTypeFromFile) && mimeTypeFromFile != "text/plain" && mimeTypeFromFile != "application/octet-stream")
            
                mimeType = mimeTypeFromFile;
            
        
        catch
        
            // Empty. 
        

        return mimeType;
    

    private readonly Dictionary<string, string> _mimeTypes = new Dictionary<string, string>
        
            "ai", "application/postscript",
            "aif", "audio/x-aiff",
            "aifc", "audio/x-aiff",
            "aiff", "audio/x-aiff",
            "asc", "text/plain",
            "atom", "application/atom+xml",
            "au", "audio/basic",
            "avi", "video/x-msvideo",
            "bcpio", "application/x-bcpio",
            "bin", "application/octet-stream",
            "bmp", "image/bmp",
            "cdf", "application/x-netcdf",
            "cgm", "image/cgm",
            "class", "application/octet-stream",
            "cpio", "application/x-cpio",
            "cpt", "application/mac-compactpro",
            "csh", "application/x-csh",
            "css", "text/css",
            "dcr", "application/x-director",
            "dif", "video/x-dv",
            "dir", "application/x-director",
            "djv", "image/vnd.djvu",
            "djvu", "image/vnd.djvu",
            "dll", "application/octet-stream",
            "dmg", "application/octet-stream",
            "dms", "application/octet-stream",
            "doc", "application/msword",
            "docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
            "dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
            "docm", "application/vnd.ms-word.document.macroEnabled.12",
            "dotm", "application/vnd.ms-word.template.macroEnabled.12",
            "dtd", "application/xml-dtd",
            "dv", "video/x-dv",
            "dvi", "application/x-dvi",
            "dxr", "application/x-director",
            "eps", "application/postscript",
            "etx", "text/x-setext",
            "exe", "application/octet-stream",
            "ez", "application/andrew-inset",
            "gif", "image/gif",
            "gram", "application/srgs",
            "grxml", "application/srgs+xml",
            "gtar", "application/x-gtar",
            "hdf", "application/x-hdf",
            "hqx", "application/mac-binhex40",
            "htc", "text/x-component",
            "htm", "text/html",
            "html", "text/html",
            "ice", "x-conference/x-cooltalk",
            "ico", "image/x-icon",
            "ics", "text/calendar",
            "ief", "image/ief",
            "ifb", "text/calendar",
            "iges", "model/iges",
            "igs", "model/iges",
            "jnlp", "application/x-java-jnlp-file",
            "jp2", "image/jp2",
            "jpe", "image/jpeg",
            "jpeg", "image/jpeg",
            "jpg", "image/jpeg",
            "js", "application/x-javascript",
            "kar", "audio/midi",
            "latex", "application/x-latex",
            "lha", "application/octet-stream",
            "lzh", "application/octet-stream",
            "m3u", "audio/x-mpegurl",
            "m4a", "audio/mp4a-latm",
            "m4b", "audio/mp4a-latm",
            "m4p", "audio/mp4a-latm",
            "m4u", "video/vnd.mpegurl",
            "m4v", "video/x-m4v",
            "mac", "image/x-macpaint",
            "man", "application/x-troff-man",
            "mathml", "application/mathml+xml",
            "me", "application/x-troff-me",
            "mesh", "model/mesh",
            "mid", "audio/midi",
            "midi", "audio/midi",
            "mif", "application/vnd.mif",
            "mov", "video/quicktime",
            "movie", "video/x-sgi-movie",
            "mp2", "audio/mpeg",
            "mp3", "audio/mpeg",
            "mp4", "video/mp4",
            "mpe", "video/mpeg",
            "mpeg", "video/mpeg",
            "mpg", "video/mpeg",
            "mpga", "audio/mpeg",
            "ms", "application/x-troff-ms",
            "msh", "model/mesh",
            "mxu", "video/vnd.mpegurl",
            "nc", "application/x-netcdf",
            "oda", "application/oda",
            "ogg", "application/ogg",
            "pbm", "image/x-portable-bitmap",
            "pct", "image/pict",
            "pdb", "chemical/x-pdb",
            "pdf", "application/pdf",
            "pgm", "image/x-portable-graymap",
            "pgn", "application/x-chess-pgn",
            "pic", "image/pict",
            "pict", "image/pict",
            "png", "image/png",
            "pnm", "image/x-portable-anymap",
            "pnt", "image/x-macpaint",
            "pntg", "image/x-macpaint",
            "ppm", "image/x-portable-pixmap",
            "ppt", "application/vnd.ms-powerpoint",
            "pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation",
            "potx", "application/vnd.openxmlformats-officedocument.presentationml.template",
            "ppsx", "application/vnd.openxmlformats-officedocument.presentationml.slideshow",
            "ppam", "application/vnd.ms-powerpoint.addin.macroEnabled.12",
            "pptm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12",
            "potm", "application/vnd.ms-powerpoint.template.macroEnabled.12",
            "ppsm", "application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
            "ps", "application/postscript",
            "qt", "video/quicktime",
            "qti", "image/x-quicktime",
            "qtif", "image/x-quicktime",
            "ra", "audio/x-pn-realaudio",
            "ram", "audio/x-pn-realaudio",
            "ras", "image/x-cmu-raster",
            "rdf", "application/rdf+xml",
            "rgb", "image/x-rgb",
            "rm", "application/vnd.rn-realmedia",
            "roff", "application/x-troff",
            "rtf", "text/rtf",
            "rtx", "text/richtext",
            "sgm", "text/sgml",
            "sgml", "text/sgml",
            "sh", "application/x-sh",
            "shar", "application/x-shar",
            "silo", "model/mesh",
            "sit", "application/x-stuffit",
            "skd", "application/x-koan",
            "skm", "application/x-koan",
            "skp", "application/x-koan",
            "skt", "application/x-koan",
            "smi", "application/smil",
            "smil", "application/smil",
            "snd", "audio/basic",
            "so", "application/octet-stream",
            "spl", "application/x-futuresplash",
            "src", "application/x-wais-source",
            "sv4cpio", "application/x-sv4cpio",
            "sv4crc", "application/x-sv4crc",
            "svg", "image/svg+xml",
            "swf", "application/x-shockwave-flash",
            "t", "application/x-troff",
            "tar", "application/x-tar",
            "tcl", "application/x-tcl",
            "tex", "application/x-tex",
            "texi", "application/x-texinfo",
            "texinfo", "application/x-texinfo",
            "tif", "image/tiff",
            "tiff", "image/tiff",
            "tr", "application/x-troff",
            "tsv", "text/tab-separated-values",
            "txt", "text/plain",
            "ustar", "application/x-ustar",
            "vcd", "application/x-cdlink",
            "vrml", "model/vrml",
            "vxml", "application/voicexml+xml",
            "wav", "audio/x-wav",
            "wbmp", "image/vnd.wap.wbmp",
            "wbmxl", "application/vnd.wap.wbxml",
            "wml", "text/vnd.wap.wml",
            "wmlc", "application/vnd.wap.wmlc",
            "wmls", "text/vnd.wap.wmlscript",
            "wmlsc", "application/vnd.wap.wmlscriptc",
            "wrl", "model/vrml",
            "xbm", "image/x-xbitmap",
            "xht", "application/xhtml+xml",
            "xhtml", "application/xhtml+xml",
            "xls", "application/vnd.ms-excel",
            "xml", "application/xml",
            "xpm", "image/x-xpixmap",
            "xsl", "application/xml",
            "xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            "xltx", "application/vnd.openxmlformats-officedocument.spreadsheetml.template",
            "xlsm", "application/vnd.ms-excel.sheet.macroEnabled.12",
            "xltm", "application/vnd.ms-excel.template.macroEnabled.12",
            "xlam", "application/vnd.ms-excel.addin.macroEnabled.12",
            "xlsb", "application/vnd.ms-excel.sheet.binary.macroEnabled.12",
            "xslt", "application/xslt+xml",
            "xul", "application/vnd.mozilla.xul+xml",
            "xwd", "image/x-xwindowdump",
            "xyz", "chemical/x-xyz",
            "zip", "application/zip"
        ;

【讨论】:

真诚地感谢任何关于否决票的评论 - 我真的很想了解此代码的任何潜在不当行为。 当您使用 GetMimeTypeFromFileInternal 时,我在您的代码中看不到任何 try-catch 块。因此,看起来默认情况下您只需检查文件扩展名,如果您不确定文件中的实际内容,这并没有真正的帮助。而且我仍然无法理解,GetMimeTypeFromFileInternal 在 Azure 中是否可以在有限信任的情况下工作?如果不是,那为什么它还在代码中? 在执行上下文受限的情况下,可以限制代码只使用列表。是的,还有更多的扩展,但只有开发人员知道应用程序的上下文并且可以添加更多到列表中。可以肯定的是,try-catch 是很好的补充。【参考方案18】:

我最终使用了 Netomatix 的 Winista MimeDetector。创建账号后即可免费下载源代码:http://www.netomatix.com/Products/DocumentManagement/MimeDetector.aspx

MimeTypes g_MimeTypes = new MimeTypes("mime-types.xml");
sbyte [] fileData = null;

using (System.IO.FileStream srcFile = new System.IO.FileStream(strFile, System.IO.FileMode.Open))

    byte [] data = new byte[srcFile.Length];
    srcFile.Read(data, 0, (Int32)srcFile.Length);
    fileData = Winista.Mime.SupportUtil.ToSByteArray(data);


MimeType oMimeType = g_MimeTypes.GetMimeType(fileData);

这是此处回答的另一个问题的一部分:Alternative to FindMimeFromData method in Urlmon.dll one which has more MIME types 我认为这个问题的最佳解决方案。

【讨论】:

【参考方案19】:

我发现运行这段代码有几个问题:

UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);

如果你尝试用 x64/Win10 运行它,你会得到

AccessViolationException "Attempted to read or write protected memory.
This is often an indication that other memory is corrupt"

感谢PtrToStringUni doesnt work in windows 10 和 @xanatos 这篇帖子

我修改了我的解决方案以在 x64 和 .NET Core 2.1 下运行:

   [DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true, 
    SetLastError = false)]
    static extern int FindMimeFromData(IntPtr pBC,
        [MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
        [MarshalAs(UnmanagedType.LPArray, ArraySubType=UnmanagedType.I1, 
        SizeParamIndex=3)]
        byte[] pBuffer,
        int cbSize,
        [MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
        int dwMimeFlags,
        out IntPtr ppwzMimeOut,
        int dwReserved);

   string getMimeFromFile(byte[] fileSource)
   
            byte[] buffer = new byte[256];
            using (Stream stream = new MemoryStream(fileSource))
            
                if (stream.Length >= 256)
                    stream.Read(buffer, 0, 256);
                else
                    stream.Read(buffer, 0, (int)stream.Length);
            

            try
            
                IntPtr mimeTypePtr;
                FindMimeFromData(IntPtr.Zero, null, buffer, buffer.Length,
                    null, 0, out mimeTypePtr, 0);

                string mime = Marshal.PtrToStringUni(mimeTypePtr);
                Marshal.FreeCoTaskMem(mimeTypePtr);
                return mime;
            
            catch (Exception ex)
            
                return "unknown/unknown";
            
   

谢谢

【讨论】:

【参考方案20】:

您好,我已将 Winista.MimeDetect 项目改编为 .net 核心/框架,并回退到 urlmon.dll 可以随意使用它:nuget package。

   //init
   var mimeTypes = new MimeTypes();

   //usage by filepath
   var mimeType1 = mimeTypes.GetMimeTypeFromFile(filePath);

【讨论】:

gihub 代码示例在这里是错误的 github.com/GetoXs/MimeDetect 。没有过载mimeTypes.GetMimeTypeFromFile(bytes);

以上是关于使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型的主要内容,如果未能解决你的问题,请参考以下文章

使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型

使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型

使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型

如何根据 APK 文件获取应用签名时间

提交到 App Store 时,未找到 WatchKit 扩展的匹配配置文件

如何在 HTML 文本区域中找到光标位置(X/Y,而不是行/列)? [复制]