使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型
Posted
技术标签:
【中文标题】使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型【英文标题】:Using .NET, how can you find the mime type of a file based on the file signature not the extension 【发布时间】:2015-12-06 07:02:23 【问题描述】:我正在寻找一种简单的方法来获取文件扩展名不正确或未给出的 mime 类型,类似于 this question 仅在 .Net 中。
【问题讨论】:
这听起来类似于this question。 当要求明确指出不要使用扩展名时,我希望我可以删除所有仍在使用文件扩展名的“假答案”! 这可能是一个老问题,但问题仍然存在。我会在这里对每个答案投反对票,因为他们只通过内容检查 Windows 可执行文件; Linux 或 ios 可执行文件或危险文件呢? @PhillipH 为这些写一个答案。 【参考方案1】:我写了一个 mime 类型的验证器。欢迎分享给大家。
private readonly Dictionary<string, byte[]> _mimeTypes = new Dictionary<string, byte[]>
"image/jpeg", new byte[] 255, 216, 255,
"image/jpg", new byte[] 255, 216, 255,
"image/pjpeg", new byte[] 255, 216, 255,
"image/apng", new byte[] 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82,
"image/png", new byte[] 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82,
"image/bmp", new byte[] 66, 77,
"image/gif", new byte[] 71, 73, 70, 56,
;
private bool ValidateMimeType(byte[] file, string contentType)
var imageType = _mimeTypes.SingleOrDefault(x => x.Key.Equals(contentType));
return file.Take(imageType.Value.Length).SequenceEqual(imageType.Value);
【讨论】:
【参考方案2】:如果您想在非 Windows 环境中托管 ASP.NET 解决方案,来自 Nuget 的HeyRed.Mime.MimeGuesser.GuessMimeType
将是终极解决方案。
文件扩展名映射非常不安全。如果攻击者上传无效扩展名,映射字典将例如允许在 .jpg 文件中分发可执行文件。 因此,始终使用内容嗅探库来了解真正的内容类型。
public static string MimeTypeFrom(byte[] dataBytes, string fileName)
var contentType = HeyRed.Mime.MimeGuesser.GuessMimeType(dataBytes);
if (string.IsNullOrEmpty(contentType))
return HeyRed.Mime.MimeTypesMap.GetMimeType(fileName);
return contentType;
【讨论】:
到目前为止,我尝试过的最好的库。找到我放在文件夹中的每个文件的内容类型。 + .net 核心支持! 简直太棒了。我还尝试了许多库(Nuget 包、自定义类...)。这个是 UNIX 系统中最接近File -bi [filename]
的。【参考方案3】:
您好,我已将 Winista.MimeDetect 项目改编为 .net 核心/框架,并回退到 urlmon.dll 可以随意使用它:nuget package。
//init
var mimeTypes = new MimeTypes();
//usage by filepath
var mimeType1 = mimeTypes.GetMimeTypeFromFile(filePath);
【讨论】:
gihub 代码示例在这里是错误的 github.com/GetoXs/MimeDetect 。没有过载mimeTypes.GetMimeTypeFromFile(bytes);
【参考方案4】:
我发现运行这段代码有几个问题:
UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
如果你尝试用 x64/Win10 运行它,你会得到
AccessViolationException "Attempted to read or write protected memory.
This is often an indication that other memory is corrupt"
感谢PtrToStringUni doesnt work in windows 10 和 @xanatos 这篇帖子
我修改了我的解决方案以在 x64 和 .NET Core 2.1 下运行:
[DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true,
SetLastError = false)]
static extern int FindMimeFromData(IntPtr pBC,
[MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
[MarshalAs(UnmanagedType.LPArray, ArraySubType=UnmanagedType.I1,
SizeParamIndex=3)]
byte[] pBuffer,
int cbSize,
[MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
int dwMimeFlags,
out IntPtr ppwzMimeOut,
int dwReserved);
string getMimeFromFile(byte[] fileSource)
byte[] buffer = new byte[256];
using (Stream stream = new MemoryStream(fileSource))
if (stream.Length >= 256)
stream.Read(buffer, 0, 256);
else
stream.Read(buffer, 0, (int)stream.Length);
try
IntPtr mimeTypePtr;
FindMimeFromData(IntPtr.Zero, null, buffer, buffer.Length,
null, 0, out mimeTypePtr, 0);
string mime = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
return mime;
catch (Exception ex)
return "unknown/unknown";
谢谢
【讨论】:
【参考方案5】:如果您使用的是 .NET Framework 4.5 或更高版本,现在有一个 MimeMapping.GetMimeMapping(filename) 方法,该方法将返回一个字符串,其中包含传递的文件名的正确 Mime 映射。请注意,这使用文件扩展名,而不是文件本身中的数据。
文档位于http://msdn.microsoft.com/en-us/library/system.web.mimemapping.getmimemapping
【讨论】:
这对我有用,只需要一行代码。var mimetype = System.Web.MimeMapping.GetMimeMapping(<pathToFile>);
这没有回答原始问题“如果文件扩展名不正确或丢失”。 GetMimeMapping 仅使用扩展名和 mime 条目的静态字典。
如果这门课很有用的话我发现了:)
我建议编辑您的评论,注意这在内部使用文件扩展名,很容易伪造。
通常情况下,我不会对答案投反对票,但作为这个误导性的答案,我这样做了。问题是关于不信任文件扩展名【参考方案6】:
我最终使用了来自 Netomatix 的 Winista MimeDetector。创建账号后即可免费下载源代码:http://www.netomatix.com/Products/DocumentManagement/MimeDetector.aspx
MimeTypes g_MimeTypes = new MimeTypes("mime-types.xml");
sbyte [] fileData = null;
using (System.IO.FileStream srcFile = new System.IO.FileStream(strFile, System.IO.FileMode.Open))
byte [] data = new byte[srcFile.Length];
srcFile.Read(data, 0, (Int32)srcFile.Length);
fileData = Winista.Mime.SupportUtil.ToSByteArray(data);
MimeType oMimeType = g_MimeTypes.GetMimeType(fileData);
这是此处回答的另一个问题的一部分:Alternative to FindMimeFromData method in Urlmon.dll one which has more MIME types 我认为这个问题的最佳解决方案。
【讨论】:
【参考方案7】:当使用 Windows Azure Web 角色或任何其他以有限信任运行您的应用程序的主机时,请不要忘记您将无权访问注册表或非托管代码。混合方法 - try-catch-for-registry 和内存字典的组合看起来像是一个很好的解决方案,它包含了所有的东西。
我用这段代码来做:
public class DefaultMimeResolver : IMimeResolver
private readonly IFileRepository _fileRepository;
public DefaultMimeResolver(IFileRepository fileRepository)
_fileRepository = fileRepository;
[DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
private static extern System.UInt32 FindMimeFromData(
System.UInt32 pBC, [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
[MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
System.UInt32 cbSize,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
System.UInt32 dwMimeFlags,
out System.UInt32 ppwzMimeOut,
System.UInt32 dwReserverd);
public string GetMimeTypeFromFileExtension(string fileExtension)
if (string.IsNullOrEmpty(fileExtension))
throw new ArgumentNullException("fileExtension");
string mimeType = GetMimeTypeFromList(fileExtension);
if (String.IsNullOrEmpty(mimeType))
mimeType = GetMimeTypeFromRegistry(fileExtension);
return mimeType;
public string GetMimeTypeFromFile(string filePath)
if (string.IsNullOrEmpty(filePath))
throw new ArgumentNullException("filePath");
if (!File.Exists(filePath))
throw new FileNotFoundException("File not found : ", filePath);
string mimeType = GetMimeTypeFromList(Path.GetExtension(filePath).ToLower());
if (String.IsNullOrEmpty(mimeType))
mimeType = GetMimeTypeFromRegistry(Path.GetExtension(filePath).ToLower());
if (String.IsNullOrEmpty(mimeType))
mimeType = GetMimeTypeFromFileInternal(filePath);
return mimeType;
private string GetMimeTypeFromList(string fileExtension)
string mimeType = null;
if (fileExtension.StartsWith("."))
fileExtension = fileExtension.TrimStart('.');
if (!String.IsNullOrEmpty(fileExtension) && _mimeTypes.ContainsKey(fileExtension))
mimeType = _mimeTypes[fileExtension];
return mimeType;
private string GetMimeTypeFromRegistry(string fileExtension)
string mimeType = null;
try
RegistryKey key = Registry.ClassesRoot.OpenSubKey(fileExtension);
if (key != null && key.GetValue("Content Type") != null)
mimeType = key.GetValue("Content Type").ToString();
catch (Exception)
// Empty. When this code is running in limited mode accessing registry is not allowed.
return mimeType;
private string GetMimeTypeFromFileInternal(string filePath)
string mimeType = null;
if (!File.Exists(filePath))
return null;
byte[] byteBuffer = new byte[256];
using (FileStream fileStream = _fileRepository.Get(filePath))
if (fileStream.Length >= 256)
fileStream.Read(byteBuffer, 0, 256);
else
fileStream.Read(byteBuffer, 0, (int)fileStream.Length);
try
UInt32 MimeTypeNum;
FindMimeFromData(0, null, byteBuffer, 256, null, 0, out MimeTypeNum, 0);
IntPtr mimeTypePtr = new IntPtr(MimeTypeNum);
string mimeTypeFromFile = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
if (!String.IsNullOrEmpty(mimeTypeFromFile) && mimeTypeFromFile != "text/plain" && mimeTypeFromFile != "application/octet-stream")
mimeType = mimeTypeFromFile;
catch
// Empty.
return mimeType;
private readonly Dictionary<string, string> _mimeTypes = new Dictionary<string, string>
"ai", "application/postscript",
"aif", "audio/x-aiff",
"aifc", "audio/x-aiff",
"aiff", "audio/x-aiff",
"asc", "text/plain",
"atom", "application/atom+xml",
"au", "audio/basic",
"avi", "video/x-msvideo",
"bcpio", "application/x-bcpio",
"bin", "application/octet-stream",
"bmp", "image/bmp",
"cdf", "application/x-netcdf",
"cgm", "image/cgm",
"class", "application/octet-stream",
"cpio", "application/x-cpio",
"cpt", "application/mac-compactpro",
"csh", "application/x-csh",
"css", "text/css",
"dcr", "application/x-director",
"dif", "video/x-dv",
"dir", "application/x-director",
"djv", "image/vnd.djvu",
"djvu", "image/vnd.djvu",
"dll", "application/octet-stream",
"dmg", "application/octet-stream",
"dms", "application/octet-stream",
"doc", "application/msword",
"docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
"docm", "application/vnd.ms-word.document.macroEnabled.12",
"dotm", "application/vnd.ms-word.template.macroEnabled.12",
"dtd", "application/xml-dtd",
"dv", "video/x-dv",
"dvi", "application/x-dvi",
"dxr", "application/x-director",
"eps", "application/postscript",
"etx", "text/x-setext",
"exe", "application/octet-stream",
"ez", "application/andrew-inset",
"gif", "image/gif",
"gram", "application/srgs",
"grxml", "application/srgs+xml",
"gtar", "application/x-gtar",
"hdf", "application/x-hdf",
"hqx", "application/mac-binhex40",
"htc", "text/x-component",
"htm", "text/html",
"html", "text/html",
"ice", "x-conference/x-cooltalk",
"ico", "image/x-icon",
"ics", "text/calendar",
"ief", "image/ief",
"ifb", "text/calendar",
"iges", "model/iges",
"igs", "model/iges",
"jnlp", "application/x-java-jnlp-file",
"jp2", "image/jp2",
"jpe", "image/jpeg",
"jpeg", "image/jpeg",
"jpg", "image/jpeg",
"js", "application/x-javascript",
"kar", "audio/midi",
"latex", "application/x-latex",
"lha", "application/octet-stream",
"lzh", "application/octet-stream",
"m3u", "audio/x-mpegurl",
"m4a", "audio/mp4a-latm",
"m4b", "audio/mp4a-latm",
"m4p", "audio/mp4a-latm",
"m4u", "video/vnd.mpegurl",
"m4v", "video/x-m4v",
"mac", "image/x-macpaint",
"man", "application/x-troff-man",
"mathml", "application/mathml+xml",
"me", "application/x-troff-me",
"mesh", "model/mesh",
"mid", "audio/midi",
"midi", "audio/midi",
"mif", "application/vnd.mif",
"mov", "video/quicktime",
"movie", "video/x-sgi-movie",
"mp2", "audio/mpeg",
"mp3", "audio/mpeg",
"mp4", "video/mp4",
"mpe", "video/mpeg",
"mpeg", "video/mpeg",
"mpg", "video/mpeg",
"mpga", "audio/mpeg",
"ms", "application/x-troff-ms",
"msh", "model/mesh",
"mxu", "video/vnd.mpegurl",
"nc", "application/x-netcdf",
"oda", "application/oda",
"ogg", "application/ogg",
"pbm", "image/x-portable-bitmap",
"pct", "image/pict",
"pdb", "chemical/x-pdb",
"pdf", "application/pdf",
"pgm", "image/x-portable-graymap",
"pgn", "application/x-chess-pgn",
"pic", "image/pict",
"pict", "image/pict",
"png", "image/png",
"pnm", "image/x-portable-anymap",
"pnt", "image/x-macpaint",
"pntg", "image/x-macpaint",
"ppm", "image/x-portable-pixmap",
"ppt", "application/vnd.ms-powerpoint",
"pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation",
"potx", "application/vnd.openxmlformats-officedocument.presentationml.template",
"ppsx", "application/vnd.openxmlformats-officedocument.presentationml.slideshow",
"ppam", "application/vnd.ms-powerpoint.addin.macroEnabled.12",
"pptm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12",
"potm", "application/vnd.ms-powerpoint.template.macroEnabled.12",
"ppsm", "application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
"ps", "application/postscript",
"qt", "video/quicktime",
"qti", "image/x-quicktime",
"qtif", "image/x-quicktime",
"ra", "audio/x-pn-realaudio",
"ram", "audio/x-pn-realaudio",
"ras", "image/x-cmu-raster",
"rdf", "application/rdf+xml",
"rgb", "image/x-rgb",
"rm", "application/vnd.rn-realmedia",
"roff", "application/x-troff",
"rtf", "text/rtf",
"rtx", "text/richtext",
"sgm", "text/sgml",
"sgml", "text/sgml",
"sh", "application/x-sh",
"shar", "application/x-shar",
"silo", "model/mesh",
"sit", "application/x-stuffit",
"skd", "application/x-koan",
"skm", "application/x-koan",
"skp", "application/x-koan",
"skt", "application/x-koan",
"smi", "application/smil",
"smil", "application/smil",
"snd", "audio/basic",
"so", "application/octet-stream",
"spl", "application/x-futuresplash",
"src", "application/x-wais-source",
"sv4cpio", "application/x-sv4cpio",
"sv4crc", "application/x-sv4crc",
"svg", "image/svg+xml",
"swf", "application/x-shockwave-flash",
"t", "application/x-troff",
"tar", "application/x-tar",
"tcl", "application/x-tcl",
"tex", "application/x-tex",
"texi", "application/x-texinfo",
"texinfo", "application/x-texinfo",
"tif", "image/tiff",
"tiff", "image/tiff",
"tr", "application/x-troff",
"tsv", "text/tab-separated-values",
"txt", "text/plain",
"ustar", "application/x-ustar",
"vcd", "application/x-cdlink",
"vrml", "model/vrml",
"vxml", "application/voicexml+xml",
"wav", "audio/x-wav",
"wbmp", "image/vnd.wap.wbmp",
"wbmxl", "application/vnd.wap.wbxml",
"wml", "text/vnd.wap.wml",
"wmlc", "application/vnd.wap.wmlc",
"wmls", "text/vnd.wap.wmlscript",
"wmlsc", "application/vnd.wap.wmlscriptc",
"wrl", "model/vrml",
"xbm", "image/x-xbitmap",
"xht", "application/xhtml+xml",
"xhtml", "application/xhtml+xml",
"xls", "application/vnd.ms-excel",
"xml", "application/xml",
"xpm", "image/x-xpixmap",
"xsl", "application/xml",
"xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"xltx", "application/vnd.openxmlformats-officedocument.spreadsheetml.template",
"xlsm", "application/vnd.ms-excel.sheet.macroEnabled.12",
"xltm", "application/vnd.ms-excel.template.macroEnabled.12",
"xlam", "application/vnd.ms-excel.addin.macroEnabled.12",
"xlsb", "application/vnd.ms-excel.sheet.binary.macroEnabled.12",
"xslt", "application/xslt+xml",
"xul", "application/vnd.mozilla.xul+xml",
"xwd", "image/x-xwindowdump",
"xyz", "chemical/x-xyz",
"zip", "application/zip"
;
【讨论】:
真诚地感谢任何关于否决票的评论 - 我真的很想了解此代码的任何潜在不当行为。 当您使用 GetMimeTypeFromFileInternal 时,我在您的代码中看不到任何 try-catch 块。因此,看起来默认情况下您只需检查文件扩展名,如果您不确定文件中的实际内容,这并没有真正的帮助。而且我仍然无法理解,GetMimeTypeFromFileInternal 在 Azure 中是否可以在有限信任的情况下工作?如果不是,那为什么它还在代码中? 在执行上下文受限的情况下,可以限制代码只使用列表。是的,还有更多的扩展,但只有开发人员知道应用程序的上下文并且可以添加更多到列表中。可以肯定的是,try-catch 是很好的补充。【参考方案8】:编辑:只需使用Mime Detective
我使用字节数组序列来确定给定文件的正确 MIME 类型。与仅查看文件名的文件扩展名相比,这样做的好处是,如果用户要重命名文件以绕过某些文件类型上传限制,则文件扩展名将无法捕捉到这一点。另一方面,通过字节数组获取文件签名将阻止这种恶作剧的发生。
这是一个 C# 示例:
public class MimeType
private static readonly byte[] BMP = 66, 77 ;
private static readonly byte[] DOC = 208, 207, 17, 224, 161, 177, 26, 225 ;
private static readonly byte[] EXE_DLL = 77, 90 ;
private static readonly byte[] GIF = 71, 73, 70, 56 ;
private static readonly byte[] ICO = 0, 0, 1, 0 ;
private static readonly byte[] JPG = 255, 216, 255 ;
private static readonly byte[] MP3 = 255, 251, 48 ;
private static readonly byte[] OGG = 79, 103, 103, 83, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0 ;
private static readonly byte[] PDF = 37, 80, 68, 70, 45, 49, 46 ;
private static readonly byte[] PNG = 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82 ;
private static readonly byte[] RAR = 82, 97, 114, 33, 26, 7, 0 ;
private static readonly byte[] SWF = 70, 87, 83 ;
private static readonly byte[] TIFF = 73, 73, 42, 0 ;
private static readonly byte[] TORRENT = 100, 56, 58, 97, 110, 110, 111, 117, 110, 99, 101 ;
private static readonly byte[] TTF = 0, 1, 0, 0, 0 ;
private static readonly byte[] WAV_AVI = 82, 73, 70, 70 ;
private static readonly byte[] WMV_WMA = 48, 38, 178, 117, 142, 102, 207, 17, 166, 217, 0, 170, 0, 98, 206, 108 ;
private static readonly byte[] ZIP_DOCX = 80, 75, 3, 4 ;
public static string GetMimeType(byte[] file, string fileName)
string mime = "application/octet-stream"; //DEFAULT UNKNOWN MIME TYPE
//Ensure that the filename isn't empty or null
if (string.IsNullOrWhiteSpace(fileName))
return mime;
//Get the file extension
string extension = Path.GetExtension(fileName) == null
? string.Empty
: Path.GetExtension(fileName).ToUpper();
//Get the MIME Type
if (file.Take(2).SequenceEqual(BMP))
mime = "image/bmp";
else if (file.Take(8).SequenceEqual(DOC))
mime = "application/msword";
else if (file.Take(2).SequenceEqual(EXE_DLL))
mime = "application/x-msdownload"; //both use same mime type
else if (file.Take(4).SequenceEqual(GIF))
mime = "image/gif";
else if (file.Take(4).SequenceEqual(ICO))
mime = "image/x-icon";
else if (file.Take(3).SequenceEqual(JPG))
mime = "image/jpeg";
else if (file.Take(3).SequenceEqual(MP3))
mime = "audio/mpeg";
else if (file.Take(14).SequenceEqual(OGG))
if (extension == ".OGX")
mime = "application/ogg";
else if (extension == ".OGA")
mime = "audio/ogg";
else
mime = "video/ogg";
else if (file.Take(7).SequenceEqual(PDF))
mime = "application/pdf";
else if (file.Take(16).SequenceEqual(PNG))
mime = "image/png";
else if (file.Take(7).SequenceEqual(RAR))
mime = "application/x-rar-compressed";
else if (file.Take(3).SequenceEqual(SWF))
mime = "application/x-shockwave-flash";
else if (file.Take(4).SequenceEqual(TIFF))
mime = "image/tiff";
else if (file.Take(11).SequenceEqual(TORRENT))
mime = "application/x-bittorrent";
else if (file.Take(5).SequenceEqual(TTF))
mime = "application/x-font-ttf";
else if (file.Take(4).SequenceEqual(WAV_AVI))
mime = extension == ".AVI" ? "video/x-msvideo" : "audio/x-wav";
else if (file.Take(16).SequenceEqual(WMV_WMA))
mime = extension == ".WMA" ? "audio/x-ms-wma" : "video/x-ms-wmv";
else if (file.Take(4).SequenceEqual(ZIP_DOCX))
mime = extension == ".DOCX" ? "application/vnd.openxmlformats-officedocument.wordprocessingml.document" : "application/x-zip-compressed";
return mime;
请注意,我处理 DOCX 文件类型的方式不同,因为 DOCX 实际上只是一个 ZIP 文件。在这种情况下,一旦我确认它具有该序列,我只需检查文件扩展名。对于某些人来说,这个示例还远未完成,但您可以轻松添加自己的示例。
如果你想添加更多的MIME类型,你可以得到很多不同文件类型的字节数组序列from here。另外,here is another good resource 涉及文件签名。
如果所有其他方法都失败了,我经常做的是逐步浏览我正在寻找的特定类型的多个文件,并在文件的字节序列中寻找一个模式。归根结底,这仍然是基础验证,不能用于 100% 证明确定文件类型。
【讨论】:
感谢@ROFLwTIME - 当我们有一个字节数组但没有文件名/扩展名时,我已经对此进行了一些改进。 (当然,对于某些 mime 类型,它需要默认设置,或者需要进一步增强才能正确识别)。但如果有人想让我发布代码,请告诉我。 +1 用于使用字节。现在,甚至可以获取特定 mime 类型的预期字节数来测试它(不是默认值 256)。但是,我会在这里选择一个结构,将扩展名、字节和 mime/type 作为属性,并可能保留一个预定义结构的字典。这将为我省去无休止的 if-else 检查 这种方法的问题是,例如,以“MZ”开头的文本文件将被解释为.EXE 文件。换句话说,您至少应该考虑在所有情况下进行扩展,加上更长的签名或每个格式的启发式方法以避免误报。 @Nutshell 我相信 XLS 末尾有一个 0 字节,而 DOC 没有,所以先检查 XLS,然后检查 DOC。至于 XLSX/DOCX,它们确实共享相同的签名 (ZIP),因此要区分它们,您需要比阅读标题更深入。例如,XLSX 文件在标题附近有字符串“xl/_rels/workbook.xml.rels”,而 DOCX 文件在标题附近有字符串“word/_rels/document.xml.rels”。这只是尝试区分这些特定类型的众多方法之一,它肯定不会涵盖 100% 的场景。 (例如,包含 DOCX/XLSX 文件的 Zip 文件) 大家好。我是如何在 github 上将原始 FileTypeDetective 分叉到 MimeDetective 的人。如果有用,我很高兴。我已经与开发人员 trailmax 进行了交谈。我们已将许可证更改为 MIT!【参考方案9】:@Steve Morgan 和@Richard Gourlay 这是一个很好的解决方案,谢谢。一个小缺点是,当文件中的字节数为 255 或更低时,mime 类型有时会产生“application/octet-stream”,这对于预期会产生“text/plain”的文件来说有点不准确。我已更新您的原始方法以解决这种情况,如下所示:
如果文件中的字节数小于或等于 255,并且推断出的 mime 类型为“application/octet-stream”,则创建一个新的字节数组,该数组由重复 n 次的原始文件字节组成,直到总字节数 >= 256。然后重新检查那个新字节数组的 mime-type。
修改方法:
Imports System.Runtime.InteropServices
<DllImport("urlmon.dll", CharSet:=CharSet.Auto)> _
Private Shared Function FindMimeFromData(pBC As System.UInt32, <MarshalAs(UnmanagedType.LPStr)> pwzUrl As System.String, <MarshalAs(UnmanagedType.LPArray)> pBuffer As Byte(), cbSize As System.UInt32, <MarshalAs(UnmanagedType.LPStr)> pwzMimeProposed As System.String, dwMimeFlags As System.UInt32, _
ByRef ppwzMimeOut As System.UInt32, dwReserverd As System.UInt32) As System.UInt32
End Function
Private Function GetMimeType(ByVal f As FileInfo) As String
'See http://***.com/questions/58510/using-net-how-can-you-find-the-mime-type-of-a-file-based-on-the-file-signature
Dim returnValue As String = ""
Dim fileStream As FileStream = Nothing
Dim fileStreamLength As Long = 0
Dim fileStreamIsLessThanBByteSize As Boolean = False
Const byteSize As Integer = 255
Const bbyteSize As Integer = byteSize + 1
Const ambiguousMimeType As String = "application/octet-stream"
Const unknownMimeType As String = "unknown/unknown"
Dim buffer As Byte() = New Byte(byteSize)
Dim fnGetMimeTypeValue As New Func(Of Byte(), Integer, String)(
Function(_buffer As Byte(), _bbyteSize As Integer) As String
Dim _returnValue As String = ""
Dim mimeType As UInt32 = 0
FindMimeFromData(0, Nothing, _buffer, _bbyteSize, Nothing, 0, mimeType, 0)
Dim mimeTypePtr As IntPtr = New IntPtr(mimeType)
_returnValue = Marshal.PtrToStringUni(mimeTypePtr)
Marshal.FreeCoTaskMem(mimeTypePtr)
Return _returnValue
End Function)
If (f.Exists()) Then
Try
fileStream = New FileStream(f.FullName(), FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
fileStreamLength = fileStream.Length()
If (fileStreamLength >= bbyteSize) Then
fileStream.Read(buffer, 0, bbyteSize)
Else
fileStreamIsLessThanBByteSize = True
fileStream.Read(buffer, 0, CInt(fileStreamLength))
End If
returnValue = fnGetMimeTypeValue(buffer, bbyteSize)
If (returnValue.Equals(ambiguousMimeType, StringComparison.OrdinalIgnoreCase) AndAlso fileStreamIsLessThanBByteSize AndAlso fileStreamLength > 0) Then
'Duplicate the stream content until the stream length is >= bbyteSize to get a more deterministic mime type analysis.
Dim currentBuffer As Byte() = buffer.Take(fileStreamLength).ToArray()
Dim repeatCount As Integer = Math.Floor((bbyteSize / fileStreamLength) + 1)
Dim bBufferList As List(Of Byte) = New List(Of Byte)
While (repeatCount > 0)
bBufferList.AddRange(currentBuffer)
repeatCount -= 1
End While
Dim bbuffer As Byte() = bBufferList.Take(bbyteSize).ToArray()
returnValue = fnGetMimeTypeValue(bbuffer, bbyteSize)
End If
Catch ex As Exception
returnValue = unknownMimeType
Finally
If (fileStream IsNot Nothing) Then fileStream.Close()
End Try
End If
Return returnValue
End Function
【讨论】:
这是我遇到的问题,你的想法很棒,复制字节。我必须在 c# 中实现它,但是使用文件的长度和具有文件第一个字节的缓冲区,我能够遍历所有丢失的字节并复制数组中的字节以重复文件(我只是从 idx 中复制了数组中较早文件长度的字节)。【参考方案10】:此答案是作者答案 (Richard Gourlay) 的副本,但根据 Rohland 指向 http://www.pinvoke.net/default.aspx/urlmon.findmimefromdata 的评论进行了改进以解决 IIS 8 / win2012 上的问题(函数会导致应用程序池崩溃)
using System.Runtime.InteropServices;
...
public static string GetMimeFromFile(string filename)
if (!File.Exists(filename))
throw new FileNotFoundException(filename + " not found");
const int maxContent = 256;
var buffer = new byte[maxContent];
using (var fs = new FileStream(filename, FileMode.Open))
if (fs.Length >= maxContent)
fs.Read(buffer, 0, maxContent);
else
fs.Read(buffer, 0, (int) fs.Length);
var mimeTypePtr = IntPtr.Zero;
try
var result = FindMimeFromData(IntPtr.Zero, null, buffer, maxContent, null, 0, out mimeTypePtr, 0);
if (result != 0)
Marshal.FreeCoTaskMem(mimeTypePtr);
throw Marshal.GetExceptionForHR(result);
var mime = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
return mime;
catch (Exception e)
if (mimeTypePtr != IntPtr.Zero)
Marshal.FreeCoTaskMem(mimeTypePtr);
return "unknown/unknown";
[DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = false)]
private static extern int FindMimeFromData(IntPtr pBC,
[MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
[MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1, SizeParamIndex = 3)] byte[] pBuffer,
int cbSize,
[MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
int dwMimeFlags,
out IntPtr ppwzMimeOut,
int dwReserved);
【讨论】:
不错的 C+P 答案。克里斯【参考方案11】:此类使用以前的答案尝试 3 种不同的方式:基于扩展的硬编码、FindMimeFromData API 和使用注册表。
using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using Microsoft.Win32;
namespace YourNamespace
public static class MimeTypeParser
[DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
private extern static System.UInt32 FindMimeFromData(
System.UInt32 pBC,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
[MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
System.UInt32 cbSize,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
System.UInt32 dwMimeFlags,
out System.UInt32 ppwzMimeOut,
System.UInt32 dwReserverd
);
public static string GetMimeType(string sFilePath)
string sMimeType = GetMimeTypeFromList(sFilePath);
if (String.IsNullOrEmpty(sMimeType))
sMimeType = GetMimeTypeFromFile(sFilePath);
if (String.IsNullOrEmpty(sMimeType))
sMimeType = GetMimeTypeFromRegistry(sFilePath);
return sMimeType;
public static string GetMimeTypeFromList(string sFileNameOrPath)
string sMimeType = null;
string sExtensionWithoutDot = Path.GetExtension(sFileNameOrPath).Substring(1).ToLower();
if (!String.IsNullOrEmpty(sExtensionWithoutDot) && spDicMIMETypes.ContainsKey(sExtensionWithoutDot))
sMimeType = spDicMIMETypes[sExtensionWithoutDot];
return sMimeType;
public static string GetMimeTypeFromRegistry(string sFileNameOrPath)
string sMimeType = null;
string sExtension = Path.GetExtension(sFileNameOrPath).ToLower();
RegistryKey pKey = Registry.ClassesRoot.OpenSubKey(sExtension);
if (pKey != null && pKey.GetValue("Content Type") != null)
sMimeType = pKey.GetValue("Content Type").ToString();
return sMimeType;
public static string GetMimeTypeFromFile(string sFilePath)
string sMimeType = null;
if (File.Exists(sFilePath))
byte[] abytBuffer = new byte[256];
using (FileStream pFileStream = new FileStream(sFilePath, FileMode.Open))
if (pFileStream.Length >= 256)
pFileStream.Read(abytBuffer, 0, 256);
else
pFileStream.Read(abytBuffer, 0, (int)pFileStream.Length);
try
UInt32 unMimeType;
FindMimeFromData(0, null, abytBuffer, 256, null, 0, out unMimeType, 0);
IntPtr pMimeType = new IntPtr(unMimeType);
string sMimeTypeFromFile = Marshal.PtrToStringUni(pMimeType);
Marshal.FreeCoTaskMem(pMimeType);
if (!String.IsNullOrEmpty(sMimeTypeFromFile) && sMimeTypeFromFile != "text/plain" && sMimeTypeFromFile != "application/octet-stream")
sMimeType = sMimeTypeFromFile;
catch
return sMimeType;
private static readonly Dictionary<string, string> spDicMIMETypes = new Dictionary<string, string>
"ai", "application/postscript",
"aif", "audio/x-aiff",
"aifc", "audio/x-aiff",
"aiff", "audio/x-aiff",
"asc", "text/plain",
"atom", "application/atom+xml",
"au", "audio/basic",
"avi", "video/x-msvideo",
"bcpio", "application/x-bcpio",
"bin", "application/octet-stream",
"bmp", "image/bmp",
"cdf", "application/x-netcdf",
"cgm", "image/cgm",
"class", "application/octet-stream",
"cpio", "application/x-cpio",
"cpt", "application/mac-compactpro",
"csh", "application/x-csh",
"css", "text/css",
"dcr", "application/x-director",
"dif", "video/x-dv",
"dir", "application/x-director",
"djv", "image/vnd.djvu",
"djvu", "image/vnd.djvu",
"dll", "application/octet-stream",
"dmg", "application/octet-stream",
"dms", "application/octet-stream",
"doc", "application/msword",
"docx","application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
"docm","application/vnd.ms-word.document.macroEnabled.12",
"dotm","application/vnd.ms-word.template.macroEnabled.12",
"dtd", "application/xml-dtd",
"dv", "video/x-dv",
"dvi", "application/x-dvi",
"dxr", "application/x-director",
"eps", "application/postscript",
"etx", "text/x-setext",
"exe", "application/octet-stream",
"ez", "application/andrew-inset",
"gif", "image/gif",
"gram", "application/srgs",
"grxml", "application/srgs+xml",
"gtar", "application/x-gtar",
"hdf", "application/x-hdf",
"hqx", "application/mac-binhex40",
"htc", "text/x-component",
"htm", "text/html",
"html", "text/html",
"ice", "x-conference/x-cooltalk",
"ico", "image/x-icon",
"ics", "text/calendar",
"ief", "image/ief",
"ifb", "text/calendar",
"iges", "model/iges",
"igs", "model/iges",
"jnlp", "application/x-java-jnlp-file",
"jp2", "image/jp2",
"jpe", "image/jpeg",
"jpeg", "image/jpeg",
"jpg", "image/jpeg",
"js", "application/x-javascript",
"kar", "audio/midi",
"latex", "application/x-latex",
"lha", "application/octet-stream",
"lzh", "application/octet-stream",
"m3u", "audio/x-mpegurl",
"m4a", "audio/mp4a-latm",
"m4b", "audio/mp4a-latm",
"m4p", "audio/mp4a-latm",
"m4u", "video/vnd.mpegurl",
"m4v", "video/x-m4v",
"mac", "image/x-macpaint",
"man", "application/x-troff-man",
"mathml", "application/mathml+xml",
"me", "application/x-troff-me",
"mesh", "model/mesh",
"mid", "audio/midi",
"midi", "audio/midi",
"mif", "application/vnd.mif",
"mov", "video/quicktime",
"movie", "video/x-sgi-movie",
"mp2", "audio/mpeg",
"mp3", "audio/mpeg",
"mp4", "video/mp4",
"mpe", "video/mpeg",
"mpeg", "video/mpeg",
"mpg", "video/mpeg",
"mpga", "audio/mpeg",
"ms", "application/x-troff-ms",
"msh", "model/mesh",
"mxu", "video/vnd.mpegurl",
"nc", "application/x-netcdf",
"oda", "application/oda",
"ogg", "application/ogg",
"pbm", "image/x-portable-bitmap",
"pct", "image/pict",
"pdb", "chemical/x-pdb",
"pdf", "application/pdf",
"pgm", "image/x-portable-graymap",
"pgn", "application/x-chess-pgn",
"pic", "image/pict",
"pict", "image/pict",
"png", "image/png",
"pnm", "image/x-portable-anymap",
"pnt", "image/x-macpaint",
"pntg", "image/x-macpaint",
"ppm", "image/x-portable-pixmap",
"ppt", "application/vnd.ms-powerpoint",
"pptx","application/vnd.openxmlformats-officedocument.presentationml.presentation",
"potx","application/vnd.openxmlformats-officedocument.presentationml.template",
"ppsx","application/vnd.openxmlformats-officedocument.presentationml.slideshow",
"ppam","application/vnd.ms-powerpoint.addin.macroEnabled.12",
"pptm","application/vnd.ms-powerpoint.presentation.macroEnabled.12",
"potm","application/vnd.ms-powerpoint.template.macroEnabled.12",
"ppsm","application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
"ps", "application/postscript",
"qt", "video/quicktime",
"qti", "image/x-quicktime",
"qtif", "image/x-quicktime",
"ra", "audio/x-pn-realaudio",
"ram", "audio/x-pn-realaudio",
"ras", "image/x-cmu-raster",
"rdf", "application/rdf+xml",
"rgb", "image/x-rgb",
"rm", "application/vnd.rn-realmedia",
"roff", "application/x-troff",
"rtf", "text/rtf",
"rtx", "text/richtext",
"sgm", "text/sgml",
"sgml", "text/sgml",
"sh", "application/x-sh",
"shar", "application/x-shar",
"silo", "model/mesh",
"sit", "application/x-stuffit",
"skd", "application/x-koan",
"skm", "application/x-koan",
"skp", "application/x-koan",
"skt", "application/x-koan",
"smi", "application/smil",
"smil", "application/smil",
"snd", "audio/basic",
"so", "application/octet-stream",
"spl", "application/x-futuresplash",
"src", "application/x-wais-source",
"sv4cpio", "application/x-sv4cpio",
"sv4crc", "application/x-sv4crc",
"svg", "image/svg+xml",
"swf", "application/x-shockwave-flash",
"t", "application/x-troff",
"tar", "application/x-tar",
"tcl", "application/x-tcl",
"tex", "application/x-tex",
"texi", "application/x-texinfo",
"texinfo", "application/x-texinfo",
"tif", "image/tiff",
"tiff", "image/tiff",
"tr", "application/x-troff",
"tsv", "text/tab-separated-values",
"txt", "text/plain",
"ustar", "application/x-ustar",
"vcd", "application/x-cdlink",
"vrml", "model/vrml",
"vxml", "application/voicexml+xml",
"wav", "audio/x-wav",
"wbmp", "image/vnd.wap.wbmp",
"wbmxl", "application/vnd.wap.wbxml",
"wml", "text/vnd.wap.wml",
"wmlc", "application/vnd.wap.wmlc",
"wmls", "text/vnd.wap.wmlscript",
"wmlsc", "application/vnd.wap.wmlscriptc",
"wrl", "model/vrml",
"xbm", "image/x-xbitmap",
"xht", "application/xhtml+xml",
"xhtml", "application/xhtml+xml",
"xls", "application/vnd.ms-excel",
"xml", "application/xml",
"xpm", "image/x-xpixmap",
"xsl", "application/xml",
"xlsx","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"xltx","application/vnd.openxmlformats-officedocument.spreadsheetml.template",
"xlsm","application/vnd.ms-excel.sheet.macroEnabled.12",
"xltm","application/vnd.ms-excel.template.macroEnabled.12",
"xlam","application/vnd.ms-excel.addin.macroEnabled.12",
"xlsb","application/vnd.ms-excel.sheet.binary.macroEnabled.12",
"xslt", "application/xslt+xml",
"xul", "application/vnd.mozilla.xul+xml",
"xwd", "image/x-xwindowdump",
"xyz", "chemical/x-xyz",
"zip", "application/zip"
;
【讨论】:
不要忘记在注册表周围进行 Try-Catch - 您将不被允许在受限模式下访问它,在受限信任中的 Azure Web 角色或具有受限信任的所有其他主机就是这种情况。【参考方案12】:我找到了一个硬编码的解决方案,我希望我能帮助别人:
public static class MIMEAssistant
private static readonly Dictionary<string, string> MIMETypesDictionary = new Dictionary<string, string>
"ai", "application/postscript",
"aif", "audio/x-aiff",
"aifc", "audio/x-aiff",
"aiff", "audio/x-aiff",
"asc", "text/plain",
"atom", "application/atom+xml",
"au", "audio/basic",
"avi", "video/x-msvideo",
"bcpio", "application/x-bcpio",
"bin", "application/octet-stream",
"bmp", "image/bmp",
"cdf", "application/x-netcdf",
"cgm", "image/cgm",
"class", "application/octet-stream",
"cpio", "application/x-cpio",
"cpt", "application/mac-compactpro",
"csh", "application/x-csh",
"css", "text/css",
"dcr", "application/x-director",
"dif", "video/x-dv",
"dir", "application/x-director",
"djv", "image/vnd.djvu",
"djvu", "image/vnd.djvu",
"dll", "application/octet-stream",
"dmg", "application/octet-stream",
"dms", "application/octet-stream",
"doc", "application/msword",
"docx","application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
"docm","application/vnd.ms-word.document.macroEnabled.12",
"dotm","application/vnd.ms-word.template.macroEnabled.12",
"dtd", "application/xml-dtd",
"dv", "video/x-dv",
"dvi", "application/x-dvi",
"dxr", "application/x-director",
"eps", "application/postscript",
"etx", "text/x-setext",
"exe", "application/octet-stream",
"ez", "application/andrew-inset",
"gif", "image/gif",
"gram", "application/srgs",
"grxml", "application/srgs+xml",
"gtar", "application/x-gtar",
"hdf", "application/x-hdf",
"hqx", "application/mac-binhex40",
"htm", "text/html",
"html", "text/html",
"ice", "x-conference/x-cooltalk",
"ico", "image/x-icon",
"ics", "text/calendar",
"ief", "image/ief",
"ifb", "text/calendar",
"iges", "model/iges",
"igs", "model/iges",
"jnlp", "application/x-java-jnlp-file",
"jp2", "image/jp2",
"jpe", "image/jpeg",
"jpeg", "image/jpeg",
"jpg", "image/jpeg",
"js", "application/x-javascript",
"kar", "audio/midi",
"latex", "application/x-latex",
"lha", "application/octet-stream",
"lzh", "application/octet-stream",
"m3u", "audio/x-mpegurl",
"m4a", "audio/mp4a-latm",
"m4b", "audio/mp4a-latm",
"m4p", "audio/mp4a-latm",
"m4u", "video/vnd.mpegurl",
"m4v", "video/x-m4v",
"mac", "image/x-macpaint",
"man", "application/x-troff-man",
"mathml", "application/mathml+xml",
"me", "application/x-troff-me",
"mesh", "model/mesh",
"mid", "audio/midi",
"midi", "audio/midi",
"mif", "application/vnd.mif",
"mov", "video/quicktime",
"movie", "video/x-sgi-movie",
"mp2", "audio/mpeg",
"mp3", "audio/mpeg",
"mp4", "video/mp4",
"mpe", "video/mpeg",
"mpeg", "video/mpeg",
"mpg", "video/mpeg",
"mpga", "audio/mpeg",
"ms", "application/x-troff-ms",
"msh", "model/mesh",
"mxu", "video/vnd.mpegurl",
"nc", "application/x-netcdf",
"oda", "application/oda",
"ogg", "application/ogg",
"pbm", "image/x-portable-bitmap",
"pct", "image/pict",
"pdb", "chemical/x-pdb",
"pdf", "application/pdf",
"pgm", "image/x-portable-graymap",
"pgn", "application/x-chess-pgn",
"pic", "image/pict",
"pict", "image/pict",
"png", "image/png",
"pnm", "image/x-portable-anymap",
"pnt", "image/x-macpaint",
"pntg", "image/x-macpaint",
"ppm", "image/x-portable-pixmap",
"ppt", "application/vnd.ms-powerpoint",
"pptx","application/vnd.openxmlformats-officedocument.presentationml.presentation",
"potx","application/vnd.openxmlformats-officedocument.presentationml.template",
"ppsx","application/vnd.openxmlformats-officedocument.presentationml.slideshow",
"ppam","application/vnd.ms-powerpoint.addin.macroEnabled.12",
"pptm","application/vnd.ms-powerpoint.presentation.macroEnabled.12",
"potm","application/vnd.ms-powerpoint.template.macroEnabled.12",
"ppsm","application/vnd.ms-powerpoint.slideshow.macroEnabled.12",
"ps", "application/postscript",
"qt", "video/quicktime",
"qti", "image/x-quicktime",
"qtif", "image/x-quicktime",
"ra", "audio/x-pn-realaudio",
"ram", "audio/x-pn-realaudio",
"ras", "image/x-cmu-raster",
"rdf", "application/rdf+xml",
"rgb", "image/x-rgb",
"rm", "application/vnd.rn-realmedia",
"roff", "application/x-troff",
"rtf", "text/rtf",
"rtx", "text/richtext",
"sgm", "text/sgml",
"sgml", "text/sgml",
"sh", "application/x-sh",
"shar", "application/x-shar",
"silo", "model/mesh",
"sit", "application/x-stuffit",
"skd", "application/x-koan",
"skm", "application/x-koan",
"skp", "application/x-koan",
"skt", "application/x-koan",
"smi", "application/smil",
"smil", "application/smil",
"snd", "audio/basic",
"so", "application/octet-stream",
"spl", "application/x-futuresplash",
"src", "application/x-wais-source",
"sv4cpio", "application/x-sv4cpio",
"sv4crc", "application/x-sv4crc",
"svg", "image/svg+xml",
"swf", "application/x-shockwave-flash",
"t", "application/x-troff",
"tar", "application/x-tar",
"tcl", "application/x-tcl",
"tex", "application/x-tex",
"texi", "application/x-texinfo",
"texinfo", "application/x-texinfo",
"tif", "image/tiff",
"tiff", "image/tiff",
"tr", "application/x-troff",
"tsv", "text/tab-separated-values",
"txt", "text/plain",
"ustar", "application/x-ustar",
"vcd", "application/x-cdlink",
"vrml", "model/vrml",
"vxml", "application/voicexml+xml",
"wav", "audio/x-wav",
"wbmp", "image/vnd.wap.wbmp",
"wbmxl", "application/vnd.wap.wbxml",
"wml", "text/vnd.wap.wml",
"wmlc", "application/vnd.wap.wmlc",
"wmls", "text/vnd.wap.wmlscript",
"wmlsc", "application/vnd.wap.wmlscriptc",
"wrl", "model/vrml",
"xbm", "image/x-xbitmap",
"xht", "application/xhtml+xml",
"xhtml", "application/xhtml+xml",
"xls", "application/vnd.ms-excel",
"xml", "application/xml",
"xpm", "image/x-xpixmap",
"xsl", "application/xml",
"xlsx","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"xltx","application/vnd.openxmlformats-officedocument.spreadsheetml.template",
"xlsm","application/vnd.ms-excel.sheet.macroEnabled.12",
"xltm","application/vnd.ms-excel.template.macroEnabled.12",
"xlam","application/vnd.ms-excel.addin.macroEnabled.12",
"xlsb","application/vnd.ms-excel.sheet.binary.macroEnabled.12",
"xslt", "application/xslt+xml",
"xul", "application/vnd.mozilla.xul+xml",
"xwd", "image/x-xwindowdump",
"xyz", "chemical/x-xyz",
"zip", "application/zip"
;
public static string GetMIMEType(string fileName)
//get file extension
string extension = Path.GetExtension(fileName).ToLowerInvariant();
if (extension.Length > 0 &&
MIMETypesDictionary.ContainsKey(extension.Remove(0, 1)))
return MIMETypesDictionary[extension.Remove(0, 1)];
return "unknown/unknown";
【讨论】:
这是基于文件名的。它可能对希望通过文件内容完成的人(而不是 OP)有用。 这个列表的一个子集也便于将 WebImage.ImageFormat 映射回 mime 类型。谢谢! 根据您的目标,您可能希望返回“application/octet-stream”而不是“unknown/unknown”。 由于我的编辑被拒绝了,我将在这里发布:扩展名必须全部小写,否则将不会在字典中找到。 @JalalAldeenSaa'd - 恕我直言,更好的解决方法是将StringComparer.OrdinalIgnoreCase
用于字典构造函数。序数比较比不变量快,你会摆脱.ToLower()
及其变体。【参考方案13】:
IIS 7 或更高版本
使用此代码,但您需要成为服务器上的管理员
public bool CheckMimeMapExtension(string fileExtension)
try
using (
ServerManager serverManager = new ServerManager())
// connects to default app.config
var config = serverManager.GetApplicationHostConfiguration();
var staticContent = config.GetSection("system.webServer/staticContent");
var mimeMap = staticContent.GetCollection();
foreach (var mimeType in mimeMap)
if (((String)mimeType["fileExtension"]).Equals(fileExtension, StringComparison.OrdinalIgnoreCase))
return true;
return false;
catch (Exception ex)
Console.WriteLine("An exception has occurred: \n0", ex.Message);
Console.Read();
return false;
【讨论】:
欺骗呢?【参考方案14】:如果有人愿意,他们可以将出色的 perl 模块 File::Type 移植到 .NET。在代码中是一组文件头幻数查找每个文件类型或正则表达式匹配。
这是一个 .NET 文件类型检测库 http://filetypedetective.codeplex.com/,但它目前只检测到少量文件。
【讨论】:
【参考方案15】:最后我确实使用了 urlmon.dll。我认为会有一种更简单的方法,但这很有效。我包含代码以帮助其他人,并允许我在需要时再次找到它。
using System.Runtime.InteropServices;
...
[DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
private extern static System.UInt32 FindMimeFromData(
System.UInt32 pBC,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
[MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
System.UInt32 cbSize,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
System.UInt32 dwMimeFlags,
out System.UInt32 ppwzMimeOut,
System.UInt32 dwReserverd
);
public static string getMimeFromFile(string filename)
if (!File.Exists(filename))
throw new FileNotFoundException(filename + " not found");
byte[] buffer = new byte[256];
using (FileStream fs = new FileStream(filename, FileMode.Open))
if (fs.Length >= 256)
fs.Read(buffer, 0, 256);
else
fs.Read(buffer, 0, (int)fs.Length);
try
System.UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
System.IntPtr mimeTypePtr = new IntPtr(mimetype);
string mime = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
return mime;
catch (Exception e)
return "unknown/unknown";
【讨论】:
可能是注册表中映射的任何内容。 @flq, @mkmurray msdn.microsoft.com/en-us/library/… 我在 Windows 8 上的 IIS 中托管此代码时遇到了问题。使用 pinvoke.net 上的实现(有细微差别)解决了这个问题:pinvoke.net/default.aspx/urlmon.findmimefromdata 我一直在用 IIS 7 测试这段代码,但它并没有为我工作。我有一个正在测试的 CSV 文件。我一直在更改 CSV 的扩展名(更改为 .png、.jpeg 等),并且 mimetype 随扩展名(image/png、image/jpeg)而变化。我可能是错的,但我的理解是 Urlmon.dll 使用文件的元数据确定 mimetype,而不仅仅是扩展名 不适用于 64 位应用程序,请查看此处:***.com/questions/18358548/…【参考方案16】:在 Urlmon.dll 中,有一个名为 FindMimeFromData
的函数。
来自文档
MIME 类型检测或“数据嗅探”是指从二进制数据中确定适当的 MIME 类型的过程。最终结果取决于服务器提供的 MIME 类型标头、文件扩展名和/或数据本身的组合。通常,只有前 256 个字节的数据是重要的。
因此,从文件中读取前(最多)256 个字节并将其传递给FindMimeFromData
。
【讨论】:
这种方法的可靠性如何? 根据***.com/questions/4833113/…,该函数只能确定26种类型,所以我认为它不可靠。例如。 '*.docx' 文件被确定为'application/x-zip-compressed'。 我想这是因为 docx 表面上是一个 zip 文件。 Docx 是一个 zip 文件,但.docx 的 mimetype 是“application/vnd.openxmlformats-officedocument.wordprocessingml.document”。虽然这可以通过仅二进制检查来确定,但这可能不是最有效的方法,而且在大多数情况下,您必须读取超过前 256 个字节。 我认为这个问题在 20 年代仍然很重要。查看this answer。 FileSignatures Project 似乎更可靠一些,让您可以准确控制要匹配的文件类型。【参考方案17】:我认为正确的答案是 Steve Morgan 和 Serguei 的答案的结合。这就是 Internet Explorer 的工作方式。对FindMimeFromData
的 pinvoke 调用仅适用于 26 种硬编码的 mime 类型。此外,即使可能存在更具体、更合适的 mime 类型,它也会给出模棱两可的 mime 类型(例如 text/plain
或 application/octet-stream
)。如果它没有给出好的 mime 类型,您可以去注册表获取更具体的 mime 类型。服务器注册表可以有更多最新的 mime 类型。
参考:http://msdn.microsoft.com/en-us/library/ms775147(VS.85).aspx
【讨论】:
【参考方案18】:我使用混合解决方案:
using System.Runtime.InteropServices;
[DllImport (@"urlmon.dll", CharSet = CharSet.Auto)]
private extern static System.UInt32 FindMimeFromData(
System.UInt32 pBC,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
[MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
System.UInt32 cbSize,
[MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
System.UInt32 dwMimeFlags,
out System.UInt32 ppwzMimeOut,
System.UInt32 dwReserverd
);
private string GetMimeFromRegistry (string Filename)
string mime = "application/octetstream";
string ext = System.IO.Path.GetExtension(Filename).ToLower();
Microsoft.Win32.RegistryKey rk = Microsoft.Win32.Registry.ClassesRoot.OpenSubKey(ext);
if (rk != null && rk.GetValue("Content Type") != null)
mime = rk.GetValue("Content Type").ToString();
return mime;
public string GetMimeTypeFromFileAndRegistry (string filename)
if (!File.Exists(filename))
return GetMimeFromRegistry (filename);
byte[] buffer = new byte[256];
using (FileStream fs = new FileStream(filename, FileMode.Open))
if (fs.Length >= 256)
fs.Read(buffer, 0, 256);
else
fs.Read(buffer, 0, (int)fs.Length);
try
System.UInt32 mimetype;
FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
System.IntPtr mimeTypePtr = new IntPtr(mimetype);
string mime = Marshal.PtrToStringUni(mimeTypePtr);
Marshal.FreeCoTaskMem(mimeTypePtr);
if (string.IsNullOrWhiteSpace (mime) ||
mime =="text/plain" || mime == "application/octet-stream")
return GetMimeFromRegistry (filename);
return mime;
catch (Exception e)
return GetMimeFromRegistry (filename);
【讨论】:
感谢您的代码。它部分工作。对于“doc”和“tif”文件,它返回“application/octet-stream”。还有其他选择吗? 如果能看到上述扩展字典和 urlmon 的混合解决方案,那就太好了。 @PranavShah,请注意,服务器对 mime 类型(注册表查找返回的类型)的了解取决于服务器上安装的软件。基本的 Windows 安装或专用 Web 服务器不应该可靠地知道不需要安装的第 3 方软件的 mime 类型。不过,它应该知道.doc
文件是什么。【参考方案19】:
我发现这个很有用。 对于 VB.NET 开发人员:
Public Shared Function GetFromFileName(ByVal fileName As String) As String
Return GetFromExtension(Path.GetExtension(fileName).Remove(0, 1))
End Function
Public Shared Function GetFromExtension(ByVal extension As String) As String
If extension.StartsWith("."c) Then
extension = extension.Remove(0, 1)
End If
If MIMETypesDictionary.ContainsKey(extension) Then
Return MIMETypesDictionary(extension)
End If
Return "unknown/unknown"
End Function
Private Shared ReadOnly MIMETypesDictionary As New Dictionary(Of String, String)() From _
"ai", "application/postscript", _
"aif", "audio/x-aiff", _
"aifc", "audio/x-aiff", _
"aiff", "audio/x-aiff", _
"asc", "text/plain", _
"atom", "application/atom+xml", _
"au", "audio/basic", _
"avi", "video/x-msvideo", _
"bcpio", "application/x-bcpio", _
"bin", "application/octet-stream", _
"bmp", "image/bmp", _
"cdf", "application/x-netcdf", _
"cgm", "image/cgm", _
"class", "application/octet-stream", _
"cpio", "application/x-cpio", _
"cpt", "application/mac-compactpro", _
"csh", "application/x-csh", _
"css", "text/css", _
"dcr", "application/x-director", _
"dif", "video/x-dv", _
"dir", "application/x-director", _
"djv", "image/vnd.djvu", _
"djvu", "image/vnd.djvu", _
"dll", "application/octet-stream", _
"dmg", "application/octet-stream", _
"dms", "application/octet-stream", _
"doc", "application/msword", _
"dtd", "application/xml-dtd", _
"dv", "video/x-dv", _
"dvi", "application/x-dvi", _
"dxr", "application/x-director", _
"eps", "application/postscript", _
"etx", "text/x-setext", _
"exe", "application/octet-stream", _
"ez", "application/andrew-inset", _
"gif", "image/gif", _
"gram", "application/srgs", _
"grxml", "application/srgs+xml", _
"gtar", "application/x-gtar", _
"hdf", "application/x-hdf", _
"hqx", "application/mac-binhex40", _
"htm", "text/html", _
"html", "text/html", _
"ice", "x-conference/x-cooltalk", _
"ico", "image/x-icon", _
"ics", "text/calendar", _
"ief", "image/ief", _
"ifb", "text/calendar", _
"iges", "model/iges", _
"igs", "model/iges", _
"jnlp", "application/x-java-jnlp-file", _
"jp2", "image/jp2", _
"jpe", "image/jpeg", _
"jpeg", "image/jpeg", _
"jpg", "image/jpeg", _
"js", "application/x-javascript", _
"kar", "audio/midi", _
"latex", "application/x-latex", _
"lha", "application/octet-stream", _
"lzh", "application/octet-stream", _
"m3u", "audio/x-mpegurl", _
"m4a", "audio/mp4a-latm", _
"m4b", "audio/mp4a-latm", _
"m4p", "audio/mp4a-latm", _
"m4u", "video/vnd.mpegurl", _
"m4v", "video/x-m4v", _
"mac", "image/x-macpaint", _
"man", "application/x-troff-man", _
"mathml", "application/mathml+xml", _
"me", "application/x-troff-me", _
"mesh", "model/mesh", _
"mid", "audio/midi", _
"midi", "audio/midi", _
"mif", "application/vnd.mif", _
"mov", "video/quicktime", _
"movie", "video/x-sgi-movie", _
"mp2", "audio/mpeg", _
"mp3", "audio/mpeg", _
"mp4", "video/mp4", _
"mpe", "video/mpeg", _
"mpeg", "video/mpeg", _
"mpg", "video/mpeg", _
"mpga", "audio/mpeg", _
"ms", "application/x-troff-ms", _
"msh", "model/mesh", _
"mxu", "video/vnd.mpegurl", _
"nc", "application/x-netcdf", _
"oda", "application/oda", _
"ogg", "application/ogg", _
"pbm", "image/x-portable-bitmap", _
"pct", "image/pict", _
"pdb", "chemical/x-pdb", _
"pdf", "application/pdf", _
"pgm", "image/x-portable-graymap", _
"pgn", "application/x-chess-pgn", _
"pic", "image/pict", _
"pict", "image/pict", _
"png", "image/png", _
"pnm", "image/x-portable-anymap", _
"pnt", "image/x-macpaint", _
"pntg", "image/x-macpaint", _
"ppm", "image/x-portable-pixmap", _
"ppt", "application/vnd.ms-powerpoint", _
"ps", "application/postscript", _
"qt", "video/quicktime", _
"qti", "image/x-quicktime", _
"qtif", "image/x-quicktime", _
"ra", "audio/x-pn-realaudio", _
"ram", "audio/x-pn-realaudio", _
"ras", "image/x-cmu-raster", _
"rdf", "application/rdf+xml", _
"rgb", "image/x-rgb", _
"rm", "application/vnd.rn-realmedia", _
"roff", "application/x-troff", _
"rtf", "text/rtf", _
"rtx", "text/richtext", _
"sgm", "text/sgml", _
"sgml", "text/sgml", _
"sh", "application/x-sh", _
"shar", "application/x-shar", _
"silo", "model/mesh", _
"sit", "application/x-stuffit", _
"skd", "application/x-koan", _
"skm", "application/x-koan", _
"skp", "application/x-koan", _
"skt", "application/x-koan", _
"smi", "application/smil", _
"smil", "application/smil", _
"snd", "audio/basic", _
"so", "application/octet-stream", _
"spl", "application/x-futuresplash", _
"src", "application/x-wais-source", _
"sv4cpio", "application/x-sv4cpio", _
"sv4crc", "application/x-sv4crc", _
"svg", "image/svg+xml", _
"swf", "application/x-shockwave-flash", _
"t", "application/x-troff", _
"tar", "application/x-tar", _
"tcl", "application/x-tcl", _
"tex", "application/x-tex", _
"texi", "application/x-texinfo", _
"texinfo", "application/x-texinfo", _
"tif", "image/tiff", _
"tiff", "image/tiff", _
"tr", "application/x-troff", _
"tsv", "text/tab-separated-values", _
"txt", "text/plain", _
"ustar", "application/x-ustar", _
"vcd", "application/x-cdlink", _
"vrml", "model/vrml", _
"vxml", "application/voicexml+xml", _
"wav", "audio/x-wav", _
"wbmp", "image/vnd.wap.wbmp", _
"wbmxl", "application/vnd.wap.wbxml", _
"wml", "text/vnd.wap.wml", _
"wmlc", "application/vnd.wap.wmlc", _
"wmls", "text/vnd.wap.wmlscript", _
"wmlsc", "application/vnd.wap.wmlscriptc", _
"wrl", "model/vrml", _
"xbm", "image/x-xbitmap", _
"xht", "application/xhtml+xml", _
"xhtml", "application/xhtml+xml", _
"xls", "application/vnd.ms-excel", _
"xml", "application/xml", _
"xpm", "image/x-xpixmap", _
"xsl", "application/xml", _
"xslt", "application/xslt+xml", _
"xul", "application/vnd.mozilla.xul+xml", _
"xwd", "image/x-xwindowdump", _
"xyz", "chemical/x-xyz", _
"zip", "application/zip" _
【讨论】:
看起来可能是旧列表...没有 .docx、.xlsx 等 某处有在线列表吗?上面的列表看起来更完整一些,在这里找到了一些缺失的列表:***.com/questions/4212861/…——但似乎应该有一个 Web 服务,你可以发送一个文件名和一些字节,这样可以做到最好猜猜剩下的…… 我会为此使用配置,所以我可以选择我需要的 mime 类型并相应地修改它们,而无需更改任何一行代码【参考方案20】:您也可以查看注册表。
using System.IO;
using Microsoft.Win32;
string GetMimeType(FileInfo fileInfo)
string mimeType = "application/unknown";
RegistryKey regKey = Registry.ClassesRoot.OpenSubKey(
fileInfo.Extension.ToLower()
);
if(regKey != null)
object contentType = regKey.GetValue("Content Type");
if(contentType != null)
mimeType = contentType.ToString();
return mimeType;
您将不得不以一种或另一种方式进入 MIME 数据库 - 无论它们是从扩展名还是从幻数映射的有点微不足道 - Windows 注册表就是这样一个地方。 对于独立于平台的解决方案,尽管必须将这个数据库与代码一起提供(或作为独立库)。
【讨论】:
@Rabbi 尽管这个问题是针对文件内容而不是扩展名的,但这个答案可能对其他路过的人(比如我自己)仍然有用。即使这样的答案不太可能被接受,拥有这些信息仍然很好。 这不是简单地根据文件名的扩展名获取mime吗?如果文件是 .docx 并且某个小丑决定将其重命名为 .doc 怎么办?你肯定弄错了 mime 类型。 @kolin,您说的完全正确,但俗话说“做傻事,总有人会做一个更好的傻瓜”。 :) 当使用 Windows Azure Web 角色或任何其他以有限信任运行您的应用程序的主机时 - 不要忘记您将不被允许访问注册表。 try-catch-for-registry 和内存字典(如 Anykey 的答案)的组合看起来是一个很好的解决方案,两者兼而有之。以上是关于使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型的主要内容,如果未能解决你的问题,请参考以下文章
使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型
使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型
使用 .NET,如何根据文件签名而不是扩展名找到文件的 mime 类型