从 URL 获取文件名

Posted 2023-02-22

技术标签:

【中文标题】从 URL 获取文件名【英文标题】：Get file name from URL 【发布时间】：2010-10-10 23:30:57 【问题描述】：

在 Java 中，给定 java.net.URL 或 String 形式的 http://www.example.com/some/path/to/a/file.xml ，获取文件名最简单的方法是什么，减去扩展名？所以，在这个例子中，我正在寻找返回 "file" 的东西。

我可以想到几种方法来做到这一点，但我正在寻找易于阅读且简短的东西。

【问题讨论】：

您确实意识到结尾处不需要文件名，甚至不需要看起来像文件名的东西。在这种情况下，服务器上可能有也可能没有 file.xml。在这种情况下，结果将是一个空字符串，或者可能是 null。我认为您需要更清楚地定义问题。那么以下 URLS 结尾呢？ ..../abc, ..../abc/, ..../abc.def, ..../abc.def.ghi, ..../abc?def.ghi 我认为这很清楚。如果 URL 指向一个文件，我感兴趣的是文件名减去扩展名（如果有的话）。查询部分不在文件名范围内。文件名是url最后一个斜杠后面的部分。文件扩展名是最后一个句点之后的文件名部分。 【参考方案1】：

return new File(Uri.parse(url).getPath()).getName()

【讨论】：

【参考方案2】：

一个班轮：

new File(uri.getPath).getName

完整代码（在 Scala REPL 中）：

import java.io.File
import java.net.URI

val uri = new URI("http://example.org/file.txt?whatever")

new File(uri.getPath).getName
res18: String = file.txt

注意：URI#gePath 已经足够智能，可以剥离查询参数和协议方案。例子：

new URI("http://example.org/hey/file.txt?whatever").getPath
res20: String = /hey/file.txt

new URI("hdfs:///hey/file.txt").getPath
res21: String = /hey/file.txt

new URI("file:///hey/file.txt").getPath
res22: String = /hey/file.txt

【讨论】：

不错的解决方案！这是最好的选择，因为它只使用标准的 JDK 在一天结束时，我刚刚解决了这个问题。优雅的解决方案。【参考方案3】：

如果您使用 Spring，则有一个 helper 来处理 URI。这是解决方案：

List<String> pathSegments = UriComponentsBuilder.fromUriString(url).build().getPathSegments();
String filename = pathSegments.get(pathSegments.size()-1);

【讨论】：

【参考方案4】：

我有同样的问题，你的。我解决了这个问题：

var URL = window.location.pathname; // Gets page name
var page = URL.substring(URL.lastIndexOf('/') + 1); 
console.info(page)

【讨论】：

Java 不是 javascript【参考方案5】：

有一些方法：

Java 7 文件 I/O：

String fileName = Paths.get(strUrl).getFileName().toString();

Apache Commons：

String fileName = FilenameUtils.getName(strUrl);

使用泽西岛：

UriBuilder buildURI = UriBuilder.fromUri(strUrl);
URI uri = buildURI.build();
String fileName = Paths.get(uri.getPath()).getFileName();

子字符串：

String fileName = strUrl.substring(strUrl.lastIndexOf('/') + 1);

【讨论】：

很遗憾，您的 Java 7 文件 I/O 解决方案对我不起作用。我有一个例外。我成功了：Paths.get(new URL(strUrl).getFile()).getFileName().toString();谢谢你的想法！【参考方案6】：

这应该差不多了（我会把错误处理留给你）：

int slashIndex = url.lastIndexOf('/');
int dotIndex = url.lastIndexOf('.', slashIndex);
String filenameWithoutExtension;
if (dotIndex == -1) 
  filenameWithoutExtension = url.substring(slashIndex + 1);
 else 
  filenameWithoutExtension = url.substring(slashIndex + 1, dotIndex);

【讨论】：

您需要考虑的一个错误处理方面是，如果您不小心将一个没有文件名的 url 传递给它（例如 http://www.example.com/ 或 http://www.example.com/folder/），您将得到一个空字符串跨度> 代码不起作用。 lastIndexOf 不能这样工作。但意图很明确。投反对票，因为如果片段部分包含斜杠，它将无法工作，并且因为自 1.7 以来在 apache commons 和 Java 中有专门的函数可以实现这一点【参考方案7】：

urllib 中的 Url 对象允许您访问路径的非转义文件名。以下是一些示例：

String raw = "http://www.example.com/some/path/to/a/file.xml";
assertEquals("file.xml", Url.parse(raw).path().filename());

raw = "http://www.example.com/files/r%C3%A9sum%C3%A9.pdf";
assertEquals("résumé.pdf", Url.parse(raw).path().filename());

【讨论】：

【参考方案8】：

除了所有高级方法之外，我的简单技巧是StringTokenizer：

import java.util.ArrayList;
import java.util.StringTokenizer;

public class URLName 
    public static void main(String args[])
        String url = "http://www.example.com/some/path/to/a/file.xml";
        StringTokenizer tokens = new StringTokenizer(url, "/");

        ArrayList<String> parts = new ArrayList<>();

        while(tokens.hasMoreTokens())
            parts.add(tokens.nextToken());
        
        String file = parts.get(parts.size() -1);
        int dot = file.indexOf(".");
        String fileName = file.substring(0, dot);
        System.out.println(fileName);

【讨论】：

【参考方案9】：

我发现直接传递给 FilenameUtils.getName 的某些 url 会返回不需要的结果，因此需要将其包装起来以避免被利用。

例如，

System.out.println(FilenameUtils.getName("http://www.google.com/.."));

..

我怀疑有人愿意这样做。

以下函数似乎工作正常，并显示了其中一些测试用例，当文件名无法确定时，它返回null。

public static String getFilenameFromUrl(String url)

    if (url == null)
        return null;
    
    try
    
        // Add a protocol if none found
        if (! url.contains("//"))
            url = "http://" + url;

        URL uri = new URL(url);
        String result = FilenameUtils.getName(uri.getPath());

        if (result == null || result.isEmpty())
            return null;

        if (result.contains(".."))
            return null;

        return result;
    
    catch (MalformedURLException e)
    
        return null;

以下示例中包含一些简单的测试用例：

import java.util.Objects;
import java.net.URL;
import org.apache.commons.io.FilenameUtils;

class Main 

  public static void main(String[] args) 
    validateFilename(null, null);
    validateFilename("", null);
    validateFilename("www.google.com/../me/you?trex=5#sdf", "you");
    validateFilename("www.google.com/../me/you?trex=5 is the num#sdf", "you");
    validateFilename("http://www.google.com/test.png?test", "test.png");
    validateFilename("http://www.google.com", null);
    validateFilename("http://www.google.com#test", null);
    validateFilename("http://www.google.com////", null);
    validateFilename("www.google.com/..", null);
    validateFilename("http://www.google.com/..", null);
    validateFilename("http://www.google.com/test", "test");
    validateFilename("https://www.google.com/../../test.png", "test.png");
    validateFilename("file://www.google.com/test.png", "test.png");
    validateFilename("file://www.google.com/../me/you?trex=5", "you");
    validateFilename("file://www.google.com/../me/you?trex", "you");
  

  private static void validateFilename(String url, String expectedFilename)
    String actualFilename = getFilenameFromUrl(url);

    System.out.println("");
    System.out.println("url:" + url);
    System.out.println("filename:" + expectedFilename);

    if (! Objects.equals(actualFilename, expectedFilename))
      throw new RuntimeException("Problem, actual=" + actualFilename + " and expected=" + expectedFilename + " are not equal");
  

  public static String getFilenameFromUrl(String url)
  
    if (url == null)
      return null;

    try
    
      // Add a protocol if none found
      if (! url.contains("//"))
        url = "http://" + url;

      URL uri = new URL(url);
      String result = FilenameUtils.getName(uri.getPath());

      if (result == null || result.isEmpty())
        return null;

      if (result.contains(".."))
        return null;

      return result;
    
    catch (MalformedURLException e)
    
      return null;

【讨论】：

【参考方案10】：

与其重新发明***，不如使用 Apache commons-io：

import org.apache.commons.io.FilenameUtils;

public class FilenameUtilTest 

    public static void main(String[] args) throws Exception 
        URL url = new URL("http://www.example.com/some/path/to/a/file.xml?foo=bar#test");

        System.out.println(FilenameUtils.getBaseName(url.getPath())); // -> file
        System.out.println(FilenameUtils.getExtension(url.getPath())); // -> xml
        System.out.println(FilenameUtils.getName(url.getPath())); // -> file.xml

【讨论】：

在 commons-io 2.2 版本中，至少您仍然需要手动处理带参数的 URL。例如。 "example.com/file.xml?date=2010-10-20" FilenameUtils.getName(url) 更合适。当使用 JDK 即可轻松获得简单的解决方案时，添加对 commons-io 的依赖似乎很奇怪（请参阅 URL#getPath 和 String#substring 或 Path#getFileName 或 File#getName）。跨度> FilenameUtils 类设计用于 Windows 和 *nix 路径，而不是 URL。更新示例以使用 URL、显示示例输出值并使用查询参数。【参考方案11】：

从字符串创建一个 URL 对象。当你第一次拥有一个 URL 对象时，有一些方法可以轻松提取你需要的任何信息。

我强烈推荐 Javaalmanac 网站，该网站有大量示例，但后来已经迁移。你可能会觉得http://exampledepot.8waytrips.com/egs/java.io/File2Uri.html 很有趣：

// Create a file object
File file = new File("filename");

// Convert the file object to a URL
URL url = null;
try 
    // The file need not exist. It is made into an absolute path
    // by prefixing the current working directory
    url = file.toURL();          // file:/d:/almanac1.4/java.io/filename
 catch (MalformedURLException e) 


// Convert the URL to a file object
file = new File(url.getFile());  // d:/almanac1.4/java.io/filename

// Read the file contents using the URL
try 
    // Open an input stream
    InputStream is = url.openStream();

    // Read from is

    is.close();
 catch (IOException e) 
    // Could not open the file

【讨论】：

【参考方案12】：

如果您只想从 java.net.URL 获取文件名（不包括任何查询参数），您可以使用以下函数：

public static String getFilenameFromURL(URL url) 
    return new File(url.getPath().toString()).getName();

例如，这个输入网址：

"http://example.com/image.png?version=2&amp;modificationDate=1449846324000"

将被翻译成这个输出字符串：

image.png

【讨论】：

【参考方案13】：

String fileName = url.substring(url.lastIndexOf('/') + 1);

【讨论】：

如果查询字符串包含“/”则不起作用（相信我，它可以）。 @maaw，请分享一个例子 https://host.com:9021/path/2721/filename.txt?X-Amz-Credential=n-it-cloud/20201214/standard/s3/aws4_request 那么您可以为单独的查询添加额外的检查。【参考方案14】：

如果您不需要删除文件扩展名，这里有一种方法可以做到这一点，而无需求助于容易出错的字符串操作，也无需使用外部库。适用于 Java 1.7+：

import java.net.URI
import java.nio.file.Paths

String url = "http://example.org/file?p=foo&q=bar"
String filename = Paths.get(new URI(url).getPath()).getFileName().toString()

【讨论】：

@Carcigenicate 我刚刚再次测试它，它似乎工作正常。 URI.getPath() 返回 String，所以我不明白为什么它不起作用 Nvm。我现在意识到我的问题是由于 Clojure 在 Java 互操作期间如何处理 var-args。字符串重载不起作用，因为还需要传递一个空数组来处理 Paths/get 的 var-args。如果你摆脱对getPath的调用，它仍然可以工作，而是使用URI重载。 @Carcigenicate 你的意思是Paths.get(new URI(url))？这似乎对我不起作用 getFileName 需要 android api 级别 26【参考方案15】：

要返回不带扩展名和不带参数的文件名，请使用以下命令：

String filenameWithParams = FilenameUtils.getBaseName(urlStr); // may hold params if http://example.com/a?param=yes
return filenameWithParams.split("\\?")[0]; // removing parameters from url if they exist

为了返回文件名，扩展名不带参数，请使用：

/** Parses a URL and extracts the filename from it or returns an empty string (if filename is non existent in the url) <br/>
 * This method will work in win/unix formats, will work with mixed case of slashes (forward and backward) <br/>
 * This method will remove parameters after the extension
 *
 * @param urlStr original url string from which we will extract the filename
 * @return filename from the url if it exists, or an empty string in all other cases */
private String getFileNameFromUrl(String urlStr) 
    String baseName = FilenameUtils.getBaseName(urlStr);
    String extension = FilenameUtils.getExtension(urlStr);

    try 
        extension = extension.split("\\?")[0]; // removing parameters from url if they exist
        return baseName.isEmpty() ? "" : baseName + "." + extension;
     catch (NullPointerException npe) 
        return "";

【讨论】：

【参考方案16】：

获取文件带扩展名的名称，不带扩展名，只有扩展名，只有 3 行：

String urlStr = "http://www.example.com/yourpath/foler/test.png";

String fileName = urlStr.substring(urlStr.lastIndexOf('/')+1, urlStr.length());
String fileNameWithoutExtension = fileName.substring(0, fileName.lastIndexOf('.'));
String fileExtension = urlStr.substring(urlStr.lastIndexOf("."));

Log.i("File Name", fileName);
Log.i("File Name Without Extension", fileNameWithoutExtension);
Log.i("File Extension", fileExtension);

日志结果：

File Name(13656): test.png
File Name Without Extension(13656): test
File Extension(13656): .png

希望对你有帮助。

【讨论】：

【参考方案17】：

create a new file with string image path

    String imagePath;
    File test = new File(imagePath);
    test.getName();
    test.getPath();
    getExtension(test.getName());


    public static String getExtension(String uri) 
            if (uri == null) 
                return null;
            

            int dot = uri.lastIndexOf(".");
            if (dot >= 0) 
                return uri.substring(dot);
             else 
                // No extension.
                return "";

【讨论】：

【参考方案18】：

保持简单：

/**
 * This function will take an URL as input and return the file name.
 * <p>Examples :</p>
 * <ul>
 * <li>http://example.com/a/b/c/test.txt -> test.txt</li>
 * <li>http://example.com/ -> an empty string </li>
 * <li>http://example.com/test.txt?param=value -> test.txt</li>
 * <li>http://example.com/test.txt#anchor -> test.txt</li>
 * </ul>
 * 
 * @param url The input URL
 * @return The URL file name
 */
public static String getFileNameFromUrl(URL url) 

    String urlString = url.getFile();

    return urlString.substring(urlString.lastIndexOf('/') + 1).split("\\?")[0].split("#")[0];

【讨论】：

@AlexNauda 将 url.getFile() 替换为 url.toString() 并与路径中的 # 一起使用。【参考方案19】：

url后面可以有参数，这个

 /**
 * Getting file name from url without extension
 * @param url string
 * @return file name
 */
public static String getFileName(String url) 
    String fileName;
    int slashIndex = url.lastIndexOf("/");
    int qIndex = url.lastIndexOf("?");
    if (qIndex > slashIndex) //if has parameters
        fileName = url.substring(slashIndex + 1, qIndex);
     else 
        fileName = url.substring(slashIndex + 1);
    
    if (fileName.contains(".")) 
        fileName = fileName.substring(0, fileName.lastIndexOf("."));
    

    return fileName;

【讨论】：

/ 可以出现在片段中。你会提取错误的东西。【参考方案20】：

这是在 Android 中执行此操作的最简单方法。我知道它不适用于 Java，但它可以帮助 Android 应用程序开发人员。

import android.webkit.URLUtil;

public String getFileNameFromURL(String url) 
    String fileNameWithExtension = null;
    String fileNameWithoutExtension = null;
    if (URLUtil.isValidUrl(url)) 
        fileNameWithExtension = URLUtil.guessFileName(url, null, null);
        if (fileNameWithExtension != null && !fileNameWithExtension.isEmpty()) 
            String[] f = fileNameWithExtension.split(".");
            if (f != null & f.length > 1) 
                fileNameWithoutExtension = f[0];
            
        
    
    return fileNameWithoutExtension;

【讨论】：

【参考方案21】：

这个怎么样：

String filenameWithoutExtension = null;
String fullname = new File(
    new URI("http://www.xyz.com/some/deep/path/to/abc.png").getPath()).getName();

int lastIndexOfDot = fullname.lastIndexOf('.');
filenameWithoutExtension = fullname.substring(0, 
    lastIndexOfDot == -1 ? fullname.length() : lastIndexOfDot);

【讨论】：

【参考方案22】：

public String getFileNameWithoutExtension(URL url) 
    String path = url.getPath();

    if (StringUtils.isBlank(path)) 
        return null;
    
    if (StringUtils.endsWith(path, "/")) 
        //is a directory ..
        return null;
    

    File file = new File(url.getPath());
    String fileNameWithExt = file.getName();

    int sepPosition = fileNameWithExt.lastIndexOf(".");
    String fileNameWithOutExt = null;
    if (sepPosition >= 0) 
        fileNameWithOutExt = fileNameWithExt.substring(0,sepPosition);
    else
        fileNameWithOutExt = fileNameWithExt;
    

    return fileNameWithOutExt;

【讨论】：

【参考方案23】：

public static String getFileName(URL extUrl) 
        //URL: "http://photosaaaaa.net/photos-ak-snc1/v315/224/13/659629384/s659629384_752969_4472.jpg"
        String filename = "";
        //PATH: /photos-ak-snc1/v315/224/13/659629384/s659629384_752969_4472.jpg
        String path = extUrl.getPath();
        //Checks for both forward and/or backslash 
        //NOTE:**While backslashes are not supported in URL's 
        //most browsers will autoreplace them with forward slashes
        //So technically if you're parsing an html page you could run into 
        //a backslash , so i'm accounting for them here;
        String[] pathContents = path.split("[\\\\/]");
        if(pathContents != null)
            int pathContentsLength = pathContents.length;
            System.out.println("Path Contents Length: " + pathContentsLength);
            for (int i = 0; i < pathContents.length; i++) 
                System.out.println("Path " + i + ": " + pathContents[i]);
            
            //lastPart: s659629384_752969_4472.jpg
            String lastPart = pathContents[pathContentsLength-1];
            String[] lastPartContents = lastPart.split("\\.");
            if(lastPartContents != null && lastPartContents.length > 1)
                int lastPartContentLength = lastPartContents.length;
                System.out.println("Last Part Length: " + lastPartContentLength);
                //filenames can contain . , so we assume everything before
                //the last . is the name, everything after the last . is the 
                //extension
                String name = "";
                for (int i = 0; i < lastPartContentLength; i++) 
                    System.out.println("Last Part " + i + ": "+ lastPartContents[i]);
                    if(i < (lastPartContents.length -1))
                        name += lastPartContents[i] ;
                        if(i < (lastPartContentLength -2))
                            name += ".";
                        
                    
                
                String extension = lastPartContents[lastPartContentLength -1];
                filename = name + "." +extension;
                System.out.println("Name: " + name);
                System.out.println("Extension: " + extension);
                System.out.println("Filename: " + filename);
            
        
        return filename;

【讨论】：

【参考方案24】：

使用 split() 重做安迪的回答：

Url u= ...;
String[] pathparts= u.getPath().split("\\/");
String filename= pathparts[pathparts.length-1].split("\\.", 1)[0];

【讨论】：

【参考方案25】：

导入 java.io.*;

import java.net.*;

public class ConvertURLToFileName


   public static void main(String[] args)throws IOException
   BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
   System.out.print("Please enter the URL : ");

   String str = in.readLine();


   try

     URL url = new URL(str);

     System.out.println("File : "+ url.getFile());
     System.out.println("Converting process Successfully");

     
   catch (MalformedURLException me)

      System.out.println("Converting process error");

我希望这会对你有所帮助。

【讨论】：

getFile() 并没有按照你的想法做。根据文档，它实际上是 getPath()+getQuery，这是毫无意义的。 java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#getFile()【参考方案26】：

String fileName = url.substring( url.lastIndexOf('/')+1, url.length() );

String fileNameWithoutExtn = fileName.substring(0, fileName.lastIndexOf('.'));

【讨论】：

为什么投反对票？这不公平。我的代码有效，我刚刚在看到反对票后验证了我的代码。我赞成你，因为它比我的版本更易读。否决票可能是因为它在没有扩展名或没有文件时不起作用。第二个参数可以省略为substring() 这不适用于http://example.org/file#anchor、http://example.org/file?p=foo&q=bar 和http://example.org/file.xml#/p=foo&q=bar 如果你让String url = new URL(original_url).getPath() 并为不包含. 的文件名添加一个特殊情况，那么这很好。【参考方案27】：

我想出了这个：

String url = "http://www.example.com/some/path/to/a/file.xml";
String file = url.substring(url.lastIndexOf('/')+1, url.lastIndexOf('.'));

【讨论】：

或者在没有文件的 URL 上，只有一个路径。您的代码也是正确的。无论如何，我们不应该检查负面条件。为你点赞。顺便说一句，dirk kuyt 这个名字听起来很熟悉吗？

以上是关于从 URL 获取文件名的主要内容，如果未能解决你的问题，请参考以下文章

从包含 url 的文本文件中获取 url 的文件大小

从完整URL获取文件名

从 URL 获取文件名

如何从 UIDocumentPicker、didPickDocumentsAt urls 获取正确的文件名：[URL]

从 URL 获取文件内容？

PHP 从当前URL获取文件名