Android扫描文件并统计各类文件数目

Posted 2022-12-30 写代码的林克

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Android扫描文件并统计各类文件数目相关的知识，希望对你有一定的参考价值。

最近在模仿小米的文件管理器写一个自己的文件管理器，其中有一个功能是全盘扫描文件并显示每类文件的数目。刚开始使用单一线程，扫描速度简直惨不忍睹，换成多线程扫描以后，速度有较明显的提升，便封装了一个工具类，分享出来。

一、遇到的问题

首先描述一下遇到的问题:

1 . android端全盘扫描文件

2 . 开一个子线程扫描太慢，使用多线程扫描

3 . 统计每一类文件的数目(比如:视频文件,图片文件,音频文件的数目)

二、解决思路

接下来描述一下几个点的解决思路:

1 . 首先目录的存储结构是树状结构，这里就设计到了树的遍历，这里我使用树的层次遍历，使用非递归方法实现，具体的遍历思路后面会有代码，这里只说明是借助于队列完成树的层次遍历。

2 . 第二个思路便是我们需要传入的参数，这里其实涉及到的是数据的存储结构问题，这里我使用的数据结构如下:

Map<String, Set<String>>

解释一下这个数据结构,map的key表示种类,value是个Set这个Set里面包含该种类的文件的后缀名。如下:

Map<String, Set<String>> CATEGORY_SUFFIX = new HashMap<>();
Set<String> set = new HashSet<>();
set.add("mp4");
set.add("avi");
set.add("wmv");
set.add("flv");
CATEGORY_SUFFIX.put("video", set);

set.add("txt");
set.add("pdf");
set.add("doc");
set.add("docx");
set.add("xls");
set.add("xlsx");
CATEGORY_SUFFIX.put("document", set);

set = new HashSet<>();
set.add("jpg");
set.add("jpeg");
set.add("png");
set.add("bmp");
set.add("gif");
CATEGORY_SUFFIX.put("picture", set);

set = new HashSet<>();
set.add("mp3");
set.add("ogg");
CATEGORY_SUFFIX.put("music", set);

set = new HashSet<>();
set.add("apk");
CATEGORY_SUFFIX.put("apk", set);

set = new HashSet<>();
set.add("zip");
set.add("rar");
set.add("7z");
CATEGORY_SUFFIX.put("zip", set);

这里的后缀为什么使用Set来存储呢，主要是考虑到后面需要涉及到查找(获得一个文件的后缀，需要在查找属于哪个类别)，Set的查找效率比较高

3 . 前面说了目录的遍历需要借助于队列进行层次遍历，又因为是多线程环境下，所以我们选用线程安全的队列ConcurrentLinkedQueue

ConcurrentLinkedQueue<File> mFileConcurrentLinkedQueue;

4 . 还有需要将统计结果进行存储，这里我也选用了线程安全的HashMap

private ConcurrentHashMap<String, Integer> mCountResult;

这个Map的key表示文件种类，value表示该类文件的数目，由于涉及到多线程访问，所以选用了线程安全的ConcurrentHashMap

5 . 多线程问题，这里我选用了固定线程数目的线程池，最大线程数目是CPU核心数

final ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());

6 . 数据传递问题，由于Android不能在子线程更新UI，所以这里需要传入Handler，将最终统计结果传递到UI线程并显示

三、实战编码

首先放上代码


/**
 * Created by 尚振鸿 on 17-12-16. 14:26
 * mail:szh@codekong.cn
 * 扫描文件并统计工具类
 */

public class ScanFileCountUtil 
    //扫描目录路径
    private String mFilePath;

    //各个分类所对应的文件后缀
    private Map<String, Set<String>> mCategorySuffix;
    //最终的统计结果
    private ConcurrentHashMap<String, Integer> mCountResult;
    //用于存储文件目录便于层次遍历
    private ConcurrentLinkedQueue<File> mFileConcurrentLinkedQueue;
    private Handler mHandler = null;

    public void scanCountFile() 
        if (mFilePath == null) 
            return;
        
        final File file = new File(mFilePath);

        //非目录或者目录不存在直接返回
        if (!file.exists() || file.isFile()) 
            return;
        
        //初始化每个类别的数目为0
        for (String category : mCategorySuffix.keySet()) 
            //将最后统计结果的key设置为类别
            mCountResult.put(category, 0);
        

        //获取到根目录下的文件和文件夹
        final File[] files = file.listFiles(new FilenameFilter() 
            @Override
            public boolean accept(File file, String s) 
                //过滤掉隐藏文件
                return !file.getName().startsWith(".");
            
        );
        //临时存储任务,便于后面全部投递到线程池
        List<Runnable> runnableList = new ArrayList<>();
        //创建信号量(最多同时有10个线程可以访问)
        final Semaphore semaphore = new Semaphore(100);
        for (File f : files) 
            if (f.isDirectory()) 
                //把目录添加进队列
                mFileConcurrentLinkedQueue.offer(f);
                //创建的线程的数目是根目录下文件夹的数目
                Runnable runnable = new Runnable() 
                    @Override
                    public void run() 
                        countFile();
                    
                ;
                runnableList.add(runnable);
             else 
                //找到该文件所属的类别
                for (Map.Entry<String, Set<String>> entry : mCategorySuffix.entrySet()) 
                    //获取文件后缀
                    String suffix = f.getName().substring(f.getName().indexOf(".") + 1).toLowerCase();
                    //找到了
                    if (entry.getValue().contains(suffix)) 
                        mCountResult.put(entry.getKey(), mCountResult.get(entry.getKey()) + 1);
                        break;
                    
                
            
        

        //固定数目线程池(最大线程数目为cpu核心数,多余线程放在等待队列中)
        final ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
        for (Runnable runnable : runnableList) 
            executorService.submit(runnable);
        
        //不允许再添加线程
        executorService.shutdown();
        //等待线程池中的所有线程运行完成
        while (true) 
            if (executorService.isTerminated()) 
                break;
            
            try 
                TimeUnit.SECONDS.sleep(1);
             catch (InterruptedException e) 
                e.printStackTrace();
            
        
        //传递统计数据给UI界面
        Message msg = Message.obtain();
        msg.obj = mCountResult;
        mHandler.sendMessage(msg);
    

    /**
     * 统计各类型文件数目
     */
    private void countFile() 
        //对目录进行层次遍历
        while (!mFileConcurrentLinkedQueue.isEmpty()) 
            //队头出队列
            final File tmpFile = mFileConcurrentLinkedQueue.poll();
            final File[] fileArray = tmpFile.listFiles(new FilenameFilter() 
                @Override
                public boolean accept(File file, String s) 
                    //过滤掉隐藏文件
                    return !file.getName().startsWith(".");
                
            );

            for (File f : fileArray) 
                if (f.isDirectory()) 
                    //把目录添加进队列
                    mFileConcurrentLinkedQueue.offer(f);
                 else 
                    //找到该文件所属的类别
                    for (Map.Entry<String, Set<String>> entry : mCategorySuffix.entrySet()) 
                        //获取文件后缀
                        String suffix = f.getName().substring(f.getName().indexOf(".") + 1).toLowerCase();
                        //找到了
                        if (entry.getValue().contains(suffix)) 
                            mCountResult.put(entry.getKey(), mCountResult.get(entry.getKey()) + 1);
                            //跳出循环，不再查找
                            break;
                        
                    
                
            
        
    

    public static class Builder 
        private Handler mHandler;
        private String mFilePath;
        //各个分类所对应的文件后缀
        private Map<String, Set<String>> mCategorySuffix;

        public Builder(Handler handler) 
            this.mHandler = handler;
        

        public Builder setFilePath(String filePath) 
            this.mFilePath = filePath;
            return this;
        

        public Builder setCategorySuffix(Map<String, Set<String>> categorySuffix) 
            this.mCategorySuffix = categorySuffix;
            return this;
        

        private void applyConfig(ScanFileCountUtil scanFileCountUtil) 
            scanFileCountUtil.mFilePath = mFilePath;
            scanFileCountUtil.mCategorySuffix = mCategorySuffix;
            scanFileCountUtil.mHandler = mHandler;
            scanFileCountUtil.mCountResult = new ConcurrentHashMap<String, Integer>(mCategorySuffix.size());
            scanFileCountUtil.mFileConcurrentLinkedQueue = new ConcurrentLinkedQueue<>();
        

        public ScanFileCountUtil create() 
            ScanFileCountUtil scanFileCountUtil = new ScanFileCountUtil();
            applyConfig(scanFileCountUtil);
            return scanFileCountUtil;

上面代码中关键的点都有注释或者是前面已经讲到了，下面说几点补充:

1 . 必须要等所有线程运行结束才能向UI线程发送消息，这里使用了轮询的方式

while (true) 
    if (executorService.isTerminated()) 
        break;
    
    try 
        TimeUnit.SECONDS.sleep(1);
     catch (InterruptedException e) 
        e.printStackTrace();

2 . 由于上面的轮询会阻塞调用线程，所以调用应该放在子线程中

3 . 上面工具类实例的创建使用到了建造者模式，不懂的可以看我的另一篇博客
http://blog.csdn.net/bingjianit/article/details/53607856

4 . 上面我创建的线程的数目是根目录下文件夹的数目，大家可以根据自己的需要调整

四、方便调用

下面简单说一下如何调用上面的代码

private Handler mHandler = new Handler(Looper.getMainLooper())
    @Override
    public void handleMessage(Message msg) 
        //接收结果
        Map<String, Integer> countRes = (Map<String, Integer>) msg.obj;
        //后续显示处理
    
;

/**
 * 扫描文件
 */
private void scanFile()
    if (!Environment.getExternalStorageState().equals(Environment.MEDIA_MOUNTED))
        return;
    
    final String path = Environment.getExternalStorageDirectory().getAbsolutePath();

    final Map<String, Set<String>> CATEGORY_SUFFIX = new HashMap<>(FILE_CATEGORY_ICON.length);
    Set<String> set = new HashSet<>();
    set.add("mp4");
    set.add("avi");
    set.add("wmv");
    set.add("flv");
    CATEGORY_SUFFIX.put("video", set);

    set.add("txt");
    set.add("pdf");
    set.add("doc");
    set.add("docx");
    set.add("xls");
    set.add("xlsx");
    CATEGORY_SUFFIX.put("document", set);

    set = new HashSet<>();
    set.add("jpg");
    set.add("jpeg");
    set.add("png");
    set.add("bmp");
    set.add("gif");
    CATEGORY_SUFFIX.put("picture", set);

    set = new HashSet<>();
    set.add("mp3");
    set.add("ogg");
    CATEGORY_SUFFIX.put("music", set);

    set = new HashSet<>();
    set.add("apk");
    CATEGORY_SUFFIX.put("apk", set);

    set = new HashSet<>();
    set.add("zip");
    set.add("rar");
    set.add("7z");
    CATEGORY_SUFFIX.put("zip", set);

    //单一线程线程池
    ExecutorService singleExecutorService = Executors.newSingleThreadExecutor();
    singleExecutorService.submit(new Runnable() 
        @Override
        public void run() 
            //构建对象
            ScanFileCountUtil scanFileCountUtil = new ScanFileCountUtil
                    .Builder(mHandler)
                    .setFilePath(path)
                    .setCategorySuffix(CATEGORY_SUFFIX)
                    .create();
            scanFileCountUtil.scanCountFile();
        
    );

五、后记

刚开始我是采用单线程扫描，扫描时间差不多是3分钟，经过使用多线程以后，扫描时间缩短到30-40秒。对了，上面的程序要想在Android中顺利运行还需要添加访问SD卡的权限和注意Android6.0的动态权限申请。

如果觉得不错，可以关注我，也可以去GitHub看看我的文件管理器，正在不断完善中，地址:
https://github.com/codekongs/FileExplorer/

以上是关于Android扫描文件并统计各类文件数目的主要内容，如果未能解决你的问题，请参考以下文章

Android扫描文件并统计各类文件数目

Android--扫描文件并统计各类文件数目

python 统计文件夹下的文件夹/某类型文件的数目

java 遍历一个目录,统计目录及其子目录中各种类型文件的数目,统计完成后,打印出各种扩展名及其数目。急

java利用SuffixFileFilter统计目录下特定后缀名文件的数目