扫描仪和 .hasNext() 问题

Posted 2023-02-25

技术标签:

【中文标题】扫描仪和 .hasNext() 问题【英文标题】：Scanner & .hasNext() issue 【发布时间】：2015-10-30 18:16:38 【问题描述】：

我是 Java 新手，对 Scanner 类非常陌生。我正在编写一个程序，它要求用户输入一个单词，然后在文件中搜索这个单词。每次找到该单词时，它都会打印在 JOptionPane 中的新行上，以及它之前和之后的单词。一切正常，但有两个例外：

如果要搜索的单词恰好是文件中的最后一个单词，则会引发“NoSuchElementException”。

如果正在搜索的单词连续出现两次（不太可能，但我发现仍然是一个问题），它只会返回一次。例如，如果要搜索的单词是“had”，“He said that he had enough. He had been up all night”是文件中的句子，那么输出是：

he had had
He had been

应该是这样的：

he had had
had had enough.
He had been

我相信我的问题在于我使用了while(scan.hasNext())，并且在这个循环中我使用了两次scan.next()。虽然我找不到解决方案，但仍然可以实现我希望程序返回的内容。

这是我的代码：

//WordSearch.java
/*
 * Program which asks the user to enter a filename followed
 * by a word to search for within the file. The program then
 * returns every occurrence of this word as well as the
 * previous and next word it appear with. Each of these
 * occurrences are printed on a new line when displayed
 * to the user.
 */

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Scanner;
import javax.swing.JOptionPane;

public class WordSearch 

    public static void main(String[] args) throws FileNotFoundException 

        String fileName = JOptionPane.showInputDialog("Enter the name of the file to be searched:");
        FileReader reader = new FileReader(fileName);

        String searchWord = JOptionPane.showInputDialog("Enter the word to be searched for in \"" + fileName + "\":");
        Scanner scan = new Scanner(reader);

        int occurrenceNum = 0;
        ArrayList<String> occurrenceList = new ArrayList<String>();
        String word = "", previousWord, nextWord = "", message = "", occurrence, allOccurrences = "";

        while(scan.hasNext())
            previousWord = word;
            word = scan.next();

            if(word.equalsIgnoreCase(searchWord))
                nextWord = scan.next();

                if(previousWord.equals(""))
                    message = word + " is the first word of the file.\nHere are the occurrences of it:\n\n";
                    occurrence = word + " " + nextWord;
                
                else
                    occurrence = previousWord + " " + word + " " + nextWord;
                

                occurrenceNum++;
                occurrenceList.add(occurrence);
            
        

        for(int i = 0; i < occurrenceNum; i++)
            allOccurrences += occurrenceList.get(i) + "\n";
        

        JOptionPane.showMessageDialog(null, message + allOccurrences);

        scan.close();

另外，附带说明：如何实现 scan.useDelimeter() 以忽略任何问号、逗号、句点、撇号等？

【问题讨论】：

我可以建议你把它分解成多个函数吗？是的，我打算这样做，然后再清理它。 .... 在我们都必须通读它之后。请注意，将代码分解为函数通常可以帮助您发现此类问题。 【参考方案1】：

如果要搜索的单词恰好是文件中的最后一个单词，则抛出 NoSuchElementException。

这是因为这条线：

if(word.equalsIgnoreCase(searchWord)) 
    nextWord = scan.next();
    ...

你不检查scan 是否真的是hasNext()，直接找scan.next()。你可以通过添加一个调用scan.hasNext()的条件来解决这个问题

如果正在搜索的单词连续出现两次（不太可能，但我发现仍然是一个问题），它只会返回一次。

这里也存在同样的问题：当您找到一个单词时，您会立即检索下一个单词。

解决这个问题有点棘手：您需要更改算法以一次查看一个单词，并使用 previousWord（无论如何存储）以使用 while 循环的后续迭代。

【讨论】：

感谢您的解决方案，一切几乎运行良好！我的最后一个问题是，如果正在搜索的单词出现，但后面跟着句号/逗号/问号/等，那么它不会被识别为匹配项。我见过Scanner.useDelimter() 方法，但我真的不明白它的格式以及如何让它正常运行。请您简单地向我解释一下吗？ @KOB 解决此问题的一种方法是使用replaceAll 删除字符串两端的所有非单词字符，即nextWord = scan.next().replaceAll("^\\W+", "").replaceAll("\\W+$", "");【参考方案2】：

你可以做的就是在再次使用 next 之前调用 hasNext。

while(scan.hasNext())
    previousWord = word;
    word = scan.next();

    if(word.equalsIgnoreCase(searchWord) && scan.hasNext()) // this line change
        nextWord = scan.next();

        if(previousWord.equals(""))
            message = word + " is the first word of the file.\nHere are the occurrences of it:\n\n";
            occurrence = word + " " + nextWord;
        
        else 
            occurrence = previousWord + " " + word + " " + nextWord;
        

        occurrenceNum++;
        occurrenceList.add(occurrence);

您不想在忽略大小写的情况下使用等于。您只想使用 .equals()。

【讨论】：

【参考方案3】：

解决方案是按照您当前保存previousWord 的方式保存两个个单词。比如：

while (scan.hasNext()) 
    previousWord = word;
    word = nextWord;
    nextWord = scan.next();

然后你检查word。如果它符合您的需要，那么您可以将其与previousWord 和nextWord 一起打印。也就是说，在每次迭代中，您都在检查您在上一次迭代中读到的单词。

这样，您的循环中只需要一个hasNext() 和一个next()。

请注意，在循环结束后，nextWord 可能实际上是您的话。这意味着您的单词是文件中的最后一个单词，您应该检查并相应地打印它。

【讨论】：

如果它是文件中的第一个单词，这会不会翻转问题并且它不会识别单词？没有。因为虽然你只在第二轮而不是第一轮检查它，但previousWord 仍然是""，你会找到它。延迟处理导致的唯一问题是该单词是文件中的最后一个单词 - 正如我已经提到的。您可能还需要注意文件中只有一个单词的情况。

以上是关于扫描仪和 .hasNext() 问题的主要内容，如果未能解决你的问题，请参考以下文章