StringTokenizer字符串标记生成器

Posted 2020-10-21 maxudong

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了StringTokenizer字符串标记生成器相关的知识，希望对你有一定的参考价值。

原创，转发请注明

　　类定义

/**
 * The string tokenizer class allows an application to break a
 * string into tokens. The tokenization method is much simpler than
 * the one used by the <code>StreamTokenizer</code> class. The
 * <code>StringTokenizer</code> methods do not distinguish among
 * identifiers, numbers, and quoted strings, nor do they recognize
 * and skip comments.
 *字符串标记器类允许应用程序将字符串分解为标记。标记化方法比StreamTokenizer类使用的方法简单得多。
 *StringTokenizer方法不区分标识符，数字和带引号的字符串，也不识别和跳过注释。
 *
 * The set of delimiters (the characters that separate tokens) may
 * be specified either at creation time or on a per-token basis.
 * 可以在创建时或以每个标记为基础指定分隔符（分隔标记的字符）集合。
 *
 * An instance of <code>StringTokenizer</code> behaves in one of two
 * ways, depending on whether it was created with the
 * <code>returnDelims</code> flag having the value <code>true</code>
 * or <code>false</code>:
 * StringTokenizer的一个实例有两种表现形式，具体取决于是否使用true或false的returnDelims标志创建。
 * 
 * If the flag is <code>false</code>, delimiter characters serve to
 * separate tokens. A token is a maximal sequence of consecutive
 * characters that are not delimiters.
 * 如果该标志为false，则分隔符用于分隔标记时，token是最大连续字符序列但不存在分隔符本身。
 *
 * A <tt>StringTokenizer</tt> object internally maintains a current
 * position within the string to be tokenized. Some operations advance this
 * current position past the characters processed.<p>
 * A token is returned by taking a substring of the string that was used to
 * create the <tt>StringTokenizer</tt> object.
 * <p>
 * The following is one example of the use of the tokenizer. The code:
 *     StringTokenizer st = new StringTokenizer("this is a test");
 *     while (st.hasMoreTokens()) {
 *         System.out.println(st.nextToken());
 *     }
 * prints the following output:
 * <blockquote><pre>
 *     this
 *     is
 *     a
 *     test
 * </pre></blockquote>
 *
 * <p>
 * <tt>StringTokenizer</tt> is a legacy class that is retained for
 * compatibility reasons although its use is discouraged in new code. It is
 * recommended that anyone seeking this functionality use the <tt>split</tt>
 * method of <tt>String</tt> or the java.util.regex package instead.
 * <p>
 * The following example illustrates how the <tt>String.split</tt>
 * method can be used to break up a string into its basic tokens:
 * <blockquote><pre>
 *     String[] result = "this is a test".split("\\s");
 *     for (int x=0; x&lt;result.length; x++)
 *         System.out.println(result[x]);
 * </pre></blockquote>
 * <p> * prints the following output:
 *  this
 *  is
 *  a
 *　test
 */

public class StringTokenizer implements Enumeration<Object> {}

该类属于保留类，现多实用String类的split方法进行分割。

构造方法：

 //str：被分割字符串；delim：分隔符字符串；returnDelims:是否返回分隔符本身
public StringTokenizer(String str, String delim, boolean returnDelims) {
    ...
    }
//返回是否包含分隔符默认为false
public StringTokenizer(String str, String delim) {
        this(str, delim, false);
    }   
//使用制表符、换行符、回车符、换页符作为分隔符，返回是否包含分隔符默认为false
public StringTokenizer(String str) {
        this(str, " \t\n\r\f", false);
    }

主要方法

//是否有其他标志
public boolean hasMoreTokens() {
        newPosition = skipDelimiters(currentPosition);
        return (newPosition < maxPosition);
    }
//同上
public boolean hasMoreElements() {
        return hasMoreTokens();
    }
//下一标志符
public String nextToken() {}
//同上
public Object nextElement() {
        return nextToken();
    }
/**
     * Calculates the number of times that this tokenizer‘s
     * nextToken method can be called before it generates an
     * exception.
     */
    public int countTokens() {
        }
//Returns the next token in this string tokenizer‘s string. 
 public String nextToken(String delim) {
    }

实例

public class Test {
    public static void main(String[] args) {
    String str = "[email protected]    bb##ccc%ddd&eee*&";
    StringTokenizer st1 = new StringTokenizer(str, "@#%&*", true);// 指定参数
                                      // 被分割字符串、分隔符、是否包含分隔符
    StringTokenizer st2 = new StringTokenizer(str, "@#%&*");// 指定参数
                                // 被分割字符串、分隔符
    StringTokenizer st3 = new StringTokenizer(str);// 仅指定 被分割字符串
    System.out.println("StringTokenizer 构造方法1测试：共有" + st1.countTokens() + "个tokens");
    while (st1.hasMoreTokens()) {
        System.out.println(st1.nextToken());
    }
    System.out.println();
    System.out.println("StringTokenizer 构造方法2测试：共有" + st2.countTokens() + "个tokens");
    while (st2.hasMoreTokens()) {
        System.out.println(st2.nextToken());
    }
    System.out.println();
    System.out.println("StringTokenizer 构造方法3测试：共有" + st3.countTokens() + "个tokens");
    while (st3.hasMoreTokens()) {
        System.out.println(st3.nextToken());
    }
    System.out.println();
    System.out.println("String分隔测试：");
    String[] strArr = str.split("##");
    for(int i = 0;i<strArr.length;i++){
        System.out.println(strArr[i]);
    }
    }
}

结果

StringTokenizer 构造方法1测试：共有12个tokens
aaa
@
b    bb
#
#
ccc
%
ddd
&
eee
*
&

StringTokenizer 构造方法2测试：共有5个tokens
aaa
b    bb
ccc
ddd
eee

StringTokenizer 构造方法3测试：共有2个tokens
[email protected]
bb##ccc%ddd&eee*&

String分隔测试：
[email protected]    bb
ccc%ddd&eee*&

由此可以看出：

StringTokenizer中的分隔符字符串是对每个字符分隔然后求和；

String的分隔字符串是使用分割字符串本身进行分割的；

其实还是有所不同的，否则也不会保留StringTokenizer这个工具类。

以上是关于StringTokenizer字符串标记生成器的主要内容，如果未能解决你的问题，请参考以下文章

java字符串分解 StringTokenizer用法

StringTokenizer拆分字符串

java字符串StringTokenizer在“//”之后无法识别令牌？

Java, Stringtokenizer和String split有啥区别？

java字符串处理 StringTokenizer 分词

StringTokenizer类与String.split()的区别