java.lang.String 类源码解读

Posted aben-blog

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了java.lang.String 类源码解读相关的知识,希望对你有一定的参考价值。

  • String类定义实现了java.io.Serializable, Comparable<String>, CharSequence 三个接口;并且为final修饰。
    技术分享图片
    public final class String
    defined
  1. String由char[]数组实现
        /** The value is used for character storage. */
        private final char value[];
    
        /** Cache the hash code for the string */
        private int hash; // Default to 0

    value[]用于存储字符串内容,被final修饰,说明一旦创建就不可被修改。String 声明的变量重新赋值即代表重新指向了另一个String实例对象。

  2. 实现序列化

    技术分享图片
    /** use serialVersionUID from JDK 1.0.2 for interoperability */
        private static final long serialVersionUID = -6849794470754667710L;
    
        /**
         * Class String is special cased within the Serialization Stream Protocol.
         *
         * A String instance is written into an ObjectOutputStream according to
         * <a href="{@docRoot}/../platform/serialization/spec/output.html">
         * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
         */
        private static final ObjectStreamField[] serialPersistentFields =
            new ObjectStreamField[0];
    Serializable

    serialVersionUID是记录序列化的版本号,serialPersistentFields用来存储需要被序列化的字段。

  3. String构造方法

    技术分享图片
        // String的value是不可变的,所以可以公用char[]数组表达同一个字符串
        public String() {
            this.value = "".value;
        }
        public String(String original) {
            this.value = original.value;
            this.hash = original.hash;
        }
        // Arrays.copyOf原理是声明新的char[]数组,System.arraycopy进行复制
        public String(char value[]) {
            this.value = Arrays.copyOf(value, value.length);
        }
        public String(char value[], int offset, int count) {
            if (offset < 0) {
                throw new StringIndexOutOfBoundsException(offset);
            }
            if (count <= 0) {
                if (count < 0) {
                    throw new StringIndexOutOfBoundsException(count);
                }
                if (offset <= value.length) {
                    this.value = "".value;
                    return;
                }
            }
            // Note: offset or count might be near -1>>>1.
            if (offset > value.length - count) {
                throw new StringIndexOutOfBoundsException(offset + count);
            }
            this.value = Arrays.copyOfRange(value, offset, offset+count);
        }
    
        // 通过codePoints 数组创建
        public String(int[] codePoints, int offset, int count) {
            if (offset < 0) {
                throw new StringIndexOutOfBoundsException(offset);
            }
            if (count <= 0) {
                if (count < 0) {
                    throw new StringIndexOutOfBoundsException(count);
                }
                if (offset <= codePoints.length) {
                    this.value = "".value;
                    return;
                }
            }
            // Note: offset or count might be near -1>>>1.
            if (offset > codePoints.length - count) {
                throw new StringIndexOutOfBoundsException(offset + count);
            }
    
            final int end = offset + count;
    
            // Pass 1: Compute precise size of char[]
            int n = count;
            for (int i = offset; i < end; i++) {
                int c = codePoints[i];
                if (Character.isBmpCodePoint(c))
                    continue;
                else if (Character.isValidCodePoint(c))
                    n++;
                else throw new IllegalArgumentException(Integer.toString(c));
            }
    
            // Pass 2: Allocate and fill in char[]
            final char[] v = new char[n];
    
            for (int i = offset, j = 0; i < end; i++, j++) {
                int c = codePoints[i];
                if (Character.isBmpCodePoint(c))
                    v[j] = (char)c;
                else
                    Character.toSurrogates(c, v, j++);
            }
    
            this.value = v;
        }
        /* Common private utility method used to bounds check the byte array
         * and requested offset & length values used by the String(byte[],..)
         * constructors.
         */
        private static void checkBounds(byte[] bytes, int offset, int length) {
            if (length < 0)
                throw new StringIndexOutOfBoundsException(length);
            if (offset < 0)
                throw new StringIndexOutOfBoundsException(offset);
            if (offset > bytes.length - length)
                throw new StringIndexOutOfBoundsException(offset + length);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified subarray of
         * bytes using the specified charset.  The length of the new {@code String}
         * is a function of the charset, and hence may not be equal to the length
         * of the subarray.
         *
         * <p> The behavior of this constructor when the given bytes are not valid
         * in the given charset is unspecified.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @param  offset
         *         The index of the first byte to decode
         *
         * @param  length
         *         The number of bytes to decode
    
         * @param  charsetName
         *         The name of a supported {@linkplain java.nio.charset.Charset
         *         charset}
         *
         * @throws  UnsupportedEncodingException
         *          If the named charset is not supported
         *
         * @throws  IndexOutOfBoundsException
         *          If the {@code offset} and {@code length} arguments index
         *          characters outside the bounds of the {@code bytes} array
         *
         * @since  JDK1.1
         */
        public String(byte bytes[], int offset, int length, String charsetName)
                throws UnsupportedEncodingException {
            if (charsetName == null)
                throw new NullPointerException("charsetName");
            checkBounds(bytes, offset, length);
            this.value = StringCoding.decode(charsetName, bytes, offset, length);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified subarray of
         * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
         * The length of the new {@code String} is a function of the charset, and
         * hence may not be equal to the length of the subarray.
         *
         * <p> This method always replaces malformed-input and unmappable-character
         * sequences with this charset‘s default replacement string.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @param  offset
         *         The index of the first byte to decode
         *
         * @param  length
         *         The number of bytes to decode
         *
         * @param  charset
         *         The {@linkplain java.nio.charset.Charset charset} to be used to
         *         decode the {@code bytes}
         *
         * @throws  IndexOutOfBoundsException
         *          If the {@code offset} and {@code length} arguments index
         *          characters outside the bounds of the {@code bytes} array
         *
         * @since  1.6
         */
        public String(byte bytes[], int offset, int length, Charset charset) {
            if (charset == null)
                throw new NullPointerException("charset");
            checkBounds(bytes, offset, length);
            this.value =  StringCoding.decode(charset, bytes, offset, length);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified array of bytes
         * using the specified {@linkplain java.nio.charset.Charset charset}.  The
         * length of the new {@code String} is a function of the charset, and hence
         * may not be equal to the length of the byte array.
         *
         * <p> The behavior of this constructor when the given bytes are not valid
         * in the given charset is unspecified.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @param  charsetName
         *         The name of a supported {@linkplain java.nio.charset.Charset
         *         charset}
         *
         * @throws  UnsupportedEncodingException
         *          If the named charset is not supported
         *
         * @since  JDK1.1
         */
        public String(byte bytes[], String charsetName)
                throws UnsupportedEncodingException {
            this(bytes, 0, bytes.length, charsetName);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified array of
         * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
         * The length of the new {@code String} is a function of the charset, and
         * hence may not be equal to the length of the byte array.
         *
         * <p> This method always replaces malformed-input and unmappable-character
         * sequences with this charset‘s default replacement string.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @param  charset
         *         The {@linkplain java.nio.charset.Charset charset} to be used to
         *         decode the {@code bytes}
         *
         * @since  1.6
         */
        public String(byte bytes[], Charset charset) {
            this(bytes, 0, bytes.length, charset);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified subarray of
         * bytes using the platform‘s default charset.  The length of the new
         * {@code String} is a function of the charset, and hence may not be equal
         * to the length of the subarray.
         *
         * <p> The behavior of this constructor when the given bytes are not valid
         * in the default charset is unspecified.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @param  offset
         *         The index of the first byte to decode
         *
         * @param  length
         *         The number of bytes to decode
         *
         * @throws  IndexOutOfBoundsException
         *          If the {@code offset} and the {@code length} arguments index
         *          characters outside the bounds of the {@code bytes} array
         *
         * @since  JDK1.1
         */
        public String(byte bytes[], int offset, int length) {
            checkBounds(bytes, offset, length);
            this.value = StringCoding.decode(bytes, offset, length);
        }
    
        /**
         * Constructs a new {@code String} by decoding the specified array of bytes
         * using the platform‘s default charset.  The length of the new {@code
         * String} is a function of the charset, and hence may not be equal to the
         * length of the byte array.
         *
         * <p> The behavior of this constructor when the given bytes are not valid
         * in the default charset is unspecified.  The {@link
         * java.nio.charset.CharsetDecoder} class should be used when more control
         * over the decoding process is required.
         *
         * @param  bytes
         *         The bytes to be decoded into characters
         *
         * @since  JDK1.1
         */
        public String(byte bytes[]) {
            this(bytes, 0, bytes.length);
        }
    
        /**
         * Allocates a new string that contains the sequence of characters
         * currently contained in the string buffer argument. The contents of the
         * string buffer are copied; subsequent modification of the string buffer
         * does not affect the newly created string.
         *
         * @param  buffer
         *         A {@code StringBuffer}
         */
        public String(StringBuffer buffer) {
            synchronized(buffer) {
                this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
            }
        }
    
        /**
         * Allocates a new string that contains the sequence of characters
         * currently contained in the string builder argument. The contents of the
         * string builder are copied; subsequent modification of the string builder
         * does not affect the newly created string.
         *
         * <p> This constructor is provided to ease migration to {@code
         * StringBuilder}. Obtaining a string from a string builder via the {@code
         * toString} method is likely to run faster and is generally preferred.
         *
         * @param   builder
         *          A {@code StringBuilder}
         *
         * @since  1.5
         */
        public String(StringBuilder builder) {
            this.value = Arrays.copyOf(builder.getValue(), builder.length());
        }
    
        /*
        * Package private constructor which shares value array for speed.
        * this constructor is always expected to be called with share==true.
        * a separate constructor is needed because we already have a public
        * String(char[]) constructor that makes a copy of the given char[].
        */
        String(char[] value, boolean share) {
            // assert share : "unshared not supported";
            this.value = value;
        }
    constructor
    String是不可变的,所以可以公用char[]数组表达同一个字符串,从构造方法可以看出,this.value = origin.value.
    构造函数可通过【codepoint】创建,这里需要另外详细了解字符编码和【bmpcodepoint】等概念。
    通过byte[]创建的字符串都是由【StringCoding.decode】方法进行获取char[]。
  4. 常用String方法

     1     public int length() {
     2         return value.length;
     3     }
     4 
     5     public boolean isEmpty() {
     6         return value.length == 0;
     7     }
     8 
     9     public char charAt(int index) {
    10         if ((index < 0) || (index >= value.length)) {
    11             throw new StringIndexOutOfBoundsException(index);
    12         }
    13         return value[index];
    14     }

    从源码可以看出,通过string直接获取的length是char的长度,但是由于string是使用utf-16对字符串进行编码存储在char数组中,
    所以http://www.qqxiuzi.cn/zh/hanzi-unicode-bianma.php?zfj=kzb  对于类似这种生僻字,需要两个char才能表示,因此length长度
    不能代表字符个数

  5.  codePoint代码点相关方法

    技术分享图片
      1     /**
      2      * Returns the character (Unicode code point) at the specified
      3      * index. The index refers to {@code char} values
      4      * (Unicode code units) and ranges from {@code 0} to
      5      * {@link #length()}{@code  - 1}.
      6      *
      7      * <p> If the {@code char} value specified at the given index
      8      * is in the high-surrogate range, the following index is less
      9      * than the length of this {@code String}, and the
     10      * {@code char} value at the following index is in the
     11      * low-surrogate range, then the supplementary code point
     12      * corresponding to this surrogate pair is returned. Otherwise,
     13      * the {@code char} value at the given index is returned.
     14      *
     15      * @param      index the index to the {@code char} values
     16      * @return     the code point value of the character at the
     17      *             {@code index}
     18      * @exception  IndexOutOfBoundsException  if the {@code index}
     19      *             argument is negative or not less than the length of this
     20      *             string.
     21      * @since      1.5
     22      */
     23     public int codePointAt(int index) {
     24         if ((index < 0) || (index >= value.length)) {
     25             throw new StringIndexOutOfBoundsException(index);
     26         }
     27         return Character.codePointAtImpl(value, index, value.length);
     28     }
     29 
     30     /**
     31      * Returns the character (Unicode code point) before the specified
     32      * index. The index refers to {@code char} values
     33      * (Unicode code units) and ranges from {@code 1} to {@link
     34      * CharSequence#length() length}.
     35      *
     36      * <p> If the {@code char} value at {@code (index - 1)}
     37      * is in the low-surrogate range, {@code (index - 2)} is not
     38      * negative, and the {@code char} value at {@code (index -
     39      * 2)} is in the high-surrogate range, then the
     40      * supplementary code point value of the surrogate pair is
     41      * returned. If the {@code char} value at {@code index -
     42      * 1} is an unpaired low-surrogate or a high-surrogate, the
     43      * surrogate value is returned.
     44      *
     45      * @param     index the index following the code point that should be returned
     46      * @return    the Unicode code point value before the given index.
     47      * @exception IndexOutOfBoundsException if the {@code index}
     48      *            argument is less than 1 or greater than the length
     49      *            of this string.
     50      * @since     1.5
     51      */
     52     public int codePointBefore(int index) {
     53         int i = index - 1;
     54         if ((i < 0) || (i >= value.length)) {
     55             throw new StringIndexOutOfBoundsException(index);
     56         }
     57         return Character.codePointBeforeImpl(value, index, 0);
     58     }
     59 
     60     /**
     61      * Returns the number of Unicode code points in the specified text
     62      * range of this {@code String}. The text range begins at the
     63      * specified {@code beginIndex} and extends to the
     64      * {@code char} at index {@code endIndex - 1}. Thus the
     65      * length (in {@code char}s) of the text range is
     66      * {@code endIndex-beginIndex}. Unpaired surrogates within
     67      * the text range count as one code point each.
     68      *
     69      * @param beginIndex the index to the first {@code char} of
     70      * the text range.
     71      * @param endIndex the index after the last {@code char} of
     72      * the text range.
     73      * @return the number of Unicode code points in the specified text
     74      * range
     75      * @exception IndexOutOfBoundsException if the
     76      * {@code beginIndex} is negative, or {@code endIndex}
     77      * is larger than the length of this {@code String}, or
     78      * {@code beginIndex} is larger than {@code endIndex}.
     79      * @since  1.5
     80      */
     81     public int codePointCount(int beginIndex, int endIndex) {
     82         if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
     83             throw new IndexOutOfBoundsException();
     84         }
     85         return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
     86     }
     87 
     88     /**
     89      * Returns the index within this {@code String} that is
     90      * offset from the given {@code index} by
     91      * {@code codePointOffset} code points. Unpaired surrogates
     92      * within the text range given by {@code index} and
     93      * {@code codePointOffset} count as one code point each.
     94      *
     95      * @param index the index to be offset
     96      * @param codePointOffset the offset in code points
     97      * @return the index within this {@code String}
     98      * @exception IndexOutOfBoundsException if {@code index}
     99      *   is negative or larger then the length of this
    100      *   {@code String}, or if {@code codePointOffset} is positive
    101      *   and the substring starting with {@code index} has fewer
    102      *   than {@code codePointOffset} code points,
    103      *   or if {@code codePointOffset} is negative and the substring
    104      *   before {@code index} has fewer than the absolute value
    105      *   of {@code codePointOffset} code points.
    106      * @since 1.5
    107      */
    108     public int offsetByCodePoints(int index, int codePointOffset) {
    109         if (index < 0 || index > value.length) {
    110             throw new IndexOutOfBoundsException();
    111         }
    112         return Character.offsetByCodePointsImpl(value, 0, value.length,
    113                 index, codePointOffset);
    114     }
    codePoint

    在第4点说道,length()方法并不能代表string的字符个数,这里可以通过codePointCount(0,str.length()) 来获取String的字符个数。 

  6. copy chars

     1     void getChars(char dst[], int dstBegin) {
     2         System.arraycopy(value, 0, dst, dstBegin, value.length);
     3     }
     4 
     5     public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
     6         if (srcBegin < 0) {
     7             throw new StringIndexOutOfBoundsException(srcBegin);
     8         }
     9         if (srcEnd > value.length) {
    10             throw new StringIndexOutOfBoundsException(srcEnd);
    11         }
    12         if (srcBegin > srcEnd) {
    13             throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
    14         }
    15         System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    16     }

    getchars就是复制string的value数组。 第一个是默认修饰符,即只能java.lang包内使用,两个复制getchars的实现都是由System.arraycopy。

  7. encode for string 

    技术分享图片
     1     public byte[] getBytes(String charsetName)
     2             throws UnsupportedEncodingException {
     3         if (charsetName == null) throw new NullPointerException();
     4         return StringCoding.encode(charsetName, value, 0, value.length);
     5     }
     6 
     7     public byte[] getBytes(Charset charset) {
     8         if (charset == null) throw new NullPointerException();
     9         return StringCoding.encode(charset, value, 0, value.length);
    10     }
    11 
    12     public byte[] getBytes() {
    13         return StringCoding.encode(value, 0, value.length);
    14     }
    getBytes

    string提供了三种getBytes方式,前两种是传入编码格式参数,第三种则是使用jvm默认编码方式来获取 bytes。

  8.  equals 方法解析

    技术分享图片
      1     // char数组进行一一索引值判断
      2     public boolean equals(Object anObject) {
      3         if (this == anObject) {
      4             return true;
      5         }
      6         if (anObject instanceof String) {
      7             String anotherString = (String)anObject;
      8             int n = value.length;
      9             if (n == anotherString.value.length) {
     10                 char v1[] = value;
     11                 char v2[] = anotherString.value;
     12                 int i = 0;
     13                 while (n-- != 0) {
     14                     if (v1[i] != v2[i])
     15                         return false;
     16                     i++;
     17                 }
     18                 return true;
     19             }
     20         }
     21         return false;
     22     }
     23 
     24     // 为阻塞方法,因StringBuffer是阻塞的。
     25     public boolean contentEquals(StringBuffer sb) {
     26         return contentEquals((CharSequence)sb);
     27     }
     28     
     29     // 非阻塞,StringBuilder为非阻塞的。char数组进行一一索引值判断
     30     private boolean nonSyncContentEquals(AbstractStringBuilder sb) {
     31         char v1[] = value;
     32         char v2[] = sb.getValue();
     33         int n = v1.length;
     34         if (n != sb.length()) {
     35             return false;
     36         }
     37         for (int i = 0; i < n; i++) {
     38             if (v1[i] != v2[i]) {
     39                 return false;
     40             }
     41         }
     42         return true;
     43     }
     44 
     45     // char Sequence类似char[]。char数组进行一一索引值判断
     46     public boolean contentEquals(CharSequence cs) {
     47         // Argument is a StringBuffer, StringBuilder
     48         if (cs instanceof AbstractStringBuilder) {
     49             if (cs instanceof StringBuffer) {
     50                 synchronized(cs) {
     51                    return nonSyncContentEquals((AbstractStringBuilder)cs);
     52                 }
     53             } else {
     54                 return nonSyncContentEquals((AbstractStringBuilder)cs);
     55             }
     56         }
     57         // Argument is a String
     58         if (cs instanceof String) {
     59             return equals(cs);
     60         }
     61         // Argument is a generic CharSequence
     62         char v1[] = value;
     63         int n = v1.length;
     64         if (n != cs.length()) {
     65             return false;
     66         }
     67         for (int i = 0; i < n; i++) {
     68             if (v1[i] != cs.charAt(i)) {
     69                 return false;
     70             }
     71         }
     72         return true;
     73     }
     74 
     75     // 该相等方法与上面不同的是,该方法用的是匹配方法,上面使用的是一一比较索引值方法
     76     public boolean equalsIgnoreCase(String anotherString) {
     77         return (this == anotherString) ? true
     78                 : (anotherString != null)
     79                 && (anotherString.value.length == value.length)
     80                 && regionMatches(true, 0, anotherString, 0, value.length);
     81     }
     82 
     83     // 指定范围进行一一索引值匹配
     84     public boolean regionMatches(boolean ignoreCase, int toffset,
     85             String other, int ooffset, int len) {
     86         char ta[] = value;
     87         int to = toffset;
     88         char pa[] = other.value;
     89         int po = ooffset;
     90         // Note: toffset, ooffset, or len might be near -1>>>1.
     91         if ((ooffset < 0) || (toffset < 0)
     92                 || (toffset > (long)value.length - len)
     93                 || (ooffset > (long)other.value.length - len)) {
     94             return false;
     95         }
     96         while (len-- > 0) {
     97             char c1 = ta[to++];
     98             char c2 = pa[po++];
     99             if (c1 == c2) {
    100                 continue;
    101             }
    102             if (ignoreCase) {
    103                 // If characters don‘t match but case may be ignored,
    104                 // try converting both characters to uppercase.
    105                 // If the results match, then the comparison scan should
    106                 // continue.
    107                 char u1 = Character.toUpperCase(c1);
    108                 char u2 = Character.toUpperCase(c2);
    109                 if (u1 == u2) {
    110                     continue;
    111                 }
    112                 // Unfortunately, conversion to uppercase does not work properly
    113                 // for the Georgian alphabet, which has strange rules about case
    114                 // conversion.  So we need to make one last check before
    115                 // exiting.
    116                 if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
    117                     continue;
    118                 }
    119             }
    120             return false;
    121         }
    122         return true;
    123     }
    124 
    125     public boolean startsWith(String prefix, int toffset) {
    126         char ta[] = value;
    127         int to = toffset;
    128         char pa[] = prefix.value;
    129         int po = 0;
    130         int pc = prefix.value.length;
    131         // Note: toffset might be near -1>>>1.
    132         if ((toffset < 0) || (toffset > value.length - pc)) {
    133             return false;
    134         }
    135         while (--pc >= 0) {
    136             if (ta[to++] != pa[po++]) {
    137                 return false;
    138             }
    139         }
    140         return true;
    141     }
    142 
    143     public boolean startsWith(String prefix) {
    144         return startsWith(prefix, 0);
    145     }
    146 
    147     public boolean endsWith(String suffix) {
    148         return startsWith(suffix, value.length - suffix.value.length);
    149     }
    equals
  9. 大小比较compareTo方法解析

    技术分享图片
     1     // 一一比较索引值,通过char进行 大小比较。
     2     public int compareTo(String anotherString) {
     3         int len1 = value.length;
     4         int len2 = anotherString.value.length;
     5         int lim = Math.min(len1, len2);
     6         char v1[] = value;
     7         char v2[] = anotherString.value;
     8 
     9         int k = 0;
    10         while (k < lim) {
    11             char c1 = v1[k];
    12             char c2 = v2[k];
    13             if (c1 != c2) {
    14                 return c1 - c2;
    15             }
    16             k++;
    17         }
    18         return len1 - len2;
    19     }
    20 
    21     // 定义了静态忽略大小写的比较器Comparator变量
    22     public static final Comparator<String> CASE_INSENSITIVE_ORDER
    23                                          = new CaseInsensitiveComparator();
    24     // 定义了私有内部类-忽略大小写的字符串比较器类
    25     private static class CaseInsensitiveComparator
    26             implements Comparator<String>, java.io.Serializable {
    27         // use serialVersionUID from JDK 1.2.2 for interoperability
    28         private static final long serialVersionUID = 8575799808933029326L;
    29 
    30         public int compare(String s1, String s2) {
    31             int n1 = s1.length();
    32             int n2 = s2.length();
    33             int min = Math.min(n1, n2);
    34             for (int i = 0; i < min; i++) {
    35                 char c1 = s1.charAt(i);
    36                 char c2 = s2.charAt(i);
    37                 if (c1 != c2) {
    38                     c1 = Character.toUpperCase(c1);
    39                     c2 = Character.toUpperCase(c2);
    40                     if (c1 != c2) {
    41                         c1 = Character.toLowerCase(c1);
    42                         c2 = Character.toLowerCase(c2);
    43                         if (c1 != c2) {
    44                             // No overflow because of numeric promotion
    45                             return c1 - c2;
    46                         }
    47                     }
    48                 }
    49             }
    50             return n1 - n2;
    51         }
    52         
    53         /** Replaces the de-serialized object. */
    54         private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
    55     }
    56 
    57     // 忽略大小写的比较方法
    58     public int compareToIgnoreCase(String str) {
    59         return CASE_INSENSITIVE_ORDER.compare(this, str);
    60     }
    compareTo

    在CaseInsensitiveComparator私有内部类中,定义了readResolve()方法,这个方法的目的是保证CaseInsensitiveComparator在反序列化中也能保持单例。

  10.  string 重写hashCode

     1     /**
     2      * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     3      */
     4     public int hashCode() {
     5         int h = hash;
     6         if (h == 0 && value.length > 0) {
     7             char val[] = value;
     8 
     9             for (int i = 0; i < value.length; i++) {
    10                 h = 31 * h + val[i];
    11             }
    12             hash = h;
    13         }
    14         return h;
    15     }

    string重写hashcode使用的质数31。

  11. indexof等通过值查询索引

    技术分享图片
      1     public int indexOf(int ch) {
      2         return indexOf(ch, 0);
      3     }
      4 
      5     /**
      6      * Returns the index within this string of the first occurrence of the
      7      * specified character, starting the search at the specified index.
      8      * <p>
      9      * If a character with value {@code ch} occurs in the
     10      * character sequence represented by this {@code String}
     11      * object at an index no smaller than {@code fromIndex}, then
     12      * the index of the first such occurrence is returned. For values
     13      * of {@code ch} in the range from 0 to 0xFFFF (inclusive),
     14      * this is the smallest value <i>k</i> such that:
     15      * <blockquote><pre>
     16      * (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &gt;= fromIndex)
     17      * </pre></blockquote>
     18      * is true. For other values of {@code ch}, it is the
     19      * smallest value <i>k</i> such that:
     20      * <blockquote><pre>
     21      * (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &gt;= fromIndex)
     22      * </pre></blockquote>
     23      * is true. In either case, if no such character occurs in this
     24      * string at or after position {@code fromIndex}, then
     25      * {@code -1} is returned.
     26      *
     27      * <p>
     28      * There is no restriction on the value of {@code fromIndex}. If it
     29      * is negative, it has the same effect as if it were zero: this entire
     30      * string may be searched. If it is greater than the length of this
     31      * string, it has the same effect as if it were equal to the length of
     32      * this string: {@code -1} is returned.
     33      *
     34      * <p>All indices are specified in {@code char} values
     35      * (Unicode code units).
     36      *
     37      * @param   ch          a character (Unicode code point).
     38      * @param   fromIndex   the index to start the search from.
     39      * @return  the index of the first occurrence of the character in the
     40      *          character sequence represented by this object that is greater
     41      *          than or equal to {@code fromIndex}, or {@code -1}
     42      *          if the character does not occur.
     43      */
     44     public int indexOf(int ch, int fromIndex) {
     45         final int max = value.length;
     46         if (fromIndex < 0) {
     47             fromIndex = 0;
     48         } else if (fromIndex >= max) {
     49             // Note: fromIndex might be near -1>>>1.
     50             return -1;
     51         }
     52 
     53         if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
     54             // handle most cases here (ch is a BMP code point or a
     55             // negative value (invalid code point))
     56             final char[] value = this.value;
     57             for (int i = fromIndex; i < max; i++) {
     58                 if (value[i] == ch) {
     59                     return i;
     60                 }
     61             }
     62             return -1;
     63         } else {
     64             return indexOfSupplementary(ch, fromIndex);
     65         }
     66     }
     67 
     68     /**
     69      * Handles (rare) calls of indexOf with a supplementary character.
     70      */
     71     private int indexOfSupplementary(int ch, int fromIndex) {
     72         if (Character.isValidCodePoint(ch)) {
     73             final char[] value = this.value;
     74             final char hi = Character.highSurrogate(ch);
     75             final char lo = Character.lowSurrogate(ch);
     76             final int max = value.length - 1;
     77             for (int i = fromIndex; i < max; i++) {
     78                 if (value[i] == hi && value[i + 1] == lo) {
     79                     return i;
     80                 }
     81             }
     82         }
     83         return -1;
     84     }
     85 
     86     /**
     87      * Returns the index within this string of the last occurrence of
     88      * the specified character. For values of {@code ch} in the
     89      * range from 0 to 0xFFFF (inclusive), the index (in Unicode code
     90      * units) returned is the largest value <i>k</i> such that:
     91      * <blockquote><pre>
     92      * this.charAt(<i>k</i>) == ch
     93      * </pre></blockquote>
     94      * is true. For other values of {@code ch}, it is the
     95      * largest value <i>k</i> such that:
     96      * <blockquote><pre>
     97      * this.codePointAt(<i>k</i>) == ch
     98      * </pre></blockquote>
     99      * is true.  In either case, if no such character occurs in this
    100      * string, then {@code -1} is returned.  The
    101      * {@code String} is searched backwards starting at the last
    102      * character.
    103      *
    104      * @param   ch   a character (Unicode code point).
    105      * @return  the index of the last occurrence of the character in the
    106      *          character sequence represented by this object, or
    107      *          {@code -1} if the character does not occur.
    108      */
    109     public int lastIndexOf(int ch) {
    110         return lastIndexOf(ch, value.length - 1);
    111     }
    112 
    113     /**
    114      * Returns the index within this string of the last occurrence of
    115      * the specified character, searching backward starting at the
    116      * specified index. For values of {@code ch} in the range
    117      * from 0 to 0xFFFF (inclusive), the index returned is the largest
    118      * value <i>k</i> such that:
    119      * <blockquote><pre>
    120      * (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &lt;= fromIndex)
    121      * </pre></blockquote>
    122      * is true. For other values of {@code ch}, it is the
    123      * largest value <i>k</i> such that:
    124      * <blockquote><pre>
    125      * (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &lt;= fromIndex)
    126      * </pre></blockquote>
    127      * is true. In either case, if no such character occurs in this
    128      * string at or before position {@code fromIndex}, then
    129      * {@code -1} is returned.
    130      *
    131      * <p>All indices are specified in {@code char} values
    132      * (Unicode code units).
    133      *
    134      * @param   ch          a character (Unicode code point).
    135      * @param   fromIndex   the index to start the search from. There is no
    136      *          restriction on the value of {@code fromIndex}. If it is
    137      *          greater than or equal to the length of this string, it has
    138      *          the same effect as if it were equal to one less than the
    139      *          length of this string: this entire string may be searched.
    140      *          If it is negative, it has the same effect as if it were -1:
    141      *          -1 is returned.
    142      * @return  the index of the last occurrence of the character in the
    143      *          character sequence represented by this object that is less
    144      *          than or equal to {@code fromIndex}, or {@code -1}
    145      *          if the character does not occur before that point.
    146      */
    147     public int lastIndexOf(int ch, int fromIndex) {
    148         if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
    149             // handle most cases here (ch is a BMP code point or a
    150             // negative value (invalid code point))
    151             final char[] value = this.value;
    152             int i = Math.min(fromIndex, value.length - 1);
    153             for (; i >= 0; i--) {
    154                 if (value[i] == ch) {
    155                     return i;
    156                 }
    157             }
    158             return -1;
    159         } else {
    160             return lastIndexOfSupplementary(ch, fromIndex);
    161         }
    162     }
    163 
    164     /**
    165      * Handles (rare) calls of lastIndexOf with a supplementary character.
    166      */
    167     private int lastIndexOfSupplementary(int ch, int fromIndex) {
    168         if (Character.isValidCodePoint(ch)) {
    169             final char[] value = this.value;
    170             char hi = Character.highSurrogate(ch);
    171             char lo = Character.lowSurrogate(ch);
    172             int i = Math.min(fromIndex, value.length - 2);
    173             for (; i >= 0; i--) {
    174                 if (value[i] == hi && value[i + 1] == lo) {
    175                     return i;
    176                 }
    177             }
    178         }
    179         return -1;
    180     }
    181 
    182     /**
    183      * Returns the index within this string of the first occurrence of the
    184      * specified substring.
    185      *
    186      * <p>The returned index is the smallest value <i>k</i> for which:
    187      * <blockquote><pre>
    188      * this.startsWith(str, <i>k</i>)
    189      * </pre></blockquote>
    190      * If no such value of <i>k</i> exists, then {@code -1} is returned.
    191      *
    192      * @param   str   the substring to search for.
    193      * @return  the index of the first occurrence of the specified substring,
    194      *          or {@code -1} if there is no such occurrence.
    195      */
    196     public int indexOf(String str) {
    197         return indexOf(str, 0);
    198     }
    199 
    200     /**
    201      * Returns the index within this string of the first occurrence of the
    202      * specified substring, starting at the specified index.
    203      *
    204      * <p>The returned index is the smallest value <i>k</i> for which:
    205      * <blockquote><pre>
    206      * <i>k</i> &gt;= fromIndex {@code &&} this.startsWith(str, <i>k</i>)
    207      * </pre></blockquote>
    208      * If no such value of <i>k</i> exists, then {@code -1} is returned.
    209      *
    210      * @param   str         the substring to search for.
    211      * @param   fromIndex   the index from which to start the search.
    212      * @return  the index of the first occurrence of the specified substring,
    213      *          starting at the specified index,
    214      *          or {@code -1} if there is no such occurrence.
    215      */
    216     public int indexOf(String str, int fromIndex) {
    217         return indexOf(value, 0, value.length,
    218                 str.value, 0, str.value.length, fromIndex);
    219     }
    220 
    221     /**
    222      * Code shared by String and AbstractStringBuilder to do searches. The
    223      * source is the character array being searched, and the target
    224      * is the string being searched for.
    225      *
    226      * @param   source       the characters being searched.
    227      * @param   sourceOffset offset of the source string.
    228      * @param   sourceCount  count of the source string.
    229      * @param   target       the characters being searched for.
    230      * @param   fromIndex    the index to begin searching from.
    231      */
    232     static int indexOf(char[] source, int sourceOffset, int sourceCount,
    233             String target, int fromIndex) {
    234         return indexOf(source, sourceOffset, sourceCount,
    235                        target.value, 0, target.value.length,
    236                        fromIndex);
    237     }
    238 
    239     /**
    240      * Code shared by String and StringBuffer to do searches. The
    241      * source is the character array being searched, and the target
    242      * is the string being searched for.
    243      *
    244      * @param   source       the characters being searched.
    245      * @param   sourceOffset offset of the source string.
    246      * @param   sourceCount  count of the source string.
    247      * @param   target       the characters being searched for.
    248      * @param   targetOffset offset of the target string.
    249      * @param   targetCount  count of the target string.
    250      * @param   fromIndex    the index to begin searching from.
    251      */
    252     static int indexOf(char[] source, int sourceOffset, int sourceCount,
    253             char[] target, int targetOffset, int targetCount,
    254             int fromIndex) {
    255         if (fromIndex >= sourceCount) {
    256             return (targetCount == 0 ? sourceCount : -1);
    257         }
    258         if (fromIndex < 0) {
    259             fromIndex = 0;
    260         }
    261         if (targetCount == 0) {
    262             return fromIndex;
    263         }
    264 
    265         char first = target[targetOffset];
    266         int max = sourceOffset + (sourceCount - targetCount);
    267 
    268         for (int i = sourceOffset + fromIndex; i <= max; i++) {
    269             /* Look for first character. */
    270             if (source[i] != first) {
    271                 while (++i <= max && source[i] != first);
    272             }
    273 
    274             /* Found first character, now look at the rest of v2 */
    275             if (i <= max) {
    276                 int j = i + 1;
    277                 int end = j + targetCount - 1;
    278                 for (int k = targetOffset + 1; j < end && source[j]
    279                         == target[k]; j++, k++);
    280 
    281                 if (j == end) {
    282                     /* Found whole string. */
    283                     return i - sourceOffset;
    284                 }
    285             }
    286         }
    287         return -1;
    288     }
    289 
    290     /**
    291      * Returns the index within this string of the last occurrence of the
    292      * specified substring.  The last occurrence of the empty string ""
    293      * is considered to occur at the index value {@code this.length()}.
    294      *
    295      * <p>The returned index is the largest value <i>k</i> for which:
    296      * <blockquote><pre>
    297      * this.startsWith(str, <i>k</i>)
    298      * </pre></blockquote>
    299      * If no such value of <i>k</i> exists, then {@code -1} is returned.
    300      *
    301      * @param   str   the substring to search for.
    302      * @return  the index of the last occurrence of the specified substring,
    303      *          or {@code -1} if there is no such occurrence.
    304      */
    305     public int lastIndexOf(String str) {
    306         return lastIndexOf(str, value.length);
    307     }
    308 
    309     /**
    310      * Returns the index within this string of the last occurrence of the
    311      * specified substring, searching backward starting at the specified index.
    312      *
    313      * <p>The returned index is the largest value <i>k</i> for which:
    314      * <blockquote><pre>
    315      * <i>k</i> {@code <=} fromIndex {@code &&} this.startsWith(str, <i>k</i>)
    316      * </pre></blockquote>
    317      * If no such value of <i>k</i> exists, then {@code -1} is returned.
    318      *
    319      * @param   str         the substring to search for.
    320      * @param   fromIndex   the index to start the search from.
    321      * @return  the index of the last occurrence of the specified substring,
    322      *          searching backward from the specified index,
    323      *          or {@code -1} if there is no such occurrence.
    324      */
    325     public int lastIndexOf(String str, int fromIndex) {
    326         return lastIndexOf(value, 0, value.length,
    327                 str.value, 0, str.value.length, fromIndex);
    328     }
    329 
    330     /**
    331      * Code shared by String and AbstractStringBuilder to do searches. The
    332      * source is the character array being searched, and the target
    333      * is the string being searched for.
    334      *
    335      * @param   source       the characters being searched.
    336      * @param   sourceOffset offset of the source string.
    337      * @param   sourceCount  count of the source string.
    338      * @param   target       the characters being searched for.
    339      * @param   fromIndex    the index to begin searching from.
    340      */
    341     static int lastIndexOf(char[] source, int sourceOffset, int sourceCount,
    342             String target, int fromIndex) {
    343         return lastIndexOf(source, sourceOffset, sourceCount,
    344                        target.value, 0, target.value.length,
    345                        fromIndex);
    346     }
    347 
    348     /**
    349      * Code shared by String and StringBuffer to do searches. The
    350      * source is the character array being searched, and the target
    351      * is the string being searched for.
    352      *
    353      * @param   source       the characters being searched.
    354      * @param   sourceOffset offset of the source string.
    355      * @param   sourceCount  count of the source string.
    356      * @param   target       the characters being searched for.
    357      * @param   targetOffset offset of the target string.
    358      * @param   targetCount  count of the target string.
    359      * @param   fromIndex    the index to begin searching from.
    360      */
    361     static int lastIndexOf(char[] source, int sourceOffset, int sourceCount,
    362             char[] target, int targetOffset, int targetCount,
    363             int fromIndex) {
    364         /*
    365          * Check arguments; return immediately where possible. For
    366          * consistency, don‘t check for null str.
    367          */
    368         int rightIndex = sourceCount - targetCount;
    369         if (fromIndex < 0) {
    370             return -1;
    371         }
    372         if (fromIndex > rightIndex) {
    373             fromIndex = rightIndex;
    374         }
    375         /* Empty string always matches. */
    376         if (targetCount == 0) {
    377             return fromIndex;
    378         }
    379 
    380         int strLastIndex = targetOffset + targetCount - 1;
    381         char strLastChar = target[strLastIndex];
    382         int min = sourceOffset + targetCount - 1;
    383         int i = min + fromIndex;
    384 
    385     startSearchForLastChar:
    386         while (true) {
    387             while (i >= min && source[i] != strLastChar) {
    388                 i--;
    389             }
    390             if (i < min) {
    391                 return -1;
    392             }
    393             int j = i - 1;
    394             int start = j - (targetCount - 1);
    395             int k = strLastIndex - 1;
    396 
    397             while (j > start) {
    398                 if (source[j--] != target[k--]) {
    399                     i--;
    400                     continue startSearchForLastChar;
    401                 }
    402             }
    403             return start - sourceOffset + 1;
    404         }
    405     }
    indexOf
  12. string 的操作并返回new String

    技术分享图片
      1 /**
      2      * Returns a string that is a substring of this string. The
      3      * substring begins with the character at the specified index and
      4      * extends to the end of this string. <p>
      5      * Examples:
      6      * <blockquote><pre>
      7      * "unhappy".substring(2) returns "happy"
      8      * "Harbison".substring(3) returns "bison"
      9      * "emptiness".substring(9) returns "" (an empty string)
     10      * </pre></blockquote>
     11      *
     12      * @param      beginIndex   the beginning index, inclusive.
     13      * @return     the specified substring.
     14      * @exception  IndexOutOfBoundsException  if
     15      *             {@code beginIndex} is negative or larger than the
     16      *             length of this {@code String} object.
     17      */
     18     public String substring(int beginIndex) {
     19         if (beginIndex < 0) {
     20             throw new StringIndexOutOfBoundsException(beginIndex);
     21         }
     22         int subLen = value.length - beginIndex;
     23         if (subLen < 0) {
     24             throw new StringIndexOutOfBoundsException(subLen);
     25         }
     26         return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
     27     }
     28 
     29     /**
     30      * Returns a string that is a substring of this string. The
     31      * substring begins at the specified {@code beginIndex} and
     32      * extends to the character at index {@code endIndex - 1}.
     33      * Thus the length of the substring is {@code endIndex-beginIndex}.
     34      * <p>
     35      * Examples:
     36      * <blockquote><pre>
     37      * "hamburger".substring(4, 8) returns "urge"
     38      * "smiles".substring(1, 5) returns "mile"
     39      * </pre></blockquote>
     40      *
     41      * @param      beginIndex   the beginning index, inclusive.
     42      * @param      endIndex     the ending index, exclusive.
     43      * @return     the specified substring.
     44      * @exception  IndexOutOfBoundsException  if the
     45      *             {@code beginIndex} is negative, or
     46      *             {@code endIndex} is larger than the length of
     47      *             this {@code String} object, or
     48      *             {@code beginIndex} is larger than
     49      *             {@code endIndex}.
     50      */
     51     public String substring(int beginIndex, int endIndex) {
     52         if (beginIndex < 0) {
     53             throw new StringIndexOutOfBoundsException(beginIndex);
     54         }
     55         if (endIndex > value.length) {
     56             throw new StringIndexOutOfBoundsException(endIndex);
     57         }
     58         int subLen = endIndex - beginIndex;
     59         if (subLen < 0) {
     60             throw new StringIndexOutOfBoundsException(subLen);
     61         }
     62         return ((beginIndex == 0) && (endIndex == value.length)) ? this
     63                 : new String(value, beginIndex, subLen);
     64     }
     65 
     66     /**
     67      * Returns a character sequence that is a subsequence of this sequence.
     68      *
     69      * <p> An invocation of this method of the form
     70      *
     71      * <blockquote><pre>
     72      * str.subSequence(begin,&nbsp;end)</pre></blockquote>
     73      *
     74      * behaves in exactly the same way as the invocation
     75      *
     76      * <blockquote><pre>
     77      * str.substring(begin,&nbsp;end)</pre></blockquote>
     78      *
     79      * @apiNote
     80      * This method is defined so that the {@code String} class can implement
     81      * the {@link CharSequence} interface.
     82      *
     83      * @param   beginIndex   the begin index, inclusive.
     84      * @param   endIndex     the end index, exclusive.
     85      * @return  the specified subsequence.
     86      *
     87      * @throws  IndexOutOfBoundsException
     88      *          if {@code beginIndex} or {@code endIndex} is negative,
     89      *          if {@code endIndex} is greater than {@code length()},
     90      *          or if {@code beginIndex} is greater than {@code endIndex}
     91      *
     92      * @since 1.4
     93      * @spec JSR-51
     94      */
     95     public CharSequence subSequence(int beginIndex, int endIndex) {
     96         return this.substring(beginIndex, endIndex);
     97     }
     98 
     99     /**
    100      * Concatenates the specified string to the end of this string.
    101      * <p>
    102      * If the length of the argument string is {@code 0}, then this
    103      * {@code String} object is returned. Otherwise, a
    104      * {@code String} object is returned that represents a character
    105      * sequence that is the concatenation of the character sequence
    106      * represented by this {@code String} object and the character
    107      * sequence represented by the argument string.<p>
    108      * Examples:
    109      * <blockquote><pre>
    110      * "cares".concat("s") returns "caress"
    111      * "to".concat("get").concat("her") returns "together"
    112      * </pre></blockquote>
    113      *
    114      * @param   str   the {@code String} that is concatenated to the end
    115      *                of this {@code String}.
    116      * @return  a string that represents the concatenation of this object‘s
    117      *          characters followed by the string argument‘s characters.
    118      */
    119     public String concat(String str) {
    120         int otherLen = str.length();
    121         if (otherLen == 0) {
    122             return this;
    123         }
    124         int len = value.length;
    125         char buf[] = Arrays.copyOf(value, len + otherLen);
    126         str.getChars(buf, len);
    127         return new String(buf, true);
    128     }
    129 
    130     /**
    131      * Returns a string resulting from replacing all occurrences of
    132      * {@code oldChar} in this string with {@code newChar}.
    133      * <p>
    134      * If the character {@code oldChar} does not occur in the
    135      * character sequence represented by this {@code String} object,
    136      * then a reference to this {@code String} object is returned.
    137      * Otherwise, a {@code String} object is returned that
    138      * represents a character sequence identical to the character sequence
    139      * represented by this {@code String} object, except that every
    140      * occurrence of {@code oldChar} is replaced by an occurrence
    141      * of {@code newChar}.
    142      * <p>
    143      * Examples:
    144      * <blockquote><pre>
    145      * "mesquite in your cellar".replace(‘e‘, ‘o‘)
    146      *         returns "mosquito in your collar"
    147      * "the war of baronets".replace(‘r‘, ‘y‘)
    148      *         returns "the way of bayonets"
    149      * "sparring with a purple porpoise".replace(‘p‘, ‘t‘)
    150      *         returns "starring with a turtle tortoise"
    151      * "JonL".replace(‘q‘, ‘x‘) returns "JonL" (no change)
    152      * </pre></blockquote>
    153      *
    154      * @param   oldChar   the old character.
    155      * @param   newChar   the new character.
    156      * @return  a string derived from this string by replacing every
    157      *          occurrence of {@code oldChar} with {@code newChar}.
    158      */
    159     public String replace(char oldChar, char newChar) {
    160         if (oldChar != newChar) {
    161             int len = value.length;
    162             int i = -1;
    163             char[] val = value; /* avoid getfield opcode */
    164 
    165             while (++i < len) {
    166                 if (val[i] == oldChar) {
    167                     break;
    168                 }
    169             }
    170             if (i < len) {
    171                 char buf[] = new char[len];
    172                 for (int j = 0; j < i; j++) {
    173                     buf[j] = val[j];
    174                 }
    175                 while (i < len) {
    176                     char c = val[i];
    177                     buf[i] = (c == oldChar) ? newChar : c;
    178                     i++;
    179                 }
    180                 return new String(buf, true);
    181             }
    182         }
    183         return this;
    184     }
    185 
    186     /**
    187      * Tells whether or not this string matches the given <a
    188      * href="../util/regex/Pattern.html#sum">regular expression</a>.
    189      *
    190      * <p> An invocation of this method of the form
    191      * <i>str</i>{@code .matches(}<i>regex</i>{@code )} yields exactly the
    192      * same result as the expression
    193      *
    194      * <blockquote>
    195      * {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#matches(String,CharSequence)
    196      * matches(<i>regex</i>, <i>str</i>)}
    197      * </blockquote>
    198      *
    199      * @param   regex
    200      *          the regular expression to which this string is to be matched
    201      *
    202      * @return  {@code true} if, and only if, this string matches the
    203      *          given regular expression
    204      *
    205      * @throws  PatternSyntaxException
    206      *          if the regular expression‘s syntax is invalid
    207      *
    208      * @see java.util.regex.Pattern
    209      *
    210      * @since 1.4
    211      * @spec JSR-51
    212      */
    213     public boolean matches(String regex) {
    214         return Pattern.matches(regex, this);
    215     }
    216 
    217     /**
    218      * Returns true if and only if this string contains the specified
    219      * sequence of char values.
    220      *
    221      * @param s the sequence to search for
    222      * @return true if this string contains {@code s}, false otherwise
    223      * @since 1.5
    224      */
    225     public boolean contains(CharSequence s) {
    226         return indexOf(s.toString()) > -1;
    227     }
    228 
    229     /**
    230      * Replaces the first substring of this string that matches the given <a
    231      * href="../util/regex/Pattern.html#sum">regular expression</a> with the
    232      * given replacement.
    233      *
    234      * <p> An invocation of this method of the form
    235      * <i>str</i>{@code .replaceFirst(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
    236      * yields exactly the same result as the expression
    237      *
    238      * <blockquote>
    239      * <code>
    240      * {@link java.util.regex.Pattern}.{@link
    241      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
    242      * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
    243      * java.util.regex.Matcher#replaceFirst replaceFirst}(<i>repl</i>)
    244      * </code>
    245      * </blockquote>
    246      *
    247      *<p>
    248      * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
    249      * replacement string may cause the results to be different than if it were
    250      * being treated as a literal replacement string; see
    251      * {@link java.util.regex.Matcher#replaceFirst}.
    252      * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
    253      * meaning of these characters, if desired.
    254      *
    255      * @param   regex
    256      *          the regular expression to which this string is to be matched
    257      * @param   replacement
    258      *          the string to be substituted for the first match
    259      *
    260      * @return  The resulting {@code String}
    261      *
    262      * @throws  PatternSyntaxException
    263      *          if the regular expression‘s syntax is invalid
    264      *
    265      * @see java.util.regex.Pattern
    266      *
    267      * @since 1.4
    268      * @spec JSR-51
    269      */
    270     public String replaceFirst(String regex, String replacement) {
    271         return Pattern.compile(regex).matcher(this).replaceFirst(replacement);
    272     }
    273 
    274     /**
    275      * Replaces each substring of this string that matches the given <a
    276      * href="../util/regex/Pattern.html#sum">regular expression</a> with the
    277      * given replacement.
    278      *
    279      * <p> An invocation of this method of the form
    280      * <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
    281      * yields exactly the same result as the expression
    282      *
    283      * <blockquote>
    284      * <code>
    285      * {@link java.util.regex.Pattern}.{@link
    286      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
    287      * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
    288      * java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
    289      * </code>
    290      * </blockquote>
    291      *
    292      *<p>
    293      * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
    294      * replacement string may cause the results to be different than if it were
    295      * being treated as a literal replacement string; see
    296      * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
    297      * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
    298      * meaning of these characters, if desired.
    299      *
    300      * @param   regex
    301      *          the regular expression to which this string is to be matched
    302      * @param   replacement
    303      *          the string to be substituted for each match
    304      *
    305      * @return  The resulting {@code String}
    306      *
    307      * @throws  PatternSyntaxException
    308      *          if the regular expression‘s syntax is invalid
    309      *
    310      * @see java.util.regex.Pattern
    311      *
    312      * @since 1.4
    313      * @spec JSR-51
    314      */
    315     public String replaceAll(String regex, String replacement) {
    316         return Pattern.compile(regex).matcher(this).replaceAll(replacement);
    317     }
    318 
    319     /**
    320      * Replaces each substring of this string that matches the literal target
    321      * sequence with the specified literal replacement sequence. The
    322      * replacement proceeds from the beginning of the string to the end, for
    323      * example, replacing "aa" with "b" in the string "aaa" will result in
    324      * "ba" rather than "ab".
    325      *
    326      * @param  target The sequence of char values to be replaced
    327      * @param  replacement The replacement sequence of char values
    328      * @return  The resulting string
    329      * @since 1.5
    330      */
    331     public String replace(CharSequence target, CharSequence replacement) {
    332         return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
    333                 this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
    334     }
    string operate 
  13. split分割方法解析

    技术分享图片
      1    /**
      2      * Splits this string around matches of the given
      3      * <a href="../util/regex/Pattern.html#sum">regular expression</a>.
      4      *
      5      * <p> The array returned by this method contains each substring of this
      6      * string that is terminated by another substring that matches the given
      7      * expression or is terminated by the end of the string.  The substrings in
      8      * the array are in the order in which they occur in this string.  If the
      9      * expression does not match any part of the input then the resulting array
     10      * has just one element, namely this string.
     11      *
     12      * <p> When there is a positive-width match at the beginning of this
     13      * string then an empty leading substring is included at the beginning
     14      * of the resulting array. A zero-width match at the beginning however
     15      * never produces such empty leading substring.
     16      *
     17      * <p> The {@code limit} parameter controls the number of times the
     18      * pattern is applied and therefore affects the length of the resulting
     19      * array.  If the limit <i>n</i> is greater than zero then the pattern
     20      * will be applied at most <i>n</i>&nbsp;-&nbsp;1 times, the array‘s
     21      * length will be no greater than <i>n</i>, and the array‘s last entry
     22      * will contain all input beyond the last matched delimiter.  If <i>n</i>
     23      * is non-positive then the pattern will be applied as many times as
     24      * possible and the array can have any length.  If <i>n</i> is zero then
     25      * the pattern will be applied as many times as possible, the array can
     26      * have any length, and trailing empty strings will be discarded.
     27      *
     28      * <p> The string {@code "boo:and:foo"}, for example, yields the
     29      * following results with these parameters:
     30      *
     31      * <blockquote><table cellpadding=1 cellspacing=0 summary="Split example showing regex, limit, and result">
     32      * <tr>
     33      *     <th>Regex</th>
     34      *     <th>Limit</th>
     35      *     <th>Result</th>
     36      * </tr>
     37      * <tr><td align=center>:</td>
     38      *     <td align=center>2</td>
     39      *     <td>{@code { "boo", "and:foo" }}</td></tr>
     40      * <tr><td align=center>:</td>
     41      *     <td align=center>5</td>
     42      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
     43      * <tr><td align=center>:</td>
     44      *     <td align=center>-2</td>
     45      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
     46      * <tr><td align=center>o</td>
     47      *     <td align=center>5</td>
     48      *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
     49      * <tr><td align=center>o</td>
     50      *     <td align=center>-2</td>
     51      *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
     52      * <tr><td align=center>o</td>
     53      *     <td align=center>0</td>
     54      *     <td>{@code { "b", "", ":and:f" }}</td></tr>
     55      * </table></blockquote>
     56      *
     57      * <p> An invocation of this method of the form
     58      * <i>str.</i>{@code split(}<i>regex</i>{@code ,}&nbsp;<i>n</i>{@code )}
     59      * yields the same result as the expression
     60      *
     61      * <blockquote>
     62      * <code>
     63      * {@link java.util.regex.Pattern}.{@link
     64      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
     65      * java.util.regex.Pattern#split(java.lang.CharSequence,int) split}(<i>str</i>,&nbsp;<i>n</i>)
     66      * </code>
     67      * </blockquote>
     68      *
     69      *
     70      * @param  regex
     71      *         the delimiting regular expression
     72      *
     73      * @param  limit
     74      *         the result threshold, as described above
     75      *
     76      * @return  the array of strings computed by splitting this string
     77      *          around matches of the given regular expression
     78      *
     79      * @throws  PatternSyntaxException
     80      *          if the regular expression‘s syntax is invalid
     81      *
     82      * @see java.util.regex.Pattern
     83      *
     84      * @since 1.4
     85      * @spec JSR-51
     86      */
     87     public String[] split(String regex, int limit) {
     88         /* fastpath if the regex is a
     89          (1)one-char String and this character is not one of the
     90             RegEx‘s meta characters ".$|()[{^?*+\\", or
     91          (2)two-char String and the first char is the backslash and
     92             the second is not the ascii digit or ascii letter.
     93          */
     94         char ch = 0;
     95         if (((regex.value.length == 1 &&
     96              ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
     97              (regex.length() == 2 &&
     98               regex.charAt(0) == ‘\\‘ &&
     99               (((ch = regex.charAt(1))-‘0‘)|(‘9‘-ch)) < 0 &&
    100               ((ch-‘a‘)|(‘z‘-ch)) < 0 &&
    101               ((ch-‘A‘)|(‘Z‘-ch)) < 0)) &&
    102             (ch < Character.MIN_HIGH_SURROGATE ||
    103              ch > Character.MAX_LOW_SURROGATE))
    104         {
    105             int off = 0;
    106             int next = 0;
    107             boolean limited = limit > 0;
    108             ArrayList<String> list = new ArrayList<>();
    109             while ((next = indexOf(ch, off)) != -1) {
    110                 if (!limited || list.size() < limit - 1) {
    111                     list.add(substring(off, next));
    112                     off = next + 1;
    113                 } else {    // last one
    114                     //assert (list.size() == limit - 1);
    115                     list.add(substring(off, value.length));
    116                     off = value.length;
    117                     break;
    118                 }
    119             }
    120             // If no match was found, return this
    121             if (off == 0)
    122                 return new String[]{this};
    123 
    124             // Add remaining segment
    125             if (!limited || list.size() < limit)
    126                 list.add(substring(off, value.length));
    127 
    128             // Construct result
    129             int resultSize = list.size();
    130             if (limit == 0) {
    131                 while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
    132                     resultSize--;
    133                 }
    134             }
    135             String[] result = new String[resultSize];
    136             return list.subList(0, resultSize).toArray(result);
    137         }
    138         return Pattern.compile(regex).split(this, limit);
    139     }
    140 
    141     /**
    142      * Splits this string around matches of the given <a
    143      * href="../util/regex/Pattern.html#sum">regular expression</a>.
    144      *
    145      * <p> This method works as if by invoking the two-argument {@link
    146      * #split(String, int) split} method with the given expression and a limit
    147      * argument of zero.  Trailing empty strings are therefore not included in
    148      * the resulting array.
    149      *
    150      * <p> The string {@code "boo:and:foo"}, for example, yields the following
    151      * results with these expressions:
    152      *
    153      * <blockquote><table cellpadding=1 cellspacing=0 summary="Split examples showing regex and result">
    154      * <tr>
    155      *  <th>Regex</th>
    156      *  <th>Result</th>
    157      * </tr>
    158      * <tr><td align=center>:</td>
    159      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
    160      * <tr><td align=center>o</td>
    161      *     <td>{@code { "b", "", ":and:f" }}</td></tr>
    162      * </table></blockquote>
    163      *
    164      *
    165      * @param  regex
    166      *         the delimiting regular expression
    167      *
    168      * @return  the array of strings computed by splitting this string
    169      *          around matches of the given regular expression
    170      *
    171      * @throws  PatternSyntaxException
    172      *          if the regular expression‘s syntax is invalid
    173      *
    174      * @see java.util.regex.Pattern
    175      *
    176      * @since 1.4
    177      * @spec JSR-51
    178      */
    179     public String[] split(String regex) {
    180         return split(regex, 0);
    181     }
    split

    在分割方法里面,可以看到有两种方法进行分割的,一种是遍历char数组,用List保存分割结果,另一种则是直接用Pattern器的分割方法。

  14. join连接方法

    技术分享图片
     1     /**
     2      * Returns a new String composed of copies of the
     3      * {@code CharSequence elements} joined together with a copy of
     4      * the specified {@code delimiter}.
     5      *
     6      * <blockquote>For example,
     7      * <pre>{@code
     8      *     String message = String.join("-", "Java", "is", "cool");
     9      *     // message returned is: "Java-is-cool"
    10      * }</pre></blockquote>
    11      *
    12      * Note that if an element is null, then {@code "null"} is added.
    13      *
    14      * @param  delimiter the delimiter that separates each element
    15      * @param  elements the elements to join together.
    16      *
    17      * @return a new {@code String} that is composed of the {@code elements}
    18      *         separated by the {@code delimiter}
    19      *
    20      * @throws NullPointerException If {@code delimiter} or {@code elements}
    21      *         is {@code null}
    22      *
    23      * @see java.util.StringJoiner
    24      * @since 1.8
    25      */
    26     public static String join(CharSequence delimiter, CharSequence... elements) {
    27         Objects.requireNonNull(delimiter);
    28         Objects.requireNonNull(elements);
    29         // Number of elements not likely worth Arrays.stream overhead.
    30         StringJoiner joiner = new StringJoiner(delimiter);
    31         for (CharSequence cs: elements) {
    32             joiner.add(cs);
    33         }
    34         return joiner.toString();
    35     }
    36 
    37     /**
    38      * Returns a new {@code String} composed of copies of the
    39      * {@code CharSequence elements} joined together with a copy of the
    40      * specified {@code delimiter}.
    41      *
    42      * <blockquote>For example,
    43      * <pre>{@code
    44      *     List<String> strings = new LinkedList<>();
    45      *     strings.add("Java");strings.add("is");
    46      *     strings.add("cool");
    47      *     String message = String.join(" ", strings);
    48      *     //message returned is: "Java is cool"
    49      *
    50      *     Set<String> strings = new LinkedHashSet<>();
    51      *     strings.add("Java"); strings.add("is");
    52      *     strings.add("very"); strings.add("cool");
    53      *     String message = String.join("-", strings);
    54      *     //message returned is: "Java-is-very-cool"
    55      * }</pre></blockquote>
    56      *
    57      * Note that if an individual element is {@code null}, then {@code "null"} is added.
    58      *
    59      * @param  delimiter a sequence of characters that is used to separate each
    60      *         of the {@code elements} in the resulting {@code String}
    61      * @param  elements an {@code Iterable} that will have its {@code elements}
    62      *         joined together.
    63      *
    64      * @return a new {@code String} that is composed from the {@code elements}
    65      *         argument
    66      *
    67      * @throws NullPointerException If {@code delimiter} or {@code elements}
    68      *         is {@code null}
    69      *
    70      * @see    #join(CharSequence,CharSequence...)
    71      * @see    java.util.StringJoiner
    72      * @since 1.8
    73      */
    74     public static String join(CharSequence delimiter,
    75             Iterable<? extends CharSequence> elements) {
    76         Objects.requireNonNull(delimiter);
    77         Objects.requireNonNull(elements);
    78         StringJoiner joiner = new StringJoiner(delimiter);
    79         for (CharSequence cs: elements) {
    80             joiner.add(cs);
    81         }
    82         return joiner.toString();
    83     }
    join method

    join实现原理是用StringJoiner,StringJoiner则是封装了StringBuilder 进行实现的。

  15. 大小写和去除空格方法

    技术分享图片
      1     /**
      2      * Converts all of the characters in this {@code String} to lower
      3      * case using the rules of the given {@code Locale}.  Case mapping is based
      4      * on the Unicode Standard version specified by the {@link java.lang.Character Character}
      5      * class. Since case mappings are not always 1:1 char mappings, the resulting
      6      * {@code String} may be a different length than the original {@code String}.
      7      * <p>
      8      * Examples of lowercase  mappings are in the following table:
      9      * <table border="1" summary="Lowercase mapping examples showing language code of locale, upper case, lower case, and description">
     10      * <tr>
     11      *   <th>Language Code of Locale</th>
     12      *   <th>Upper Case</th>
     13      *   <th>Lower Case</th>
     14      *   <th>Description</th>
     15      * </tr>
     16      * <tr>
     17      *   <td>tr (Turkish)</td>
     18      *   <td>&#92;u0130</td>
     19      *   <td>&#92;u0069</td>
     20      *   <td>capital letter I with dot above -&gt; small letter i</td>
     21      * </tr>
     22      * <tr>
     23      *   <td>tr (Turkish)</td>
     24      *   <td>&#92;u0049</td>
     25      *   <td>&#92;u0131</td>
     26      *   <td>capital letter I -&gt; small letter dotless i </td>
     27      * </tr>
     28      * <tr>
     29      *   <td>(all)</td>
     30      *   <td>French Fries</td>
     31      *   <td>french fries</td>
     32      *   <td>lowercased all chars in String</td>
     33      * </tr>
     34      * <tr>
     35      *   <td>(all)</td>
     36      *   <td><img src="doc-files/capiota.gif" alt="capiota"><img src="doc-files/capchi.gif" alt="capchi">
     37      *       <img src="doc-files/captheta.gif" alt="captheta"><img src="doc-files/capupsil.gif" alt="capupsil">
     38      *       <img src="doc-files/capsigma.gif" alt="capsigma"></td>
     39      *   <td><img src="doc-files/iota.gif" alt="iota"><img src="doc-files/chi.gif" alt="chi">
     40      *       <img src="doc-files/theta.gif" alt="theta"><img src="doc-files/upsilon.gif" alt="upsilon">
     41      *       <img src="doc-files/sigma1.gif" alt="sigma"></td>
     42      *   <td>lowercased all chars in String</td>
     43      * </tr>
     44      * </table>
     45      *
     46      * @param locale use the case transformation rules for this locale
     47      * @return the {@code String}, converted to lowercase.
     48      * @see     java.lang.String#toLowerCase()
     49      * @see     java.lang.String#toUpperCase()
     50      * @see     java.lang.String#toUpperCase(Locale)
     51      * @since   1.1
     52      */
     53     public String toLowerCase(Locale locale) {
     54         if (locale == null) {
     55             throw new NullPointerException();
     56         }
     57 
     58         int firstUpper;
     59         final int len = value.length;
     60 
     61         /* Now check if there are any characters that need to be changed. */
     62         scan: {
     63             for (firstUpper = 0 ; firstUpper < len; ) {
     64                 char c = value[firstUpper];
     65                 if ((c >= Character.MIN_HIGH_SURROGATE)
     66                         && (c <= Character.MAX_HIGH_SURROGATE)) {
     67                     int supplChar = codePointAt(firstUpper);
     68                     if (supplChar != Character.toLowerCase(supplChar)) {
     69                         break scan;
     70                     }
     71                     firstUpper += Character.charCount(supplChar);
     72                 } else {
     73                     if (c != Character.toLowerCase(c)) {
     74                         break scan;
     75                     }
     76                     firstUpper++;
     77                 }
     78             }
     79             return this;
     80         }
     81 
     82         char[] result = new char[len];
     83         int resultOffset = 0;  /* result may grow, so i+resultOffset
     84                                 * is the write location in result */
     85 
     86         /* Just copy the first few lowerCase characters. */
     87         System.arraycopy(value, 0, result, 0, firstUpper);
     88 
     89         String lang = locale.getLanguage();
     90         boolean localeDependent =
     91                 (lang == "tr" || lang == "az" || lang == "lt");
     92         char[] lowerCharArray;
     93         int lowerChar;
     94         int srcChar;
     95         int srcCount;
     96         for (int i = firstUpper; i < len; i += srcCount) {
     97             srcChar = (int)value[i];
     98             if ((char)srcChar >= Character.MIN_HIGH_SURROGATE
     99                     && (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
    100                 srcChar = codePointAt(i);
    101                 srcCount = Character.charCount(srcChar);
    102             } else {
    103                 srcCount = 1;
    104             }
    105             if (localeDependent ||
    106                 srcChar == ‘\u03A3‘ || // GREEK CAPITAL LETTER SIGMA
    107                 srcChar == ‘\u0130‘) { // LATIN CAPITAL LETTER I WITH DOT ABOVE
    108                 lowerChar = ConditionalSpecialCasing.toLowerCaseEx(this, i, locale);
    109             } else {
    110                 lowerChar = Character.toLowerCase(srcChar);
    111             }
    112             if ((lowerChar == Character.ERROR)
    113                     || (lowerChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
    114                 if (lowerChar == Character.ERROR) {
    115                     lowerCharArray =
    116                             ConditionalSpecialCasing.toLowerCaseCharArray(this, i, locale);
    117                 } else if (srcCount == 2) {
    118                     resultOffset += Character.toChars(lowerChar, result, i + resultOffset) - srcCount;
    119                     continue;
    120                 } else {
    121                     lowerCharArray = Character.toChars(lowerChar);
    122                 }
    123 
    124                 /* Grow result if needed */
    125                 int mapLen = lowerCharArray.length;
    126                 if (mapLen > srcCount) {
    127                     char[] result2 = new char[result.length + mapLen - srcCount];
    128                     System.arraycopy(result, 0, result2, 0, i + resultOffset);
    129                     result = result2;
    130                 }
    131                 for (int x = 0; x < mapLen; ++x) {
    132                     result[i + resultOffset + x] = lowerCharArray[x];
    133                 }
    134                 resultOffset += (mapLen - srcCount);
    135             } else {
    136                 result[i + resultOffset] = (char)lowerChar;
    137             }
    138         }
    139         return new String(result, 0, len + resultOffset);
    140     }
    141 
    142     /**
    143      * Converts all of the characters in this {@code String} to lower
    144      * case using the rules of the default locale. This is equivalent to calling
    145      * {@code toLowerCase(Locale.getDefault())}.
    146      * <p>
    147      * <b>Note:</b> This method is locale sensitive, and may produce unexpected
    148      * results if used for strings that are intended to be interpreted locale
    149      * independently.
    150      * Examples are programming language identifiers, protocol keys, and HTML
    151      * tags.
    152      * For instance, {@code "TITLE".toLowerCase()} in a Turkish locale
    153      * returns {@code "t\u005Cu0131tle"}, where ‘\u005Cu0131‘ is the
    154      * LATIN SMALL LETTER DOTLESS I character.
    155      * To obtain correct results for locale insensitive strings, use
    156      * {@code toLowerCase(Locale.ROOT)}.
    157      * <p>
    158      * @return  the {@code String}, converted to lowercase.
    159      * @see     java.lang.String#toLowerCase(Locale)
    160      */
    161     public String toLowerCase() {
    162         return toLowerCase(Locale.getDefault());
    163     }
    164 
    165     /**
    166      * Converts all of the characters in this {@code String} to upper
    167      * case using the rules of the given {@code Locale}. Case mapping is based
    168      * on the Unicode Standard version specified by the {@link java.lang.Character Character}
    169      * class. Since case mappings are not always 1:1 char mappings, the resulting
    170      * {@code String} may be a different length than the original {@code String}.
    171      * <p>
    172      * Examples of locale-sensitive and 1:M case mappings are in the following table.
    173      *
    174      * <table border="1" summary="Examples of locale-sensitive and 1:M case mappings. Shows Language code of locale, lower case, upper case, and description.">
    175      * <tr>
    176      *   <th>Language Code of Locale</th>
    177      *   <th>Lower Case</th>
    178      *   <th>Upper Case</th>
    179      *   <th>Description</th>
    180      * </tr>
    181      * <tr>
    182      *   <td>tr (Turkish)</td>
    183      *   <td>&#92;u0069</td>
    184      *   <td>&#92;u0130</td>
    185      *   <td>small letter i -&gt; capital letter I with dot above</td>
    186      * </tr>
    187      * <tr>
    188      *   <td>tr (Turkish)</td>
    189      *   <td>&#92;u0131</td>
    190      *   <td>&#92;u0049</td>
    191      *   <td>small letter dotless i -&gt; capital letter I</td>
    192      * </tr>
    193      * <tr>
    194      *   <td>(all)</td>
    195      *   <td>&#92;u00df</td>
    196      *   <td>&#92;u0053 &#92;u0053</td>
    197      *   <td>small letter sharp s -&gt; two letters: SS</td>
    198      * </tr>
    199      * <tr>
    200      *   <td>(all)</td>
    201      *   <td>Fahrvergn&uuml;gen</td>
    202      *   <td>FAHRVERGN&Uuml;GEN</td>
    203      *   <td></td>
    204      * </tr>
    205      * </table>
    206      * @param locale use the case transformation rules for this locale
    207      * @return the {@code String}, converted to uppercase.
    208      * @see     java.lang.String#toUpperCase()
    209      * @see     java.lang.String#toLowerCase()
    210      * @see     java.lang.String#toLowerCase(Locale)
    211      * @since   1.1
    212      */
    213     public String toUpperCase(Locale locale) {
    214         if (locale == null) {
    215             throw new NullPointerException();
    216         }
    217 
    218         int firstLower;
    219         final int len = value.length;
    220 
    221         /* Now check if there are any characters that need to be changed. */
    222         scan: {
    223             for (firstLower = 0 ; firstLower < len; ) {
    224                 int c = (int)value[firstLower];
    225                 int srcCount;
    226                 if ((c >= Character.MIN_HIGH_SURROGATE)
    227                         && (c <= Character.MAX_HIGH_SURROGATE)) {
    228                     c = codePointAt(firstLower);
    229                     srcCount = Character.charCount(c);
    230                 } else {
    231                     srcCount = 1;
    232                 }
    233                 int upperCaseChar = Character.toUpperCaseEx(c);
    234                 if ((upperCaseChar == Character.ERROR)
    235                         || (c != upperCaseChar)) {
    236                     break scan;
    237                 }
    238                 firstLower += srcCount;
    239             }
    240             return this;
    241         }
    242 
    243         /* result may grow, so i+resultOffset is the write location in result */
    244         int resultOffset = 0;
    245         char[] result = new char[len]; /* may grow */
    246 
    247         /* Just copy the first few upperCase characters. */
    248         System.arraycopy(value, 0, result, 0, firstLower);
    249 
    250         String lang = locale.getLanguage();
    251         boolean localeDependent =
    252                 (lang == "tr" || lang == "az" || lang == "lt");
    253         char[] upperCharArray;
    254         int upperChar;
    255         int srcChar;
    256         int srcCount;
    257         for (int i = firstLower; i < len; i += srcCount) {
    258             srcChar = (int)value[i];
    259             if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&
    260                 (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
    261                 srcChar = codePointAt(i);
    262                 srcCount = Character.charCount(srcChar);
    263             } else {
    264                 srcCount = 1;
    265             }
    266             if (localeDependent) {
    267                 upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);
    268             } else {
    269                 upperChar = Character.toUpperCaseEx(srcChar);
    270             }
    271             if ((upperChar == Character.ERROR)
    272                     || (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
    273                 if (upperChar == Character.ERROR) {
    274                     if (localeDependent) {
    275                         upperCharArray =
    276                                 ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);
    277                     } else {
    278                         upperCharArray = Character.toUpperCaseCharArray(srcChar);
    279                     }
    280                 } else if (srcCount == 2) {
    281                     resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;
    282                     continue;
    283                 } else {
    284                     upperCharArray = Character.toChars(upperChar);
    285                 }
    286 
    287                 /* Grow result if needed */
    288                 int mapLen = upperCharArray.length;
    289                 if (mapLen > srcCount) {
    290                     char[] result2 = new char[result.length + mapLen - srcCount];
    291                     System.arraycopy(result, 0, result2, 0, i + resultOffset);
    292                     result = result2;
    293                 }
    294                 for (int x = 0; x < mapLen; ++x) {
    295                     result[i + resultOffset + x] = upperCharArray[x];
    296                 }
    297                 resultOffset += (mapLen - srcCount);
    298             } else {
    299                 result[i + resultOffset] = (char)upperChar;
    300             }
    301         }
    302         return new String(result, 0, len + resultOffset);
    303     }
    304 
    305     /**
    306      * Converts all of the characters in this {@code String} to upper
    307      * case using the rules of the default locale. This method is equivalent to
    308      * {@code toUpperCase(Locale.getDefault())}.
    309      * <p>
    310      * <b>Note:</b> This method is locale sensitive, and may produce unexpected
    311      * results if used for strings that are intended to be interpreted locale
    312      * independently.
    313      * Examples are programming language identifiers, protocol keys, and HTML
    314      * tags.
    315      * For instance, {@code "title".toUpperCase()} in a Turkish locale
    316      * returns {@code "T\u005Cu0130TLE"}, where ‘\u005Cu0130‘ is the
    317      * LATIN CAPITAL LETTER I WITH DOT ABOVE character.
    318      * To obtain correct results for locale insensitive strings, use
    319      * {@code toUpperCase(Locale.ROOT)}.
    320      * <p>
    321      * @return  the {@code String}, converted to uppercase.
    322      * @see     java.lang.String#toUpperCase(Locale)
    323      */
    324     public String toUpperCase() {
    325         return toUpperCase(Locale.getDefault());
    326     }
    327 
    328     /**
    329      * Returns a string whose value is this string, with any leading and trailing
    330      * whitespace removed.
    331      * <p>
    332      * If this {@code String} object represents an empty character
    333      * sequence, or the first and last characters of character sequence
    334      * represented by this {@code String} object both have codes
    335      * greater than {@code ‘\u005Cu0020‘} (the space character), then a
    336      * reference to this {@code String} object is returned.
    337      * <p>
    338      * Otherwise, if there is no character with a code greater than
    339      * {@code ‘\u005Cu0020‘} in the string, then a
    340      * {@code String} object representing an empty string is
    341      * returned.
    342      * <p>
    343      * Otherwise, let <i>k</i> be the index of the first character in the
    344      * string whose code is greater than {@code ‘\u005Cu0020‘}, and let
    345      * <i>m</i> be the index of the last character in the string whose code
    346      * is greater than {@code ‘\u005Cu0020‘}. A {@code String}
    347      * object is returned, representing the substring of this string that
    348      * begins with the character at index <i>k</i> and ends with the
    349      * character at index <i>m</i>-that is, the result of
    350      * {@code this.substring(k, m + 1)}.
    351      * <p>
    352      * This method may be used to trim whitespace (as defined above) from
    353      * the beginning and end of a string.
    354      *
    355      * @return  A string whose value is this string, with any leading and trailing white
    356      *          space removed, or this string if it has no leading or
    357      *          trailing white space.
    358      */
    359     public String trim() {
    360         int len = value.length;
    361         int st = 0;
    362         char[] val = value;    /* avoid getfield opcode */
    363 
    364         while ((st < len) && (val[st] <= ‘ ‘)) {
    365             st++;
    366         }
    367         while ((st < len) && (val[len - 1] <= ‘ ‘)) {
    368             len--;
    369         }
    370         return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
    371     }
    lowupperandtrim
  16. tostring 方法

    技术分享图片
     1     /**
     2      * This object (which is already a string!) is itself returned.
     3      *
     4      * @return  the string itself.
     5      */
     6     public String toString() {
     7         return this;
     8     }
     9 
    10     /**
    11      * Converts this string to a new character array.
    12      *
    13      * @return  a newly allocated character array whose length is the length
    14      *          of this string and whose contents are initialized to contain
    15      *          the character sequence represented by this string.
    16      */
    17     public char[] toCharArray() {
    18         // Cannot use Arrays.copyOf because of class initialization order issues
    19         char result[] = new char[value.length];
    20         System.arraycopy(value, 0, result, 0, value.length);
    21         return result;
    22     }
    toString
  17. format 格式化,公共静态方法

    技术分享图片
    1     public static String format(String format, Object... args) {
    2         return new Formatter().format(format, args).toString();
    3     }
    4 
    5     public static String format(Locale l, String format, Object... args) {
    6         return new Formatter(l).format(format, args).toString();
    7     }
    format
  18. valueOf
    技术分享图片
      1     /**
      2      * Returns the string representation of the {@code Object} argument.
      3      *
      4      * @param   obj   an {@code Object}.
      5      * @return  if the argument is {@code null}, then a string equal to
      6      *          {@code "null"}; otherwise, the value of
      7      *          {@code obj.toString()} is returned.
      8      * @see     java.lang.Object#toString()
      9      */
     10     public static String valueOf(Object obj) {
     11         return (obj == null) ? "null" : obj.toString();
     12     }
     13 
     14     /**
     15      * Returns the string representation of the {@code char} array
     16      * argument. The contents of the character array are copied; subsequent
     17      * modification of the character array does not affect the returned
     18      * string.
     19      *
     20      * @param   data     the character array.
     21      * @return  a {@code String} that contains the characters of the
     22      *          character array.
     23      */
     24     public static String valueOf(char data[]) {
     25         return new String(data);
     26     }
     27 
     28     /**
     29      * Returns the string representation of a specific subarray of the
     30      * {@code char} array argument.
     31      * <p>
     32      * The {@code offset} argument is the index of the first
     33      * character of the subarray. The {@code count} argument
     34      * specifies the length of the subarray. The contents of the subarray
     35      * are copied; subsequent modification of the character array does not
     36      * affect the returned string.
     37      *
     38      * @param   data     the character array.
     39      * @param   offset   initial offset of the subarray.
     40      * @param   count    length of the subarray.
     41      * @return  a {@code String} that contains the characters of the
     42      *          specified subarray of the character array.
     43      * @exception IndexOutOfBoundsException if {@code offset} is
     44      *          negative, or {@code count} is negative, or
     45      *          {@code offset+count} is larger than
     46      *          {@code data.length}.
     47      */
     48     public static String valueOf(char data[], int offset, int count) {
     49         return new String(data, offset, count);
     50     }
     51 
     52     /**
     53      * Equivalent to {@link #valueOf(char[], int, int)}.
     54      *
     55      * @param   data     the character array.
     56      * @param   offset   initial offset of the subarray.
     57      * @param   count    length of the subarray.
     58      * @return  a {@code String} that contains the characters of the
     59      *          specified subarray of the character array.
     60      * @exception IndexOutOfBoundsException if {@code offset} is
     61      *          negative, or {@code count} is negative, or
     62      *          {@code offset+count} is larger than
     63      *          {@code data.length}.
     64      */
     65     public static String copyValueOf(char data[], int offset, int count) {
     66         return new String(data, offset, count);
     67     }
     68 
     69     /**
     70      * Equivalent to {@link #valueOf(char[])}.
     71      *
     72      * @param   data   the character array.
     73      * @return  a {@code String} that contains the characters of the
     74      *          character array.
     75      */
     76     public static String copyValueOf(char data[]) {
     77         return new String(data);
     78     }
     79 
     80     /**
     81      * Returns the string representation of the {@code boolean} argument.
     82      *
     83      * @param   b   a {@code boolean}.
     84      * @return  if the argument is {@code true}, a string equal to
     85      *          {@code "true"} is returned; otherwise, a string equal to
     86      *          {@code "false"} is returned.
     87      */
     88     public static String valueOf(boolean b) {
     89         return b ? "true" : "false";
     90     }
     91 
     92     /**
     93      * Returns the string representation of the {@code char}
     94      * argument.
     95      *
     96      * @param   c   a {@code char}.
     97      * @return  a string of length {@code 1} containing
     98      *          as its single character the argument {@code c}.
     99      */
    100     public static String valueOf(char c) {
    101         char data[] = {c};
    102         return new String(data, true);
    103     }
    104 
    105     /**
    106      * Returns the string representation of the {@code int} argument.
    107      * <p>
    108      * The representation is exactly the one returned by the
    109      * {@code Integer.toString} method of one argument.
    110      *
    111      * @param   i   an {@code int}.
    112      * @return  a string representation of the {@code int} argument.
    113      * @see     java.lang.Integer#toString(int, int)
    114      */
    115     public static String valueOf(int i) {
    116         return Integer.toString(i);
    117     }
    118 
    119     /**
    120      * Returns the string representation of the {@code long} argument.
    121      * <p>
    122      * The representation is exactly the one returned by the
    123      * {@code Long.toString} method of one argument.
    124      *
    125      * @param   l   a {@code long}.
    126      * @return  a string representation of the {@code long} argument.
    127      * @see     java.lang.Long#toString(long)
    128      */
    129     public static String valueOf(long l) {
    130         return Long.toString(l);
    131     }
    132 
    133     /**
    134      * Returns the string representation of the {@code float} argument.
    135      * <p>
    136      * The representation is exactly the one returned by the
    137      * {@code Float.toString} method of one argument.
    138      *
    139      * @param   f   a {@code float}.
    140      * @return  a string representation of the {@code float} argument.
    141      * @see     java.lang.Float#toString(float)
    142      */
    143     public static String valueOf(float f) {
    144         return Float.toString(f);
    145     }
    146 
    147     /**
    148      * Returns the string representation of the {@code double} argument.
    149      * <p>
    150      * The representation is exactly the one returned by the
    151      * {@code Double.toString} method of one argument.
    152      *
    153      * @param   d   a {@code double}.
    154      * @return  a  string representation of the {@code double} argument.
    155      * @see     java.lang.Double#toString(double)
    156      */
    157     public static String valueOf(double d) {
    158         return Double.toString(d);
    159     }
    valueOf

    在源码提供的所有valueOf方法里面,如果是确定类型的null传入,返回的是字符串“null”,而如果直接传入null,则会发生错误。(“null”是个坑。)

  19. intern本地方法 ,这个是native 方法,说明是由系统动态库实现的。
    技术分享图片
     1 /**
     2      * Returns a canonical representation for the string object.
     3      * <p>
     4      * A pool of strings, initially empty, is maintained privately by the
     5      * class {@code String}.
     6      * <p>
     7      * When the intern method is invoked, if the pool already contains a
     8      * string equal to this {@code String} object as determined by
     9      * the {@link #equals(Object)} method, then the string from the pool is
    10      * returned. Otherwise, this {@code String} object is added to the
    11      * pool and a reference to this {@code String} object is returned.
    12      * <p>
    13      * It follows that for any two strings {@code s} and {@code t},
    14      * {@code s.intern() == t.intern()} is {@code true}
    15      * if and only if {@code s.equals(t)} is {@code true}.
    16      * <p>
    17      * All literal strings and string-valued constant expressions are
    18      * interned. String literals are defined in section 3.10.5 of the
    19      * <cite>The Java&trade; Language Specification</cite>.
    20      *
    21      * @return  a string that has the same contents as this string, but is
    22      *          guaranteed to be from a pool of unique strings.
    23      */
    24     public native String intern();
    intern

    具体这个方法的原理就跟字符串常量池有关系。









以上是关于java.lang.String 类源码解读的主要内容,如果未能解决你的问题,请参考以下文章

JDK1.8源码——java.lang.String类

java.lang.String类

从源码分析java.lang.String.isEmpty()

java 中string类怎么实现

关于java中String类!!!!

深入Java字符串