java.lang.String 类源码解读

Posted 2020-10-29 aben-blog

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了java.lang.String 类源码解读相关的知识，希望对你有一定的参考价值。

String类定义实现了java.io.Serializable, Comparable<String>, CharSequence 三个接口；并且为final修饰。
```
public final class String
```
defined

String由char[]数组实现
```
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0
```
value[]用于存储字符串内容，被final修饰，说明一旦创建就不可被修改。String 声明的变量重新赋值即代表重新指向了另一个String实例对象。

实现序列化

/** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../platform/serialization/spec/output.html">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

Serializable

serialVersionUID是记录序列化的版本号，serialPersistentFields用来存储需要被序列化的字段。

String构造方法

    // String的value是不可变的，所以可以公用char[]数组表达同一个字符串
    public String() {
        this.value = "".value;
    }
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }
    // Arrays.copyOf原理是声明新的char[]数组，System.arraycopy进行复制
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

    // 通过codePoints 数组创建
    public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }
    /* Common private utility method used to bounds check the byte array
     * and requested offset & length values used by the String(byte[],..)
     * constructors.
     */
    private static void checkBounds(byte[] bytes, int offset, int length) {
        if (length < 0)
            throw new StringIndexOutOfBoundsException(length);
        if (offset < 0)
            throw new StringIndexOutOfBoundsException(offset);
        if (offset > bytes.length - length)
            throw new StringIndexOutOfBoundsException(offset + length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified charset.  The length of the new {@code String}
     * is a function of the charset, and hence may not be equal to the length
     * of the subarray.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode

     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the subarray.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset‘s default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  1.6
     */
    public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the specified {@linkplain java.nio.charset.Charset charset}.  The
     * length of the new {@code String} is a function of the charset, and hence
     * may not be equal to the length of the byte array.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the byte array.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset‘s default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @since  1.6
     */
    public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the platform‘s default charset.  The length of the new
     * {@code String} is a function of the charset, and hence may not be equal
     * to the length of the subarray.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and the {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length) {
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the platform‘s default charset.  The length of the new {@code
     * String} is a function of the charset, and hence may not be equal to the
     * length of the byte array.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @since  JDK1.1
     */
    public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

    /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string buffer argument. The contents of the
     * string buffer are copied; subsequent modification of the string buffer
     * does not affect the newly created string.
     *
     * @param  buffer
     *         A {@code StringBuffer}
     */
    public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

    /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string builder argument. The contents of the
     * string builder are copied; subsequent modification of the string builder
     * does not affect the newly created string.
     *
     * <p> This constructor is provided to ease migration to {@code
     * StringBuilder}. Obtaining a string from a string builder via the {@code
     * toString} method is likely to run faster and is generally preferred.
     *
     * @param   builder
     *          A {@code StringBuilder}
     *
     * @since  1.5
     */
    public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

    /*
    * Package private constructor which shares value array for speed.
    * this constructor is always expected to be called with share==true.
    * a separate constructor is needed because we already have a public
    * String(char[]) constructor that makes a copy of the given char[].
    */
    String(char[] value, boolean share) {
        // assert share : "unshared not supported";
        this.value = value;
    }

constructor

String是不可变的，所以可以公用char[]数组表达同一个字符串，从构造方法可以看出，this.value = origin.value.
构造函数可通过【codepoint】创建，这里需要另外详细了解字符编码和【bmpcodepoint】等概念。
通过byte[]创建的字符串都是由【StringCoding.decode】方法进行获取char[]。

常用String方法

 1     public int length() {
 2         return value.length;
 3     }
 4 
 5     public boolean isEmpty() {
 6         return value.length == 0;
 7     }
 8 
 9     public char charAt(int index) {
10         if ((index < 0) || (index >= value.length)) {
11             throw new StringIndexOutOfBoundsException(index);
12         }
13         return value[index];
14     }

从源码可以看出，通过string直接获取的length是char的长度，但是由于string是使用utf-16对字符串进行编码存储在char数组中，
所以http://www.qqxiuzi.cn/zh/hanzi-unicode-bianma.php?zfj=kzb 对于类似这种生僻字，需要两个char才能表示，因此length长度
不能代表字符个数。

codePoint代码点相关方法

  1     /**
  2      * Returns the character (Unicode code point) at the specified
  3      * index. The index refers to {@code char} values
  4      * (Unicode code units) and ranges from {@code 0} to
  5      * {@link #length()}{@code  - 1}.
  6      *
  7      * <p> If the {@code char} value specified at the given index
  8      * is in the high-surrogate range, the following index is less
  9      * than the length of this {@code String}, and the
 10      * {@code char} value at the following index is in the
 11      * low-surrogate range, then the supplementary code point
 12      * corresponding to this surrogate pair is returned. Otherwise,
 13      * the {@code char} value at the given index is returned.
 14      *
 15      * @param      index the index to the {@code char} values
 16      * @return     the code point value of the character at the
 17      *             {@code index}
 18      * @exception  IndexOutOfBoundsException  if the {@code index}
 19      *             argument is negative or not less than the length of this
 20      *             string.
 21      * @since      1.5
 22      */
 23     public int codePointAt(int index) {
 24         if ((index < 0) || (index >= value.length)) {
 25             throw new StringIndexOutOfBoundsException(index);
 26         }
 27         return Character.codePointAtImpl(value, index, value.length);
 28     }
 29 
 30     /**
 31      * Returns the character (Unicode code point) before the specified
 32      * index. The index refers to {@code char} values
 33      * (Unicode code units) and ranges from {@code 1} to {@link
 34      * CharSequence#length() length}.
 35      *
 36      * <p> If the {@code char} value at {@code (index - 1)}
 37      * is in the low-surrogate range, {@code (index - 2)} is not
 38      * negative, and the {@code char} value at {@code (index -
 39      * 2)} is in the high-surrogate range, then the
 40      * supplementary code point value of the surrogate pair is
 41      * returned. If the {@code char} value at {@code index -
 42      * 1} is an unpaired low-surrogate or a high-surrogate, the
 43      * surrogate value is returned.
 44      *
 45      * @param     index the index following the code point that should be returned
 46      * @return    the Unicode code point value before the given index.
 47      * @exception IndexOutOfBoundsException if the {@code index}
 48      *            argument is less than 1 or greater than the length
 49      *            of this string.
 50      * @since     1.5
 51      */
 52     public int codePointBefore(int index) {
 53         int i = index - 1;
 54         if ((i < 0) || (i >= value.length)) {
 55             throw new StringIndexOutOfBoundsException(index);
 56         }
 57         return Character.codePointBeforeImpl(value, index, 0);
 58     }
 59 
 60     /**
 61      * Returns the number of Unicode code points in the specified text
 62      * range of this {@code String}. The text range begins at the
 63      * specified {@code beginIndex} and extends to the
 64      * {@code char} at index {@code endIndex - 1}. Thus the
 65      * length (in {@code char}s) of the text range is
 66      * {@code endIndex-beginIndex}. Unpaired surrogates within
 67      * the text range count as one code point each.
 68      *
 69      * @param beginIndex the index to the first {@code char} of
 70      * the text range.
 71      * @param endIndex the index after the last {@code char} of
 72      * the text range.
 73      * @return the number of Unicode code points in the specified text
 74      * range
 75      * @exception IndexOutOfBoundsException if the
 76      * {@code beginIndex} is negative, or {@code endIndex}
 77      * is larger than the length of this {@code String}, or
 78      * {@code beginIndex} is larger than {@code endIndex}.
 79      * @since  1.5
 80      */
 81     public int codePointCount(int beginIndex, int endIndex) {
 82         if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
 83             throw new IndexOutOfBoundsException();
 84         }
 85         return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
 86     }
 87 
 88     /**
 89      * Returns the index within this {@code String} that is
 90      * offset from the given {@code index} by
 91      * {@code codePointOffset} code points. Unpaired surrogates
 92      * within the text range given by {@code index} and
 93      * {@code codePointOffset} count as one code point each.
 94      *
 95      * @param index the index to be offset
 96      * @param codePointOffset the offset in code points
 97      * @return the index within this {@code String}
 98      * @exception IndexOutOfBoundsException if {@code index}
 99      *   is negative or larger then the length of this
100      *   {@code String}, or if {@code codePointOffset} is positive
101      *   and the substring starting with {@code index} has fewer
102      *   than {@code codePointOffset} code points,
103      *   or if {@code codePointOffset} is negative and the substring
104      *   before {@code index} has fewer than the absolute value
105      *   of {@code codePointOffset} code points.
106      * @since 1.5
107      */
108     public int offsetByCodePoints(int index, int codePointOffset) {
109         if (index < 0 || index > value.length) {
110             throw new IndexOutOfBoundsException();
111         }
112         return Character.offsetByCodePointsImpl(value, 0, value.length,
113                 index, codePointOffset);
114     }

codePoint

在第4点说道，length()方法并不能代表string的字符个数，这里可以通过codePointCount(0,str.length()) 来获取String的字符个数。

copy chars

 1     void getChars(char dst[], int dstBegin) {
 2         System.arraycopy(value, 0, dst, dstBegin, value.length);
 3     }
 4 
 5     public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
 6         if (srcBegin < 0) {
 7             throw new StringIndexOutOfBoundsException(srcBegin);
 8         }
 9         if (srcEnd > value.length) {
10             throw new StringIndexOutOfBoundsException(srcEnd);
11         }
12         if (srcBegin > srcEnd) {
13             throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
14         }
15         System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
16     }

getchars就是复制string的value数组。第一个是默认修饰符，即只能java.lang包内使用，两个复制getchars的实现都是由System.arraycopy。

encode for string

 1     public byte[] getBytes(String charsetName)
 2             throws UnsupportedEncodingException {
 3         if (charsetName == null) throw new NullPointerException();
 4         return StringCoding.encode(charsetName, value, 0, value.length);
 5     }
 6 
 7     public byte[] getBytes(Charset charset) {
 8         if (charset == null) throw new NullPointerException();
 9         return StringCoding.encode(charset, value, 0, value.length);
10     }
11 
12     public byte[] getBytes() {
13         return StringCoding.encode(value, 0, value.length);
14     }

getBytes

string提供了三种getBytes方式，前两种是传入编码格式参数，第三种则是使用jvm默认编码方式来获取 bytes。

equals 方法解析

  1     // char数组进行一一索引值判断
  2     public boolean equals(Object anObject) {
  3         if (this == anObject) {
  4             return true;
  5         }
  6         if (anObject instanceof String) {
  7             String anotherString = (String)anObject;
  8             int n = value.length;
  9             if (n == anotherString.value.length) {
 10                 char v1[] = value;
 11                 char v2[] = anotherString.value;
 12                 int i = 0;
 13                 while (n-- != 0) {
 14                     if (v1[i] != v2[i])
 15                         return false;
 16                     i++;
 17                 }
 18                 return true;
 19             }
 20         }
 21         return false;
 22     }
 23 
 24     // 为阻塞方法，因StringBuffer是阻塞的。
 25     public boolean contentEquals(StringBuffer sb) {
 26         return contentEquals((CharSequence)sb);
 27     }
 28     
 29     // 非阻塞，StringBuilder为非阻塞的。char数组进行一一索引值判断
 30     private boolean nonSyncContentEquals(AbstractStringBuilder sb) {
 31         char v1[] = value;
 32         char v2[] = sb.getValue();
 33         int n = v1.length;
 34         if (n != sb.length()) {
 35             return false;
 36         }
 37         for (int i = 0; i < n; i++) {
 38             if (v1[i] != v2[i]) {
 39                 return false;
 40             }
 41         }
 42         return true;
 43     }
 44 
 45     // char Sequence类似char[]。char数组进行一一索引值判断
 46     public boolean contentEquals(CharSequence cs) {
 47         // Argument is a StringBuffer, StringBuilder
 48         if (cs instanceof AbstractStringBuilder) {
 49             if (cs instanceof StringBuffer) {
 50                 synchronized(cs) {
 51                    return nonSyncContentEquals((AbstractStringBuilder)cs);
 52                 }
 53             } else {
 54                 return nonSyncContentEquals((AbstractStringBuilder)cs);
 55             }
 56         }
 57         // Argument is a String
 58         if (cs instanceof String) {
 59             return equals(cs);
 60         }
 61         // Argument is a generic CharSequence
 62         char v1[] = value;
 63         int n = v1.length;
 64         if (n != cs.length()) {
 65             return false;
 66         }
 67         for (int i = 0; i < n; i++) {
 68             if (v1[i] != cs.charAt(i)) {
 69                 return false;
 70             }
 71         }
 72         return true;
 73     }
 74 
 75     // 该相等方法与上面不同的是，该方法用的是匹配方法，上面使用的是一一比较索引值方法
 76     public boolean equalsIgnoreCase(String anotherString) {
 77         return (this == anotherString) ? true
 78                 : (anotherString != null)
 79                 && (anotherString.value.length == value.length)
 80                 && regionMatches(true, 0, anotherString, 0, value.length);
 81     }
 82 
 83     // 指定范围进行一一索引值匹配
 84     public boolean regionMatches(boolean ignoreCase, int toffset,
 85             String other, int ooffset, int len) {
 86         char ta[] = value;
 87         int to = toffset;
 88         char pa[] = other.value;
 89         int po = ooffset;
 90         // Note: toffset, ooffset, or len might be near -1>>>1.
 91         if ((ooffset < 0) || (toffset < 0)
 92                 || (toffset > (long)value.length - len)
 93                 || (ooffset > (long)other.value.length - len)) {
 94             return false;
 95         }
 96         while (len-- > 0) {
 97             char c1 = ta[to++];
 98             char c2 = pa[po++];
 99             if (c1 == c2) {
100                 continue;
101             }
102             if (ignoreCase) {
103                 // If characters don‘t match but case may be ignored,
104                 // try converting both characters to uppercase.
105                 // If the results match, then the comparison scan should
106                 // continue.
107                 char u1 = Character.toUpperCase(c1);
108                 char u2 = Character.toUpperCase(c2);
109                 if (u1 == u2) {
110                     continue;
111                 }
112                 // Unfortunately, conversion to uppercase does not work properly
113                 // for the Georgian alphabet, which has strange rules about case
114                 // conversion.  So we need to make one last check before
115                 // exiting.
116                 if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
117                     continue;
118                 }
119             }
120             return false;
121         }
122         return true;
123     }
124 
125     public boolean startsWith(String prefix, int toffset) {
126         char ta[] = value;
127         int to = toffset;
128         char pa[] = prefix.value;
129         int po = 0;
130         int pc = prefix.value.length;
131         // Note: toffset might be near -1>>>1.
132         if ((toffset < 0) || (toffset > value.length - pc)) {
133             return false;
134         }
135         while (--pc >= 0) {
136             if (ta[to++] != pa[po++]) {
137                 return false;
138             }
139         }
140         return true;
141     }
142 
143     public boolean startsWith(String prefix) {
144         return startsWith(prefix, 0);
145     }
146 
147     public boolean endsWith(String suffix) {
148         return startsWith(suffix, value.length - suffix.value.length);
149     }

equals

大小比较compareTo方法解析

 1     // 一一比较索引值，通过char进行 大小比较。
 2     public int compareTo(String anotherString) {
 3         int len1 = value.length;
 4         int len2 = anotherString.value.length;
 5         int lim = Math.min(len1, len2);
 6         char v1[] = value;
 7         char v2[] = anotherString.value;
 8 
 9         int k = 0;
10         while (k < lim) {
11             char c1 = v1[k];
12             char c2 = v2[k];
13             if (c1 != c2) {
14                 return c1 - c2;
15             }
16             k++;
17         }
18         return len1 - len2;
19     }
20 
21     // 定义了静态忽略大小写的比较器Comparator变量
22     public static final Comparator<String> CASE_INSENSITIVE_ORDER
23                                          = new CaseInsensitiveComparator();
24     // 定义了私有内部类-忽略大小写的字符串比较器类
25     private static class CaseInsensitiveComparator
26             implements Comparator<String>, java.io.Serializable {
27         // use serialVersionUID from JDK 1.2.2 for interoperability
28         private static final long serialVersionUID = 8575799808933029326L;
29 
30         public int compare(String s1, String s2) {
31             int n1 = s1.length();
32             int n2 = s2.length();
33             int min = Math.min(n1, n2);
34             for (int i = 0; i < min; i++) {
35                 char c1 = s1.charAt(i);
36                 char c2 = s2.charAt(i);
37                 if (c1 != c2) {
38                     c1 = Character.toUpperCase(c1);
39                     c2 = Character.toUpperCase(c2);
40                     if (c1 != c2) {
41                         c1 = Character.toLowerCase(c1);
42                         c2 = Character.toLowerCase(c2);
43                         if (c1 != c2) {
44                             // No overflow because of numeric promotion
45                             return c1 - c2;
46                         }
47                     }
48                 }
49             }
50             return n1 - n2;
51         }
52         
53         /** Replaces the de-serialized object. */
54         private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
55     }
56 
57     // 忽略大小写的比较方法
58     public int compareToIgnoreCase(String str) {
59         return CASE_INSENSITIVE_ORDER.compare(this, str);
60     }

compareTo

在CaseInsensitiveComparator私有内部类中，定义了readResolve()方法，这个方法的目的是保证CaseInsensitiveComparator在反序列化中也能保持单例。

string 重写hashCode

 1     /**
 2      * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 3      */
 4     public int hashCode() {
 5         int h = hash;
 6         if (h == 0 && value.length > 0) {
 7             char val[] = value;
 8 
 9             for (int i = 0; i < value.length; i++) {
10                 h = 31 * h + val[i];
11             }
12             hash = h;
13         }
14         return h;
15     }

string重写hashcode使用的质数31。

indexof等通过值查询索引

  1     public int indexOf(int ch) {
  2         return indexOf(ch, 0);
  3     }
  4 
  5     /**
  6      * Returns the index within this string of the first occurrence of the
  7      * specified character, starting the search at the specified index.
  8      * <p>
  9      * If a character with value {@code ch} occurs in the
 10      * character sequence represented by this {@code String}
 11      * object at an index no smaller than {@code fromIndex}, then
 12      * the index of the first such occurrence is returned. For values
 13      * of {@code ch} in the range from 0 to 0xFFFF (inclusive),
 14      * this is the smallest value <i>k</i> such that:
 15      * <blockquote><pre>
 16      * (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &gt;= fromIndex)
 17      * </pre></blockquote>
 18      * is true. For other values of {@code ch}, it is the
 19      * smallest value <i>k</i> such that:
 20      * <blockquote><pre>
 21      * (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &gt;= fromIndex)
 22      * </pre></blockquote>
 23      * is true. In either case, if no such character occurs in this
 24      * string at or after position {@code fromIndex}, then
 25      * {@code -1} is returned.
 26      *
 27      * <p>
 28      * There is no restriction on the value of {@code fromIndex}. If it
 29      * is negative, it has the same effect as if it were zero: this entire
 30      * string may be searched. If it is greater than the length of this
 31      * string, it has the same effect as if it were equal to the length of
 32      * this string: {@code -1} is returned.
 33      *
 34      * <p>All indices are specified in {@code char} values
 35      * (Unicode code units).
 36      *
 37      * @param   ch          a character (Unicode code point).
 38      * @param   fromIndex   the index to start the search from.
 39      * @return  the index of the first occurrence of the character in the
 40      *          character sequence represented by this object that is greater
 41      *          than or equal to {@code fromIndex}, or {@code -1}
 42      *          if the character does not occur.
 43      */
 44     public int indexOf(int ch, int fromIndex) {
 45         final int max = value.length;
 46         if (fromIndex < 0) {
 47             fromIndex = 0;
 48         } else if (fromIndex >= max) {
 49             // Note: fromIndex might be near -1>>>1.
 50             return -1;
 51         }
 52 
 53         if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
 54             // handle most cases here (ch is a BMP code point or a
 55             // negative value (invalid code point))
 56             final char[] value = this.value;
 57             for (int i = fromIndex; i < max; i++) {
 58                 if (value[i] == ch) {
 59                     return i;
 60                 }
 61             }
 62             return -1;
 63         } else {
 64             return indexOfSupplementary(ch, fromIndex);
 65         }
 66     }
 67 
 68     /**
 69      * Handles (rare) calls of indexOf with a supplementary character.
 70      */
 71     private int indexOfSupplementary(int ch, int fromIndex) {
 72         if (Character.isValidCodePoint(ch)) {
 73             final char[] value = this.value;
 74             final char hi = Character.highSurrogate(ch);
 75             final char lo = Character.lowSurrogate(ch);
 76             final int max = value.length - 1;
 77             for (int i = fromIndex; i < max; i++) {
 78                 if (value[i] == hi && value[i + 1] == lo) {
 79                     return i;
 80                 }
 81             }
 82         }
 83         return -1;
 84     }
 85 
 86     /**
 87      * Returns the index within this string of the last occurrence of
 88      * the specified character. For values of {@code ch} in the
 89      * range from 0 to 0xFFFF (inclusive), the index (in Unicode code
 90      * units) returned is the largest value <i>k</i> such that:
 91      * <blockquote><pre>
 92      * this.charAt(<i>k</i>) == ch
 93      * </pre></blockquote>
 94      * is true. For other values of {@code ch}, it is the
 95      * largest value <i>k</i> such that:
 96      * <blockquote><pre>
 97      * this.codePointAt(<i>k</i>) == ch
 98      * </pre></blockquote>
 99      * is true.  In either case, if no such character occurs in this
100      * string, then {@code -1} is returned.  The
101      * {@code String} is searched backwards starting at the last
102      * character.
103      *
104      * @param   ch   a character (Unicode code point).
105      * @return  the index of the last occurrence of the character in the
106      *          character sequence represented by this object, or
107      *          {@code -1} if the character does not occur.
108      */
109     public int lastIndexOf(int ch) {
110         return lastIndexOf(ch, value.length - 1);
111     }
112 
113     /**
114      * Returns the index within this string of the last occurrence of
115      * the specified character, searching backward starting at the
116      * specified index. For values of {@code ch} in the range
117      * from 0 to 0xFFFF (inclusive), the index returned is the largest
118      * value <i>k</i> such that:
119      * <blockquote><pre>
120      * (this.charAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &lt;= fromIndex)
121      * </pre></blockquote>
122      * is true. For other values of {@code ch}, it is the
123      * largest value <i>k</i> such that:
124      * <blockquote><pre>
125      * (this.codePointAt(<i>k</i>) == ch) {@code &&} (<i>k</i> &lt;= fromIndex)
126      * </pre></blockquote>
127      * is true. In either case, if no such character occurs in this
128      * string at or before position {@code fromIndex}, then
129      * {@code -1} is returned.
130      *
131      * <p>All indices are specified in {@code char} values
132      * (Unicode code units).
133      *
134      * @param   ch          a character (Unicode code point).
135      * @param   fromIndex   the index to start the search from. There is no
136      *          restriction on the value of {@code fromIndex}. If it is
137      *          greater than or equal to the length of this string, it has
138      *          the same effect as if it were equal to one less than the
139      *          length of this string: this entire string may be searched.
140      *          If it is negative, it has the same effect as if it were -1:
141      *          -1 is returned.
142      * @return  the index of the last occurrence of the character in the
143      *          character sequence represented by this object that is less
144      *          than or equal to {@code fromIndex}, or {@code -1}
145      *          if the character does not occur before that point.
146      */
147     public int lastIndexOf(int ch, int fromIndex) {
148         if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
149             // handle most cases here (ch is a BMP code point or a
150             // negative value (invalid code point))
151             final char[] value = this.value;
152             int i = Math.min(fromIndex, value.length - 1);
153             for (; i >= 0; i--) {
154                 if (value[i] == ch) {
155                     return i;
156                 }
157             }
158             return -1;
159         } else {
160             return lastIndexOfSupplementary(ch, fromIndex);
161         }
162     }
163 
164     /**
165      * Handles (rare) calls of lastIndexOf with a supplementary character.
166      */
167     private int lastIndexOfSupplementary(int ch, int fromIndex) {
168         if (Character.isValidCodePoint(ch)) {
169             final char[] value = this.value;
170             char hi = Character.highSurrogate(ch);
171             char lo = Character.lowSurrogate(ch);
172             int i = Math.min(fromIndex, value.length - 2);
173             for (; i >= 0; i--) {
174                 if (value[i] == hi && value[i + 1] == lo) {
175                     return i;
176                 }
177             }
178         }
179         return -1;
180     }
181 
182     /**
183      * Returns the index within this string of the first occurrence of the
184      * specified substring.
185      *
186      * <p>The returned index is the smallest value <i>k</i> for which:
187      * <blockquote><pre>
188      * this.startsWith(str, <i>k</i>)
189      * </pre></blockquote>
190      * If no such value of <i>k</i> exists, then {@code -1} is returned.
191      *
192      * @param   str   the substring to search for.
193      * @return  the index of the first occurrence of the specified substring,
194      *          or {@code -1} if there is no such occurrence.
195      */
196     public int indexOf(String str) {
197         return indexOf(str, 0);
198     }
199 
200     /**
201      * Returns the index within this string of the first occurrence of the
202      * specified substring, starting at the specified index.
203      *
204      * <p>The returned index is the smallest value <i>k</i> for which:
205      * <blockquote><pre>
206      * <i>k</i> &gt;= fromIndex {@code &&} this.startsWith(str, <i>k</i>)
207      * </pre></blockquote>
208      * If no such value of <i>k</i> exists, then {@code -1} is returned.
209      *
210      * @param   str         the substring to search for.
211      * @param   fromIndex   the index from which to start the search.
212      * @return  the index of the first occurrence of the specified substring,
213      *          starting at the specified index,
214      *          or {@code -1} if there is no such occurrence.
215      */
216     public int indexOf(String str, int fromIndex) {
217         return indexOf(value, 0, value.length,
218                 str.value, 0, str.value.length, fromIndex);
219     }
220 
221     /**
222      * Code shared by String and AbstractStringBuilder to do searches. The
223      * source is the character array being searched, and the target
224      * is the string being searched for.
225      *
226      * @param   source       the characters being searched.
227      * @param   sourceOffset offset of the source string.
228      * @param   sourceCount  count of the source string.
229      * @param   target       the characters being searched for.
230      * @param   fromIndex    the index to begin searching from.
231      */
232     static int indexOf(char[] source, int sourceOffset, int sourceCount,
233             String target, int fromIndex) {
234         return indexOf(source, sourceOffset, sourceCount,
235                        target.value, 0, target.value.length,
236                        fromIndex);
237     }
238 
239     /**
240      * Code shared by String and StringBuffer to do searches. The
241      * source is the character array being searched, and the target
242      * is the string being searched for.
243      *
244      * @param   source       the characters being searched.
245      * @param   sourceOffset offset of the source string.
246      * @param   sourceCount  count of the source string.
247      * @param   target       the characters being searched for.
248      * @param   targetOffset offset of the target string.
249      * @param   targetCount  count of the target string.
250      * @param   fromIndex    the index to begin searching from.
251      */
252     static int indexOf(char[] source, int sourceOffset, int sourceCount,
253             char[] target, int targetOffset, int targetCount,
254             int fromIndex) {
255         if (fromIndex >= sourceCount) {
256             return (targetCount == 0 ? sourceCount : -1);
257         }
258         if (fromIndex < 0) {
259             fromIndex = 0;
260         }
261         if (targetCount == 0) {
262             return fromIndex;
263         }
264 
265         char first = target[targetOffset];
266         int max = sourceOffset + (sourceCount - targetCount);
267 
268         for (int i = sourceOffset + fromIndex; i <= max; i++) {
269             /* Look for first character. */
270             if (source[i] != first) {
271                 while (++i <= max && source[i] != first);
272             }
273 
274             /* Found first character, now look at the rest of v2 */
275             if (i <= max) {
276                 int j = i + 1;
277                 int end = j + targetCount - 1;
278                 for (int k = targetOffset + 1; j < end && source[j]
279                         == target[k]; j++, k++);
280 
281                 if (j == end) {
282                     /* Found whole string. */
283                     return i - sourceOffset;
284                 }
285             }
286         }
287         return -1;
288     }
289 
290     /**
291      * Returns the index within this string of the last occurrence of the
292      * specified substring.  The last occurrence of the empty string ""
293      * is considered to occur at the index value {@code this.length()}.
294      *
295      * <p>The returned index is the largest value <i>k</i> for which:
296      * <blockquote><pre>
297      * this.startsWith(str, <i>k</i>)
298      * </pre></blockquote>
299      * If no such value of <i>k</i> exists, then {@code -1} is returned.
300      *
301      * @param   str   the substring to search for.
302      * @return  the index of the last occurrence of the specified substring,
303      *          or {@code -1} if there is no such occurrence.
304      */
305     public int lastIndexOf(String str) {
306         return lastIndexOf(str, value.length);
307     }
308 
309     /**
310      * Returns the index within this string of the last occurrence of the
311      * specified substring, searching backward starting at the specified index.
312      *
313      * <p>The returned index is the largest value <i>k</i> for which:
314      * <blockquote><pre>
315      * <i>k</i> {@code <=} fromIndex {@code &&} this.startsWith(str, <i>k</i>)
316      * </pre></blockquote>
317      * If no such value of <i>k</i> exists, then {@code -1} is returned.
318      *
319      * @param   str         the substring to search for.
320      * @param   fromIndex   the index to start the search from.
321      * @return  the index of the last occurrence of the specified substring,
322      *          searching backward from the specified index,
323      *          or {@code -1} if there is no such occurrence.
324      */
325     public int lastIndexOf(String str, int fromIndex) {
326         return lastIndexOf(value, 0, value.length,
327                 str.value, 0, str.value.length, fromIndex);
328     }
329 
330     /**
331      * Code shared by String and AbstractStringBuilder to do searches. The
332      * source is the character array being searched, and the target
333      * is the string being searched for.
334      *
335      * @param   source       the characters being searched.
336      * @param   sourceOffset offset of the source string.
337      * @param   sourceCount  count of the source string.
338      * @param   target       the characters being searched for.
339      * @param   fromIndex    the index to begin searching from.
340      */
341     static int lastIndexOf(char[] source, int sourceOffset, int sourceCount,
342             String target, int fromIndex) {
343         return lastIndexOf(source, sourceOffset, sourceCount,
344                        target.value, 0, target.value.length,
345                        fromIndex);
346     }
347 
348     /**
349      * Code shared by String and StringBuffer to do searches. The
350      * source is the character array being searched, and the target
351      * is the string being searched for.
352      *
353      * @param   source       the characters being searched.
354      * @param   sourceOffset offset of the source string.
355      * @param   sourceCount  count of the source string.
356      * @param   target       the characters being searched for.
357      * @param   targetOffset offset of the target string.
358      * @param   targetCount  count of the target string.
359      * @param   fromIndex    the index to begin searching from.
360      */
361     static int lastIndexOf(char[] source, int sourceOffset, int sourceCount,
362             char[] target, int targetOffset, int targetCount,
363             int fromIndex) {
364         /*
365          * Check arguments; return immediately where possible. For
366          * consistency, don‘t check for null str.
367          */
368         int rightIndex = sourceCount - targetCount;
369         if (fromIndex < 0) {
370             return -1;
371         }
372         if (fromIndex > rightIndex) {
373             fromIndex = rightIndex;
374         }
375         /* Empty string always matches. */
376         if (targetCount == 0) {
377             return fromIndex;
378         }
379 
380         int strLastIndex = targetOffset + targetCount - 1;
381         char strLastChar = target[strLastIndex];
382         int min = sourceOffset + targetCount - 1;
383         int i = min + fromIndex;
384 
385     startSearchForLastChar:
386         while (true) {
387             while (i >= min && source[i] != strLastChar) {
388                 i--;
389             }
390             if (i < min) {
391                 return -1;
392             }
393             int j = i - 1;
394             int start = j - (targetCount - 1);
395             int k = strLastIndex - 1;
396 
397             while (j > start) {
398                 if (source[j--] != target[k--]) {
399                     i--;
400                     continue startSearchForLastChar;
401                 }
402             }
403             return start - sourceOffset + 1;
404         }
405     }

indexOf

string 的操作并返回new String

  1 /**
  2      * Returns a string that is a substring of this string. The
  3      * substring begins with the character at the specified index and
  4      * extends to the end of this string. <p>
  5      * Examples:
  6      * <blockquote><pre>
  7      * "unhappy".substring(2) returns "happy"
  8      * "Harbison".substring(3) returns "bison"
  9      * "emptiness".substring(9) returns "" (an empty string)
 10      * </pre></blockquote>
 11      *
 12      * @param      beginIndex   the beginning index, inclusive.
 13      * @return     the specified substring.
 14      * @exception  IndexOutOfBoundsException  if
 15      *             {@code beginIndex} is negative or larger than the
 16      *             length of this {@code String} object.
 17      */
 18     public String substring(int beginIndex) {
 19         if (beginIndex < 0) {
 20             throw new StringIndexOutOfBoundsException(beginIndex);
 21         }
 22         int subLen = value.length - beginIndex;
 23         if (subLen < 0) {
 24             throw new StringIndexOutOfBoundsException(subLen);
 25         }
 26         return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
 27     }
 28 
 29     /**
 30      * Returns a string that is a substring of this string. The
 31      * substring begins at the specified {@code beginIndex} and
 32      * extends to the character at index {@code endIndex - 1}.
 33      * Thus the length of the substring is {@code endIndex-beginIndex}.
 34      * <p>
 35      * Examples:
 36      * <blockquote><pre>
 37      * "hamburger".substring(4, 8) returns "urge"
 38      * "smiles".substring(1, 5) returns "mile"
 39      * </pre></blockquote>
 40      *
 41      * @param      beginIndex   the beginning index, inclusive.
 42      * @param      endIndex     the ending index, exclusive.
 43      * @return     the specified substring.
 44      * @exception  IndexOutOfBoundsException  if the
 45      *             {@code beginIndex} is negative, or
 46      *             {@code endIndex} is larger than the length of
 47      *             this {@code String} object, or
 48      *             {@code beginIndex} is larger than
 49      *             {@code endIndex}.
 50      */
 51     public String substring(int beginIndex, int endIndex) {
 52         if (beginIndex < 0) {
 53             throw new StringIndexOutOfBoundsException(beginIndex);
 54         }
 55         if (endIndex > value.length) {
 56             throw new StringIndexOutOfBoundsException(endIndex);
 57         }
 58         int subLen = endIndex - beginIndex;
 59         if (subLen < 0) {
 60             throw new StringIndexOutOfBoundsException(subLen);
 61         }
 62         return ((beginIndex == 0) && (endIndex == value.length)) ? this
 63                 : new String(value, beginIndex, subLen);
 64     }
 65 
 66     /**
 67      * Returns a character sequence that is a subsequence of this sequence.
 68      *
 69      * <p> An invocation of this method of the form
 70      *
 71      * <blockquote><pre>
 72      * str.subSequence(begin,&nbsp;end)</pre></blockquote>
 73      *
 74      * behaves in exactly the same way as the invocation
 75      *
 76      * <blockquote><pre>
 77      * str.substring(begin,&nbsp;end)</pre></blockquote>
 78      *
 79      * @apiNote
 80      * This method is defined so that the {@code String} class can implement
 81      * the {@link CharSequence} interface.
 82      *
 83      * @param   beginIndex   the begin index, inclusive.
 84      * @param   endIndex     the end index, exclusive.
 85      * @return  the specified subsequence.
 86      *
 87      * @throws  IndexOutOfBoundsException
 88      *          if {@code beginIndex} or {@code endIndex} is negative,
 89      *          if {@code endIndex} is greater than {@code length()},
 90      *          or if {@code beginIndex} is greater than {@code endIndex}
 91      *
 92      * @since 1.4
 93      * @spec JSR-51
 94      */
 95     public CharSequence subSequence(int beginIndex, int endIndex) {
 96         return this.substring(beginIndex, endIndex);
 97     }
 98 
 99     /**
100      * Concatenates the specified string to the end of this string.
101      * <p>
102      * If the length of the argument string is {@code 0}, then this
103      * {@code String} object is returned. Otherwise, a
104      * {@code String} object is returned that represents a character
105      * sequence that is the concatenation of the character sequence
106      * represented by this {@code String} object and the character
107      * sequence represented by the argument string.<p>
108      * Examples:
109      * <blockquote><pre>
110      * "cares".concat("s") returns "caress"
111      * "to".concat("get").concat("her") returns "together"
112      * </pre></blockquote>
113      *
114      * @param   str   the {@code String} that is concatenated to the end
115      *                of this {@code String}.
116      * @return  a string that represents the concatenation of this object‘s
117      *          characters followed by the string argument‘s characters.
118      */
119     public String concat(String str) {
120         int otherLen = str.length();
121         if (otherLen == 0) {
122             return this;
123         }
124         int len = value.length;
125         char buf[] = Arrays.copyOf(value, len + otherLen);
126         str.getChars(buf, len);
127         return new String(buf, true);
128     }
129 
130     /**
131      * Returns a string resulting from replacing all occurrences of
132      * {@code oldChar} in this string with {@code newChar}.
133      * <p>
134      * If the character {@code oldChar} does not occur in the
135      * character sequence represented by this {@code String} object,
136      * then a reference to this {@code String} object is returned.
137      * Otherwise, a {@code String} object is returned that
138      * represents a character sequence identical to the character sequence
139      * represented by this {@code String} object, except that every
140      * occurrence of {@code oldChar} is replaced by an occurrence
141      * of {@code newChar}.
142      * <p>
143      * Examples:
144      * <blockquote><pre>
145      * "mesquite in your cellar".replace(‘e‘, ‘o‘)
146      *         returns "mosquito in your collar"
147      * "the war of baronets".replace(‘r‘, ‘y‘)
148      *         returns "the way of bayonets"
149      * "sparring with a purple porpoise".replace(‘p‘, ‘t‘)
150      *         returns "starring with a turtle tortoise"
151      * "JonL".replace(‘q‘, ‘x‘) returns "JonL" (no change)
152      * </pre></blockquote>
153      *
154      * @param   oldChar   the old character.
155      * @param   newChar   the new character.
156      * @return  a string derived from this string by replacing every
157      *          occurrence of {@code oldChar} with {@code newChar}.
158      */
159     public String replace(char oldChar, char newChar) {
160         if (oldChar != newChar) {
161             int len = value.length;
162             int i = -1;
163             char[] val = value; /* avoid getfield opcode */
164 
165             while (++i < len) {
166                 if (val[i] == oldChar) {
167                     break;
168                 }
169             }
170             if (i < len) {
171                 char buf[] = new char[len];
172                 for (int j = 0; j < i; j++) {
173                     buf[j] = val[j];
174                 }
175                 while (i < len) {
176                     char c = val[i];
177                     buf[i] = (c == oldChar) ? newChar : c;
178                     i++;
179                 }
180                 return new String(buf, true);
181             }
182         }
183         return this;
184     }
185 
186     /**
187      * Tells whether or not this string matches the given <a
188      * href="../util/regex/Pattern.html#sum">regular expression</a>.
189      *
190      * <p> An invocation of this method of the form
191      * <i>str</i>{@code .matches(}<i>regex</i>{@code )} yields exactly the
192      * same result as the expression
193      *
194      * <blockquote>
195      * {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#matches(String,CharSequence)
196      * matches(<i>regex</i>, <i>str</i>)}
197      * </blockquote>
198      *
199      * @param   regex
200      *          the regular expression to which this string is to be matched
201      *
202      * @return  {@code true} if, and only if, this string matches the
203      *          given regular expression
204      *
205      * @throws  PatternSyntaxException
206      *          if the regular expression‘s syntax is invalid
207      *
208      * @see java.util.regex.Pattern
209      *
210      * @since 1.4
211      * @spec JSR-51
212      */
213     public boolean matches(String regex) {
214         return Pattern.matches(regex, this);
215     }
216 
217     /**
218      * Returns true if and only if this string contains the specified
219      * sequence of char values.
220      *
221      * @param s the sequence to search for
222      * @return true if this string contains {@code s}, false otherwise
223      * @since 1.5
224      */
225     public boolean contains(CharSequence s) {
226         return indexOf(s.toString()) > -1;
227     }
228 
229     /**
230      * Replaces the first substring of this string that matches the given <a
231      * href="../util/regex/Pattern.html#sum">regular expression</a> with the
232      * given replacement.
233      *
234      * <p> An invocation of this method of the form
235      * <i>str</i>{@code .replaceFirst(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
236      * yields exactly the same result as the expression
237      *
238      * <blockquote>
239      * <code>
240      * {@link java.util.regex.Pattern}.{@link
241      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
242      * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
243      * java.util.regex.Matcher#replaceFirst replaceFirst}(<i>repl</i>)
244      * </code>
245      * </blockquote>
246      *
247      *<p>
248      * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
249      * replacement string may cause the results to be different than if it were
250      * being treated as a literal replacement string; see
251      * {@link java.util.regex.Matcher#replaceFirst}.
252      * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
253      * meaning of these characters, if desired.
254      *
255      * @param   regex
256      *          the regular expression to which this string is to be matched
257      * @param   replacement
258      *          the string to be substituted for the first match
259      *
260      * @return  The resulting {@code String}
261      *
262      * @throws  PatternSyntaxException
263      *          if the regular expression‘s syntax is invalid
264      *
265      * @see java.util.regex.Pattern
266      *
267      * @since 1.4
268      * @spec JSR-51
269      */
270     public String replaceFirst(String regex, String replacement) {
271         return Pattern.compile(regex).matcher(this).replaceFirst(replacement);
272     }
273 
274     /**
275      * Replaces each substring of this string that matches the given <a
276      * href="../util/regex/Pattern.html#sum">regular expression</a> with the
277      * given replacement.
278      *
279      * <p> An invocation of this method of the form
280      * <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
281      * yields exactly the same result as the expression
282      *
283      * <blockquote>
284      * <code>
285      * {@link java.util.regex.Pattern}.{@link
286      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
287      * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
288      * java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
289      * </code>
290      * </blockquote>
291      *
292      *<p>
293      * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
294      * replacement string may cause the results to be different than if it were
295      * being treated as a literal replacement string; see
296      * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
297      * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
298      * meaning of these characters, if desired.
299      *
300      * @param   regex
301      *          the regular expression to which this string is to be matched
302      * @param   replacement
303      *          the string to be substituted for each match
304      *
305      * @return  The resulting {@code String}
306      *
307      * @throws  PatternSyntaxException
308      *          if the regular expression‘s syntax is invalid
309      *
310      * @see java.util.regex.Pattern
311      *
312      * @since 1.4
313      * @spec JSR-51
314      */
315     public String replaceAll(String regex, String replacement) {
316         return Pattern.compile(regex).matcher(this).replaceAll(replacement);
317     }
318 
319     /**
320      * Replaces each substring of this string that matches the literal target
321      * sequence with the specified literal replacement sequence. The
322      * replacement proceeds from the beginning of the string to the end, for
323      * example, replacing "aa" with "b" in the string "aaa" will result in
324      * "ba" rather than "ab".
325      *
326      * @param  target The sequence of char values to be replaced
327      * @param  replacement The replacement sequence of char values
328      * @return  The resulting string
329      * @since 1.5
330      */
331     public String replace(CharSequence target, CharSequence replacement) {
332         return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
333                 this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
334     }

string operate

split分割方法解析

  1    /**
  2      * Splits this string around matches of the given
  3      * <a href="../util/regex/Pattern.html#sum">regular expression</a>.
  4      *
  5      * <p> The array returned by this method contains each substring of this
  6      * string that is terminated by another substring that matches the given
  7      * expression or is terminated by the end of the string.  The substrings in
  8      * the array are in the order in which they occur in this string.  If the
  9      * expression does not match any part of the input then the resulting array
 10      * has just one element, namely this string.
 11      *
 12      * <p> When there is a positive-width match at the beginning of this
 13      * string then an empty leading substring is included at the beginning
 14      * of the resulting array. A zero-width match at the beginning however
 15      * never produces such empty leading substring.
 16      *
 17      * <p> The {@code limit} parameter controls the number of times the
 18      * pattern is applied and therefore affects the length of the resulting
 19      * array.  If the limit <i>n</i> is greater than zero then the pattern
 20      * will be applied at most <i>n</i>&nbsp;-&nbsp;1 times, the array‘s
 21      * length will be no greater than <i>n</i>, and the array‘s last entry
 22      * will contain all input beyond the last matched delimiter.  If <i>n</i>
 23      * is non-positive then the pattern will be applied as many times as
 24      * possible and the array can have any length.  If <i>n</i> is zero then
 25      * the pattern will be applied as many times as possible, the array can
 26      * have any length, and trailing empty strings will be discarded.
 27      *
 28      * <p> The string {@code "boo:and:foo"}, for example, yields the
 29      * following results with these parameters:
 30      *
 31      * <blockquote><table cellpadding=1 cellspacing=0 summary="Split example showing regex, limit, and result">
 32      * <tr>
 33      *     <th>Regex</th>
 34      *     <th>Limit</th>
 35      *     <th>Result</th>
 36      * </tr>
 37      * <tr><td align=center>:</td>
 38      *     <td align=center>2</td>
 39      *     <td>{@code { "boo", "and:foo" }}</td></tr>
 40      * <tr><td align=center>:</td>
 41      *     <td align=center>5</td>
 42      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
 43      * <tr><td align=center>:</td>
 44      *     <td align=center>-2</td>
 45      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
 46      * <tr><td align=center>o</td>
 47      *     <td align=center>5</td>
 48      *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
 49      * <tr><td align=center>o</td>
 50      *     <td align=center>-2</td>
 51      *     <td>{@code { "b", "", ":and:f", "", "" }}</td></tr>
 52      * <tr><td align=center>o</td>
 53      *     <td align=center>0</td>
 54      *     <td>{@code { "b", "", ":and:f" }}</td></tr>
 55      * </table></blockquote>
 56      *
 57      * <p> An invocation of this method of the form
 58      * <i>str.</i>{@code split(}<i>regex</i>{@code ,}&nbsp;<i>n</i>{@code )}
 59      * yields the same result as the expression
 60      *
 61      * <blockquote>
 62      * <code>
 63      * {@link java.util.regex.Pattern}.{@link
 64      * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
 65      * java.util.regex.Pattern#split(java.lang.CharSequence,int) split}(<i>str</i>,&nbsp;<i>n</i>)
 66      * </code>
 67      * </blockquote>
 68      *
 69      *
 70      * @param  regex
 71      *         the delimiting regular expression
 72      *
 73      * @param  limit
 74      *         the result threshold, as described above
 75      *
 76      * @return  the array of strings computed by splitting this string
 77      *          around matches of the given regular expression
 78      *
 79      * @throws  PatternSyntaxException
 80      *          if the regular expression‘s syntax is invalid
 81      *
 82      * @see java.util.regex.Pattern
 83      *
 84      * @since 1.4
 85      * @spec JSR-51
 86      */
 87     public String[] split(String regex, int limit) {
 88         /* fastpath if the regex is a
 89          (1)one-char String and this character is not one of the
 90             RegEx‘s meta characters ".$|()[{^?*+\\", or
 91          (2)two-char String and the first char is the backslash and
 92             the second is not the ascii digit or ascii letter.
 93          */
 94         char ch = 0;
 95         if (((regex.value.length == 1 &&
 96              ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
 97              (regex.length() == 2 &&
 98               regex.charAt(0) == ‘\\‘ &&
 99               (((ch = regex.charAt(1))-‘0‘)|(‘9‘-ch)) < 0 &&
100               ((ch-‘a‘)|(‘z‘-ch)) < 0 &&
101               ((ch-‘A‘)|(‘Z‘-ch)) < 0)) &&
102             (ch < Character.MIN_HIGH_SURROGATE ||
103              ch > Character.MAX_LOW_SURROGATE))
104         {
105             int off = 0;
106             int next = 0;
107             boolean limited = limit > 0;
108             ArrayList<String> list = new ArrayList<>();
109             while ((next = indexOf(ch, off)) != -1) {
110                 if (!limited || list.size() < limit - 1) {
111                     list.add(substring(off, next));
112                     off = next + 1;
113                 } else {    // last one
114                     //assert (list.size() == limit - 1);
115                     list.add(substring(off, value.length));
116                     off = value.length;
117                     break;
118                 }
119             }
120             // If no match was found, return this
121             if (off == 0)
122                 return new String[]{this};
123 
124             // Add remaining segment
125             if (!limited || list.size() < limit)
126                 list.add(substring(off, value.length));
127 
128             // Construct result
129             int resultSize = list.size();
130             if (limit == 0) {
131                 while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
132                     resultSize--;
133                 }
134             }
135             String[] result = new String[resultSize];
136             return list.subList(0, resultSize).toArray(result);
137         }
138         return Pattern.compile(regex).split(this, limit);
139     }
140 
141     /**
142      * Splits this string around matches of the given <a
143      * href="../util/regex/Pattern.html#sum">regular expression</a>.
144      *
145      * <p> This method works as if by invoking the two-argument {@link
146      * #split(String, int) split} method with the given expression and a limit
147      * argument of zero.  Trailing empty strings are therefore not included in
148      * the resulting array.
149      *
150      * <p> The string {@code "boo:and:foo"}, for example, yields the following
151      * results with these expressions:
152      *
153      * <blockquote><table cellpadding=1 cellspacing=0 summary="Split examples showing regex and result">
154      * <tr>
155      *  <th>Regex</th>
156      *  <th>Result</th>
157      * </tr>
158      * <tr><td align=center>:</td>
159      *     <td>{@code { "boo", "and", "foo" }}</td></tr>
160      * <tr><td align=center>o</td>
161      *     <td>{@code { "b", "", ":and:f" }}</td></tr>
162      * </table></blockquote>
163      *
164      *
165      * @param  regex
166      *         the delimiting regular expression
167      *
168      * @return  the array of strings computed by splitting this string
169      *          around matches of the given regular expression
170      *
171      * @throws  PatternSyntaxException
172      *          if the regular expression‘s syntax is invalid
173      *
174      * @see java.util.regex.Pattern
175      *
176      * @since 1.4
177      * @spec JSR-51
178      */
179     public String[] split(String regex) {
180         return split(regex, 0);
181     }

split

在分割方法里面，可以看到有两种方法进行分割的，一种是遍历char数组，用List保存分割结果，另一种则是直接用Pattern器的分割方法。

join连接方法

 1     /**
 2      * Returns a new String composed of copies of the
 3      * {@code CharSequence elements} joined together with a copy of
 4      * the specified {@code delimiter}.
 5      *
 6      * <blockquote>For example,
 7      * <pre>{@code
 8      *     String message = String.join("-", "Java", "is", "cool");
 9      *     // message returned is: "Java-is-cool"
10      * }</pre></blockquote>
11      *
12      * Note that if an element is null, then {@code "null"} is added.
13      *
14      * @param  delimiter the delimiter that separates each element
15      * @param  elements the elements to join together.
16      *
17      * @return a new {@code String} that is composed of the {@code elements}
18      *         separated by the {@code delimiter}
19      *
20      * @throws NullPointerException If {@code delimiter} or {@code elements}
21      *         is {@code null}
22      *
23      * @see java.util.StringJoiner
24      * @since 1.8
25      */
26     public static String join(CharSequence delimiter, CharSequence... elements) {
27         Objects.requireNonNull(delimiter);
28         Objects.requireNonNull(elements);
29         // Number of elements not likely worth Arrays.stream overhead.
30         StringJoiner joiner = new StringJoiner(delimiter);
31         for (CharSequence cs: elements) {
32             joiner.add(cs);
33         }
34         return joiner.toString();
35     }
36 
37     /**
38      * Returns a new {@code String} composed of copies of the
39      * {@code CharSequence elements} joined together with a copy of the
40      * specified {@code delimiter}.
41      *
42      * <blockquote>For example,
43      * <pre>{@code
44      *     List<String> strings = new LinkedList<>();
45      *     strings.add("Java");strings.add("is");
46      *     strings.add("cool");
47      *     String message = String.join(" ", strings);
48      *     //message returned is: "Java is cool"
49      *
50      *     Set<String> strings = new LinkedHashSet<>();
51      *     strings.add("Java"); strings.add("is");
52      *     strings.add("very"); strings.add("cool");
53      *     String message = String.join("-", strings);
54      *     //message returned is: "Java-is-very-cool"
55      * }</pre></blockquote>
56      *
57      * Note that if an individual element is {@code null}, then {@code "null"} is added.
58      *
59      * @param  delimiter a sequence of characters that is used to separate each
60      *         of the {@code elements} in the resulting {@code String}
61      * @param  elements an {@code Iterable} that will have its {@code elements}
62      *         joined together.
63      *
64      * @return a new {@code String} that is composed from the {@code elements}
65      *         argument
66      *
67      * @throws NullPointerException If {@code delimiter} or {@code elements}
68      *         is {@code null}
69      *
70      * @see    #join(CharSequence,CharSequence...)
71      * @see    java.util.StringJoiner
72      * @since 1.8
73      */
74     public static String join(CharSequence delimiter,
75             Iterable<? extends CharSequence> elements) {
76         Objects.requireNonNull(delimiter);
77         Objects.requireNonNull(elements);
78         StringJoiner joiner = new StringJoiner(delimiter);
79         for (CharSequence cs: elements) {
80             joiner.add(cs);
81         }
82         return joiner.toString();
83     }

join method

join实现原理是用StringJoiner，StringJoiner则是封装了StringBuilder 进行实现的。

大小写和去除空格方法

  1     /**
  2      * Converts all of the characters in this {@code String} to lower
  3      * case using the rules of the given {@code Locale}.  Case mapping is based
  4      * on the Unicode Standard version specified by the {@link java.lang.Character Character}
  5      * class. Since case mappings are not always 1:1 char mappings, the resulting
  6      * {@code String} may be a different length than the original {@code String}.
  7      * <p>
  8      * Examples of lowercase  mappings are in the following table:
  9      * <table border="1" summary="Lowercase mapping examples showing language code of locale, upper case, lower case, and description">
 10      * <tr>
 11      *   <th>Language Code of Locale</th>
 12      *   <th>Upper Case</th>
 13      *   <th>Lower Case</th>
 14      *   <th>Description</th>
 15      * </tr>
 16      * <tr>
 17      *   <td>tr (Turkish)</td>
 18      *   <td>&#92;u0130</td>
 19      *   <td>&#92;u0069</td>
 20      *   <td>capital letter I with dot above -&gt; small letter i</td>
 21      * </tr>
 22      * <tr>
 23      *   <td>tr (Turkish)</td>
 24      *   <td>&#92;u0049</td>
 25      *   <td>&#92;u0131</td>
 26      *   <td>capital letter I -&gt; small letter dotless i </td>
 27      * </tr>
 28      * <tr>
 29      *   <td>(all)</td>
 30      *   <td>French Fries</td>
 31      *   <td>french fries</td>
 32      *   <td>lowercased all chars in String</td>
 33      * </tr>
 34      * <tr>
 35      *   <td>(all)</td>
 36      *   <td><img src="doc-files/capiota.gif" alt="capiota"><img src="doc-files/capchi.gif" alt="capchi">
 37      *       <img src="doc-files/captheta.gif" alt="captheta"><img src="doc-files/capupsil.gif" alt="capupsil">
 38      *       <img src="doc-files/capsigma.gif" alt="capsigma"></td>
 39      *   <td><img src="doc-files/iota.gif" alt="iota"><img src="doc-files/chi.gif" alt="chi">
 40      *       <img src="doc-files/theta.gif" alt="theta"><img src="doc-files/upsilon.gif" alt="upsilon">
 41      *       <img src="doc-files/sigma1.gif" alt="sigma"></td>
 42      *   <td>lowercased all chars in String</td>
 43      * </tr>
 44      * </table>
 45      *
 46      * @param locale use the case transformation rules for this locale
 47      * @return the {@code String}, converted to lowercase.
 48      * @see     java.lang.String#toLowerCase()
 49      * @see     java.lang.String#toUpperCase()
 50      * @see     java.lang.String#toUpperCase(Locale)
 51      * @since   1.1
 52      */
 53     public String toLowerCase(Locale locale) {
 54         if (locale == null) {
 55             throw new NullPointerException();
 56         }
 57 
 58         int firstUpper;
 59         final int len = value.length;
 60 
 61         /* Now check if there are any characters that need to be changed. */
 62         scan: {
 63             for (firstUpper = 0 ; firstUpper < len; ) {
 64                 char c = value[firstUpper];
 65                 if ((c >= Character.MIN_HIGH_SURROGATE)
 66                         && (c <= Character.MAX_HIGH_SURROGATE)) {
 67                     int supplChar = codePointAt(firstUpper);
 68                     if (supplChar != Character.toLowerCase(supplChar)) {
 69                         break scan;
 70                     }
 71                     firstUpper += Character.charCount(supplChar);
 72                 } else {
 73                     if (c != Character.toLowerCase(c)) {
 74                         break scan;
 75                     }
 76                     firstUpper++;
 77                 }
 78             }
 79             return this;
 80         }
 81 
 82         char[] result = new char[len];
 83         int resultOffset = 0;  /* result may grow, so i+resultOffset
 84                                 * is the write location in result */
 85 
 86         /* Just copy the first few lowerCase characters. */
 87         System.arraycopy(value, 0, result, 0, firstUpper);
 88 
 89         String lang = locale.getLanguage();
 90         boolean localeDependent =
 91                 (lang == "tr" || lang == "az" || lang == "lt");
 92         char[] lowerCharArray;
 93         int lowerChar;
 94         int srcChar;
 95         int srcCount;
 96         for (int i = firstUpper; i < len; i += srcCount) {
 97             srcChar = (int)value[i];
 98             if ((char)srcChar >= Character.MIN_HIGH_SURROGATE
 99                     && (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
100                 srcChar = codePointAt(i);
101                 srcCount = Character.charCount(srcChar);
102             } else {
103                 srcCount = 1;
104             }
105             if (localeDependent ||
106                 srcChar == ‘\u03A3‘ || // GREEK CAPITAL LETTER SIGMA
107                 srcChar == ‘\u0130‘) { // LATIN CAPITAL LETTER I WITH DOT ABOVE
108                 lowerChar = ConditionalSpecialCasing.toLowerCaseEx(this, i, locale);
109             } else {
110                 lowerChar = Character.toLowerCase(srcChar);
111             }
112             if ((lowerChar == Character.ERROR)
113                     || (lowerChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
114                 if (lowerChar == Character.ERROR) {
115                     lowerCharArray =
116                             ConditionalSpecialCasing.toLowerCaseCharArray(this, i, locale);
117                 } else if (srcCount == 2) {
118                     resultOffset += Character.toChars(lowerChar, result, i + resultOffset) - srcCount;
119                     continue;
120                 } else {
121                     lowerCharArray = Character.toChars(lowerChar);
122                 }
123 
124                 /* Grow result if needed */
125                 int mapLen = lowerCharArray.length;
126                 if (mapLen > srcCount) {
127                     char[] result2 = new char[result.length + mapLen - srcCount];
128                     System.arraycopy(result, 0, result2, 0, i + resultOffset);
129                     result = result2;
130                 }
131                 for (int x = 0; x < mapLen; ++x) {
132                     result[i + resultOffset + x] = lowerCharArray[x];
133                 }
134                 resultOffset += (mapLen - srcCount);
135             } else {
136                 result[i + resultOffset] = (char)lowerChar;
137             }
138         }
139         return new String(result, 0, len + resultOffset);
140     }
141 
142     /**
143      * Converts all of the characters in this {@code String} to lower
144      * case using the rules of the default locale. This is equivalent to calling
145      * {@code toLowerCase(Locale.getDefault())}.
146      * <p>
147      * <b>Note:</b> This method is locale sensitive, and may produce unexpected
148      * results if used for strings that are intended to be interpreted locale
149      * independently.
150      * Examples are programming language identifiers, protocol keys, and HTML
151      * tags.
152      * For instance, {@code "TITLE".toLowerCase()} in a Turkish locale
153      * returns {@code "t\u005Cu0131tle"}, where ‘\u005Cu0131‘ is the
154      * LATIN SMALL LETTER DOTLESS I character.
155      * To obtain correct results for locale insensitive strings, use
156      * {@code toLowerCase(Locale.ROOT)}.
157      * <p>
158      * @return  the {@code String}, converted to lowercase.
159      * @see     java.lang.String#toLowerCase(Locale)
160      */
161     public String toLowerCase() {
162         return toLowerCase(Locale.getDefault());
163     }
164 
165     /**
166      * Converts all of the characters in this {@code String} to upper
167      * case using the rules of the given {@code Locale}. Case mapping is based
168      * on the Unicode Standard version specified by the {@link java.lang.Character Character}
169      * class. Since case mappings are not always 1:1 char mappings, the resulting
170      * {@code String} may be a different length than the original {@code String}.
171      * <p>
172      * Examples of locale-sensitive and 1:M case mappings are in the following table.
173      *
174      * <table border="1" summary="Examples of locale-sensitive and 1:M case mappings. Shows Language code of locale, lower case, upper case, and description.">
175      * <tr>
176      *   <th>Language Code of Locale</th>
177      *   <th>Lower Case</th>
178      *   <th>Upper Case</th>
179      *   <th>Description</th>
180      * </tr>
181      * <tr>
182      *   <td>tr (Turkish)</td>
183      *   <td>&#92;u0069</td>
184      *   <td>&#92;u0130</td>
185      *   <td>small letter i -&gt; capital letter I with dot above</td>
186      * </tr>
187      * <tr>
188      *   <td>tr (Turkish)</td>
189      *   <td>&#92;u0131</td>
190      *   <td>&#92;u0049</td>
191      *   <td>small letter dotless i -&gt; capital letter I</td>
192      * </tr>
193      * <tr>
194      *   <td>(all)</td>
195      *   <td>&#92;u00df</td>
196      *   <td>&#92;u0053 &#92;u0053</td>
197      *   <td>small letter sharp s -&gt; two letters: SS</td>
198      * </tr>
199      * <tr>
200      *   <td>(all)</td>
201      *   <td>Fahrvergn&uuml;gen</td>
202      *   <td>FAHRVERGN&Uuml;GEN</td>
203      *   <td></td>
204      * </tr>
205      * </table>
206      * @param locale use the case transformation rules for this locale
207      * @return the {@code String}, converted to uppercase.
208      * @see     java.lang.String#toUpperCase()
209      * @see     java.lang.String#toLowerCase()
210      * @see     java.lang.String#toLowerCase(Locale)
211      * @since   1.1
212      */
213     public String toUpperCase(Locale locale) {
214         if (locale == null) {
215             throw new NullPointerException();
216         }
217 
218         int firstLower;
219         final int len = value.length;
220 
221         /* Now check if there are any characters that need to be changed. */
222         scan: {
223             for (firstLower = 0 ; firstLower < len; ) {
224                 int c = (int)value[firstLower];
225                 int srcCount;
226                 if ((c >= Character.MIN_HIGH_SURROGATE)
227                         && (c <= Character.MAX_HIGH_SURROGATE)) {
228                     c = codePointAt(firstLower);
229                     srcCount = Character.charCount(c);
230                 } else {
231                     srcCount = 1;
232                 }
233                 int upperCaseChar = Character.toUpperCaseEx(c);
234                 if ((upperCaseChar == Character.ERROR)
235                         || (c != upperCaseChar)) {
236                     break scan;
237                 }
238                 firstLower += srcCount;
239             }
240             return this;
241         }
242 
243         /* result may grow, so i+resultOffset is the write location in result */
244         int resultOffset = 0;
245         char[] result = new char[len]; /* may grow */
246 
247         /* Just copy the first few upperCase characters. */
248         System.arraycopy(value, 0, result, 0, firstLower);
249 
250         String lang = locale.getLanguage();
251         boolean localeDependent =
252                 (lang == "tr" || lang == "az" || lang == "lt");
253         char[] upperCharArray;
254         int upperChar;
255         int srcChar;
256         int srcCount;
257         for (int i = firstLower; i < len; i += srcCount) {
258             srcChar = (int)value[i];
259             if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&
260                 (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
261                 srcChar = codePointAt(i);
262                 srcCount = Character.charCount(srcChar);
263             } else {
264                 srcCount = 1;
265             }
266             if (localeDependent) {
267                 upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);
268             } else {
269                 upperChar = Character.toUpperCaseEx(srcChar);
270             }
271             if ((upperChar == Character.ERROR)
272                     || (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
273                 if (upperChar == Character.ERROR) {
274                     if (localeDependent) {
275                         upperCharArray =
276                                 ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);
277                     } else {
278                         upperCharArray = Character.toUpperCaseCharArray(srcChar);
279                     }
280                 } else if (srcCount == 2) {
281                     resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;
282                     continue;
283                 } else {
284                     upperCharArray = Character.toChars(upperChar);
285                 }
286 
287                 /* Grow result if needed */
288                 int mapLen = upperCharArray.length;
289                 if (mapLen > srcCount) {
290                     char[] result2 = new char[result.length + mapLen - srcCount];
291                     System.arraycopy(result, 0, result2, 0, i + resultOffset);
292                     result = result2;
293                 }
294                 for (int x = 0; x < mapLen; ++x) {
295                     result[i + resultOffset + x] = upperCharArray[x];
296                 }
297                 resultOffset += (mapLen - srcCount);
298             } else {
299                 result[i + resultOffset] = (char)upperChar;
300             }
301         }
302         return new String(result, 0, len + resultOffset);
303     }
304 
305     /**
306      * Converts all of the characters in this {@code String} to upper
307      * case using the rules of the default locale. This method is equivalent to
308      * {@code toUpperCase(Locale.getDefault())}.
309      * <p>
310      * <b>Note:</b> This method is locale sensitive, and may produce unexpected
311      * results if used for strings that are intended to be interpreted locale
312      * independently.
313      * Examples are programming language identifiers, protocol keys, and HTML
314      * tags.
315      * For instance, {@code "title".toUpperCase()} in a Turkish locale
316      * returns {@code "T\u005Cu0130TLE"}, where ‘\u005Cu0130‘ is the
317      * LATIN CAPITAL LETTER I WITH DOT ABOVE character.
318      * To obtain correct results for locale insensitive strings, use
319      * {@code toUpperCase(Locale.ROOT)}.
320      * <p>
321      * @return  the {@code String}, converted to uppercase.
322      * @see     java.lang.String#toUpperCase(Locale)
323      */
324     public String toUpperCase() {
325         return toUpperCase(Locale.getDefault());
326     }
327 
328     /**
329      * Returns a string whose value is this string, with any leading and trailing
330      * whitespace removed.
331      * <p>
332      * If this {@code String} object represents an empty character
333      * sequence, or the first and last characters of character sequence
334      * represented by this {@code String} object both have codes
335      * greater than {@code ‘\u005Cu0020‘} (the space character), then a
336      * reference to this {@code String} object is returned.
337      * <p>
338      * Otherwise, if there is no character with a code greater than
339      * {@code ‘\u005Cu0020‘} in the string, then a
340      * {@code String} object representing an empty string is
341      * returned.
342      * <p>
343      * Otherwise, let <i>k</i> be the index of the first character in the
344      * string whose code is greater than {@code ‘\u005Cu0020‘}, and let
345      * <i>m</i> be the index of the last character in the string whose code
346      * is greater than {@code ‘\u005Cu0020‘}. A {@code String}
347      * object is returned, representing the substring of this string that
348      * begins with the character at index <i>k</i> and ends with the
349      * character at index <i>m</i>-that is, the result of
350      * {@code this.substring(k, m + 1)}.
351      * <p>
352      * This method may be used to trim whitespace (as defined above) from
353      * the beginning and end of a string.
354      *
355      * @return  A string whose value is this string, with any leading and trailing white
356      *          space removed, or this string if it has no leading or
357      *          trailing white space.
358      */
359     public String trim() {
360         int len = value.length;
361         int st = 0;
362         char[] val = value;    /* avoid getfield opcode */
363 
364         while ((st < len) && (val[st] <= ‘ ‘)) {
365             st++;
366         }
367         while ((st < len) && (val[len - 1] <= ‘ ‘)) {
368             len--;
369         }
370         return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
371     }

lowupperandtrim

tostring 方法

 1     /**
 2      * This object (which is already a string!) is itself returned.
 3      *
 4      * @return  the string itself.
 5      */
 6     public String toString() {
 7         return this;
 8     }
 9 
10     /**
11      * Converts this string to a new character array.
12      *
13      * @return  a newly allocated character array whose length is the length
14      *          of this string and whose contents are initialized to contain
15      *          the character sequence represented by this string.
16      */
17     public char[] toCharArray() {
18         // Cannot use Arrays.copyOf because of class initialization order issues
19         char result[] = new char[value.length];
20         System.arraycopy(value, 0, result, 0, value.length);
21         return result;
22     }

toString

format 格式化，公共静态方法

1     public static String format(String format, Object... args) {
2         return new Formatter().format(format, args).toString();
3     }
4 
5     public static String format(Locale l, String format, Object... args) {
6         return new Formatter(l).format(format, args).toString();
7     }

format

valueOf

  1     /**
  2      * Returns the string representation of the {@code Object} argument.
  3      *
  4      * @param   obj   an {@code Object}.
  5      * @return  if the argument is {@code null}, then a string equal to
  6      *          {@code "null"}; otherwise, the value of
  7      *          {@code obj.toString()} is returned.
  8      * @see     java.lang.Object#toString()
  9      */
 10     public static String valueOf(Object obj) {
 11         return (obj == null) ? "null" : obj.toString();
 12     }
 13 
 14     /**
 15      * Returns the string representation of the {@code char} array
 16      * argument. The contents of the character array are copied; subsequent
 17      * modification of the character array does not affect the returned
 18      * string.
 19      *
 20      * @param   data     the character array.
 21      * @return  a {@code String} that contains the characters of the
 22      *          character array.
 23      */
 24     public static String valueOf(char data[]) {
 25         return new String(data);
 26     }
 27 
 28     /**
 29      * Returns the string representation of a specific subarray of the
 30      * {@code char} array argument.
 31      * <p>
 32      * The {@code offset} argument is the index of the first
 33      * character of the subarray. The {@code count} argument
 34      * specifies the length of the subarray. The contents of the subarray
 35      * are copied; subsequent modification of the character array does not
 36      * affect the returned string.
 37      *
 38      * @param   data     the character array.
 39      * @param   offset   initial offset of the subarray.
 40      * @param   count    length of the subarray.
 41      * @return  a {@code String} that contains the characters of the
 42      *          specified subarray of the character array.
 43      * @exception IndexOutOfBoundsException if {@code offset} is
 44      *          negative, or {@code count} is negative, or
 45      *          {@code offset+count} is larger than
 46      *          {@code data.length}.
 47      */
 48     public static String valueOf(char data[], int offset, int count) {
 49         return new String(data, offset, count);
 50     }
 51 
 52     /**
 53      * Equivalent to {@link #valueOf(char[], int, int)}.
 54      *
 55      * @param   data     the character array.
 56      * @param   offset   initial offset of the subarray.
 57      * @param   count    length of the subarray.
 58      * @return  a {@code String} that contains the characters of the
 59      *          specified subarray of the character array.
 60      * @exception IndexOutOfBoundsException if {@code offset} is
 61      *          negative, or {@code count} is negative, or
 62      *          {@code offset+count} is larger than
 63      *          {@code data.length}.
 64      */
 65     public static String copyValueOf(char data[], int offset, int count) {
 66         return new String(data, offset, count);
 67     }
 68 
 69     /**
 70      * Equivalent to {@link #valueOf(char[])}.
 71      *
 72      * @param   data   the character array.
 73      * @return  a {@code String} that contains the characters of the
 74      *          character array.
 75      */
 76     public static String copyValueOf(char data[]) {
 77         return new String(data);
 78     }
 79 
 80     /**
 81      * Returns the string representation of the {@code boolean} argument.
 82      *
 83      * @param   b   a {@code boolean}.
 84      * @return  if the argument is {@code true}, a string equal to
 85      *          {@code "true"} is returned; otherwise, a string equal to
 86      *          {@code "false"} is returned.
 87      */
 88     public static String valueOf(boolean b) {
 89         return b ? "true" : "false";
 90     }
 91 
 92     /**
 93      * Returns the string representation of the {@code char}
 94      * argument.
 95      *
 96      * @param   c   a {@code char}.
 97      * @return  a string of length {@code 1} containing
 98      *          as its single character the argument {@code c}.
 99      */
100     public static String valueOf(char c) {
101         char data[] = {c};
102         return new String(data, true);
103     }
104 
105     /**
106      * Returns the string representation of the {@code int} argument.
107      * <p>
108      * The representation is exactly the one returned by the
109      * {@code Integer.toString} method of one argument.
110      *
111      * @param   i   an {@code int}.
112      * @return  a string representation of the {@code int} argument.
113      * @see     java.lang.Integer#toString(int, int)
114      */
115     public static String valueOf(int i) {
116         return Integer.toString(i);
117     }
118 
119     /**
120      * Returns the string representation of the {@code long} argument.
121      * <p>
122      * The representation is exactly the one returned by the
123      * {@code Long.toString} method of one argument.
124      *
125      * @param   l   a {@code long}.
126      * @return  a string representation of the {@code long} argument.
127      * @see     java.lang.Long#toString(long)
128      */
129     public static String valueOf(long l) {
130         return Long.toString(l);
131     }
132 
133     /**
134      * Returns the string representation of the {@code float} argument.
135      * <p>
136      * The representation is exactly the one returned by the
137      * {@code Float.toString} method of one argument.
138      *
139      * @param   f   a {@code float}.
140      * @return  a string representation of the {@code float} argument.
141      * @see     java.lang.Float#toString(float)
142      */
143     public static String valueOf(float f) {
144         return Float.toString(f);
145     }
146 
147     /**
148      * Returns the string representation of the {@code double} argument.
149      * <p>
150      * The representation is exactly the one returned by the
151      * {@code Double.toString} method of one argument.
152      *
153      * @param   d   a {@code double}.
154      * @return  a  string representation of the {@code double} argument.
155      * @see     java.lang.Double#toString(double)
156      */
157     public static String valueOf(double d) {
158         return Double.toString(d);
159     }

valueOf

在源码提供的所有valueOf方法里面，如果是确定类型的null传入，返回的是字符串“null”，而如果直接传入null，则会发生错误。（“null”是个坑。）

intern本地方法 ,这个是native 方法，说明是由系统动态库实现的。

 1 /**
 2      * Returns a canonical representation for the string object.
 3      * <p>
 4      * A pool of strings, initially empty, is maintained privately by the
 5      * class {@code String}.
 6      * <p>
 7      * When the intern method is invoked, if the pool already contains a
 8      * string equal to this {@code String} object as determined by
 9      * the {@link #equals(Object)} method, then the string from the pool is
10      * returned. Otherwise, this {@code String} object is added to the
11      * pool and a reference to this {@code String} object is returned.
12      * <p>
13      * It follows that for any two strings {@code s} and {@code t},
14      * {@code s.intern() == t.intern()} is {@code true}
15      * if and only if {@code s.equals(t)} is {@code true}.
16      * <p>
17      * All literal strings and string-valued constant expressions are
18      * interned. String literals are defined in section 3.10.5 of the
19      * <cite>The Java&trade; Language Specification</cite>.
20      *
21      * @return  a string that has the same contents as this string, but is
22      *          guaranteed to be from a pool of unique strings.
23      */
24     public native String intern();

intern

具体这个方法的原理就跟字符串常量池有关系。

以上是关于java.lang.String 类源码解读的主要内容，如果未能解决你的问题，请参考以下文章